All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-06  8:56 ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Memory protection keys enable applications to protect its
address space from inadvertent access from or corruption
by itself.

These patches along with the pte-bit freeing patch series
enables the protection key feature on powerpc; 4k and 64k
hashpage kernels. It also changes the generic and x86
code to expose memkey features through sysfs. Finally
testcases and Documentation is updated.

All patches can be found at --
https://github.com/rampai/memorykeys.git memkey.v9

The overall idea:
-----------------
 A process allocates a key and associates it with
 an address range within its address space.
 The process then can dynamically set read/write 
 permissions on the key without involving the 
 kernel. Any code that violates the permissions
 of the address space; as defined by its associated
 key, will receive a segmentation fault.

This patch series enables the feature on PPC64 HPTE
platform.

ISA3.0 section 5.7.13 describes the detailed
specifications.


Highlevel view of the design:
---------------------------
When an application associates a key with a address
address range, program the key in the Linux PTE.
When the MMU detects a page fault, allocate a hash
page and program the key into HPTE. And finally
when the MMU detects a key violation; due to
invalid application access, invoke the registered
signal handler and provide the violated key number.


Testing:
-------
This patch series has passed all the protection key
tests available in the selftest directory.The
tests are updated to work on both x86 and powerpc.
The selftests have passed on x86 and powerpc hardware.

History:
-------
version v9:
	(1) used jump-labels to optimize code
		-- Balbir
	(2) fixed a register initialization bug noted
		by Balbir
	(3) fixed inappropriate use of paca to pass
		siginfo and keys to signal handler
	(4) Cleanup of comment style not to be right 
		justified -- mpe
	(5) restructured the patches to depend on the
		availability of VM_PKEY_BIT4 in
		include/linux/mm.h
	(6) Incorporated comments from Dave Hansen
		towards changes to selftest and got
		them tested on x86.

version v8:
	(1) Contents of the AMR register withdrawn from
	the siginfo structure. Applications can always
	read the AMR register.
	(2) AMR/IAMR/UAMOR are now available through 
		ptrace system call. -- thanks to Thiago
	(3) code changes to handle legacy power cpus
	that do not support execute-disable.
	(4) incorporates many code improvement
		suggestions.

version v7:
	(1) refers to device tree property to enable
		protection keys.
	(2) adds 4K PTE support.
	(3) fixes a couple of bugs noticed by Thiago
	(4) decouples this patch series from arch-
	 independent code. This patch series can
	 now stand by itself, with one kludge
	patch(2).
version v7:
	(1) refers to device tree property to enable
		protection keys.
	(2) adds 4K PTE support.
	(3) fixes a couple of bugs noticed by Thiago
	(4) decouples this patch series from arch-
	 independent code. This patch series can
	 now stand by itself, with one kludge
	 patch(2).

version v6:
	(1) selftest changes are broken down into 20
		incremental patches.
	(2) A separate key allocation mask that
		includes PKEY_DISABLE_EXECUTE is 
		added for powerpc
	(3) pkey feature is enabled for 64K HPT case
		only. RPT and 4k HPT is disabled.
	(4) Documentation is updated to better 
		capture the semantics.
	(5) introduced arch_pkeys_enabled() to find
		if an arch enables pkeys. Correspond-
		ing change the logic that displays
		key value in smaps.
	(6) code rearranged in many places based on
		comments from Dave Hansen, Balbir,
		Anshuman.	
	(7) fixed one bug where a bogus key could be
		associated successfully in
		pkey_mprotect().

version v5:
	(1) reverted back to the old design -- store
	 the key in the pte, instead of bypassing
	 it. The v4 design slowed down the hash
	 page path.
	(2) detects key violation when kernel is told
		to access user pages.
	(3) further refined the patches into smaller
		consumable units
	(4) page faults handlers captures the fault-
		ing key 
	 from the pte instead of the vma. This
	 closes a race between where the key 
	 update in the vma and a key fault caused
	 by the key programmed in the pte.
	(5) a key created with access-denied should
	 also set it up to deny write. Fixed it.
	(6) protection-key number is displayed in
 		smaps the x86 way.

version v4:
	(1) patches no more depend on the pte bits
		to program the hpte
			-- comment by Balbir
	(2) documentation updates
	(3) fixed a bug in the selftest.
	(4) unlike x86, powerpc lets signal handler
		change key permission bits; the
		change will persist across signal
		handler boundaries. Earlier we
		allowed the signal handler to
		modify a field in the siginfo
		structure which would than be used
		by the kernel to program the key
		protection register (AMR)
		 -- resolves a issue raised by Ben.
		"Calls to sys_swapcontext with a
		made-up context will end up with a
		crap AMR if done by code who didn't
		know about that register".
	(5) these changes enable protection keys on
 		4k-page kernel aswell.

version v3:
	(1) split the patches into smaller consumable
		patches.
	(2) added the ability to disable execute
		permission on a key at creation.
	(3) rename calc_pte_to_hpte_pkey_bits() to
	pte_to_hpte_pkey_bits()
		-- suggested by Anshuman
	(4) some code optimization and clarity in
		do_page_fault()
	(5) A bug fix while invalidating a hpte slot
		in __hash_page_4K()
		-- noticed by Aneesh
	

version v2:
	(1) documentation and selftest added.
 	(2) fixed a bug in 4k hpte backed 64k pte
		where page invalidation was not
		done correctly, and initialization
		of second-part-of-the-pte was not
		done correctly if the pte was not
		yet Hashed with a hpte.
		--	Reported by Aneesh.
	(3) Fixed ABI breakage caused in siginfo
		structure.
		-- Reported by Anshuman.
	

version v1: Initial version

Ram Pai (47):
  mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
    is enabled
  mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
  powerpc: initial pkey plumbing
  powerpc: track allocation status of all pkeys
  powerpc: helper function to read,write AMR,IAMR,UAMOR registers
  powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  powerpc: cleanup AMR, IAMR when a key is allocated or freed
  powerpc: implementation for arch_set_user_pkey_access()
  powerpc: ability to create execute-disabled pkeys
  powerpc: store and restore the pkey state across context switches
  powerpc: introduce execute-only pkey
  powerpc: ability to associate pkey to a vma
  powerpc: implementation for arch_override_mprotect_pkey()
  powerpc: map vma key-protection bits to pte key bits.
  powerpc: Program HPTE key protection bits
  powerpc: helper to validate key-access permissions of a pte
  powerpc: check key protection for user page access
  powerpc: implementation for arch_vma_access_permitted()
  powerpc: Handle exceptions caused by pkey violation
  powerpc: introduce get_mm_addr_key() helper
  powerpc: Deliver SEGV signal on pkey violation
  powerpc: Enable pkey subsystem
  powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  powerpc: sys_pkey_mprotect() system call
  powerpc: add sys_pkey_modify() system call
  mm, x86 : introduce arch_pkeys_enabled()
  mm: display pkey in smaps if arch_pkeys_enabled() is true
  Documentation/x86: Move protecton key documentation to arch neutral
    directory
  Documentation/vm: PowerPC specific updates to memory protection keys
  selftest/x86: Move protecton key selftest to arch neutral directory
  selftest/vm: rename all references to pkru to a generic name
  selftest/vm: move generic definitions to header file
  selftest/vm: typecast the pkey register
  selftest/vm: generic function to handle shadow key register
  selftest/vm: fix the wrong assert in pkey_disable_set()
  selftest/vm: fixed bugs in pkey_disable_clear()
  selftest/vm: clear the bits in shadow reg when a pkey is freed.
  selftest/vm: fix alloc_random_pkey() to make it really random
  selftest/vm: introduce two arch independent abstraction
  selftest/vm: pkey register should match shadow pkey
  selftest/vm: generic cleanup
  selftest/vm: powerpc implementation for generic abstraction
  selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
  selftest/vm: associate key on a mapped page and detect access
    violation
  selftest/vm: associate key on a mapped page and detect write
    violation
  selftest/vm: detect write violation on a mapped access-denied-key
    page
  selftest/vm: sub-page allocator

Thiago Jung Bauermann (4):
  powerpc/ptrace: Add memory protection key regset
  mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  selftests/powerpc: Add ptrace tests for Protection Key register
  selftests/powerpc: Add core file test for Protection Key register

 Documentation/vm/protection-keys.txt               |  161 +++
 Documentation/x86/protection-keys.txt              |   85 --
 arch/powerpc/Kconfig                               |   15 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
 arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
 arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
 arch/powerpc/include/asm/bug.h                     |    1 +
 arch/powerpc/include/asm/cputable.h                |   15 +-
 arch/powerpc/include/asm/mman.h                    |   13 +-
 arch/powerpc/include/asm/mmu.h                     |    9 +
 arch/powerpc/include/asm/mmu_context.h             |   24 +
 arch/powerpc/include/asm/pkeys.h                   |  247 ++++
 arch/powerpc/include/asm/processor.h               |    5 +
 arch/powerpc/include/asm/systbl.h                  |    4 +
 arch/powerpc/include/asm/unistd.h                  |    6 +-
 arch/powerpc/include/uapi/asm/elf.h                |    1 +
 arch/powerpc/include/uapi/asm/mman.h               |    6 +
 arch/powerpc/include/uapi/asm/unistd.h             |    4 +
 arch/powerpc/kernel/entry_64.S                     |    9 +
 arch/powerpc/kernel/process.c                      |    7 +
 arch/powerpc/kernel/prom.c                         |   18 +
 arch/powerpc/kernel/ptrace.c                       |   66 +
 arch/powerpc/kernel/traps.c                        |   19 +-
 arch/powerpc/mm/Makefile                           |    1 +
 arch/powerpc/mm/fault.c                            |   49 +-
 arch/powerpc/mm/hash_utils_64.c                    |   29 +
 arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
 arch/powerpc/mm/pkeys.c                            |  463 +++++++
 arch/x86/include/asm/mmu_context.h                 |    4 +-
 arch/x86/include/asm/pkeys.h                       |    2 +
 arch/x86/kernel/fpu/xstate.c                       |    5 +
 arch/x86/kernel/setup.c                            |    8 -
 arch/x86/mm/pkeys.c                                |    9 +
 fs/proc/task_mmu.c                                 |   16 +-
 include/linux/mm.h                                 |   12 +-
 include/linux/pkeys.h                              |    7 +-
 include/uapi/linux/elf.h                           |    1 +
 mm/mprotect.c                                      |   88 ++
 tools/testing/selftests/powerpc/include/reg.h      |    1 +
 tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
 tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
 .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
 tools/testing/selftests/vm/Makefile                |    1 +
 tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
 tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
 tools/testing/selftests/x86/Makefile               |    2 +-
 tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
 tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
 48 files changed, 4095 insertions(+), 1747 deletions(-)
 create mode 100644 Documentation/vm/protection-keys.txt
 delete mode 100644 Documentation/x86/protection-keys.txt
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
 create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
 create mode 100644 tools/testing/selftests/vm/protection_keys.c
 delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
 delete mode 100644 tools/testing/selftests/x86/protection_keys.c

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-06  8:56 ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Memory protection keys enable applications to protect its
address space from inadvertent access from or corruption
by itself.

These patches along with the pte-bit freeing patch series
enables the protection key feature on powerpc; 4k and 64k
hashpage kernels. It also changes the generic and x86
code to expose memkey features through sysfs. Finally
testcases and Documentation is updated.

All patches can be found at --
https://github.com/rampai/memorykeys.git memkey.v9

The overall idea:
-----------------
 A process allocates a key and associates it with
 an address range within its address space.
 The process then can dynamically set read/write 
 permissions on the key without involving the 
 kernel. Any code that violates the permissions
 of the address space; as defined by its associated
 key, will receive a segmentation fault.

This patch series enables the feature on PPC64 HPTE
platform.

ISA3.0 section 5.7.13 describes the detailed
specifications.


Highlevel view of the design:
---------------------------
When an application associates a key with a address
address range, program the key in the Linux PTE.
When the MMU detects a page fault, allocate a hash
page and program the key into HPTE. And finally
when the MMU detects a key violation; due to
invalid application access, invoke the registered
signal handler and provide the violated key number.


Testing:
-------
This patch series has passed all the protection key
tests available in the selftest directory.The
tests are updated to work on both x86 and powerpc.
The selftests have passed on x86 and powerpc hardware.

History:
-------
version v9:
	(1) used jump-labels to optimize code
		-- Balbir
	(2) fixed a register initialization bug noted
		by Balbir
	(3) fixed inappropriate use of paca to pass
		siginfo and keys to signal handler
	(4) Cleanup of comment style not to be right 
		justified -- mpe
	(5) restructured the patches to depend on the
		availability of VM_PKEY_BIT4 in
		include/linux/mm.h
	(6) Incorporated comments from Dave Hansen
		towards changes to selftest and got
		them tested on x86.

version v8:
	(1) Contents of the AMR register withdrawn from
	the siginfo structure. Applications can always
	read the AMR register.
	(2) AMR/IAMR/UAMOR are now available through 
		ptrace system call. -- thanks to Thiago
	(3) code changes to handle legacy power cpus
	that do not support execute-disable.
	(4) incorporates many code improvement
		suggestions.

version v7:
	(1) refers to device tree property to enable
		protection keys.
	(2) adds 4K PTE support.
	(3) fixes a couple of bugs noticed by Thiago
	(4) decouples this patch series from arch-
	 independent code. This patch series can
	 now stand by itself, with one kludge
	patch(2).
version v7:
	(1) refers to device tree property to enable
		protection keys.
	(2) adds 4K PTE support.
	(3) fixes a couple of bugs noticed by Thiago
	(4) decouples this patch series from arch-
	 independent code. This patch series can
	 now stand by itself, with one kludge
	 patch(2).

version v6:
	(1) selftest changes are broken down into 20
		incremental patches.
	(2) A separate key allocation mask that
		includes PKEY_DISABLE_EXECUTE is 
		added for powerpc
	(3) pkey feature is enabled for 64K HPT case
		only. RPT and 4k HPT is disabled.
	(4) Documentation is updated to better 
		capture the semantics.
	(5) introduced arch_pkeys_enabled() to find
		if an arch enables pkeys. Correspond-
		ing change the logic that displays
		key value in smaps.
	(6) code rearranged in many places based on
		comments from Dave Hansen, Balbir,
		Anshuman.	
	(7) fixed one bug where a bogus key could be
		associated successfully in
		pkey_mprotect().

version v5:
	(1) reverted back to the old design -- store
	 the key in the pte, instead of bypassing
	 it. The v4 design slowed down the hash
	 page path.
	(2) detects key violation when kernel is told
		to access user pages.
	(3) further refined the patches into smaller
		consumable units
	(4) page faults handlers captures the fault-
		ing key 
	 from the pte instead of the vma. This
	 closes a race between where the key 
	 update in the vma and a key fault caused
	 by the key programmed in the pte.
	(5) a key created with access-denied should
	 also set it up to deny write. Fixed it.
	(6) protection-key number is displayed in
 		smaps the x86 way.

version v4:
	(1) patches no more depend on the pte bits
		to program the hpte
			-- comment by Balbir
	(2) documentation updates
	(3) fixed a bug in the selftest.
	(4) unlike x86, powerpc lets signal handler
		change key permission bits; the
		change will persist across signal
		handler boundaries. Earlier we
		allowed the signal handler to
		modify a field in the siginfo
		structure which would than be used
		by the kernel to program the key
		protection register (AMR)
		 -- resolves a issue raised by Ben.
		"Calls to sys_swapcontext with a
		made-up context will end up with a
		crap AMR if done by code who didn't
		know about that register".
	(5) these changes enable protection keys on
 		4k-page kernel aswell.

version v3:
	(1) split the patches into smaller consumable
		patches.
	(2) added the ability to disable execute
		permission on a key at creation.
	(3) rename calc_pte_to_hpte_pkey_bits() to
	pte_to_hpte_pkey_bits()
		-- suggested by Anshuman
	(4) some code optimization and clarity in
		do_page_fault()
	(5) A bug fix while invalidating a hpte slot
		in __hash_page_4K()
		-- noticed by Aneesh
	

version v2:
	(1) documentation and selftest added.
 	(2) fixed a bug in 4k hpte backed 64k pte
		where page invalidation was not
		done correctly, and initialization
		of second-part-of-the-pte was not
		done correctly if the pte was not
		yet Hashed with a hpte.
		--	Reported by Aneesh.
	(3) Fixed ABI breakage caused in siginfo
		structure.
		-- Reported by Anshuman.
	

version v1: Initial version

Ram Pai (47):
  mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
    is enabled
  mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
  powerpc: initial pkey plumbing
  powerpc: track allocation status of all pkeys
  powerpc: helper function to read,write AMR,IAMR,UAMOR registers
  powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  powerpc: cleanup AMR, IAMR when a key is allocated or freed
  powerpc: implementation for arch_set_user_pkey_access()
  powerpc: ability to create execute-disabled pkeys
  powerpc: store and restore the pkey state across context switches
  powerpc: introduce execute-only pkey
  powerpc: ability to associate pkey to a vma
  powerpc: implementation for arch_override_mprotect_pkey()
  powerpc: map vma key-protection bits to pte key bits.
  powerpc: Program HPTE key protection bits
  powerpc: helper to validate key-access permissions of a pte
  powerpc: check key protection for user page access
  powerpc: implementation for arch_vma_access_permitted()
  powerpc: Handle exceptions caused by pkey violation
  powerpc: introduce get_mm_addr_key() helper
  powerpc: Deliver SEGV signal on pkey violation
  powerpc: Enable pkey subsystem
  powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  powerpc: sys_pkey_mprotect() system call
  powerpc: add sys_pkey_modify() system call
  mm, x86 : introduce arch_pkeys_enabled()
  mm: display pkey in smaps if arch_pkeys_enabled() is true
  Documentation/x86: Move protecton key documentation to arch neutral
    directory
  Documentation/vm: PowerPC specific updates to memory protection keys
  selftest/x86: Move protecton key selftest to arch neutral directory
  selftest/vm: rename all references to pkru to a generic name
  selftest/vm: move generic definitions to header file
  selftest/vm: typecast the pkey register
  selftest/vm: generic function to handle shadow key register
  selftest/vm: fix the wrong assert in pkey_disable_set()
  selftest/vm: fixed bugs in pkey_disable_clear()
  selftest/vm: clear the bits in shadow reg when a pkey is freed.
  selftest/vm: fix alloc_random_pkey() to make it really random
  selftest/vm: introduce two arch independent abstraction
  selftest/vm: pkey register should match shadow pkey
  selftest/vm: generic cleanup
  selftest/vm: powerpc implementation for generic abstraction
  selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
  selftest/vm: associate key on a mapped page and detect access
    violation
  selftest/vm: associate key on a mapped page and detect write
    violation
  selftest/vm: detect write violation on a mapped access-denied-key
    page
  selftest/vm: sub-page allocator

Thiago Jung Bauermann (4):
  powerpc/ptrace: Add memory protection key regset
  mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  selftests/powerpc: Add ptrace tests for Protection Key register
  selftests/powerpc: Add core file test for Protection Key register

 Documentation/vm/protection-keys.txt               |  161 +++
 Documentation/x86/protection-keys.txt              |   85 --
 arch/powerpc/Kconfig                               |   15 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
 arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
 arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
 arch/powerpc/include/asm/bug.h                     |    1 +
 arch/powerpc/include/asm/cputable.h                |   15 +-
 arch/powerpc/include/asm/mman.h                    |   13 +-
 arch/powerpc/include/asm/mmu.h                     |    9 +
 arch/powerpc/include/asm/mmu_context.h             |   24 +
 arch/powerpc/include/asm/pkeys.h                   |  247 ++++
 arch/powerpc/include/asm/processor.h               |    5 +
 arch/powerpc/include/asm/systbl.h                  |    4 +
 arch/powerpc/include/asm/unistd.h                  |    6 +-
 arch/powerpc/include/uapi/asm/elf.h                |    1 +
 arch/powerpc/include/uapi/asm/mman.h               |    6 +
 arch/powerpc/include/uapi/asm/unistd.h             |    4 +
 arch/powerpc/kernel/entry_64.S                     |    9 +
 arch/powerpc/kernel/process.c                      |    7 +
 arch/powerpc/kernel/prom.c                         |   18 +
 arch/powerpc/kernel/ptrace.c                       |   66 +
 arch/powerpc/kernel/traps.c                        |   19 +-
 arch/powerpc/mm/Makefile                           |    1 +
 arch/powerpc/mm/fault.c                            |   49 +-
 arch/powerpc/mm/hash_utils_64.c                    |   29 +
 arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
 arch/powerpc/mm/pkeys.c                            |  463 +++++++
 arch/x86/include/asm/mmu_context.h                 |    4 +-
 arch/x86/include/asm/pkeys.h                       |    2 +
 arch/x86/kernel/fpu/xstate.c                       |    5 +
 arch/x86/kernel/setup.c                            |    8 -
 arch/x86/mm/pkeys.c                                |    9 +
 fs/proc/task_mmu.c                                 |   16 +-
 include/linux/mm.h                                 |   12 +-
 include/linux/pkeys.h                              |    7 +-
 include/uapi/linux/elf.h                           |    1 +
 mm/mprotect.c                                      |   88 ++
 tools/testing/selftests/powerpc/include/reg.h      |    1 +
 tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
 tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
 .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
 tools/testing/selftests/vm/Makefile                |    1 +
 tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
 tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
 tools/testing/selftests/x86/Makefile               |    2 +-
 tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
 tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
 48 files changed, 4095 insertions(+), 1747 deletions(-)
 create mode 100644 Documentation/vm/protection-keys.txt
 delete mode 100644 Documentation/x86/protection-keys.txt
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
 create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
 create mode 100644 tools/testing/selftests/vm/protection_keys.c
 delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
 delete mode 100644 tools/testing/selftests/x86/protection_keys.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [PATCH v9 01/51] mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

VM_PKEY_BITx are defined only if CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
is enabled. Powerpc also needs these bits. Hence lets define the
VM_PKEY_BITx bits for any architecture that enables
CONFIG_ARCH_HAS_PKEYS.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 fs/proc/task_mmu.c |    4 ++--
 include/linux/mm.h |    9 +++++----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 6744bd7..677866e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -677,13 +677,13 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_MERGEABLE)]	= "mg",
 		[ilog2(VM_UFFD_MISSING)]= "um",
 		[ilog2(VM_UFFD_WP)]	= "uw",
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_ARCH_HAS_PKEYS
 		/* These come out via ProtectionKey: */
 		[ilog2(VM_PKEY_BIT0)]	= "",
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
 		[ilog2(VM_PKEY_BIT3)]	= "",
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
 	};
 	size_t i;
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 43edf65..2c5ea48 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -218,15 +218,16 @@ extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 #define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
 #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
 
-#if defined(CONFIG_X86)
-# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
-#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
+#ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
 # define VM_PKEY_BIT1	VM_HIGH_ARCH_1
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
+
+#if defined(CONFIG_X86)
+# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
 #elif defined(CONFIG_PPC)
 # define VM_SAO		VM_ARCH_1	/* Strong Access Ordering (powerpc) */
 #elif defined(CONFIG_PARISC)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 01/51] mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

VM_PKEY_BITx are defined only if CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
is enabled. Powerpc also needs these bits. Hence lets define the
VM_PKEY_BITx bits for any architecture that enables
CONFIG_ARCH_HAS_PKEYS.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 fs/proc/task_mmu.c |    4 ++--
 include/linux/mm.h |    9 +++++----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 6744bd7..677866e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -677,13 +677,13 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_MERGEABLE)]	= "mg",
 		[ilog2(VM_UFFD_MISSING)]= "um",
 		[ilog2(VM_UFFD_WP)]	= "uw",
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_ARCH_HAS_PKEYS
 		/* These come out via ProtectionKey: */
 		[ilog2(VM_PKEY_BIT0)]	= "",
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
 		[ilog2(VM_PKEY_BIT3)]	= "",
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
 	};
 	size_t i;
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 43edf65..2c5ea48 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -218,15 +218,16 @@ extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 #define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
 #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
 
-#if defined(CONFIG_X86)
-# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
-#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
+#ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
 # define VM_PKEY_BIT1	VM_HIGH_ARCH_1
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
+
+#if defined(CONFIG_X86)
+# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
 #elif defined(CONFIG_PPC)
 # define VM_SAO		VM_ARCH_1	/* Strong Access Ordering (powerpc) */
 #elif defined(CONFIG_PARISC)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 02/51] mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Currently only 4bits are allocated in the vma flags to hold 16
keys. This is sufficient for x86. PowerPC  supports  32  keys,
which needs 5bits. This patch allocates an  additional bit.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 fs/proc/task_mmu.c |    1 +
 include/linux/mm.h |    3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 677866e..fad19a0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -683,6 +683,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
 		[ilog2(VM_PKEY_BIT3)]	= "",
+		[ilog2(VM_PKEY_BIT4)]	= "",
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 	};
 	size_t i;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2c5ea48..f5330a9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -221,9 +221,10 @@ extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 #ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
-# define VM_PKEY_BIT1	VM_HIGH_ARCH_1
+# define VM_PKEY_BIT1	VM_HIGH_ARCH_1	/* on x86 and 5-bit value on ppc64   */
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
+# define VM_PKEY_BIT4	VM_HIGH_ARCH_4
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 
 #if defined(CONFIG_X86)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 02/51] mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Currently only 4bits are allocated in the vma flags to hold 16
keys. This is sufficient for x86. PowerPC  supports  32  keys,
which needs 5bits. This patch allocates an  additional bit.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 fs/proc/task_mmu.c |    1 +
 include/linux/mm.h |    3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 677866e..fad19a0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -683,6 +683,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
 		[ilog2(VM_PKEY_BIT3)]	= "",
+		[ilog2(VM_PKEY_BIT4)]	= "",
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 	};
 	size_t i;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2c5ea48..f5330a9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -221,9 +221,10 @@ extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 #ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
-# define VM_PKEY_BIT1	VM_HIGH_ARCH_1
+# define VM_PKEY_BIT1	VM_HIGH_ARCH_1	/* on x86 and 5-bit value on ppc64   */
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
+# define VM_PKEY_BIT4	VM_HIGH_ARCH_4
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 
 #if defined(CONFIG_X86)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 03/51] powerpc: initial pkey plumbing
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Basic  plumbing  to   initialize  the   pkey  system.
Nothing is enabled yet. A later patch will enable it
ones all the infrastructure is in place.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/Kconfig                   |   15 ++++++++
 arch/powerpc/include/asm/mmu_context.h |    5 +++
 arch/powerpc/include/asm/pkeys.h       |   57 ++++++++++++++++++++++++++++++++
 arch/powerpc/mm/Makefile               |    1 +
 arch/powerpc/mm/hash_utils_64.c        |    4 ++
 arch/powerpc/mm/pkeys.c                |   30 +++++++++++++++++
 6 files changed, 112 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index cb782ac..9fd389b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -865,6 +865,21 @@ config SECCOMP
 
 	  If unsure, say Y. Only embedded should say N here.
 
+config PPC_MEM_KEYS
+	prompt "PowerPC Memory Protection Keys"
+	def_bool y
+	depends on PPC_BOOK3S_64
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_HAS_PKEYS
+	help
+	  Memory Protection Keys provides a mechanism for enforcing
+	  page-based protections, but without requiring modification of the
+	  page tables when an application changes protection domains.
+
+	  For details, see Documentation/vm/protection-keys.txt
+
+	  If unsure, say y.
+
 endmenu
 
 config ISA_DMA_API
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 492d814..2c24447 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -142,5 +142,10 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 	/* by default, allow everything */
 	return true;
 }
+
+#ifndef CONFIG_PPC_MEM_KEYS
+#define pkey_initialize()
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
new file mode 100644
index 0000000..a54cb39
--- /dev/null
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -0,0 +1,57 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2017, IBM Corporation.
+ * Author: Ram Pai <linuxram@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _ASM_POWERPC_KEYS_H
+#define _ASM_POWERPC_KEYS_H
+
+#include <linux/jump_label.h>
+
+DECLARE_STATIC_KEY_TRUE(pkey_disabled);
+#define ARCH_VM_PKEY_FLAGS 0
+
+static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
+{
+	return false;
+}
+
+static inline int mm_pkey_alloc(struct mm_struct *mm)
+{
+	return -1;
+}
+
+static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
+{
+	return -EINVAL;
+}
+
+/*
+ * Try to dedicate one of the protection keys to be used as an
+ * execute-only protection key.
+ */
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+	return 0;
+}
+
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+					      int prot, int pkey)
+{
+	return 0;
+}
+
+static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+					    unsigned long init_val)
+{
+	return 0;
+}
+
+extern void pkey_initialize(void);
+#endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index a0c327d..823b03d 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -44,3 +44,4 @@ obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
 obj-$(CONFIG_SPAPR_TCE_IOMMU)	+= mmu_context_iommu.o
 obj-$(CONFIG_PPC_PTDUMP)	+= dump_linuxpagetables.o
 obj-$(CONFIG_PPC_HTDUMP)	+= dump_hashpagetable.o
+obj-$(CONFIG_PPC_MEM_KEYS)	+= pkeys.o
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 578d5a3..1e74590 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -35,6 +35,7 @@
 #include <linux/memblock.h>
 #include <linux/context_tracking.h>
 #include <linux/libfdt.h>
+#include <linux/pkeys.h>
 
 #include <asm/debugfs.h>
 #include <asm/processor.h>
@@ -1050,6 +1051,9 @@ void __init hash__early_init_mmu(void)
 	pr_info("Initializing hash mmu with SLB\n");
 	/* Initialize SLB management */
 	slb_initialize();
+
+	/* initialize the key subsystem */
+	pkey_initialize();
 }
 
 #ifdef CONFIG_SMP
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
new file mode 100644
index 0000000..c97a7a0
--- /dev/null
+++ b/arch/powerpc/mm/pkeys.c
@@ -0,0 +1,30 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2017, IBM Corporation.
+ * Author: Ram Pai <linuxram@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/pkeys.h>
+
+DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+bool pkey_execute_disable_supported;
+
+void __init pkey_initialize(void)
+{
+	/*
+	 * Disable the pkey system till everything is in place. A subsequent
+	 * patch will enable it.
+	 */
+	static_branch_enable(&pkey_disabled);
+
+	/*
+	 * Disable execute_disable support for now. A subsequent patch will
+	 * enable it.
+	 */
+	pkey_execute_disable_supported = false;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 03/51] powerpc: initial pkey plumbing
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Basic  plumbing  to   initialize  the   pkey  system.
Nothing is enabled yet. A later patch will enable it
ones all the infrastructure is in place.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/Kconfig                   |   15 ++++++++
 arch/powerpc/include/asm/mmu_context.h |    5 +++
 arch/powerpc/include/asm/pkeys.h       |   57 ++++++++++++++++++++++++++++++++
 arch/powerpc/mm/Makefile               |    1 +
 arch/powerpc/mm/hash_utils_64.c        |    4 ++
 arch/powerpc/mm/pkeys.c                |   30 +++++++++++++++++
 6 files changed, 112 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index cb782ac..9fd389b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -865,6 +865,21 @@ config SECCOMP
 
 	  If unsure, say Y. Only embedded should say N here.
 
+config PPC_MEM_KEYS
+	prompt "PowerPC Memory Protection Keys"
+	def_bool y
+	depends on PPC_BOOK3S_64
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_HAS_PKEYS
+	help
+	  Memory Protection Keys provides a mechanism for enforcing
+	  page-based protections, but without requiring modification of the
+	  page tables when an application changes protection domains.
+
+	  For details, see Documentation/vm/protection-keys.txt
+
+	  If unsure, say y.
+
 endmenu
 
 config ISA_DMA_API
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 492d814..2c24447 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -142,5 +142,10 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 	/* by default, allow everything */
 	return true;
 }
+
+#ifndef CONFIG_PPC_MEM_KEYS
+#define pkey_initialize()
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
new file mode 100644
index 0000000..a54cb39
--- /dev/null
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -0,0 +1,57 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2017, IBM Corporation.
+ * Author: Ram Pai <linuxram@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _ASM_POWERPC_KEYS_H
+#define _ASM_POWERPC_KEYS_H
+
+#include <linux/jump_label.h>
+
+DECLARE_STATIC_KEY_TRUE(pkey_disabled);
+#define ARCH_VM_PKEY_FLAGS 0
+
+static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
+{
+	return false;
+}
+
+static inline int mm_pkey_alloc(struct mm_struct *mm)
+{
+	return -1;
+}
+
+static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
+{
+	return -EINVAL;
+}
+
+/*
+ * Try to dedicate one of the protection keys to be used as an
+ * execute-only protection key.
+ */
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+	return 0;
+}
+
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+					      int prot, int pkey)
+{
+	return 0;
+}
+
+static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+					    unsigned long init_val)
+{
+	return 0;
+}
+
+extern void pkey_initialize(void);
+#endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index a0c327d..823b03d 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -44,3 +44,4 @@ obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
 obj-$(CONFIG_SPAPR_TCE_IOMMU)	+= mmu_context_iommu.o
 obj-$(CONFIG_PPC_PTDUMP)	+= dump_linuxpagetables.o
 obj-$(CONFIG_PPC_HTDUMP)	+= dump_hashpagetable.o
+obj-$(CONFIG_PPC_MEM_KEYS)	+= pkeys.o
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 578d5a3..1e74590 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -35,6 +35,7 @@
 #include <linux/memblock.h>
 #include <linux/context_tracking.h>
 #include <linux/libfdt.h>
+#include <linux/pkeys.h>
 
 #include <asm/debugfs.h>
 #include <asm/processor.h>
@@ -1050,6 +1051,9 @@ void __init hash__early_init_mmu(void)
 	pr_info("Initializing hash mmu with SLB\n");
 	/* Initialize SLB management */
 	slb_initialize();
+
+	/* initialize the key subsystem */
+	pkey_initialize();
 }
 
 #ifdef CONFIG_SMP
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
new file mode 100644
index 0000000..c97a7a0
--- /dev/null
+++ b/arch/powerpc/mm/pkeys.c
@@ -0,0 +1,30 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2017, IBM Corporation.
+ * Author: Ram Pai <linuxram@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/pkeys.h>
+
+DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+bool pkey_execute_disable_supported;
+
+void __init pkey_initialize(void)
+{
+	/*
+	 * Disable the pkey system till everything is in place. A subsequent
+	 * patch will enable it.
+	 */
+	static_branch_enable(&pkey_disabled);
+
+	/*
+	 * Disable execute_disable support for now. A subsequent patch will
+	 * enable it.
+	 */
+	pkey_execute_disable_supported = false;
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 04/51] powerpc: track allocation status of all pkeys
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Total 32 keys are available on power7 and above. However
pkey 0,1 are reserved. So effectively we  have  30 pkeys.

On 4K kernels, we do not  have  5  bits  in  the  PTE to
represent  all the keys; we only have 3bits.Two of those
keys are reserved; pkey 0 and pkey 1. So effectively  we
have 6 pkeys.

This patch keeps track of reserved keys, allocated  keys
and keys that are currently free.

Also it  adds  skeletal  functions  and macros, that the
architecture-independent code expects to be available.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h |    9 +++
 arch/powerpc/include/asm/mmu_context.h   |    1 +
 arch/powerpc/include/asm/pkeys.h         |   95 ++++++++++++++++++++++++++++-
 arch/powerpc/mm/mmu_context_book3s64.c   |    2 +
 arch/powerpc/mm/pkeys.c                  |   33 ++++++++++
 5 files changed, 136 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 37fdede..df17fbc 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -108,6 +108,15 @@ struct patb_entry {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	struct list_head iommu_group_mem_list;
 #endif
+
+#ifdef CONFIG_PPC_MEM_KEYS
+	/*
+	 * Each bit represents one protection key.
+	 * bit set   -> key allocated
+	 * bit unset -> key available for allocation
+	 */
+	u32 pkey_allocation_map;
+#endif
 } mm_context_t;
 
 /*
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 2c24447..6d7c4f1 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -145,6 +145,7 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 
 #ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_initialize()
+#define pkey_mm_init(mm)
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index a54cb39..e5deac7 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -15,21 +15,101 @@
 #include <linux/jump_label.h>
 
 DECLARE_STATIC_KEY_TRUE(pkey_disabled);
-#define ARCH_VM_PKEY_FLAGS 0
+extern int pkeys_total; /* total pkeys as per device tree */
+extern u32 initial_allocation_mask; /* bits set for reserved keys */
+
+/*
+ * powerpc needs VM_PKEY_BIT* bit to enable pkey system.
+ * Without them, at least compilation needs to succeed.
+ */
+#ifndef VM_PKEY_BIT0
+#define VM_PKEY_SHIFT 0
+#define VM_PKEY_BIT0 0
+#define VM_PKEY_BIT1 0
+#define VM_PKEY_BIT2 0
+#define VM_PKEY_BIT3 0
+#endif
+
+/*
+ * powerpc needs an additional vma bit to support 32 keys. Till the additional
+ * vma bit lands in include/linux/mm.h we can only support 16 keys.
+ */
+#ifndef VM_PKEY_BIT4
+#define VM_PKEY_BIT4 0
+#endif
+
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+			    VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+#define arch_max_pkey() pkeys_total
+
+#define pkey_alloc_mask(pkey) (0x1 << pkey)
+
+#define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
+
+#define __mm_pkey_allocated(mm, pkey) {	\
+	mm_pkey_allocation_map(mm) |= pkey_alloc_mask(pkey); \
+}
+
+#define __mm_pkey_free(mm, pkey) {	\
+	mm_pkey_allocation_map(mm) &= ~pkey_alloc_mask(pkey);	\
+}
+
+#define __mm_pkey_is_allocated(mm, pkey)	\
+	(mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
+
+#define __mm_pkey_is_reserved(pkey) (initial_allocation_mask & \
+				       pkey_alloc_mask(pkey))
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
-	return false;
+	/* A reserved key is never considered as 'explicitly allocated' */
+	return ((pkey < arch_max_pkey()) &&
+		!__mm_pkey_is_reserved(pkey) &&
+		__mm_pkey_is_allocated(mm, pkey));
 }
 
+/*
+ * Returns a positive, 5-bit key on success, or -1 on failure.
+ * Relies on the mmap_sem to protect against concurrency in mm_pkey_alloc() and
+ * mm_pkey_free().
+ */
 static inline int mm_pkey_alloc(struct mm_struct *mm)
 {
-	return -1;
+	/*
+	 * Note: this is the one and only place we make sure that the pkey is
+	 * valid as far as the hardware is concerned. The rest of the kernel
+	 * trusts that only good, valid pkeys come out of here.
+	 */
+	u32 all_pkeys_mask = (u32)(~(0x0));
+	int ret;
+
+	if (static_branch_likely(&pkey_disabled))
+		return -1;
+
+	/*
+	 * Are we out of pkeys? We must handle this specially because ffz()
+	 * behavior is undefined if there are no zeros.
+	 */
+	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
+		return -1;
+
+	ret = ffz((u32)mm_pkey_allocation_map(mm));
+	__mm_pkey_allocated(mm, ret);
+	return ret;
 }
 
 static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 {
-	return -EINVAL;
+	if (static_branch_likely(&pkey_disabled))
+		return -1;
+
+	if (!mm_pkey_is_allocated(mm, pkey))
+		return -EINVAL;
+
+	__mm_pkey_free(mm, pkey);
+
+	return 0;
 }
 
 /*
@@ -53,5 +133,12 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return 0;
 }
 
+static inline void pkey_mm_init(struct mm_struct *mm)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return;
+	mm_pkey_allocation_map(mm) = initial_allocation_mask;
+}
+
 extern void pkey_initialize(void);
 #endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 05e1538..5df223a 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -16,6 +16,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <linux/spinlock.h>
 #include <linux/idr.h>
 #include <linux/export.h>
@@ -118,6 +119,7 @@ static int hash__init_new_context(struct mm_struct *mm)
 
 	subpage_prot_init_new_context(mm);
 
+	pkey_mm_init(mm);
 	return index;
 }
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index c97a7a0..512bdf2 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -13,18 +13,51 @@
 
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
 bool pkey_execute_disable_supported;
+int  pkeys_total;		/* Total pkeys as per device tree */
+u32  initial_allocation_mask;	/* Bits set for reserved keys */
 
 void __init pkey_initialize(void)
 {
+	int os_reserved, i;
+
 	/*
 	 * Disable the pkey system till everything is in place. A subsequent
 	 * patch will enable it.
 	 */
 	static_branch_enable(&pkey_disabled);
 
+	/* Lets assume 32 keys */
+	pkeys_total = 32;
+
+	/*
+	 * Adjust the upper limit, based on the number of bits supported by
+	 * arch-neutral code.
+	 */
+	pkeys_total = min_t(int, pkeys_total,
+			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
+
 	/*
 	 * Disable execute_disable support for now. A subsequent patch will
 	 * enable it.
 	 */
 	pkey_execute_disable_supported = false;
+
+#ifdef CONFIG_PPC_4K_PAGES
+	/*
+	 * The OS can manage only 8 pkeys due to its inability to represent them
+	 * in the Linux 4K PTE.
+	 */
+	os_reserved = pkeys_total - 8;
+#else
+	os_reserved = 0;
+#endif
+	/*
+	 * Bits are in LE format. NOTE: 1, 0 are reserved.
+	 * key 0 is the default key, which allows read/write/execute.
+	 * key 1 is recommended not to be used. PowerISA(3.0) page 1015,
+	 * 	programming note.
+	 */
+	initial_allocation_mask = ~0x0;
+	for (i = 2; i < (pkeys_total - os_reserved); i++)
+		initial_allocation_mask &= ~(0x1 << i);
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 04/51] powerpc: track allocation status of all pkeys
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Total 32 keys are available on power7 and above. However
pkey 0,1 are reserved. So effectively we  have  30 pkeys.

On 4K kernels, we do not  have  5  bits  in  the  PTE to
represent  all the keys; we only have 3bits.Two of those
keys are reserved; pkey 0 and pkey 1. So effectively  we
have 6 pkeys.

This patch keeps track of reserved keys, allocated  keys
and keys that are currently free.

Also it  adds  skeletal  functions  and macros, that the
architecture-independent code expects to be available.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h |    9 +++
 arch/powerpc/include/asm/mmu_context.h   |    1 +
 arch/powerpc/include/asm/pkeys.h         |   95 ++++++++++++++++++++++++++++-
 arch/powerpc/mm/mmu_context_book3s64.c   |    2 +
 arch/powerpc/mm/pkeys.c                  |   33 ++++++++++
 5 files changed, 136 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 37fdede..df17fbc 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -108,6 +108,15 @@ struct patb_entry {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	struct list_head iommu_group_mem_list;
 #endif
+
+#ifdef CONFIG_PPC_MEM_KEYS
+	/*
+	 * Each bit represents one protection key.
+	 * bit set   -> key allocated
+	 * bit unset -> key available for allocation
+	 */
+	u32 pkey_allocation_map;
+#endif
 } mm_context_t;
 
 /*
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 2c24447..6d7c4f1 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -145,6 +145,7 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 
 #ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_initialize()
+#define pkey_mm_init(mm)
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index a54cb39..e5deac7 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -15,21 +15,101 @@
 #include <linux/jump_label.h>
 
 DECLARE_STATIC_KEY_TRUE(pkey_disabled);
-#define ARCH_VM_PKEY_FLAGS 0
+extern int pkeys_total; /* total pkeys as per device tree */
+extern u32 initial_allocation_mask; /* bits set for reserved keys */
+
+/*
+ * powerpc needs VM_PKEY_BIT* bit to enable pkey system.
+ * Without them, at least compilation needs to succeed.
+ */
+#ifndef VM_PKEY_BIT0
+#define VM_PKEY_SHIFT 0
+#define VM_PKEY_BIT0 0
+#define VM_PKEY_BIT1 0
+#define VM_PKEY_BIT2 0
+#define VM_PKEY_BIT3 0
+#endif
+
+/*
+ * powerpc needs an additional vma bit to support 32 keys. Till the additional
+ * vma bit lands in include/linux/mm.h we can only support 16 keys.
+ */
+#ifndef VM_PKEY_BIT4
+#define VM_PKEY_BIT4 0
+#endif
+
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+			    VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+#define arch_max_pkey() pkeys_total
+
+#define pkey_alloc_mask(pkey) (0x1 << pkey)
+
+#define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
+
+#define __mm_pkey_allocated(mm, pkey) {	\
+	mm_pkey_allocation_map(mm) |= pkey_alloc_mask(pkey); \
+}
+
+#define __mm_pkey_free(mm, pkey) {	\
+	mm_pkey_allocation_map(mm) &= ~pkey_alloc_mask(pkey);	\
+}
+
+#define __mm_pkey_is_allocated(mm, pkey)	\
+	(mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
+
+#define __mm_pkey_is_reserved(pkey) (initial_allocation_mask & \
+				       pkey_alloc_mask(pkey))
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
-	return false;
+	/* A reserved key is never considered as 'explicitly allocated' */
+	return ((pkey < arch_max_pkey()) &&
+		!__mm_pkey_is_reserved(pkey) &&
+		__mm_pkey_is_allocated(mm, pkey));
 }
 
+/*
+ * Returns a positive, 5-bit key on success, or -1 on failure.
+ * Relies on the mmap_sem to protect against concurrency in mm_pkey_alloc() and
+ * mm_pkey_free().
+ */
 static inline int mm_pkey_alloc(struct mm_struct *mm)
 {
-	return -1;
+	/*
+	 * Note: this is the one and only place we make sure that the pkey is
+	 * valid as far as the hardware is concerned. The rest of the kernel
+	 * trusts that only good, valid pkeys come out of here.
+	 */
+	u32 all_pkeys_mask = (u32)(~(0x0));
+	int ret;
+
+	if (static_branch_likely(&pkey_disabled))
+		return -1;
+
+	/*
+	 * Are we out of pkeys? We must handle this specially because ffz()
+	 * behavior is undefined if there are no zeros.
+	 */
+	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
+		return -1;
+
+	ret = ffz((u32)mm_pkey_allocation_map(mm));
+	__mm_pkey_allocated(mm, ret);
+	return ret;
 }
 
 static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 {
-	return -EINVAL;
+	if (static_branch_likely(&pkey_disabled))
+		return -1;
+
+	if (!mm_pkey_is_allocated(mm, pkey))
+		return -EINVAL;
+
+	__mm_pkey_free(mm, pkey);
+
+	return 0;
 }
 
 /*
@@ -53,5 +133,12 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return 0;
 }
 
+static inline void pkey_mm_init(struct mm_struct *mm)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return;
+	mm_pkey_allocation_map(mm) = initial_allocation_mask;
+}
+
 extern void pkey_initialize(void);
 #endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 05e1538..5df223a 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -16,6 +16,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <linux/spinlock.h>
 #include <linux/idr.h>
 #include <linux/export.h>
@@ -118,6 +119,7 @@ static int hash__init_new_context(struct mm_struct *mm)
 
 	subpage_prot_init_new_context(mm);
 
+	pkey_mm_init(mm);
 	return index;
 }
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index c97a7a0..512bdf2 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -13,18 +13,51 @@
 
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
 bool pkey_execute_disable_supported;
+int  pkeys_total;		/* Total pkeys as per device tree */
+u32  initial_allocation_mask;	/* Bits set for reserved keys */
 
 void __init pkey_initialize(void)
 {
+	int os_reserved, i;
+
 	/*
 	 * Disable the pkey system till everything is in place. A subsequent
 	 * patch will enable it.
 	 */
 	static_branch_enable(&pkey_disabled);
 
+	/* Lets assume 32 keys */
+	pkeys_total = 32;
+
+	/*
+	 * Adjust the upper limit, based on the number of bits supported by
+	 * arch-neutral code.
+	 */
+	pkeys_total = min_t(int, pkeys_total,
+			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
+
 	/*
 	 * Disable execute_disable support for now. A subsequent patch will
 	 * enable it.
 	 */
 	pkey_execute_disable_supported = false;
+
+#ifdef CONFIG_PPC_4K_PAGES
+	/*
+	 * The OS can manage only 8 pkeys due to its inability to represent them
+	 * in the Linux 4K PTE.
+	 */
+	os_reserved = pkeys_total - 8;
+#else
+	os_reserved = 0;
+#endif
+	/*
+	 * Bits are in LE format. NOTE: 1, 0 are reserved.
+	 * key 0 is the default key, which allows read/write/execute.
+	 * key 1 is recommended not to be used. PowerISA(3.0) page 1015,
+	 * 	programming note.
+	 */
+	initial_allocation_mask = ~0x0;
+	for (i = 2; i < (pkeys_total - os_reserved); i++)
+		initial_allocation_mask &= ~(0x1 << i);
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 05/51] powerpc: helper function to read,write AMR,IAMR,UAMOR registers
  2017-11-06  8:56 ` Ram Pai
  (?)
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Implements helper functions to read and write the key related
registers; AMR, IAMR, UAMOR.

AMR register tracks the read,write permission of a key
IAMR register tracks the execute permission of a key
UAMOR register enables and disables a key

Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/pkeys.c |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 512bdf2..b6bdfdf 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -61,3 +61,39 @@ void __init pkey_initialize(void)
 	for (i = 2; i < (pkeys_total - os_reserved); i++)
 		initial_allocation_mask &= ~(0x1 << i);
 }
+
+static inline u64 read_amr(void)
+{
+	return mfspr(SPRN_AMR);
+}
+
+static inline void write_amr(u64 value)
+{
+	mtspr(SPRN_AMR, value);
+}
+
+static inline u64 read_iamr(void)
+{
+	if (!likely(pkey_execute_disable_supported))
+		return 0x0UL;
+
+	return mfspr(SPRN_IAMR);
+}
+
+static inline void write_iamr(u64 value)
+{
+	if (!likely(pkey_execute_disable_supported))
+		return;
+
+	mtspr(SPRN_IAMR, value);
+}
+
+static inline u64 read_uamor(void)
+{
+	return mfspr(SPRN_UAMOR);
+}
+
+static inline void write_uamor(u64 value)
+{
+	mtspr(SPRN_UAMOR, value);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 05/51] powerpc: helper function to read,write AMR,IAMR,UAMOR registers
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Implements helper functions to read and write the key related
registers; AMR, IAMR, UAMOR.

AMR register tracks the read,write permission of a key
IAMR register tracks the execute permission of a key
UAMOR register enables and disables a key

Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/pkeys.c |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 512bdf2..b6bdfdf 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -61,3 +61,39 @@ void __init pkey_initialize(void)
 	for (i = 2; i < (pkeys_total - os_reserved); i++)
 		initial_allocation_mask &= ~(0x1 << i);
 }
+
+static inline u64 read_amr(void)
+{
+	return mfspr(SPRN_AMR);
+}
+
+static inline void write_amr(u64 value)
+{
+	mtspr(SPRN_AMR, value);
+}
+
+static inline u64 read_iamr(void)
+{
+	if (!likely(pkey_execute_disable_supported))
+		return 0x0UL;
+
+	return mfspr(SPRN_IAMR);
+}
+
+static inline void write_iamr(u64 value)
+{
+	if (!likely(pkey_execute_disable_supported))
+		return;
+
+	mtspr(SPRN_IAMR, value);
+}
+
+static inline u64 read_uamor(void)
+{
+	return mfspr(SPRN_UAMOR);
+}
+
+static inline void write_uamor(u64 value)
+{
+	mtspr(SPRN_UAMOR, value);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 05/51] powerpc: helper function to read, write AMR, IAMR, UAMOR registers
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Implements helper functions to read and write the key related
registers; AMR, IAMR, UAMOR.

AMR register tracks the read,write permission of a key
IAMR register tracks the execute permission of a key
UAMOR register enables and disables a key

Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/pkeys.c |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 512bdf2..b6bdfdf 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -61,3 +61,39 @@ void __init pkey_initialize(void)
 	for (i = 2; i < (pkeys_total - os_reserved); i++)
 		initial_allocation_mask &= ~(0x1 << i);
 }
+
+static inline u64 read_amr(void)
+{
+	return mfspr(SPRN_AMR);
+}
+
+static inline void write_amr(u64 value)
+{
+	mtspr(SPRN_AMR, value);
+}
+
+static inline u64 read_iamr(void)
+{
+	if (!likely(pkey_execute_disable_supported))
+		return 0x0UL;
+
+	return mfspr(SPRN_IAMR);
+}
+
+static inline void write_iamr(u64 value)
+{
+	if (!likely(pkey_execute_disable_supported))
+		return;
+
+	mtspr(SPRN_IAMR, value);
+}
+
+static inline u64 read_uamor(void)
+{
+	return mfspr(SPRN_UAMOR);
+}
+
+static inline void write_uamor(u64 value)
+{
+	mtspr(SPRN_UAMOR, value);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 06/51] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Introduce  helper functions that can initialize the bits in the AMR,
IAMR and UAMOR register; the bits that correspond to the given pkey.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/pkeys.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index b6bdfdf..f3bf661 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,6 +16,10 @@
 int  pkeys_total;		/* Total pkeys as per device tree */
 u32  initial_allocation_mask;	/* Bits set for reserved keys */
 
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64)*8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+
 void __init pkey_initialize(void)
 {
 	int os_reserved, i;
@@ -97,3 +101,46 @@ static inline void write_uamor(u64 value)
 {
 	mtspr(SPRN_UAMOR, value);
 }
+
+static inline void init_amr(int pkey, u8 init_bits)
+{
+	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+	write_amr(old_amr | new_amr_bits);
+}
+
+static inline void init_iamr(int pkey, u8 init_bits)
+{
+	u64 new_iamr_bits = (((u64)init_bits & 0x1UL) << pkeyshift(pkey));
+	u64 old_iamr = read_iamr() & ~((u64)(0x1ul) << pkeyshift(pkey));
+
+	write_iamr(old_iamr | new_iamr_bits);
+}
+
+static void pkey_status_change(int pkey, bool enable)
+{
+	u64 old_uamor;
+
+	/* Reset the AMR and IAMR bits for this key */
+	init_amr(pkey, 0x0);
+	init_iamr(pkey, 0x0);
+
+	/* Enable/disable key */
+	old_uamor = read_uamor();
+	if (enable)
+		old_uamor |= (0x3ul << pkeyshift(pkey));
+	else
+		old_uamor &= ~(0x3ul << pkeyshift(pkey));
+	write_uamor(old_uamor);
+}
+
+void __arch_activate_pkey(int pkey)
+{
+	pkey_status_change(pkey, true);
+}
+
+void __arch_deactivate_pkey(int pkey)
+{
+	pkey_status_change(pkey, false);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 06/51] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Introduce  helper functions that can initialize the bits in the AMR,
IAMR and UAMOR register; the bits that correspond to the given pkey.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/pkeys.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index b6bdfdf..f3bf661 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,6 +16,10 @@
 int  pkeys_total;		/* Total pkeys as per device tree */
 u32  initial_allocation_mask;	/* Bits set for reserved keys */
 
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64)*8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+
 void __init pkey_initialize(void)
 {
 	int os_reserved, i;
@@ -97,3 +101,46 @@ static inline void write_uamor(u64 value)
 {
 	mtspr(SPRN_UAMOR, value);
 }
+
+static inline void init_amr(int pkey, u8 init_bits)
+{
+	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+	write_amr(old_amr | new_amr_bits);
+}
+
+static inline void init_iamr(int pkey, u8 init_bits)
+{
+	u64 new_iamr_bits = (((u64)init_bits & 0x1UL) << pkeyshift(pkey));
+	u64 old_iamr = read_iamr() & ~((u64)(0x1ul) << pkeyshift(pkey));
+
+	write_iamr(old_iamr | new_iamr_bits);
+}
+
+static void pkey_status_change(int pkey, bool enable)
+{
+	u64 old_uamor;
+
+	/* Reset the AMR and IAMR bits for this key */
+	init_amr(pkey, 0x0);
+	init_iamr(pkey, 0x0);
+
+	/* Enable/disable key */
+	old_uamor = read_uamor();
+	if (enable)
+		old_uamor |= (0x3ul << pkeyshift(pkey));
+	else
+		old_uamor &= ~(0x3ul << pkeyshift(pkey));
+	write_uamor(old_uamor);
+}
+
+void __arch_activate_pkey(int pkey)
+{
+	pkey_status_change(pkey, true);
+}
+
+void __arch_deactivate_pkey(int pkey)
+{
+	pkey_status_change(pkey, false);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 07/51] powerpc: cleanup AMR, IAMR when a key is allocated or freed
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:56   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Cleanup the bits corresponding to a key in the AMR, and IAMR
register, when the key is newly allocated/activated or is freed.
We dont want some residual bits cause the hardware enforce
unintended behavior when the key is activated or freed.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index e5deac7..0d00a54 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -69,6 +69,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 		__mm_pkey_is_allocated(mm, pkey));
 }
 
+extern void __arch_activate_pkey(int pkey);
+extern void __arch_deactivate_pkey(int pkey);
 /*
  * Returns a positive, 5-bit key on success, or -1 on failure.
  * Relies on the mmap_sem to protect against concurrency in mm_pkey_alloc() and
@@ -96,6 +98,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
 
 	ret = ffz((u32)mm_pkey_allocation_map(mm));
 	__mm_pkey_allocated(mm, ret);
+
+	/*
+	 * Enable the key in the hardware
+	 */
+	if (ret > 0)
+		__arch_activate_pkey(ret);
 	return ret;
 }
 
@@ -107,6 +115,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 	if (!mm_pkey_is_allocated(mm, pkey))
 		return -EINVAL;
 
+	/*
+	 * Disable the key in the hardware
+	 */
+	__arch_deactivate_pkey(pkey);
 	__mm_pkey_free(mm, pkey);
 
 	return 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 07/51] powerpc: cleanup AMR, IAMR when a key is allocated or freed
@ 2017-11-06  8:56   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:56 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Cleanup the bits corresponding to a key in the AMR, and IAMR
register, when the key is newly allocated/activated or is freed.
We dont want some residual bits cause the hardware enforce
unintended behavior when the key is activated or freed.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index e5deac7..0d00a54 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -69,6 +69,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 		__mm_pkey_is_allocated(mm, pkey));
 }
 
+extern void __arch_activate_pkey(int pkey);
+extern void __arch_deactivate_pkey(int pkey);
 /*
  * Returns a positive, 5-bit key on success, or -1 on failure.
  * Relies on the mmap_sem to protect against concurrency in mm_pkey_alloc() and
@@ -96,6 +98,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
 
 	ret = ffz((u32)mm_pkey_allocation_map(mm));
 	__mm_pkey_allocated(mm, ret);
+
+	/*
+	 * Enable the key in the hardware
+	 */
+	if (ret > 0)
+		__arch_activate_pkey(ret);
 	return ret;
 }
 
@@ -107,6 +115,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 	if (!mm_pkey_is_allocated(mm, pkey))
 		return -EINVAL;
 
+	/*
+	 * Disable the key in the hardware
+	 */
+	__arch_deactivate_pkey(pkey);
 	__mm_pkey_free(mm, pkey);
 
 	return 0;
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 08/51] powerpc: implementation for arch_set_user_pkey_access()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This patch provides the detailed implementation for
a user to allocate a key and enable it in the hardware.

It provides the plumbing, but it cannot be used till
the system call is implemented. The next patch  will
do so.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |    6 ++++-
 arch/powerpc/mm/pkeys.c          |   40 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0d00a54..652c750 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -139,10 +139,14 @@ static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 	return 0;
 }
 
+extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+				       unsigned long init_val);
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 					    unsigned long init_val)
 {
-	return 0;
+	if (static_branch_likely(&pkey_disabled))
+		return -EINVAL;
+	return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
 static inline void pkey_mm_init(struct mm_struct *mm)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index f3bf661..4a01c2f 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -9,6 +9,7 @@
  * (at your option) any later version.
  */
 
+#include <asm/mman.h>
 #include <linux/pkeys.h>
 
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
@@ -17,6 +18,9 @@
 u32  initial_allocation_mask;	/* Bits set for reserved keys */
 
 #define AMR_BITS_PER_PKEY 2
+#define AMR_RD_BIT 0x1UL
+#define AMR_WR_BIT 0x2UL
+#define IAMR_EX_BIT 0x1UL
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
@@ -102,6 +106,20 @@ static inline void write_uamor(u64 value)
 	mtspr(SPRN_UAMOR, value);
 }
 
+static bool is_pkey_enabled(int pkey)
+{
+	u64 uamor = read_uamor();
+	u64 pkey_bits = 0x3ul << pkeyshift(pkey);
+	u64 uamor_pkey_bits = (uamor & pkey_bits);
+
+	/*
+	 * Both the bits in UAMOR corresponding to the key should be set or
+	 * reset.
+	 */
+	WARN_ON(uamor_pkey_bits && (uamor_pkey_bits != pkey_bits));
+	return !!(uamor_pkey_bits);
+}
+
 static inline void init_amr(int pkey, u8 init_bits)
 {
 	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
@@ -144,3 +162,25 @@ void __arch_deactivate_pkey(int pkey)
 {
 	pkey_status_change(pkey, false);
 }
+
+/*
+ * Set the access rights in AMR IAMR and UAMOR registers for @pkey to that
+ * specified in @init_val.
+ */
+int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+				unsigned long init_val)
+{
+	u64 new_amr_bits = 0x0ul;
+
+	if (!is_pkey_enabled(pkey))
+		return -EINVAL;
+
+	/* Set the bits we need in AMR: */
+	if (init_val & PKEY_DISABLE_ACCESS)
+		new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
+	else if (init_val & PKEY_DISABLE_WRITE)
+		new_amr_bits |= AMR_WR_BIT;
+
+	init_amr(pkey, new_amr_bits);
+	return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 08/51] powerpc: implementation for arch_set_user_pkey_access()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This patch provides the detailed implementation for
a user to allocate a key and enable it in the hardware.

It provides the plumbing, but it cannot be used till
the system call is implemented. The next patch  will
do so.

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |    6 ++++-
 arch/powerpc/mm/pkeys.c          |   40 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0d00a54..652c750 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -139,10 +139,14 @@ static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 	return 0;
 }
 
+extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+				       unsigned long init_val);
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 					    unsigned long init_val)
 {
-	return 0;
+	if (static_branch_likely(&pkey_disabled))
+		return -EINVAL;
+	return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
 static inline void pkey_mm_init(struct mm_struct *mm)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index f3bf661..4a01c2f 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -9,6 +9,7 @@
  * (at your option) any later version.
  */
 
+#include <asm/mman.h>
 #include <linux/pkeys.h>
 
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
@@ -17,6 +18,9 @@
 u32  initial_allocation_mask;	/* Bits set for reserved keys */
 
 #define AMR_BITS_PER_PKEY 2
+#define AMR_RD_BIT 0x1UL
+#define AMR_WR_BIT 0x2UL
+#define IAMR_EX_BIT 0x1UL
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
@@ -102,6 +106,20 @@ static inline void write_uamor(u64 value)
 	mtspr(SPRN_UAMOR, value);
 }
 
+static bool is_pkey_enabled(int pkey)
+{
+	u64 uamor = read_uamor();
+	u64 pkey_bits = 0x3ul << pkeyshift(pkey);
+	u64 uamor_pkey_bits = (uamor & pkey_bits);
+
+	/*
+	 * Both the bits in UAMOR corresponding to the key should be set or
+	 * reset.
+	 */
+	WARN_ON(uamor_pkey_bits && (uamor_pkey_bits != pkey_bits));
+	return !!(uamor_pkey_bits);
+}
+
 static inline void init_amr(int pkey, u8 init_bits)
 {
 	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
@@ -144,3 +162,25 @@ void __arch_deactivate_pkey(int pkey)
 {
 	pkey_status_change(pkey, false);
 }
+
+/*
+ * Set the access rights in AMR IAMR and UAMOR registers for @pkey to that
+ * specified in @init_val.
+ */
+int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+				unsigned long init_val)
+{
+	u64 new_amr_bits = 0x0ul;
+
+	if (!is_pkey_enabled(pkey))
+		return -EINVAL;
+
+	/* Set the bits we need in AMR: */
+	if (init_val & PKEY_DISABLE_ACCESS)
+		new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
+	else if (init_val & PKEY_DISABLE_WRITE)
+		new_amr_bits |= AMR_WR_BIT;
+
+	init_amr(pkey, new_amr_bits);
+	return 0;
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 09/51] powerpc: ability to create execute-disabled pkeys
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

powerpc has hardware support to disable execute on a pkey.
This patch enables the ability to create execute-disabled
keys.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
 arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index e63bc37..65065ce 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -30,4 +30,10 @@
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#undef PKEY_ACCESS_MASK
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
+				PKEY_DISABLE_WRITE  |\
+				PKEY_DISABLE_EXECUTE)
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 4a01c2f..3ddc13a 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -29,6 +29,14 @@ void __init pkey_initialize(void)
 	int os_reserved, i;
 
 	/*
+	 * We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
+	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
+	 * Ensure that the bits a distinct.
+	 */
+	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
+		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+	/*
 	 * Disable the pkey system till everything is in place. A subsequent
 	 * patch will enable it.
 	 */
@@ -171,10 +179,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 				unsigned long init_val)
 {
 	u64 new_amr_bits = 0x0ul;
+	u64 new_iamr_bits = 0x0ul;
 
 	if (!is_pkey_enabled(pkey))
 		return -EINVAL;
 
+	if (init_val & PKEY_DISABLE_EXECUTE) {
+		if (!pkey_execute_disable_supported)
+			return -EINVAL;
+		new_iamr_bits |= IAMR_EX_BIT;
+	}
+	init_iamr(pkey, new_iamr_bits);
+
 	/* Set the bits we need in AMR: */
 	if (init_val & PKEY_DISABLE_ACCESS)
 		new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 09/51] powerpc: ability to create execute-disabled pkeys
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

powerpc has hardware support to disable execute on a pkey.
This patch enables the ability to create execute-disabled
keys.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
 arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index e63bc37..65065ce 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -30,4 +30,10 @@
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#undef PKEY_ACCESS_MASK
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
+				PKEY_DISABLE_WRITE  |\
+				PKEY_DISABLE_EXECUTE)
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 4a01c2f..3ddc13a 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -29,6 +29,14 @@ void __init pkey_initialize(void)
 	int os_reserved, i;
 
 	/*
+	 * We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
+	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
+	 * Ensure that the bits a distinct.
+	 */
+	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
+		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+	/*
 	 * Disable the pkey system till everything is in place. A subsequent
 	 * patch will enable it.
 	 */
@@ -171,10 +179,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 				unsigned long init_val)
 {
 	u64 new_amr_bits = 0x0ul;
+	u64 new_iamr_bits = 0x0ul;
 
 	if (!is_pkey_enabled(pkey))
 		return -EINVAL;
 
+	if (init_val & PKEY_DISABLE_EXECUTE) {
+		if (!pkey_execute_disable_supported)
+			return -EINVAL;
+		new_iamr_bits |= IAMR_EX_BIT;
+	}
+	init_iamr(pkey, new_iamr_bits);
+
 	/* Set the bits we need in AMR: */
 	if (init_val & PKEY_DISABLE_ACCESS)
 		new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 10/51] powerpc: store and restore the pkey state across context switches
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Store and restore the AMR, IAMR and UAMOR register state of the task
before scheduling out and after scheduling in, respectively.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    3 ++
 arch/powerpc/include/asm/pkeys.h       |    4 ++
 arch/powerpc/include/asm/processor.h   |    5 +++
 arch/powerpc/kernel/process.c          |    7 ++++
 arch/powerpc/mm/pkeys.c                |   49 +++++++++++++++++++++++++++++++-
 5 files changed, 67 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 6d7c4f1..4eccc2f 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -146,6 +146,9 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
+#define thread_pkey_regs_save(thread)
+#define thread_pkey_regs_restore(new_thread, old_thread)
+#define thread_pkey_regs_init(thread)
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 652c750..0b2d9f0 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -156,5 +156,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	mm_pkey_allocation_map(mm) = initial_allocation_mask;
 }
 
+extern void thread_pkey_regs_save(struct thread_struct *thread);
+extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
+				     struct thread_struct *old_thread);
+extern void thread_pkey_regs_init(struct thread_struct *thread);
 extern void pkey_initialize(void);
 #endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fab7ff8..e3c417c 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -309,6 +309,11 @@ struct thread_struct {
 	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
 	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC_MEM_KEYS
+	unsigned long	amr;
+	unsigned long	iamr;
+	unsigned long	uamor;
+#endif
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
 	void*		kvm_shadow_vcpu; /* KVM internal data */
 #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a0c74bb..148b934 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -42,6 +42,7 @@
 #include <linux/hw_breakpoint.h>
 #include <linux/uaccess.h>
 #include <linux/elf-randomize.h>
+#include <linux/pkeys.h>
 
 #include <asm/pgtable.h>
 #include <asm/io.h>
@@ -1085,6 +1086,8 @@ static inline void save_sprs(struct thread_struct *t)
 		t->tar = mfspr(SPRN_TAR);
 	}
 #endif
+
+	thread_pkey_regs_save(t);
 }
 
 static inline void restore_sprs(struct thread_struct *old_thread,
@@ -1120,6 +1123,8 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 			mtspr(SPRN_TAR, new_thread->tar);
 	}
 #endif
+
+	thread_pkey_regs_restore(new_thread, old_thread);
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -1705,6 +1710,8 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 	current->thread.tm_tfiar = 0;
 	current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+
+	thread_pkey_regs_init(&current->thread);
 }
 EXPORT_SYMBOL(start_thread);
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 3ddc13a..469f370 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,6 +16,8 @@
 bool pkey_execute_disable_supported;
 int  pkeys_total;		/* Total pkeys as per device tree */
 u32  initial_allocation_mask;	/* Bits set for reserved keys */
+u64  pkey_amr_uamor_mask;	/* Bits in AMR/UMOR not to be touched */
+u64  pkey_iamr_mask;		/* Bits in AMR not to be touched */
 
 #define AMR_BITS_PER_PKEY 2
 #define AMR_RD_BIT 0x1UL
@@ -74,8 +76,16 @@ void __init pkey_initialize(void)
 	 * 	programming note.
 	 */
 	initial_allocation_mask = ~0x0;
-	for (i = 2; i < (pkeys_total - os_reserved); i++)
+
+	/* register mask is in BE format */
+	pkey_amr_uamor_mask = ~0x0ul;
+	pkey_iamr_mask = ~0x0ul;
+
+	for (i = 2; i < (pkeys_total - os_reserved); i++) {
 		initial_allocation_mask &= ~(0x1 << i);
+		pkey_amr_uamor_mask &= ~(0x3ul << pkeyshift(i));
+		pkey_iamr_mask &= ~(0x1ul << pkeyshift(i));
+	}
 }
 
 static inline u64 read_amr(void)
@@ -200,3 +210,40 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	init_amr(pkey, new_amr_bits);
 	return 0;
 }
+
+void thread_pkey_regs_save(struct thread_struct *thread)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return;
+
+	/*
+	 * TODO: Skip saving registers if @thread hasn't used any keys yet.
+	 */
+	thread->amr = read_amr();
+	thread->iamr = read_iamr();
+	thread->uamor = read_uamor();
+}
+
+void thread_pkey_regs_restore(struct thread_struct *new_thread,
+			      struct thread_struct *old_thread)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return;
+
+	/*
+	 * TODO: Just set UAMOR to zero if @new_thread hasn't used any keys yet.
+	 */
+	if (old_thread->amr != new_thread->amr)
+		write_amr(new_thread->amr);
+	if (old_thread->iamr != new_thread->iamr)
+		write_iamr(new_thread->iamr);
+	if (old_thread->uamor != new_thread->uamor)
+		write_uamor(new_thread->uamor);
+}
+
+void thread_pkey_regs_init(struct thread_struct *thread)
+{
+	write_amr(read_amr() & pkey_amr_uamor_mask);
+	write_iamr(read_iamr() & pkey_iamr_mask);
+	write_uamor(read_uamor() & pkey_amr_uamor_mask);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 10/51] powerpc: store and restore the pkey state across context switches
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Store and restore the AMR, IAMR and UAMOR register state of the task
before scheduling out and after scheduling in, respectively.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    3 ++
 arch/powerpc/include/asm/pkeys.h       |    4 ++
 arch/powerpc/include/asm/processor.h   |    5 +++
 arch/powerpc/kernel/process.c          |    7 ++++
 arch/powerpc/mm/pkeys.c                |   49 +++++++++++++++++++++++++++++++-
 5 files changed, 67 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 6d7c4f1..4eccc2f 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -146,6 +146,9 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
+#define thread_pkey_regs_save(thread)
+#define thread_pkey_regs_restore(new_thread, old_thread)
+#define thread_pkey_regs_init(thread)
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 652c750..0b2d9f0 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -156,5 +156,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	mm_pkey_allocation_map(mm) = initial_allocation_mask;
 }
 
+extern void thread_pkey_regs_save(struct thread_struct *thread);
+extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
+				     struct thread_struct *old_thread);
+extern void thread_pkey_regs_init(struct thread_struct *thread);
 extern void pkey_initialize(void);
 #endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fab7ff8..e3c417c 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -309,6 +309,11 @@ struct thread_struct {
 	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
 	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC_MEM_KEYS
+	unsigned long	amr;
+	unsigned long	iamr;
+	unsigned long	uamor;
+#endif
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
 	void*		kvm_shadow_vcpu; /* KVM internal data */
 #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a0c74bb..148b934 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -42,6 +42,7 @@
 #include <linux/hw_breakpoint.h>
 #include <linux/uaccess.h>
 #include <linux/elf-randomize.h>
+#include <linux/pkeys.h>
 
 #include <asm/pgtable.h>
 #include <asm/io.h>
@@ -1085,6 +1086,8 @@ static inline void save_sprs(struct thread_struct *t)
 		t->tar = mfspr(SPRN_TAR);
 	}
 #endif
+
+	thread_pkey_regs_save(t);
 }
 
 static inline void restore_sprs(struct thread_struct *old_thread,
@@ -1120,6 +1123,8 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 			mtspr(SPRN_TAR, new_thread->tar);
 	}
 #endif
+
+	thread_pkey_regs_restore(new_thread, old_thread);
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -1705,6 +1710,8 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 	current->thread.tm_tfiar = 0;
 	current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+
+	thread_pkey_regs_init(&current->thread);
 }
 EXPORT_SYMBOL(start_thread);
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 3ddc13a..469f370 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,6 +16,8 @@
 bool pkey_execute_disable_supported;
 int  pkeys_total;		/* Total pkeys as per device tree */
 u32  initial_allocation_mask;	/* Bits set for reserved keys */
+u64  pkey_amr_uamor_mask;	/* Bits in AMR/UMOR not to be touched */
+u64  pkey_iamr_mask;		/* Bits in AMR not to be touched */
 
 #define AMR_BITS_PER_PKEY 2
 #define AMR_RD_BIT 0x1UL
@@ -74,8 +76,16 @@ void __init pkey_initialize(void)
 	 * 	programming note.
 	 */
 	initial_allocation_mask = ~0x0;
-	for (i = 2; i < (pkeys_total - os_reserved); i++)
+
+	/* register mask is in BE format */
+	pkey_amr_uamor_mask = ~0x0ul;
+	pkey_iamr_mask = ~0x0ul;
+
+	for (i = 2; i < (pkeys_total - os_reserved); i++) {
 		initial_allocation_mask &= ~(0x1 << i);
+		pkey_amr_uamor_mask &= ~(0x3ul << pkeyshift(i));
+		pkey_iamr_mask &= ~(0x1ul << pkeyshift(i));
+	}
 }
 
 static inline u64 read_amr(void)
@@ -200,3 +210,40 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	init_amr(pkey, new_amr_bits);
 	return 0;
 }
+
+void thread_pkey_regs_save(struct thread_struct *thread)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return;
+
+	/*
+	 * TODO: Skip saving registers if @thread hasn't used any keys yet.
+	 */
+	thread->amr = read_amr();
+	thread->iamr = read_iamr();
+	thread->uamor = read_uamor();
+}
+
+void thread_pkey_regs_restore(struct thread_struct *new_thread,
+			      struct thread_struct *old_thread)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return;
+
+	/*
+	 * TODO: Just set UAMOR to zero if @new_thread hasn't used any keys yet.
+	 */
+	if (old_thread->amr != new_thread->amr)
+		write_amr(new_thread->amr);
+	if (old_thread->iamr != new_thread->iamr)
+		write_iamr(new_thread->iamr);
+	if (old_thread->uamor != new_thread->uamor)
+		write_uamor(new_thread->uamor);
+}
+
+void thread_pkey_regs_init(struct thread_struct *thread)
+{
+	write_amr(read_amr() & pkey_amr_uamor_mask);
+	write_iamr(read_iamr() & pkey_iamr_mask);
+	write_uamor(read_uamor() & pkey_amr_uamor_mask);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 11/51] powerpc: introduce execute-only pkey
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This patch provides the implementation of execute-only pkey.
The architecture-independent layer expects the arch-dependent
layer, to support the ability to create and enable a special
key which has execute-only permission.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
 arch/powerpc/include/asm/pkeys.h         |    8 ++++-
 arch/powerpc/mm/pkeys.c                  |   56 ++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index df17fbc..44dbc91 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -116,6 +116,7 @@ struct patb_entry {
 	 * bit unset -> key available for allocation
 	 */
 	u32 pkey_allocation_map;
+	s16 execute_only_pkey; /* key holding execute-only protection */
 #endif
 } mm_context_t;
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0b2d9f0..20d1f0e 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -128,9 +128,13 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
  * Try to dedicate one of the protection keys to be used as an
  * execute-only protection key.
  */
+extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	return 0;
+	if (static_branch_likely(&pkey_disabled))
+		return -1;
+
+	return __execute_only_pkey(mm);
 }
 
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
@@ -154,6 +158,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	if (static_branch_likely(&pkey_disabled))
 		return;
 	mm_pkey_allocation_map(mm) = initial_allocation_mask;
+	/* -1 means unallocated or invalid */
+	mm->context.execute_only_pkey = -1;
 }
 
 extern void thread_pkey_regs_save(struct thread_struct *thread);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 469f370..5da94fe 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -247,3 +247,59 @@ void thread_pkey_regs_init(struct thread_struct *thread)
 	write_iamr(read_iamr() & pkey_iamr_mask);
 	write_uamor(read_uamor() & pkey_amr_uamor_mask);
 }
+
+static inline bool pkey_allows_readwrite(int pkey)
+{
+	int pkey_shift = pkeyshift(pkey);
+
+	if (!is_pkey_enabled(pkey))
+		return true;
+
+	return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
+}
+
+int __execute_only_pkey(struct mm_struct *mm)
+{
+	bool need_to_set_mm_pkey = false;
+	int execute_only_pkey = mm->context.execute_only_pkey;
+	int ret;
+
+	/* Do we need to assign a pkey for mm's execute-only maps? */
+	if (execute_only_pkey == -1) {
+		/* Go allocate one to use, which might fail */
+		execute_only_pkey = mm_pkey_alloc(mm);
+		if (execute_only_pkey < 0)
+			return -1;
+		need_to_set_mm_pkey = true;
+	}
+
+	/*
+	 * We do not want to go through the relatively costly dance to set AMR
+	 * if we do not need to. Check it first and assume that if the
+	 * execute-only pkey is readwrite-disabled than we do not have to set it
+	 * ourselves.
+	 */
+	if (!need_to_set_mm_pkey && !pkey_allows_readwrite(execute_only_pkey))
+		return execute_only_pkey;
+
+	/*
+	 * Set up AMR so that it denies access for everything other than
+	 * execution.
+	 */
+	ret = __arch_set_user_pkey_access(current, execute_only_pkey,
+					  PKEY_DISABLE_ACCESS |
+					  PKEY_DISABLE_WRITE);
+	/*
+	 * If the AMR-set operation failed somehow, just return 0 and
+	 * effectively disable execute-only support.
+	 */
+	if (ret) {
+		mm_pkey_free(mm, execute_only_pkey);
+		return -1;
+	}
+
+	/* We got one, store it and use it from here on out */
+	if (need_to_set_mm_pkey)
+		mm->context.execute_only_pkey = execute_only_pkey;
+	return execute_only_pkey;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 11/51] powerpc: introduce execute-only pkey
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This patch provides the implementation of execute-only pkey.
The architecture-independent layer expects the arch-dependent
layer, to support the ability to create and enable a special
key which has execute-only permission.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
 arch/powerpc/include/asm/pkeys.h         |    8 ++++-
 arch/powerpc/mm/pkeys.c                  |   56 ++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index df17fbc..44dbc91 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -116,6 +116,7 @@ struct patb_entry {
 	 * bit unset -> key available for allocation
 	 */
 	u32 pkey_allocation_map;
+	s16 execute_only_pkey; /* key holding execute-only protection */
 #endif
 } mm_context_t;
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0b2d9f0..20d1f0e 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -128,9 +128,13 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
  * Try to dedicate one of the protection keys to be used as an
  * execute-only protection key.
  */
+extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	return 0;
+	if (static_branch_likely(&pkey_disabled))
+		return -1;
+
+	return __execute_only_pkey(mm);
 }
 
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
@@ -154,6 +158,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	if (static_branch_likely(&pkey_disabled))
 		return;
 	mm_pkey_allocation_map(mm) = initial_allocation_mask;
+	/* -1 means unallocated or invalid */
+	mm->context.execute_only_pkey = -1;
 }
 
 extern void thread_pkey_regs_save(struct thread_struct *thread);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 469f370..5da94fe 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -247,3 +247,59 @@ void thread_pkey_regs_init(struct thread_struct *thread)
 	write_iamr(read_iamr() & pkey_iamr_mask);
 	write_uamor(read_uamor() & pkey_amr_uamor_mask);
 }
+
+static inline bool pkey_allows_readwrite(int pkey)
+{
+	int pkey_shift = pkeyshift(pkey);
+
+	if (!is_pkey_enabled(pkey))
+		return true;
+
+	return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
+}
+
+int __execute_only_pkey(struct mm_struct *mm)
+{
+	bool need_to_set_mm_pkey = false;
+	int execute_only_pkey = mm->context.execute_only_pkey;
+	int ret;
+
+	/* Do we need to assign a pkey for mm's execute-only maps? */
+	if (execute_only_pkey == -1) {
+		/* Go allocate one to use, which might fail */
+		execute_only_pkey = mm_pkey_alloc(mm);
+		if (execute_only_pkey < 0)
+			return -1;
+		need_to_set_mm_pkey = true;
+	}
+
+	/*
+	 * We do not want to go through the relatively costly dance to set AMR
+	 * if we do not need to. Check it first and assume that if the
+	 * execute-only pkey is readwrite-disabled than we do not have to set it
+	 * ourselves.
+	 */
+	if (!need_to_set_mm_pkey && !pkey_allows_readwrite(execute_only_pkey))
+		return execute_only_pkey;
+
+	/*
+	 * Set up AMR so that it denies access for everything other than
+	 * execution.
+	 */
+	ret = __arch_set_user_pkey_access(current, execute_only_pkey,
+					  PKEY_DISABLE_ACCESS |
+					  PKEY_DISABLE_WRITE);
+	/*
+	 * If the AMR-set operation failed somehow, just return 0 and
+	 * effectively disable execute-only support.
+	 */
+	if (ret) {
+		mm_pkey_free(mm, execute_only_pkey);
+		return -1;
+	}
+
+	/* We got one, store it and use it from here on out */
+	if (need_to_set_mm_pkey)
+		mm->context.execute_only_pkey = execute_only_pkey;
+	return execute_only_pkey;
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 12/51] powerpc: ability to associate pkey to a vma
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

arch-independent code expects the arch to  map
a  pkey  into the vma's protection bit setting.
The patch provides that ability.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mman.h  |    7 ++++++-
 arch/powerpc/include/asm/pkeys.h |   11 +++++++++++
 arch/powerpc/mm/pkeys.c          |    8 ++++++++
 3 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 30922f6..2999478 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include <asm/cputable.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <asm/cpu_has_feature.h>
 
 /*
@@ -22,7 +23,11 @@
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 		unsigned long pkey)
 {
-	return (prot & PROT_SAO) ? VM_SAO : 0;
+#ifdef CONFIG_PPC_MEM_KEYS
+	return (((prot & PROT_SAO) ? VM_SAO : 0) | pkey_to_vmflag_bits(pkey));
+#else
+	return ((prot & PROT_SAO) ? VM_SAO : 0);
+#endif
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 20d1f0e..1bd41ef 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -41,6 +41,17 @@
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 			    VM_PKEY_BIT3 | VM_PKEY_BIT4)
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS | \
+				PKEY_DISABLE_WRITE  | \
+				PKEY_DISABLE_EXECUTE)
+
+static inline u64 pkey_to_vmflag_bits(u16 pkey)
+{
+	return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
+}
+
 #define arch_max_pkey() pkeys_total
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 5da94fe..4d704ea 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -39,6 +39,14 @@ void __init pkey_initialize(void)
 		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
 	/*
+	 * pkey_to_vmflag_bits() assumes that the pkey bits are contiguous
+	 * in the vmaflag. Make sure that is really the case.
+	 */
+	BUILD_BUG_ON(__builtin_clzl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) +
+		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
+				!= (sizeof(u64) * BITS_PER_BYTE));
+
+	/*
 	 * Disable the pkey system till everything is in place. A subsequent
 	 * patch will enable it.
 	 */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 12/51] powerpc: ability to associate pkey to a vma
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

arch-independent code expects the arch to  map
a  pkey  into the vma's protection bit setting.
The patch provides that ability.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mman.h  |    7 ++++++-
 arch/powerpc/include/asm/pkeys.h |   11 +++++++++++
 arch/powerpc/mm/pkeys.c          |    8 ++++++++
 3 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 30922f6..2999478 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include <asm/cputable.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <asm/cpu_has_feature.h>
 
 /*
@@ -22,7 +23,11 @@
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 		unsigned long pkey)
 {
-	return (prot & PROT_SAO) ? VM_SAO : 0;
+#ifdef CONFIG_PPC_MEM_KEYS
+	return (((prot & PROT_SAO) ? VM_SAO : 0) | pkey_to_vmflag_bits(pkey));
+#else
+	return ((prot & PROT_SAO) ? VM_SAO : 0);
+#endif
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 20d1f0e..1bd41ef 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -41,6 +41,17 @@
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 			    VM_PKEY_BIT3 | VM_PKEY_BIT4)
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS | \
+				PKEY_DISABLE_WRITE  | \
+				PKEY_DISABLE_EXECUTE)
+
+static inline u64 pkey_to_vmflag_bits(u16 pkey)
+{
+	return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
+}
+
 #define arch_max_pkey() pkeys_total
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 5da94fe..4d704ea 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -39,6 +39,14 @@ void __init pkey_initialize(void)
 		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
 	/*
+	 * pkey_to_vmflag_bits() assumes that the pkey bits are contiguous
+	 * in the vmaflag. Make sure that is really the case.
+	 */
+	BUILD_BUG_ON(__builtin_clzl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) +
+		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
+				!= (sizeof(u64) * BITS_PER_BYTE));
+
+	/*
 	 * Disable the pkey system till everything is in place. A subsequent
 	 * patch will enable it.
 	 */
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 13/51] powerpc: implementation for arch_override_mprotect_pkey()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

arch independent code calls arch_override_mprotect_pkey()
to return a pkey that best matches the requested protection.

This patch provides the implementation.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 ++++
 arch/powerpc/include/asm/pkeys.h       |   21 +++++++++++++++++-
 arch/powerpc/mm/pkeys.c                |   36 ++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 4eccc2f..a83d540 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -149,6 +149,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #define thread_pkey_regs_save(thread)
 #define thread_pkey_regs_restore(new_thread, old_thread)
 #define thread_pkey_regs_init(thread)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	return 0;
+}
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 1bd41ef..441bbf3 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -52,6 +52,13 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 	return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return 0;
+	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
+}
+
 #define arch_max_pkey() pkeys_total
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
@@ -148,10 +155,22 @@ static inline int execute_only_pkey(struct mm_struct *mm)
 	return __execute_only_pkey(mm);
 }
 
+extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
+					 int prot, int pkey);
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 					      int prot, int pkey)
 {
-	return 0;
+	if (static_branch_likely(&pkey_disabled))
+		return 0;
+
+	/*
+	 * Is this an mprotect_pkey() call? If so, never override the value that
+	 * came from the user.
+	 */
+	if (pkey != -1)
+		return pkey;
+
+	return __arch_override_mprotect_pkey(vma, prot, pkey);
 }
 
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 4d704ea..f1c6195 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -311,3 +311,39 @@ int __execute_only_pkey(struct mm_struct *mm)
 		mm->context.execute_only_pkey = execute_only_pkey;
 	return execute_only_pkey;
 }
+
+static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
+{
+	/* Do this check first since the vm_flags should be hot */
+	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+		return false;
+
+	return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
+}
+
+/*
+ * This should only be called for *plain* mprotect calls.
+ */
+int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
+				  int pkey)
+{
+	/*
+	 * If the currently associated pkey is execute-only, but the requested
+	 * protection requires read or write, move it back to the default pkey.
+	 */
+	if (vma_is_pkey_exec_only(vma) && (prot & (PROT_READ | PROT_WRITE)))
+		return 0;
+
+	/*
+	 * The requested protection is execute-only. Hence let's use an
+	 * execute-only pkey.
+	 */
+	if (prot == PROT_EXEC) {
+		pkey = execute_only_pkey(vma->vm_mm);
+		if (pkey > 0)
+			return pkey;
+	}
+
+	/* Nothing to override. */
+	return vma_pkey(vma);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 13/51] powerpc: implementation for arch_override_mprotect_pkey()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

arch independent code calls arch_override_mprotect_pkey()
to return a pkey that best matches the requested protection.

This patch provides the implementation.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 ++++
 arch/powerpc/include/asm/pkeys.h       |   21 +++++++++++++++++-
 arch/powerpc/mm/pkeys.c                |   36 ++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 4eccc2f..a83d540 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -149,6 +149,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #define thread_pkey_regs_save(thread)
 #define thread_pkey_regs_restore(new_thread, old_thread)
 #define thread_pkey_regs_init(thread)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	return 0;
+}
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 1bd41ef..441bbf3 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -52,6 +52,13 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 	return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return 0;
+	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
+}
+
 #define arch_max_pkey() pkeys_total
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
@@ -148,10 +155,22 @@ static inline int execute_only_pkey(struct mm_struct *mm)
 	return __execute_only_pkey(mm);
 }
 
+extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
+					 int prot, int pkey);
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 					      int prot, int pkey)
 {
-	return 0;
+	if (static_branch_likely(&pkey_disabled))
+		return 0;
+
+	/*
+	 * Is this an mprotect_pkey() call? If so, never override the value that
+	 * came from the user.
+	 */
+	if (pkey != -1)
+		return pkey;
+
+	return __arch_override_mprotect_pkey(vma, prot, pkey);
 }
 
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 4d704ea..f1c6195 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -311,3 +311,39 @@ int __execute_only_pkey(struct mm_struct *mm)
 		mm->context.execute_only_pkey = execute_only_pkey;
 	return execute_only_pkey;
 }
+
+static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
+{
+	/* Do this check first since the vm_flags should be hot */
+	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+		return false;
+
+	return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
+}
+
+/*
+ * This should only be called for *plain* mprotect calls.
+ */
+int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
+				  int pkey)
+{
+	/*
+	 * If the currently associated pkey is execute-only, but the requested
+	 * protection requires read or write, move it back to the default pkey.
+	 */
+	if (vma_is_pkey_exec_only(vma) && (prot & (PROT_READ | PROT_WRITE)))
+		return 0;
+
+	/*
+	 * The requested protection is execute-only. Hence let's use an
+	 * execute-only pkey.
+	 */
+	if (prot == PROT_EXEC) {
+		pkey = execute_only_pkey(vma->vm_mm);
+		if (pkey > 0)
+			return pkey;
+	}
+
+	/* Nothing to override. */
+	return vma_pkey(vma);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 14/51] powerpc: map vma key-protection bits to pte key bits.
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Map  the  key  protection  bits of the vma to the pkey bits in
the PTE.

The PTE  bits used  for pkey  are  3,4,5,6  and 57. The  first
four bits are the same four bits that were freed up  initially
in this patch series. remember? :-) Without those four bits
this patch wouldn't be possible.

BUT, on 4k kernel, bit 3, and 4 could not be freed up. remember?
Hence we have to be satisfied with 5, 6 and 7.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   25 ++++++++++++++++++++++++-
 arch/powerpc/include/asm/mman.h              |    6 ++++++
 arch/powerpc/include/asm/pkeys.h             |   12 ++++++++++++
 3 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 9a677cd..4c1ee6e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -39,6 +39,7 @@
 #define _RPAGE_RSV2		0x0800000000000000UL
 #define _RPAGE_RSV3		0x0400000000000000UL
 #define _RPAGE_RSV4		0x0200000000000000UL
+#define _RPAGE_RSV5		0x00040UL
 
 #define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
 #define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
@@ -58,6 +59,25 @@
 /* Max physical address bit as per radix table */
 #define _RPAGE_PA_MAX		57
 
+#ifdef CONFIG_PPC_MEM_KEYS
+#ifdef CONFIG_PPC_64K_PAGES
+#define H_PTE_PKEY_BIT0	_RPAGE_RSV1
+#define H_PTE_PKEY_BIT1	_RPAGE_RSV2
+#else /* CONFIG_PPC_64K_PAGES */
+#define H_PTE_PKEY_BIT0	0 /* _RPAGE_RSV1 is not available */
+#define H_PTE_PKEY_BIT1	0 /* _RPAGE_RSV2 is not available */
+#endif /* CONFIG_PPC_64K_PAGES */
+#define H_PTE_PKEY_BIT2	_RPAGE_RSV3
+#define H_PTE_PKEY_BIT3	_RPAGE_RSV4
+#define H_PTE_PKEY_BIT4	_RPAGE_RSV5
+#else /*  CONFIG_PPC_MEM_KEYS */
+#define H_PTE_PKEY_BIT0	0
+#define H_PTE_PKEY_BIT1	0
+#define H_PTE_PKEY_BIT2	0
+#define H_PTE_PKEY_BIT3	0
+#define H_PTE_PKEY_BIT4	0
+#endif /*  CONFIG_PPC_MEM_KEYS */
+
 /*
  * Max physical address bit we will use for now.
  *
@@ -121,13 +141,16 @@
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
 			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |	\
 			 _PAGE_SOFT_DIRTY)
+
+#define H_PTE_PKEY  (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \
+		     H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4)
 /*
  * Mask of bits returned by pte_pgprot()
  */
 #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
 			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
 			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
-			 _PAGE_SOFT_DIRTY)
+			 _PAGE_SOFT_DIRTY | H_PTE_PKEY)
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 2999478..07e3f54 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -33,7 +33,13 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
+#ifdef CONFIG_PPC_MEM_KEYS
+	return (vm_flags & VM_SAO) ?
+		__pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) :
+		__pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags));
+#else
 	return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
+#endif
 }
 #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 441bbf3..cfe61a9 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -52,6 +52,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 	return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
+static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return 0x0UL;
+
+	return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT4 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT1 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT0 : 0x0UL));
+}
+
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
 	if (static_branch_likely(&pkey_disabled))
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 14/51] powerpc: map vma key-protection bits to pte key bits.
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Map  the  key  protection  bits of the vma to the pkey bits in
the PTE.

The PTE  bits used  for pkey  are  3,4,5,6  and 57. The  first
four bits are the same four bits that were freed up  initially
in this patch series. remember? :-) Without those four bits
this patch wouldn't be possible.

BUT, on 4k kernel, bit 3, and 4 could not be freed up. remember?
Hence we have to be satisfied with 5, 6 and 7.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   25 ++++++++++++++++++++++++-
 arch/powerpc/include/asm/mman.h              |    6 ++++++
 arch/powerpc/include/asm/pkeys.h             |   12 ++++++++++++
 3 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 9a677cd..4c1ee6e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -39,6 +39,7 @@
 #define _RPAGE_RSV2		0x0800000000000000UL
 #define _RPAGE_RSV3		0x0400000000000000UL
 #define _RPAGE_RSV4		0x0200000000000000UL
+#define _RPAGE_RSV5		0x00040UL
 
 #define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
 #define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
@@ -58,6 +59,25 @@
 /* Max physical address bit as per radix table */
 #define _RPAGE_PA_MAX		57
 
+#ifdef CONFIG_PPC_MEM_KEYS
+#ifdef CONFIG_PPC_64K_PAGES
+#define H_PTE_PKEY_BIT0	_RPAGE_RSV1
+#define H_PTE_PKEY_BIT1	_RPAGE_RSV2
+#else /* CONFIG_PPC_64K_PAGES */
+#define H_PTE_PKEY_BIT0	0 /* _RPAGE_RSV1 is not available */
+#define H_PTE_PKEY_BIT1	0 /* _RPAGE_RSV2 is not available */
+#endif /* CONFIG_PPC_64K_PAGES */
+#define H_PTE_PKEY_BIT2	_RPAGE_RSV3
+#define H_PTE_PKEY_BIT3	_RPAGE_RSV4
+#define H_PTE_PKEY_BIT4	_RPAGE_RSV5
+#else /*  CONFIG_PPC_MEM_KEYS */
+#define H_PTE_PKEY_BIT0	0
+#define H_PTE_PKEY_BIT1	0
+#define H_PTE_PKEY_BIT2	0
+#define H_PTE_PKEY_BIT3	0
+#define H_PTE_PKEY_BIT4	0
+#endif /*  CONFIG_PPC_MEM_KEYS */
+
 /*
  * Max physical address bit we will use for now.
  *
@@ -121,13 +141,16 @@
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
 			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |	\
 			 _PAGE_SOFT_DIRTY)
+
+#define H_PTE_PKEY  (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \
+		     H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4)
 /*
  * Mask of bits returned by pte_pgprot()
  */
 #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
 			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
 			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
-			 _PAGE_SOFT_DIRTY)
+			 _PAGE_SOFT_DIRTY | H_PTE_PKEY)
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 2999478..07e3f54 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -33,7 +33,13 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
+#ifdef CONFIG_PPC_MEM_KEYS
+	return (vm_flags & VM_SAO) ?
+		__pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) :
+		__pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags));
+#else
 	return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
+#endif
 }
 #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 441bbf3..cfe61a9 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -52,6 +52,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 	return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
+static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return 0x0UL;
+
+	return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT4 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT1 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT0 : 0x0UL));
+}
+
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
 	if (static_branch_likely(&pkey_disabled))
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 15/51] powerpc: Program HPTE key protection bits
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Map the PTE protection key bits to the HPTE key protection bits,
while creating HPTE  entries.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |    5 +++++
 arch/powerpc/include/asm/mmu_context.h        |    6 ++++++
 arch/powerpc/include/asm/pkeys.h              |    9 +++++++++
 arch/powerpc/mm/hash_utils_64.c               |    1 +
 4 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 508275b..2e22357 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -90,6 +90,8 @@
 #define HPTE_R_PP0		ASM_CONST(0x8000000000000000)
 #define HPTE_R_TS		ASM_CONST(0x4000000000000000)
 #define HPTE_R_KEY_HI		ASM_CONST(0x3000000000000000)
+#define HPTE_R_KEY_BIT0		ASM_CONST(0x2000000000000000)
+#define HPTE_R_KEY_BIT1		ASM_CONST(0x1000000000000000)
 #define HPTE_R_RPN_SHIFT	12
 #define HPTE_R_RPN		ASM_CONST(0x0ffffffffffff000)
 #define HPTE_R_RPN_3_0		ASM_CONST(0x01fffffffffff000)
@@ -104,6 +106,9 @@
 #define HPTE_R_C		ASM_CONST(0x0000000000000080)
 #define HPTE_R_R		ASM_CONST(0x0000000000000100)
 #define HPTE_R_KEY_LO		ASM_CONST(0x0000000000000e00)
+#define HPTE_R_KEY_BIT2		ASM_CONST(0x0000000000000800)
+#define HPTE_R_KEY_BIT3		ASM_CONST(0x0000000000000400)
+#define HPTE_R_KEY_BIT4		ASM_CONST(0x0000000000000200)
 #define HPTE_R_KEY		(HPTE_R_KEY_LO | HPTE_R_KEY_HI)
 
 #define HPTE_V_1TB_SEG		ASM_CONST(0x4000000000000000)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index a83d540..a557735 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -154,6 +154,12 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 {
 	return 0;
 }
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+	return 0x0UL;
+}
+
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index cfe61a9..06a58fe 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -73,6 +73,15 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 #define arch_max_pkey() pkeys_total
 
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+	return (((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
+}
+
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
 #define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1e74590..ddfc673 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -232,6 +232,7 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 		 */
 		rflags |= HPTE_R_M;
 
+	rflags |= pte_to_hpte_pkey_bits(pteflags);
 	return rflags;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 15/51] powerpc: Program HPTE key protection bits
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Map the PTE protection key bits to the HPTE key protection bits,
while creating HPTE  entries.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |    5 +++++
 arch/powerpc/include/asm/mmu_context.h        |    6 ++++++
 arch/powerpc/include/asm/pkeys.h              |    9 +++++++++
 arch/powerpc/mm/hash_utils_64.c               |    1 +
 4 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 508275b..2e22357 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -90,6 +90,8 @@
 #define HPTE_R_PP0		ASM_CONST(0x8000000000000000)
 #define HPTE_R_TS		ASM_CONST(0x4000000000000000)
 #define HPTE_R_KEY_HI		ASM_CONST(0x3000000000000000)
+#define HPTE_R_KEY_BIT0		ASM_CONST(0x2000000000000000)
+#define HPTE_R_KEY_BIT1		ASM_CONST(0x1000000000000000)
 #define HPTE_R_RPN_SHIFT	12
 #define HPTE_R_RPN		ASM_CONST(0x0ffffffffffff000)
 #define HPTE_R_RPN_3_0		ASM_CONST(0x01fffffffffff000)
@@ -104,6 +106,9 @@
 #define HPTE_R_C		ASM_CONST(0x0000000000000080)
 #define HPTE_R_R		ASM_CONST(0x0000000000000100)
 #define HPTE_R_KEY_LO		ASM_CONST(0x0000000000000e00)
+#define HPTE_R_KEY_BIT2		ASM_CONST(0x0000000000000800)
+#define HPTE_R_KEY_BIT3		ASM_CONST(0x0000000000000400)
+#define HPTE_R_KEY_BIT4		ASM_CONST(0x0000000000000200)
 #define HPTE_R_KEY		(HPTE_R_KEY_LO | HPTE_R_KEY_HI)
 
 #define HPTE_V_1TB_SEG		ASM_CONST(0x4000000000000000)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index a83d540..a557735 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -154,6 +154,12 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 {
 	return 0;
 }
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+	return 0x0UL;
+}
+
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index cfe61a9..06a58fe 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -73,6 +73,15 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 #define arch_max_pkey() pkeys_total
 
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+	return (((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
+}
+
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
 #define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1e74590..ddfc673 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -232,6 +232,7 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 		 */
 		rflags |= HPTE_R_M;
 
+	rflags |= pte_to_hpte_pkey_bits(pteflags);
 	return rflags;
 }
 
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 16/51] powerpc: helper to validate key-access permissions of a pte
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

helper function that checks if the read/write/execute is allowed
on the pte.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |    4 +++
 arch/powerpc/include/asm/pkeys.h             |    9 ++++++++
 arch/powerpc/mm/pkeys.c                      |   28 ++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 4c1ee6e..c277a63 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -462,6 +462,10 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 		pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 06a58fe..3437a50 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -82,6 +82,15 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
 		((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
 }
 
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+	return (((pteflags & H_PTE_PKEY_BIT0) ? 0x10 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT1) ? 0x8 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT3) ? 0x2 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT4) ? 0x1 : 0x0UL));
+}
+
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
 #define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index f1c6195..13902be 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -347,3 +347,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
 	/* Nothing to override. */
 	return vma_pkey(vma);
 }
+
+static bool pkey_access_permitted(int pkey, bool write, bool execute)
+{
+	int pkey_shift;
+	u64 amr;
+
+	if (!pkey)
+		return true;
+
+	if (!is_pkey_enabled(pkey))
+		return true;
+
+	pkey_shift = pkeyshift(pkey);
+	if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
+		return true;
+
+	amr = read_amr(); /* Delay reading amr until absolutely needed */
+	return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
+		(write &&  !(amr & (AMR_WR_BIT << pkey_shift))));
+}
+
+bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return true;
+
+	return pkey_access_permitted(pte_to_pkey_bits(pte), write, execute);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 16/51] powerpc: helper to validate key-access permissions of a pte
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

helper function that checks if the read/write/execute is allowed
on the pte.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |    4 +++
 arch/powerpc/include/asm/pkeys.h             |    9 ++++++++
 arch/powerpc/mm/pkeys.c                      |   28 ++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 4c1ee6e..c277a63 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -462,6 +462,10 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 		pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 06a58fe..3437a50 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -82,6 +82,15 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
 		((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
 }
 
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+	return (((pteflags & H_PTE_PKEY_BIT0) ? 0x10 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT1) ? 0x8 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT3) ? 0x2 : 0x0UL) |
+		((pteflags & H_PTE_PKEY_BIT4) ? 0x1 : 0x0UL));
+}
+
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
 #define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index f1c6195..13902be 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -347,3 +347,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
 	/* Nothing to override. */
 	return vma_pkey(vma);
 }
+
+static bool pkey_access_permitted(int pkey, bool write, bool execute)
+{
+	int pkey_shift;
+	u64 amr;
+
+	if (!pkey)
+		return true;
+
+	if (!is_pkey_enabled(pkey))
+		return true;
+
+	pkey_shift = pkeyshift(pkey);
+	if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
+		return true;
+
+	amr = read_amr(); /* Delay reading amr until absolutely needed */
+	return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
+		(write &&  !(amr & (AMR_WR_BIT << pkey_shift))));
+}
+
+bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return true;
+
+	return pkey_access_permitted(pte_to_pkey_bits(pte), write, execute);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 17/51] powerpc: check key protection for user page access
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Make sure that the kernel does not access user pages without
checking their key-protection.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c277a63..5ecb846 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -464,6 +464,19 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 
 #ifdef CONFIG_PPC_MEM_KEYS
 extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+
+#define pte_access_permitted(pte, write) \
+	(pte_present(pte) && \
+	 ((!(write) || pte_write(pte)) && \
+	  arch_pte_access_permitted(pte_val(pte), !!write, 0)))
+
+/*
+ * We store key in pmd for huge tlb pages. So need to check for key protection.
+ */
+#define pmd_access_permitted(pmd, write) \
+	(pmd_present(pmd) && \
+	 ((!(write) || pmd_write(pmd)) && \
+	  arch_pte_access_permitted(pmd_val(pmd), !!write, 0)))
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 17/51] powerpc: check key protection for user page access
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Make sure that the kernel does not access user pages without
checking their key-protection.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c277a63..5ecb846 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -464,6 +464,19 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 
 #ifdef CONFIG_PPC_MEM_KEYS
 extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+
+#define pte_access_permitted(pte, write) \
+	(pte_present(pte) && \
+	 ((!(write) || pte_write(pte)) && \
+	  arch_pte_access_permitted(pte_val(pte), !!write, 0)))
+
+/*
+ * We store key in pmd for huge tlb pages. So need to check for key protection.
+ */
+#define pmd_access_permitted(pmd, write) \
+	(pmd_present(pmd) && \
+	 ((!(write) || pmd_write(pmd)) && \
+	  arch_pte_access_permitted(pmd_val(pmd), !!write, 0)))
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 18/51] powerpc: implementation for arch_vma_access_permitted()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This patch provides the implementation for
arch_vma_access_permitted(). Returns true if the
requested access is allowed by pkey associated with the
vma.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 +++-
 arch/powerpc/mm/pkeys.c                |   34 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index a557735..95a3288 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -136,6 +136,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 {
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
+			       bool execute, bool foreign);
+#else /* CONFIG_PPC_MEM_KEYS */
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 		bool write, bool execute, bool foreign)
 {
@@ -143,7 +147,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 	return true;
 }
 
-#ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
 #define thread_pkey_regs_save(thread)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 13902be..3b221bd 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -375,3 +375,37 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
 
 	return pkey_access_permitted(pte_to_pkey_bits(pte), write, execute);
 }
+
+/*
+ * We only want to enforce protection keys on the current thread because we
+ * effectively have no access to AMR/IAMR for other threads or any way to tell
+ * which AMR/IAMR in a threaded process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current mm, or if we are
+ * in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+	if (!current->mm)
+		return true;
+
+	/* if it is not our ->mm, it has to be foreign */
+	if (current->mm != vma->vm_mm)
+		return true;
+
+	return false;
+}
+
+bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
+			       bool execute, bool foreign)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return true;
+	/*
+	 * Do not enforce our key-permissions on a foreign vma.
+	 */
+	if (foreign || vma_is_foreign(vma))
+		return true;
+
+	return pkey_access_permitted(vma_pkey(vma), write, execute);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 18/51] powerpc: implementation for arch_vma_access_permitted()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This patch provides the implementation for
arch_vma_access_permitted(). Returns true if the
requested access is allowed by pkey associated with the
vma.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 +++-
 arch/powerpc/mm/pkeys.c                |   34 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index a557735..95a3288 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -136,6 +136,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 {
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
+			       bool execute, bool foreign);
+#else /* CONFIG_PPC_MEM_KEYS */
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 		bool write, bool execute, bool foreign)
 {
@@ -143,7 +147,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 	return true;
 }
 
-#ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
 #define thread_pkey_regs_save(thread)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 13902be..3b221bd 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -375,3 +375,37 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
 
 	return pkey_access_permitted(pte_to_pkey_bits(pte), write, execute);
 }
+
+/*
+ * We only want to enforce protection keys on the current thread because we
+ * effectively have no access to AMR/IAMR for other threads or any way to tell
+ * which AMR/IAMR in a threaded process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current mm, or if we are
+ * in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+	if (!current->mm)
+		return true;
+
+	/* if it is not our ->mm, it has to be foreign */
+	if (current->mm != vma->vm_mm)
+		return true;
+
+	return false;
+}
+
+bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
+			       bool execute, bool foreign)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return true;
+	/*
+	 * Do not enforce our key-permissions on a foreign vma.
+	 */
+	if (foreign || vma_is_foreign(vma))
+		return true;
+
+	return pkey_access_permitted(vma_pkey(vma), write, execute);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 19/51] powerpc: Handle exceptions caused by pkey violation
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Handle Data and  Instruction exceptions caused by memory
protection-key.

The CPU will detect the key fault if the HPTE is already
programmed with the key.

However if the HPTE is not  hashed, a key fault will not
be detected by the hardware. The software will detect
pkey violation in such a case.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/fault.c |   32 +++++++++++++++++++++++++++-----
 1 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4797d08..dfcd0e4 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -145,6 +145,24 @@ static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 	return __bad_area(regs, address, SEGV_MAPERR);
 }
 
+static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
+				    int si_code)
+{
+	int sig = SIGBUS;
+	int code = BUS_OBJERR;
+
+#ifdef CONFIG_PPC_MEM_KEYS
+	if (si_code & DSISR_KEYFAULT) {
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+		sig = SIGSEGV;
+		code = SEGV_PKUERR;
+	}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
+	_exception(sig, regs, code, address);
+	return 0;
+}
+
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
 		     unsigned int fault)
 {
@@ -391,11 +409,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return 0;
 
 	if (unlikely(page_fault_is_bad(error_code))) {
-		if (is_user) {
-			_exception(SIGBUS, regs, BUS_OBJERR, address);
-			return 0;
-		}
-		return SIGBUS;
+		if (!is_user)
+			return SIGBUS;
+		return bad_page_fault_exception(regs, address, error_code);
 	}
 
 	/* Additional sanity check(s) */
@@ -498,6 +514,12 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * the fault.
 	 */
 	fault = handle_mm_fault(vma, address, flags);
+
+#ifdef CONFIG_PPC_MEM_KEYS
+	if (unlikely(fault & VM_FAULT_SIGSEGV))
+		return __bad_area(regs, address, SEGV_PKUERR);
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 	major |= fault & VM_FAULT_MAJOR;
 
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 19/51] powerpc: Handle exceptions caused by pkey violation
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Handle Data and  Instruction exceptions caused by memory
protection-key.

The CPU will detect the key fault if the HPTE is already
programmed with the key.

However if the HPTE is not  hashed, a key fault will not
be detected by the hardware. The software will detect
pkey violation in such a case.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/fault.c |   32 +++++++++++++++++++++++++++-----
 1 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4797d08..dfcd0e4 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -145,6 +145,24 @@ static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 	return __bad_area(regs, address, SEGV_MAPERR);
 }
 
+static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
+				    int si_code)
+{
+	int sig = SIGBUS;
+	int code = BUS_OBJERR;
+
+#ifdef CONFIG_PPC_MEM_KEYS
+	if (si_code & DSISR_KEYFAULT) {
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+		sig = SIGSEGV;
+		code = SEGV_PKUERR;
+	}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
+	_exception(sig, regs, code, address);
+	return 0;
+}
+
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
 		     unsigned int fault)
 {
@@ -391,11 +409,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return 0;
 
 	if (unlikely(page_fault_is_bad(error_code))) {
-		if (is_user) {
-			_exception(SIGBUS, regs, BUS_OBJERR, address);
-			return 0;
-		}
-		return SIGBUS;
+		if (!is_user)
+			return SIGBUS;
+		return bad_page_fault_exception(regs, address, error_code);
 	}
 
 	/* Additional sanity check(s) */
@@ -498,6 +514,12 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * the fault.
 	 */
 	fault = handle_mm_fault(vma, address, flags);
+
+#ifdef CONFIG_PPC_MEM_KEYS
+	if (unlikely(fault & VM_FAULT_SIGSEGV))
+		return __bad_area(regs, address, SEGV_PKUERR);
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 	major |= fault & VM_FAULT_MAJOR;
 
 	/*
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 20/51] powerpc: introduce get_mm_addr_key() helper
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

get_mm_addr_key() helper returns the pkey associated with
an address corresponding to a given mm_struct.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu.h  |    9 +++++++++
 arch/powerpc/mm/hash_utils_64.c |   24 ++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 6364f5c..bb38312 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -260,6 +260,15 @@ static inline bool early_radix_enabled(void)
 }
 #endif
 
+#ifdef CONFIG_PPC_MEM_KEYS
+extern u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address);
+#else
+static inline u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
+{
+	return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #endif /* !__ASSEMBLY__ */
 
 /* The kernel use the constants below to index in the page sizes array.
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index ddfc673..0108d12 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1575,6 +1575,30 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+/*
+ * Return the protection key associated with the given address and the
+ * mm_struct.
+ */
+u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
+{
+	pte_t *ptep;
+	u16 pkey = 0;
+	unsigned long flags;
+
+	if (!mm || !mm->pgd)
+		return 0;
+
+	local_irq_save(flags);
+	ptep = find_linux_pte(mm->pgd, address, NULL, NULL);
+	if (ptep)
+		pkey = pte_to_pkey_bits(pte_val(READ_ONCE(*ptep)));
+	local_irq_restore(flags);
+
+	return pkey;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 static inline void tm_flush_hash_page(int local)
 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 20/51] powerpc: introduce get_mm_addr_key() helper
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

get_mm_addr_key() helper returns the pkey associated with
an address corresponding to a given mm_struct.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu.h  |    9 +++++++++
 arch/powerpc/mm/hash_utils_64.c |   24 ++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 6364f5c..bb38312 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -260,6 +260,15 @@ static inline bool early_radix_enabled(void)
 }
 #endif
 
+#ifdef CONFIG_PPC_MEM_KEYS
+extern u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address);
+#else
+static inline u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
+{
+	return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #endif /* !__ASSEMBLY__ */
 
 /* The kernel use the constants below to index in the page sizes array.
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index ddfc673..0108d12 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1575,6 +1575,30 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+/*
+ * Return the protection key associated with the given address and the
+ * mm_struct.
+ */
+u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
+{
+	pte_t *ptep;
+	u16 pkey = 0;
+	unsigned long flags;
+
+	if (!mm || !mm->pgd)
+		return 0;
+
+	local_irq_save(flags);
+	ptep = find_linux_pte(mm->pgd, address, NULL, NULL);
+	if (ptep)
+		pkey = pte_to_pkey_bits(pte_val(READ_ONCE(*ptep)));
+	local_irq_restore(flags);
+
+	return pkey;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 static inline void tm_flush_hash_page(int local)
 {
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 21/51] powerpc: Deliver SEGV signal on pkey violation
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

The value of the pkey, whose protection got violated,
is made available in si_pkey field of the siginfo structure.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/bug.h |    1 +
 arch/powerpc/kernel/traps.c    |   12 ++++++++-
 arch/powerpc/mm/fault.c        |   55 ++++++++++++++++++++++-----------------
 3 files changed, 43 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 3c04249..97c3847 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -133,6 +133,7 @@
 extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
+extern void _exception_pkey(int, struct pt_regs *, int, unsigned long, int);
 extern void die(const char *, struct pt_regs *, long);
 extern bool die_will_crash(void);
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 13c9dcd..ed1c39b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -20,6 +20,7 @@
 #include <linux/sched/debug.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <linux/stddef.h>
 #include <linux/unistd.h>
 #include <linux/ptrace.h>
@@ -265,7 +266,9 @@ void user_single_step_siginfo(struct task_struct *tsk,
 	info->si_addr = (void __user *)regs->nip;
 }
 
-void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
+
+void _exception_pkey(int signr, struct pt_regs *regs, int code, unsigned long addr,
+		int key)
 {
 	siginfo_t info;
 	const char fmt32[] = KERN_INFO "%s[%d]: unhandled signal %d " \
@@ -292,9 +295,16 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 	info.si_signo = signr;
 	info.si_code = code;
 	info.si_addr = (void __user *) addr;
+	info.si_pkey = key;
+
 	force_sig_info(signr, &info, current);
 }
 
+void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
+{
+	_exception_pkey(signr, regs, code, addr, 0);
+}
+
 void system_reset_exception(struct pt_regs *regs)
 {
 	/*
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index dfcd0e4..84523ed 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -107,7 +107,8 @@ static bool store_updates_sp(struct pt_regs *regs)
  */
 
 static int
-__bad_area_nosemaphore(struct pt_regs *regs, unsigned long address, int si_code)
+__bad_area_nosemaphore(struct pt_regs *regs, unsigned long address, int si_code,
+		int pkey)
 {
 	/*
 	 * If we are in kernel mode, bail out with a SEGV, this will
@@ -117,17 +118,18 @@ static bool store_updates_sp(struct pt_regs *regs)
 	if (!user_mode(regs))
 		return SIGSEGV;
 
-	_exception(SIGSEGV, regs, si_code, address);
+	_exception_pkey(SIGSEGV, regs, si_code, address, pkey);
 
 	return 0;
 }
 
 static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long address)
 {
-	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
+	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR, 0);
 }
 
-static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
+			int pkey)
 {
 	struct mm_struct *mm = current->mm;
 
@@ -137,30 +139,18 @@ static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
 	 */
 	up_read(&mm->mmap_sem);
 
-	return __bad_area_nosemaphore(regs, address, si_code);
+	return __bad_area_nosemaphore(regs, address, si_code, pkey);
 }
 
 static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 {
-	return __bad_area(regs, address, SEGV_MAPERR);
+	return __bad_area(regs, address, SEGV_MAPERR, 0);
 }
 
-static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
-				    int si_code)
+static int bad_key_fault_exception(struct pt_regs *regs, unsigned long address,
+				    int pkey)
 {
-	int sig = SIGBUS;
-	int code = BUS_OBJERR;
-
-#ifdef CONFIG_PPC_MEM_KEYS
-	if (si_code & DSISR_KEYFAULT) {
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
-		sig = SIGSEGV;
-		code = SEGV_PKUERR;
-	}
-#endif /* CONFIG_PPC_MEM_KEYS */
-
-	_exception(sig, regs, code, address);
-	return 0;
+	return __bad_area_nosemaphore(regs, address, SEGV_PKUERR, pkey);
 }
 
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
@@ -411,7 +401,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (unlikely(page_fault_is_bad(error_code))) {
 		if (!is_user)
 			return SIGBUS;
-		return bad_page_fault_exception(regs, address, error_code);
+
+		if (error_code & DSISR_KEYFAULT) {
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs,
+					address);
+			return bad_key_fault_exception(regs, address,
+				 get_mm_addr_key(current->mm, address));
+		}
+
+		_exception_pkey(SIGBUS, regs, BUS_OBJERR, address, 0);
+		return 0;
 	}
 
 	/* Additional sanity check(s) */
@@ -516,8 +515,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	fault = handle_mm_fault(vma, address, flags);
 
 #ifdef CONFIG_PPC_MEM_KEYS
-	if (unlikely(fault & VM_FAULT_SIGSEGV))
-		return __bad_area(regs, address, SEGV_PKUERR);
+	if (unlikely(fault & VM_FAULT_SIGSEGV)) {
+		/*
+		 * The PGD-PDT...PMD-PTE tree may not have been fully setup.
+		 * Hence we cannot walk the tree to locate the PTE, to locate
+		 * the key. Hence lets use vma_pkey() to get the key; instead
+		 * of get_mm_addr_key().
+		 */
+		up_read(&current->mm->mmap_sem);
+		return bad_key_fault_exception(regs, address, vma_pkey(vma));
+	}
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 	major |= fault & VM_FAULT_MAJOR;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 21/51] powerpc: Deliver SEGV signal on pkey violation
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

The value of the pkey, whose protection got violated,
is made available in si_pkey field of the siginfo structure.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/bug.h |    1 +
 arch/powerpc/kernel/traps.c    |   12 ++++++++-
 arch/powerpc/mm/fault.c        |   55 ++++++++++++++++++++++-----------------
 3 files changed, 43 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 3c04249..97c3847 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -133,6 +133,7 @@
 extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
+extern void _exception_pkey(int, struct pt_regs *, int, unsigned long, int);
 extern void die(const char *, struct pt_regs *, long);
 extern bool die_will_crash(void);
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 13c9dcd..ed1c39b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -20,6 +20,7 @@
 #include <linux/sched/debug.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <linux/stddef.h>
 #include <linux/unistd.h>
 #include <linux/ptrace.h>
@@ -265,7 +266,9 @@ void user_single_step_siginfo(struct task_struct *tsk,
 	info->si_addr = (void __user *)regs->nip;
 }
 
-void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
+
+void _exception_pkey(int signr, struct pt_regs *regs, int code, unsigned long addr,
+		int key)
 {
 	siginfo_t info;
 	const char fmt32[] = KERN_INFO "%s[%d]: unhandled signal %d " \
@@ -292,9 +295,16 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 	info.si_signo = signr;
 	info.si_code = code;
 	info.si_addr = (void __user *) addr;
+	info.si_pkey = key;
+
 	force_sig_info(signr, &info, current);
 }
 
+void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
+{
+	_exception_pkey(signr, regs, code, addr, 0);
+}
+
 void system_reset_exception(struct pt_regs *regs)
 {
 	/*
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index dfcd0e4..84523ed 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -107,7 +107,8 @@ static bool store_updates_sp(struct pt_regs *regs)
  */
 
 static int
-__bad_area_nosemaphore(struct pt_regs *regs, unsigned long address, int si_code)
+__bad_area_nosemaphore(struct pt_regs *regs, unsigned long address, int si_code,
+		int pkey)
 {
 	/*
 	 * If we are in kernel mode, bail out with a SEGV, this will
@@ -117,17 +118,18 @@ static bool store_updates_sp(struct pt_regs *regs)
 	if (!user_mode(regs))
 		return SIGSEGV;
 
-	_exception(SIGSEGV, regs, si_code, address);
+	_exception_pkey(SIGSEGV, regs, si_code, address, pkey);
 
 	return 0;
 }
 
 static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long address)
 {
-	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
+	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR, 0);
 }
 
-static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
+			int pkey)
 {
 	struct mm_struct *mm = current->mm;
 
@@ -137,30 +139,18 @@ static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
 	 */
 	up_read(&mm->mmap_sem);
 
-	return __bad_area_nosemaphore(regs, address, si_code);
+	return __bad_area_nosemaphore(regs, address, si_code, pkey);
 }
 
 static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 {
-	return __bad_area(regs, address, SEGV_MAPERR);
+	return __bad_area(regs, address, SEGV_MAPERR, 0);
 }
 
-static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
-				    int si_code)
+static int bad_key_fault_exception(struct pt_regs *regs, unsigned long address,
+				    int pkey)
 {
-	int sig = SIGBUS;
-	int code = BUS_OBJERR;
-
-#ifdef CONFIG_PPC_MEM_KEYS
-	if (si_code & DSISR_KEYFAULT) {
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
-		sig = SIGSEGV;
-		code = SEGV_PKUERR;
-	}
-#endif /* CONFIG_PPC_MEM_KEYS */
-
-	_exception(sig, regs, code, address);
-	return 0;
+	return __bad_area_nosemaphore(regs, address, SEGV_PKUERR, pkey);
 }
 
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
@@ -411,7 +401,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (unlikely(page_fault_is_bad(error_code))) {
 		if (!is_user)
 			return SIGBUS;
-		return bad_page_fault_exception(regs, address, error_code);
+
+		if (error_code & DSISR_KEYFAULT) {
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs,
+					address);
+			return bad_key_fault_exception(regs, address,
+				 get_mm_addr_key(current->mm, address));
+		}
+
+		_exception_pkey(SIGBUS, regs, BUS_OBJERR, address, 0);
+		return 0;
 	}
 
 	/* Additional sanity check(s) */
@@ -516,8 +515,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	fault = handle_mm_fault(vma, address, flags);
 
 #ifdef CONFIG_PPC_MEM_KEYS
-	if (unlikely(fault & VM_FAULT_SIGSEGV))
-		return __bad_area(regs, address, SEGV_PKUERR);
+	if (unlikely(fault & VM_FAULT_SIGSEGV)) {
+		/*
+		 * The PGD-PDT...PMD-PTE tree may not have been fully setup.
+		 * Hence we cannot walk the tree to locate the PTE, to locate
+		 * the key. Hence lets use vma_pkey() to get the key; instead
+		 * of get_mm_addr_key().
+		 */
+		up_read(&current->mm->mmap_sem);
+		return bad_key_fault_exception(regs, address, vma_pkey(vma));
+	}
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 	major |= fault & VM_FAULT_MAJOR;
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 22/51] powerpc/ptrace: Add memory protection key regset
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

The AMR/IAMR/UAMOR are part of the program context.
Allow it to be accessed via ptrace and through core files.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h    |    5 +++
 arch/powerpc/include/uapi/asm/elf.h |    1 +
 arch/powerpc/kernel/ptrace.c        |   66 +++++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/traps.c         |    7 ++++
 include/uapi/linux/elf.h            |    1 +
 5 files changed, 80 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 3437a50..9ee4731 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -213,6 +213,11 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+	return !static_branch_likely(&pkey_disabled);
+}
+
 static inline void pkey_mm_init(struct mm_struct *mm)
 {
 	if (static_branch_likely(&pkey_disabled))
diff --git a/arch/powerpc/include/uapi/asm/elf.h b/arch/powerpc/include/uapi/asm/elf.h
index 5f201d4..860c592 100644
--- a/arch/powerpc/include/uapi/asm/elf.h
+++ b/arch/powerpc/include/uapi/asm/elf.h
@@ -97,6 +97,7 @@
 #define ELF_NTMSPRREG	3	/* include tfhar, tfiar, texasr */
 #define ELF_NEBB	3	/* includes ebbrr, ebbhr, bescr */
 #define ELF_NPMU	5	/* includes siar, sdar, sier, mmcr2, mmcr0 */
+#define ELF_NPKEY	3	/* includes amr, iamr, uamor */
 
 typedef unsigned long elf_greg_t64;
 typedef elf_greg_t64 elf_gregset_t64[ELF_NGREG];
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index f52ad5b..3718a04 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -35,6 +35,7 @@
 #include <linux/context_tracking.h>
 
 #include <linux/uaccess.h>
+#include <linux/pkeys.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/switch_to.h>
@@ -1775,6 +1776,61 @@ static int pmu_set(struct task_struct *target,
 	return ret;
 }
 #endif
+
+#ifdef CONFIG_PPC_MEM_KEYS
+static int pkey_active(struct task_struct *target,
+		       const struct user_regset *regset)
+{
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	return regset->n;
+}
+
+static int pkey_get(struct task_struct *target,
+		    const struct user_regset *regset,
+		    unsigned int pos, unsigned int count,
+		    void *kbuf, void __user *ubuf)
+{
+	BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
+	BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
+
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+				   &target->thread.amr, 0,
+				   ELF_NPKEY * sizeof(unsigned long));
+}
+
+static int pkey_set(struct task_struct *target,
+		      const struct user_regset *regset,
+		      unsigned int pos, unsigned int count,
+		      const void *kbuf, const void __user *ubuf)
+{
+	u64 new_amr;
+	int ret;
+
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	/* Only the AMR can be set from userspace */
+	if (pos != 0 || count != sizeof(new_amr))
+		return -EINVAL;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &new_amr, 0, sizeof(new_amr));
+	if (ret)
+		return ret;
+
+	/* UAMOR determines which bits of the AMR can be set from userspace. */
+	target->thread.amr = (new_amr & target->thread.uamor) |
+		(target->thread.amr & ~target->thread.uamor);
+
+	return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 /*
  * These are our native regset flavors.
  */
@@ -1809,6 +1865,9 @@ enum powerpc_regset {
 	REGSET_EBB,		/* EBB registers */
 	REGSET_PMR,		/* Performance Monitor Registers */
 #endif
+#ifdef CONFIG_PPC_MEM_KEYS
+	REGSET_PKEY,		/* AMR register */
+#endif
 };
 
 static const struct user_regset native_regsets[] = {
@@ -1914,6 +1973,13 @@ enum powerpc_regset {
 		.active = pmu_active, .get = pmu_get, .set = pmu_set
 	},
 #endif
+#ifdef CONFIG_PPC_MEM_KEYS
+	[REGSET_PKEY] = {
+		.core_note_type = NT_PPC_PKEY, .n = ELF_NPKEY,
+		.size = sizeof(u64), .align = sizeof(u64),
+		.active = pkey_active, .get = pkey_get, .set = pkey_set
+	},
+#endif
 };
 
 static const struct user_regset_view user_ppc_native_view = {
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index ed1c39b..f449dc5 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -291,6 +291,13 @@ void _exception_pkey(int signr, struct pt_regs *regs, int code, unsigned long ad
 		local_irq_enable();
 
 	current->thread.trap_nr = code;
+
+	/*
+	 * Save all the pkey registers AMR/IAMR/UAMOR. Eg: Core dumps need
+	 * to capture the content, if the task gets killed.
+	 */
+	thread_pkey_regs_save(&current->thread);
+
 	memset(&info, 0, sizeof(info));
 	info.si_signo = signr;
 	info.si_code = code;
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index c58627c..c017818 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -396,6 +396,7 @@
 #define NT_PPC_TM_CTAR	0x10d		/* TM checkpointed Target Address Register */
 #define NT_PPC_TM_CPPR	0x10e		/* TM checkpointed Program Priority Register */
 #define NT_PPC_TM_CDSCR	0x10f		/* TM checkpointed Data Stream Control Register */
+#define NT_PPC_PKEY	0x110		/* Memory Protection Keys registers */
 #define NT_386_TLS	0x200		/* i386 TLS slots (struct user_desc) */
 #define NT_386_IOPERM	0x201		/* x86 io permission bitmap (1=deny) */
 #define NT_X86_XSTATE	0x202		/* x86 extended state using xsave */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 22/51] powerpc/ptrace: Add memory protection key regset
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

The AMR/IAMR/UAMOR are part of the program context.
Allow it to be accessed via ptrace and through core files.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h    |    5 +++
 arch/powerpc/include/uapi/asm/elf.h |    1 +
 arch/powerpc/kernel/ptrace.c        |   66 +++++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/traps.c         |    7 ++++
 include/uapi/linux/elf.h            |    1 +
 5 files changed, 80 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 3437a50..9ee4731 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -213,6 +213,11 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+	return !static_branch_likely(&pkey_disabled);
+}
+
 static inline void pkey_mm_init(struct mm_struct *mm)
 {
 	if (static_branch_likely(&pkey_disabled))
diff --git a/arch/powerpc/include/uapi/asm/elf.h b/arch/powerpc/include/uapi/asm/elf.h
index 5f201d4..860c592 100644
--- a/arch/powerpc/include/uapi/asm/elf.h
+++ b/arch/powerpc/include/uapi/asm/elf.h
@@ -97,6 +97,7 @@
 #define ELF_NTMSPRREG	3	/* include tfhar, tfiar, texasr */
 #define ELF_NEBB	3	/* includes ebbrr, ebbhr, bescr */
 #define ELF_NPMU	5	/* includes siar, sdar, sier, mmcr2, mmcr0 */
+#define ELF_NPKEY	3	/* includes amr, iamr, uamor */
 
 typedef unsigned long elf_greg_t64;
 typedef elf_greg_t64 elf_gregset_t64[ELF_NGREG];
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index f52ad5b..3718a04 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -35,6 +35,7 @@
 #include <linux/context_tracking.h>
 
 #include <linux/uaccess.h>
+#include <linux/pkeys.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/switch_to.h>
@@ -1775,6 +1776,61 @@ static int pmu_set(struct task_struct *target,
 	return ret;
 }
 #endif
+
+#ifdef CONFIG_PPC_MEM_KEYS
+static int pkey_active(struct task_struct *target,
+		       const struct user_regset *regset)
+{
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	return regset->n;
+}
+
+static int pkey_get(struct task_struct *target,
+		    const struct user_regset *regset,
+		    unsigned int pos, unsigned int count,
+		    void *kbuf, void __user *ubuf)
+{
+	BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
+	BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
+
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+				   &target->thread.amr, 0,
+				   ELF_NPKEY * sizeof(unsigned long));
+}
+
+static int pkey_set(struct task_struct *target,
+		      const struct user_regset *regset,
+		      unsigned int pos, unsigned int count,
+		      const void *kbuf, const void __user *ubuf)
+{
+	u64 new_amr;
+	int ret;
+
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	/* Only the AMR can be set from userspace */
+	if (pos != 0 || count != sizeof(new_amr))
+		return -EINVAL;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &new_amr, 0, sizeof(new_amr));
+	if (ret)
+		return ret;
+
+	/* UAMOR determines which bits of the AMR can be set from userspace. */
+	target->thread.amr = (new_amr & target->thread.uamor) |
+		(target->thread.amr & ~target->thread.uamor);
+
+	return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 /*
  * These are our native regset flavors.
  */
@@ -1809,6 +1865,9 @@ enum powerpc_regset {
 	REGSET_EBB,		/* EBB registers */
 	REGSET_PMR,		/* Performance Monitor Registers */
 #endif
+#ifdef CONFIG_PPC_MEM_KEYS
+	REGSET_PKEY,		/* AMR register */
+#endif
 };
 
 static const struct user_regset native_regsets[] = {
@@ -1914,6 +1973,13 @@ enum powerpc_regset {
 		.active = pmu_active, .get = pmu_get, .set = pmu_set
 	},
 #endif
+#ifdef CONFIG_PPC_MEM_KEYS
+	[REGSET_PKEY] = {
+		.core_note_type = NT_PPC_PKEY, .n = ELF_NPKEY,
+		.size = sizeof(u64), .align = sizeof(u64),
+		.active = pkey_active, .get = pkey_get, .set = pkey_set
+	},
+#endif
 };
 
 static const struct user_regset_view user_ppc_native_view = {
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index ed1c39b..f449dc5 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -291,6 +291,13 @@ void _exception_pkey(int signr, struct pt_regs *regs, int code, unsigned long ad
 		local_irq_enable();
 
 	current->thread.trap_nr = code;
+
+	/*
+	 * Save all the pkey registers AMR/IAMR/UAMOR. Eg: Core dumps need
+	 * to capture the content, if the task gets killed.
+	 */
+	thread_pkey_regs_save(&current->thread);
+
 	memset(&info, 0, sizeof(info));
 	info.si_signo = signr;
 	info.si_code = code;
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index c58627c..c017818 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -396,6 +396,7 @@
 #define NT_PPC_TM_CTAR	0x10d		/* TM checkpointed Target Address Register */
 #define NT_PPC_TM_CPPR	0x10e		/* TM checkpointed Program Priority Register */
 #define NT_PPC_TM_CDSCR	0x10f		/* TM checkpointed Data Stream Control Register */
+#define NT_PPC_PKEY	0x110		/* Memory Protection Keys registers */
 #define NT_386_TLS	0x200		/* i386 TLS slots (struct user_desc) */
 #define NT_386_IOPERM	0x201		/* x86 io permission bitmap (1=deny) */
 #define NT_X86_XSTATE	0x202		/* x86 extended state using xsave */
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 23/51] powerpc: Enable pkey subsystem
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

PAPR defines 'ibm,processor-storage-keys' property. It exports two
values. The first value holds the number of data-access keys and the
second holds the number of instruction-access keys.  Due to a bug in
the  firmware, instruction-access  keys is  always  reported  as zero.
However any key can be configured to disable data-access and/or disable
execution-access. The inavailablity of the second value is not a
big handicap, though it could have been used to determine if the
platform supported disable-execution-access.

Non PAPR platforms do not define this property   in the device tree yet.
Here, we   hardcode   CPUs   that   support  pkey by consulting
PowerISA3.0

This patch calculates the number of keys supported by the platform.
Alsi it determines the platform support for read/write/execution access
support for pkeys.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/cputable.h    |   15 +++++++++----
 arch/powerpc/include/asm/mmu_context.h |    1 +
 arch/powerpc/include/asm/pkeys.h       |   10 +++++++++
 arch/powerpc/kernel/prom.c             |   18 +++++++++++++++++
 arch/powerpc/mm/pkeys.c                |   33 +++++++++++++++++++++----------
 5 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index 53b31c2..b288735 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -215,7 +215,9 @@ enum {
 #define CPU_FTR_DAWR			LONG_ASM_CONST(0x0400000000000000)
 #define CPU_FTR_DABRX			LONG_ASM_CONST(0x0800000000000000)
 #define CPU_FTR_PMAO_BUG		LONG_ASM_CONST(0x1000000000000000)
+#define CPU_FTR_PKEY			LONG_ASM_CONST(0x2000000000000000)
 #define CPU_FTR_POWER9_DD1		LONG_ASM_CONST(0x4000000000000000)
+#define CPU_FTR_PKEY_EXECUTE		LONG_ASM_CONST(0x8000000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -436,7 +438,8 @@ enum {
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_PURR | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX | \
+	    CPU_FTR_PKEY)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -444,7 +447,7 @@ enum {
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_CFAR | \
-	    CPU_FTR_DABRX)
+	    CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -453,7 +456,7 @@ enum {
 	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | \
-	    CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX)
+	    CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER8 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -463,7 +466,8 @@ enum {
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP)
+	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY |\
+	    CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
 #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
 #define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
@@ -475,7 +479,8 @@ enum {
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300)
+	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \
+	    CPU_FTR_PKEY | CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \
 			     (~CPU_FTR_SAO))
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 95a3288..5a15d37 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -152,6 +152,7 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #define thread_pkey_regs_save(thread)
 #define thread_pkey_regs_restore(new_thread, old_thread)
 #define thread_pkey_regs_init(thread)
+#define pkey_mmu_values(total_data, total_execute)
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 9ee4731..333fb28 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -13,6 +13,7 @@
 #define _ASM_POWERPC_KEYS_H
 
 #include <linux/jump_label.h>
+#include <asm/firmware.h>
 
 DECLARE_STATIC_KEY_TRUE(pkey_disabled);
 extern int pkeys_total; /* total pkeys as per device tree */
@@ -227,6 +228,15 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	mm->context.execute_only_pkey = -1;
 }
 
+static inline void pkey_mmu_values(int total_data, int total_execute)
+{
+	/*
+	 * Since any pkey can be used for data or execute, we will just treat
+	 * all keys as equal and track them as one entity.
+	 */
+	pkeys_total = total_data;
+}
+
 extern void thread_pkey_regs_save(struct thread_struct *thread);
 extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
 				     struct thread_struct *old_thread);
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index f830562..8b75e9b 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -35,6 +35,7 @@
 #include <linux/of_fdt.h>
 #include <linux/libfdt.h>
 #include <linux/cpu.h>
+#include <linux/pkeys.h>
 
 #include <asm/prom.h>
 #include <asm/rtas.h>
@@ -228,6 +229,22 @@ static void __init check_cpu_pa_features(unsigned long node)
 		      ibm_pa_features, ARRAY_SIZE(ibm_pa_features));
 }
 
+static void __init check_cpu_pkey_feature(unsigned long node)
+{
+	const __be32 *ftrs;
+	int len, total_data, total_execute;
+
+	ftrs = of_get_flat_dt_prop(node, "ibm,processor-storage-keys", &len);
+	if (ftrs == NULL)
+		return;
+
+	len /= sizeof(int);
+	total_execute = (len >= 2) ? be32_to_cpu(ftrs[1]) : 0;
+	total_data = (len >= 1) ? be32_to_cpu(ftrs[0]) : 0;
+	pkey_mmu_values(total_data, total_execute);
+}
+
+
 #ifdef CONFIG_PPC_STD_MMU_64
 static void __init init_mmu_slb_size(unsigned long node)
 {
@@ -391,6 +408,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node,
 
 		check_cpu_feature_properties(node);
 		check_cpu_pa_features(node);
+		check_cpu_pkey_feature(node);
 	}
 
 	identical_pvr_fixup(node);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 3b221bd..5047371 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -26,6 +26,14 @@
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
+static inline bool pkey_mmu_enabled(void)
+{
+	if (firmware_has_feature(FW_FEATURE_LPAR))
+		return pkeys_total;
+	else
+		return cpu_has_feature(CPU_FTR_PKEY);
+}
+
 void __init pkey_initialize(void)
 {
 	int os_reserved, i;
@@ -46,14 +54,9 @@ void __init pkey_initialize(void)
 		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
 				!= (sizeof(u64) * BITS_PER_BYTE));
 
-	/*
-	 * Disable the pkey system till everything is in place. A subsequent
-	 * patch will enable it.
-	 */
-	static_branch_enable(&pkey_disabled);
-
-	/* Lets assume 32 keys */
-	pkeys_total = 32;
+	/* Let's assume 32 keys if we are not told the number of pkeys. */
+	if (!pkeys_total)
+		pkeys_total = 32;
 
 	/*
 	 * Adjust the upper limit, based on the number of bits supported by
@@ -62,11 +65,19 @@ void __init pkey_initialize(void)
 	pkeys_total = min_t(int, pkeys_total,
 			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
 
+	if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
+		static_branch_enable(&pkey_disabled);
+	else
+		static_branch_disable(&pkey_disabled);
+
+	if (static_branch_likely(&pkey_disabled))
+		return;
+
 	/*
-	 * Disable execute_disable support for now. A subsequent patch will
-	 * enable it.
+	 * The device tree cannot be relied on for execute_disable support.
+	 * Hence we depend on CPU FTR.
 	 */
-	pkey_execute_disable_supported = false;
+	pkey_execute_disable_supported = cpu_has_feature(CPU_FTR_PKEY_EXECUTE);
 
 #ifdef CONFIG_PPC_4K_PAGES
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 23/51] powerpc: Enable pkey subsystem
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

PAPR defines 'ibm,processor-storage-keys' property. It exports two
values. The first value holds the number of data-access keys and the
second holds the number of instruction-access keys.  Due to a bug in
the  firmware, instruction-access  keys is  always  reported  as zero.
However any key can be configured to disable data-access and/or disable
execution-access. The inavailablity of the second value is not a
big handicap, though it could have been used to determine if the
platform supported disable-execution-access.

Non PAPR platforms do not define this property   in the device tree yet.
Here, we   hardcode   CPUs   that   support  pkey by consulting
PowerISA3.0

This patch calculates the number of keys supported by the platform.
Alsi it determines the platform support for read/write/execution access
support for pkeys.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/cputable.h    |   15 +++++++++----
 arch/powerpc/include/asm/mmu_context.h |    1 +
 arch/powerpc/include/asm/pkeys.h       |   10 +++++++++
 arch/powerpc/kernel/prom.c             |   18 +++++++++++++++++
 arch/powerpc/mm/pkeys.c                |   33 +++++++++++++++++++++----------
 5 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index 53b31c2..b288735 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -215,7 +215,9 @@ enum {
 #define CPU_FTR_DAWR			LONG_ASM_CONST(0x0400000000000000)
 #define CPU_FTR_DABRX			LONG_ASM_CONST(0x0800000000000000)
 #define CPU_FTR_PMAO_BUG		LONG_ASM_CONST(0x1000000000000000)
+#define CPU_FTR_PKEY			LONG_ASM_CONST(0x2000000000000000)
 #define CPU_FTR_POWER9_DD1		LONG_ASM_CONST(0x4000000000000000)
+#define CPU_FTR_PKEY_EXECUTE		LONG_ASM_CONST(0x8000000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -436,7 +438,8 @@ enum {
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_PURR | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX | \
+	    CPU_FTR_PKEY)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -444,7 +447,7 @@ enum {
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_CFAR | \
-	    CPU_FTR_DABRX)
+	    CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -453,7 +456,7 @@ enum {
 	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | \
-	    CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX)
+	    CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER8 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -463,7 +466,8 @@ enum {
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP)
+	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY |\
+	    CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
 #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
 #define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
@@ -475,7 +479,8 @@ enum {
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300)
+	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \
+	    CPU_FTR_PKEY | CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \
 			     (~CPU_FTR_SAO))
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 95a3288..5a15d37 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -152,6 +152,7 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #define thread_pkey_regs_save(thread)
 #define thread_pkey_regs_restore(new_thread, old_thread)
 #define thread_pkey_regs_init(thread)
+#define pkey_mmu_values(total_data, total_execute)
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 9ee4731..333fb28 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -13,6 +13,7 @@
 #define _ASM_POWERPC_KEYS_H
 
 #include <linux/jump_label.h>
+#include <asm/firmware.h>
 
 DECLARE_STATIC_KEY_TRUE(pkey_disabled);
 extern int pkeys_total; /* total pkeys as per device tree */
@@ -227,6 +228,15 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	mm->context.execute_only_pkey = -1;
 }
 
+static inline void pkey_mmu_values(int total_data, int total_execute)
+{
+	/*
+	 * Since any pkey can be used for data or execute, we will just treat
+	 * all keys as equal and track them as one entity.
+	 */
+	pkeys_total = total_data;
+}
+
 extern void thread_pkey_regs_save(struct thread_struct *thread);
 extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
 				     struct thread_struct *old_thread);
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index f830562..8b75e9b 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -35,6 +35,7 @@
 #include <linux/of_fdt.h>
 #include <linux/libfdt.h>
 #include <linux/cpu.h>
+#include <linux/pkeys.h>
 
 #include <asm/prom.h>
 #include <asm/rtas.h>
@@ -228,6 +229,22 @@ static void __init check_cpu_pa_features(unsigned long node)
 		      ibm_pa_features, ARRAY_SIZE(ibm_pa_features));
 }
 
+static void __init check_cpu_pkey_feature(unsigned long node)
+{
+	const __be32 *ftrs;
+	int len, total_data, total_execute;
+
+	ftrs = of_get_flat_dt_prop(node, "ibm,processor-storage-keys", &len);
+	if (ftrs == NULL)
+		return;
+
+	len /= sizeof(int);
+	total_execute = (len >= 2) ? be32_to_cpu(ftrs[1]) : 0;
+	total_data = (len >= 1) ? be32_to_cpu(ftrs[0]) : 0;
+	pkey_mmu_values(total_data, total_execute);
+}
+
+
 #ifdef CONFIG_PPC_STD_MMU_64
 static void __init init_mmu_slb_size(unsigned long node)
 {
@@ -391,6 +408,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node,
 
 		check_cpu_feature_properties(node);
 		check_cpu_pa_features(node);
+		check_cpu_pkey_feature(node);
 	}
 
 	identical_pvr_fixup(node);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 3b221bd..5047371 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -26,6 +26,14 @@
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
+static inline bool pkey_mmu_enabled(void)
+{
+	if (firmware_has_feature(FW_FEATURE_LPAR))
+		return pkeys_total;
+	else
+		return cpu_has_feature(CPU_FTR_PKEY);
+}
+
 void __init pkey_initialize(void)
 {
 	int os_reserved, i;
@@ -46,14 +54,9 @@ void __init pkey_initialize(void)
 		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
 				!= (sizeof(u64) * BITS_PER_BYTE));
 
-	/*
-	 * Disable the pkey system till everything is in place. A subsequent
-	 * patch will enable it.
-	 */
-	static_branch_enable(&pkey_disabled);
-
-	/* Lets assume 32 keys */
-	pkeys_total = 32;
+	/* Let's assume 32 keys if we are not told the number of pkeys. */
+	if (!pkeys_total)
+		pkeys_total = 32;
 
 	/*
 	 * Adjust the upper limit, based on the number of bits supported by
@@ -62,11 +65,19 @@ void __init pkey_initialize(void)
 	pkeys_total = min_t(int, pkeys_total,
 			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
 
+	if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
+		static_branch_enable(&pkey_disabled);
+	else
+		static_branch_disable(&pkey_disabled);
+
+	if (static_branch_likely(&pkey_disabled))
+		return;
+
 	/*
-	 * Disable execute_disable support for now. A subsequent patch will
-	 * enable it.
+	 * The device tree cannot be relied on for execute_disable support.
+	 * Hence we depend on CPU FTR.
 	 */
-	pkey_execute_disable_supported = false;
+	pkey_execute_disable_supported = cpu_has_feature(CPU_FTR_PKEY_EXECUTE);
 
 #ifdef CONFIG_PPC_4K_PAGES
 	/*
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 24/51] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Finally this patch provides the ability for a process to
allocate and free a protection key.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    2 ++
 arch/powerpc/include/asm/unistd.h      |    4 +---
 arch/powerpc/include/uapi/asm/unistd.h |    2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 449912f..dea4a95 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -389,3 +389,5 @@
 COMPAT_SYS_SPU(pwritev2)
 SYSCALL(kexec_file_load)
 SYSCALL(statx)
+SYSCALL(pkey_alloc)
+SYSCALL(pkey_free)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index 9ba11db..e0273bc 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,13 +12,11 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		384
+#define NR_syscalls		386
 
 #define __NR__exit __NR_exit
 
 #define __IGNORE_pkey_mprotect
-#define __IGNORE_pkey_alloc
-#define __IGNORE_pkey_free
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index df8684f..5db4385 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -395,5 +395,7 @@
 #define __NR_pwritev2		381
 #define __NR_kexec_file_load	382
 #define __NR_statx		383
+#define __NR_pkey_alloc		384
+#define __NR_pkey_free		385
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 24/51] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Finally this patch provides the ability for a process to
allocate and free a protection key.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    2 ++
 arch/powerpc/include/asm/unistd.h      |    4 +---
 arch/powerpc/include/uapi/asm/unistd.h |    2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 449912f..dea4a95 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -389,3 +389,5 @@
 COMPAT_SYS_SPU(pwritev2)
 SYSCALL(kexec_file_load)
 SYSCALL(statx)
+SYSCALL(pkey_alloc)
+SYSCALL(pkey_free)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index 9ba11db..e0273bc 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,13 +12,11 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		384
+#define NR_syscalls		386
 
 #define __NR__exit __NR_exit
 
 #define __IGNORE_pkey_mprotect
-#define __IGNORE_pkey_alloc
-#define __IGNORE_pkey_free
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index df8684f..5db4385 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -395,5 +395,7 @@
 #define __NR_pwritev2		381
 #define __NR_kexec_file_load	382
 #define __NR_statx		383
+#define __NR_pkey_alloc		384
+#define __NR_pkey_free		385
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 25/51] powerpc: sys_pkey_mprotect() system call
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Patch provides the ability for a process to
associate a pkey with a address range.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    1 +
 arch/powerpc/include/asm/unistd.h      |    4 +---
 arch/powerpc/include/uapi/asm/unistd.h |    1 +
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index dea4a95..d61f9c9 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -391,3 +391,4 @@
 SYSCALL(statx)
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
+SYSCALL(pkey_mprotect)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index e0273bc..daf1ba9 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,12 +12,10 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		386
+#define NR_syscalls		387
 
 #define __NR__exit __NR_exit
 
-#define __IGNORE_pkey_mprotect
-
 #ifndef __ASSEMBLY__
 
 #include <linux/types.h>
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 5db4385..389c36f 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -397,5 +397,6 @@
 #define __NR_statx		383
 #define __NR_pkey_alloc		384
 #define __NR_pkey_free		385
+#define __NR_pkey_mprotect	386
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 25/51] powerpc: sys_pkey_mprotect() system call
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Patch provides the ability for a process to
associate a pkey with a address range.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    1 +
 arch/powerpc/include/asm/unistd.h      |    4 +---
 arch/powerpc/include/uapi/asm/unistd.h |    1 +
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index dea4a95..d61f9c9 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -391,3 +391,4 @@
 SYSCALL(statx)
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
+SYSCALL(pkey_mprotect)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index e0273bc..daf1ba9 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,12 +12,10 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		386
+#define NR_syscalls		387
 
 #define __NR__exit __NR_exit
 
-#define __IGNORE_pkey_mprotect
-
 #ifndef __ASSEMBLY__
 
 #include <linux/types.h>
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 5db4385..389c36f 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -397,5 +397,6 @@
 #define __NR_statx		383
 #define __NR_pkey_alloc		384
 #define __NR_pkey_free		385
+#define __NR_pkey_mprotect	386
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 26/51] powerpc: add sys_pkey_modify() system call
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

sys_pkey_modify()  is   powerpc  specific  system  call.  It
enables  the ability to modify *any* attribute of a key.

Since powerpc disallows modification of IAMR from user space
an application is unable to change a key's execute-attribute.

This system call helps accomplish the above.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    1 +
 arch/powerpc/include/asm/unistd.h      |    2 +-
 arch/powerpc/include/uapi/asm/unistd.h |    1 +
 arch/powerpc/kernel/entry_64.S         |    9 +++++++++
 arch/powerpc/mm/pkeys.c                |   17 +++++++++++++++++
 5 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index d61f9c9..533cdc5 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -392,3 +392,4 @@
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
 SYSCALL(pkey_mprotect)
+PPC64ONLY(pkey_modify)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index daf1ba9..1e97086 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		387
+#define NR_syscalls		388
 
 #define __NR__exit __NR_exit
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 389c36f..318cd79 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -398,5 +398,6 @@
 #define __NR_pkey_alloc		384
 #define __NR_pkey_free		385
 #define __NR_pkey_mprotect	386
+#define __NR_pkey_modify	387
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 4a0fd4f..47c85f9 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -455,6 +455,15 @@ _GLOBAL(ppc_switch_endian)
 	bl	sys_switch_endian
 	b	.Lsyscall_exit
 
+_GLOBAL(ppc_pkey_modify)
+	bl	save_nvgprs
+#ifdef  CONFIG_PPC_MEM_KEYS
+	bl	sys_pkey_modify
+#else
+	bl	sys_ni_syscall
+#endif
+	b	.Lsyscall_exit
+
 _GLOBAL(ret_from_fork)
 	bl	schedule_tail
 	REST_NVGPRS(r1)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 5047371..2612f61 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -420,3 +420,20 @@ bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
 
 	return pkey_access_permitted(vma_pkey(vma), write, execute);
 }
+
+long sys_pkey_modify(int pkey, unsigned long new_val)
+{
+	bool ret;
+	/* Check for unsupported init values */
+	if (new_val & ~PKEY_ACCESS_MASK)
+		return -EINVAL;
+
+	down_write(&current->mm->mmap_sem);
+	ret = mm_pkey_is_allocated(current->mm, pkey);
+	up_write(&current->mm->mmap_sem);
+
+	if (!ret)
+		return -EINVAL;
+
+	return __arch_set_user_pkey_access(current, pkey, new_val);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 26/51] powerpc: add sys_pkey_modify() system call
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

sys_pkey_modify()  is   powerpc  specific  system  call.  It
enables  the ability to modify *any* attribute of a key.

Since powerpc disallows modification of IAMR from user space
an application is unable to change a key's execute-attribute.

This system call helps accomplish the above.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    1 +
 arch/powerpc/include/asm/unistd.h      |    2 +-
 arch/powerpc/include/uapi/asm/unistd.h |    1 +
 arch/powerpc/kernel/entry_64.S         |    9 +++++++++
 arch/powerpc/mm/pkeys.c                |   17 +++++++++++++++++
 5 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index d61f9c9..533cdc5 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -392,3 +392,4 @@
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
 SYSCALL(pkey_mprotect)
+PPC64ONLY(pkey_modify)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index daf1ba9..1e97086 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		387
+#define NR_syscalls		388
 
 #define __NR__exit __NR_exit
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 389c36f..318cd79 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -398,5 +398,6 @@
 #define __NR_pkey_alloc		384
 #define __NR_pkey_free		385
 #define __NR_pkey_mprotect	386
+#define __NR_pkey_modify	387
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 4a0fd4f..47c85f9 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -455,6 +455,15 @@ _GLOBAL(ppc_switch_endian)
 	bl	sys_switch_endian
 	b	.Lsyscall_exit
 
+_GLOBAL(ppc_pkey_modify)
+	bl	save_nvgprs
+#ifdef  CONFIG_PPC_MEM_KEYS
+	bl	sys_pkey_modify
+#else
+	bl	sys_ni_syscall
+#endif
+	b	.Lsyscall_exit
+
 _GLOBAL(ret_from_fork)
 	bl	schedule_tail
 	REST_NVGPRS(r1)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 5047371..2612f61 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -420,3 +420,20 @@ bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
 
 	return pkey_access_permitted(vma_pkey(vma), write, execute);
 }
+
+long sys_pkey_modify(int pkey, unsigned long new_val)
+{
+	bool ret;
+	/* Check for unsupported init values */
+	if (new_val & ~PKEY_ACCESS_MASK)
+		return -EINVAL;
+
+	down_write(&current->mm->mmap_sem);
+	ret = mm_pkey_is_allocated(current->mm, pkey);
+	up_write(&current->mm->mmap_sem);
+
+	if (!ret)
+		return -EINVAL;
+
+	return __arch_set_user_pkey_access(current, pkey, new_val);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 27/51] mm, x86 : introduce arch_pkeys_enabled()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Arch neutral code needs to know if the architecture supports
protection  keys  to  display protection key in smaps. Hence
introducing arch_pkeys_enabled().

This patch also provides x86 implementation for
arch_pkeys_enabled().

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/x86/include/asm/pkeys.h |    1 +
 arch/x86/kernel/fpu/xstate.c |    5 +++++
 include/linux/pkeys.h        |    5 +++++
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index a0ba1ff..f6c287b 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -6,6 +6,7 @@
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
+extern bool arch_pkeys_enabled(void);
 
 /*
  * Try to dedicate one of the protection keys to be used as an
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index f1d5476..a43db74 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -942,6 +942,11 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 
 	return 0;
 }
+
+bool arch_pkeys_enabled(void)
+{
+	return boot_cpu_has(X86_FEATURE_OSPKE);
+}
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 /*
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 0794ca7..3ca2e44 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -35,6 +35,11 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return 0;
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+	return false;
+}
+
 static inline void copy_init_pkru_to_fpregs(void)
 {
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 27/51] mm, x86 : introduce arch_pkeys_enabled()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Arch neutral code needs to know if the architecture supports
protection  keys  to  display protection key in smaps. Hence
introducing arch_pkeys_enabled().

This patch also provides x86 implementation for
arch_pkeys_enabled().

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/x86/include/asm/pkeys.h |    1 +
 arch/x86/kernel/fpu/xstate.c |    5 +++++
 include/linux/pkeys.h        |    5 +++++
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index a0ba1ff..f6c287b 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -6,6 +6,7 @@
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
+extern bool arch_pkeys_enabled(void);
 
 /*
  * Try to dedicate one of the protection keys to be used as an
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index f1d5476..a43db74 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -942,6 +942,11 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 
 	return 0;
 }
+
+bool arch_pkeys_enabled(void)
+{
+	return boot_cpu_has(X86_FEATURE_OSPKE);
+}
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 /*
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 0794ca7..3ca2e44 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -35,6 +35,11 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return 0;
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+	return false;
+}
+
 static inline void copy_init_pkru_to_fpregs(void)
 {
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 28/51] mm: display pkey in smaps if arch_pkeys_enabled() is true
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Currently the  architecture  specific code is expected to
display  the  protection  keys  in  smap  for a given vma.
This can lead to redundant code and possibly to divergent
formats in which the key gets displayed.

This  patch  changes  the implementation. It displays the
pkey only if the architecture support pkeys.

x86 arch_show_smap() function is not needed anymore.
Delete it.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/x86/kernel/setup.c |    8 --------
 fs/proc/task_mmu.c      |   11 ++++++-----
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 0957dd7..b8b8d0e 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1357,11 +1357,3 @@ static int __init register_kernel_offset_dumper(void)
 	return 0;
 }
 __initcall(register_kernel_offset_dumper);
-
-void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index fad19a0..5ce3ec0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -18,6 +18,7 @@
 #include <linux/page_idle.h>
 #include <linux/shmem_fs.h>
 #include <linux/uaccess.h>
+#include <linux/pkeys.h>
 
 #include <asm/elf.h>
 #include <asm/tlb.h>
@@ -731,10 +732,6 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
 }
 #endif /* HUGETLB_PAGE */
 
-void __weak arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-}
-
 static int show_smap(struct seq_file *m, void *v, int is_pid)
 {
 	struct proc_maps_private *priv = m->private;
@@ -854,9 +851,13 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 			   (unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
 
 	if (!rollup_mode) {
-		arch_show_smap(m, vma);
+#ifdef CONFIG_ARCH_HAS_PKEYS
+		if (arch_pkeys_enabled())
+			seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
+#endif
 		show_smap_vma_flags(m, vma);
 	}
+
 	m_cache_vma(m, vma);
 	return ret;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 28/51] mm: display pkey in smaps if arch_pkeys_enabled() is true
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Currently the  architecture  specific code is expected to
display  the  protection  keys  in  smap  for a given vma.
This can lead to redundant code and possibly to divergent
formats in which the key gets displayed.

This  patch  changes  the implementation. It displays the
pkey only if the architecture support pkeys.

x86 arch_show_smap() function is not needed anymore.
Delete it.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/x86/kernel/setup.c |    8 --------
 fs/proc/task_mmu.c      |   11 ++++++-----
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 0957dd7..b8b8d0e 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1357,11 +1357,3 @@ static int __init register_kernel_offset_dumper(void)
 	return 0;
 }
 __initcall(register_kernel_offset_dumper);
-
-void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index fad19a0..5ce3ec0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -18,6 +18,7 @@
 #include <linux/page_idle.h>
 #include <linux/shmem_fs.h>
 #include <linux/uaccess.h>
+#include <linux/pkeys.h>
 
 #include <asm/elf.h>
 #include <asm/tlb.h>
@@ -731,10 +732,6 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
 }
 #endif /* HUGETLB_PAGE */
 
-void __weak arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-}
-
 static int show_smap(struct seq_file *m, void *v, int is_pid)
 {
 	struct proc_maps_private *priv = m->private;
@@ -854,9 +851,13 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 			   (unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
 
 	if (!rollup_mode) {
-		arch_show_smap(m, vma);
+#ifdef CONFIG_ARCH_HAS_PKEYS
+		if (arch_pkeys_enabled())
+			seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
+#endif
 		show_smap_vma_flags(m, vma);
 	}
+
 	m_cache_vma(m, vma);
 	return ret;
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

Expose useful information for programs using memory protection keys.
Provide implementation for powerpc and x86.

On a powerpc system with pkeys support, here is what is shown:

$ head /sys/kernel/mm/protection_keys/*
==> /sys/kernel/mm/protection_keys/disable_access_supported <==
true

==> /sys/kernel/mm/protection_keys/disable_execute_supported <==
true

==> /sys/kernel/mm/protection_keys/disable_write_supported <==
true

==> /sys/kernel/mm/protection_keys/total_keys <==
31

==> /sys/kernel/mm/protection_keys/usable_keys <==
27

And on an x86 without pkeys support:

$ head /sys/kernel/mm/protection_keys/*
==> /sys/kernel/mm/protection_keys/disable_access_supported <==
false

==> /sys/kernel/mm/protection_keys/disable_execute_supported <==
false

==> /sys/kernel/mm/protection_keys/disable_write_supported <==
false

==> /sys/kernel/mm/protection_keys/total_keys <==
1

==> /sys/kernel/mm/protection_keys/usable_keys <==
0

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h   |    2 +
 arch/powerpc/mm/pkeys.c            |   24 ++++++++++
 arch/x86/include/asm/mmu_context.h |    4 +-
 arch/x86/include/asm/pkeys.h       |    1 +
 arch/x86/mm/pkeys.c                |    9 ++++
 include/linux/pkeys.h              |    2 +-
 mm/mprotect.c                      |   88 ++++++++++++++++++++++++++++++++++++
 7 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 333fb28..6d70b1a 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -237,6 +237,8 @@ static inline void pkey_mmu_values(int total_data, int total_execute)
 	pkeys_total = total_data;
 }
 
+extern bool arch_supports_pkeys(int cap);
+extern unsigned int arch_usable_pkeys(void);
 extern void thread_pkey_regs_save(struct thread_struct *thread);
 extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
 				     struct thread_struct *old_thread);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 2612f61..7e8468f 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -421,6 +421,30 @@ bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
 	return pkey_access_permitted(vma_pkey(vma), write, execute);
 }
 
+unsigned int arch_usable_pkeys(void)
+{
+	unsigned int reserved;
+
+	if (static_branch_likely(&pkey_disabled))
+		return 0;
+
+	/* Reserve one more to account for the execute-only pkey. */
+	reserved = hweight32(initial_allocation_mask) + 1;
+
+	return pkeys_total > reserved ? pkeys_total - reserved : 0;
+}
+
+bool arch_supports_pkeys(int cap)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return false;
+
+	if (cap & PKEY_DISABLE_EXECUTE)
+		return pkey_execute_disable_supported;
+
+	return (cap & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+}
+
 long sys_pkey_modify(int pkey, unsigned long new_val)
 {
 	bool ret;
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 6699fc4..e3efabb 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -129,6 +129,8 @@ static inline void switch_ldt(struct mm_struct *prev, struct mm_struct *next)
 
 void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk);
 
+#define PKEY_INITIAL_ALLOCATION_MAP	1
+
 static inline int init_new_context(struct task_struct *tsk,
 				   struct mm_struct *mm)
 {
@@ -138,7 +140,7 @@ static inline int init_new_context(struct task_struct *tsk,
 	#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 	if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
 		/* pkey 0 is the default and always allocated */
-		mm->context.pkey_allocation_map = 0x1;
+		mm->context.pkey_allocation_map = PKEY_INITIAL_ALLOCATION_MAP;
 		/* -1 means unallocated or invalid */
 		mm->context.execute_only_pkey = -1;
 	}
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index f6c287b..6807288 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -106,5 +106,6 @@ extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 extern void copy_init_pkru_to_fpregs(void);
+extern unsigned int arch_usable_pkeys(void);
 
 #endif /*_ASM_X86_PKEYS_H */
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index d7bc0ee..3083a59 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -122,6 +122,15 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot, int pkey
 	return vma_pkey(vma);
 }
 
+unsigned int arch_usable_pkeys(void)
+{
+	/* Reserve one more to account for the execute-only pkey. */
+	unsigned int reserved = (boot_cpu_has(X86_FEATURE_OSPKE) ?
+			hweight32(PKEY_INITIAL_ALLOCATION_MAP) : 0) + 1;
+
+	return arch_max_pkey() > reserved ? arch_max_pkey() - reserved : 0;
+}
+
 #define PKRU_AD_KEY(pkey)	(PKRU_AD_BIT << ((pkey) * PKRU_BITS_PER_PKEY))
 
 /*
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 3ca2e44..0784f20 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -11,6 +11,7 @@
 #define arch_max_pkey() (1)
 #define execute_only_pkey(mm) (0)
 #define arch_override_mprotect_pkey(vma, prot, pkey) (0)
+#define arch_usable_pkeys() (0)
 #define PKEY_DEDICATED_EXECUTE_ONLY 0
 #define ARCH_VM_PKEY_FLAGS 0
 
@@ -43,7 +44,6 @@ static inline bool arch_pkeys_enabled(void)
 static inline void copy_init_pkru_to_fpregs(void)
 {
 }
-
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 #endif /* _LINUX_PKEYS_H */
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ec39f73..43a4584 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -568,4 +568,92 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 	return ret;
 }
 
+#ifdef CONFIG_SYSFS
+
+#define PKEYS_ATTR_RO(_name)						\
+	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+
+static ssize_t total_keys_show(struct kobject *kobj,
+			       struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", arch_max_pkey());
+}
+PKEYS_ATTR_RO(total_keys);
+
+static ssize_t usable_keys_show(struct kobject *kobj,
+				struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", arch_usable_pkeys());
+}
+PKEYS_ATTR_RO(usable_keys);
+
+static ssize_t disable_access_supported_show(struct kobject *kobj,
+					      struct kobj_attribute *attr,
+					      char *buf)
+{
+	if (arch_pkeys_enabled()) {
+		strcpy(buf, "true\n");
+		return sizeof("true\n") - 1;
+	}
+
+	strcpy(buf, "false\n");
+	return sizeof("false\n") - 1;
+}
+PKEYS_ATTR_RO(disable_access_supported);
+
+static ssize_t disable_write_supported_show(struct kobject *kobj,
+					     struct kobj_attribute *attr,
+					     char *buf)
+{
+	if (arch_pkeys_enabled()) {
+		strcpy(buf, "true\n");
+		return sizeof("true\n") - 1;
+	}
+
+	strcpy(buf, "false\n");
+	return sizeof("false\n") - 1;
+}
+PKEYS_ATTR_RO(disable_write_supported);
+
+static ssize_t disable_execute_supported_show(struct kobject *kobj,
+					      struct kobj_attribute *attr,
+					      char *buf)
+{
+#ifdef PKEY_DISABLE_EXECUTE
+	if (arch_supports_pkeys(PKEY_DISABLE_EXECUTE)) {
+		strcpy(buf, "true\n");
+		return sizeof("true\n") - 1;
+	}
+#endif
+
+	strcpy(buf, "false\n");
+	return sizeof("false\n") - 1;
+}
+PKEYS_ATTR_RO(disable_execute_supported);
+
+static struct attribute *pkeys_attrs[] = {
+	&total_keys_attr.attr,
+	&usable_keys_attr.attr,
+	&disable_access_supported_attr.attr,
+	&disable_write_supported_attr.attr,
+	&disable_execute_supported_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group pkeys_attr_group = {
+	.attrs = pkeys_attrs,
+	.name = "protection_keys",
+};
+
+static int __init pkeys_sysfs_init(void)
+{
+	int err;
+
+	err = sysfs_create_group(mm_kobj, &pkeys_attr_group);
+
+	return err;
+}
+late_initcall(pkeys_sysfs_init);
+#endif /* CONFIG_SYSFS */
+
 #endif /* CONFIG_ARCH_HAS_PKEYS */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

Expose useful information for programs using memory protection keys.
Provide implementation for powerpc and x86.

On a powerpc system with pkeys support, here is what is shown:

$ head /sys/kernel/mm/protection_keys/*
==> /sys/kernel/mm/protection_keys/disable_access_supported <==
true

==> /sys/kernel/mm/protection_keys/disable_execute_supported <==
true

==> /sys/kernel/mm/protection_keys/disable_write_supported <==
true

==> /sys/kernel/mm/protection_keys/total_keys <==
31

==> /sys/kernel/mm/protection_keys/usable_keys <==
27

And on an x86 without pkeys support:

$ head /sys/kernel/mm/protection_keys/*
==> /sys/kernel/mm/protection_keys/disable_access_supported <==
false

==> /sys/kernel/mm/protection_keys/disable_execute_supported <==
false

==> /sys/kernel/mm/protection_keys/disable_write_supported <==
false

==> /sys/kernel/mm/protection_keys/total_keys <==
1

==> /sys/kernel/mm/protection_keys/usable_keys <==
0

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h   |    2 +
 arch/powerpc/mm/pkeys.c            |   24 ++++++++++
 arch/x86/include/asm/mmu_context.h |    4 +-
 arch/x86/include/asm/pkeys.h       |    1 +
 arch/x86/mm/pkeys.c                |    9 ++++
 include/linux/pkeys.h              |    2 +-
 mm/mprotect.c                      |   88 ++++++++++++++++++++++++++++++++++++
 7 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 333fb28..6d70b1a 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -237,6 +237,8 @@ static inline void pkey_mmu_values(int total_data, int total_execute)
 	pkeys_total = total_data;
 }
 
+extern bool arch_supports_pkeys(int cap);
+extern unsigned int arch_usable_pkeys(void);
 extern void thread_pkey_regs_save(struct thread_struct *thread);
 extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
 				     struct thread_struct *old_thread);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 2612f61..7e8468f 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -421,6 +421,30 @@ bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
 	return pkey_access_permitted(vma_pkey(vma), write, execute);
 }
 
+unsigned int arch_usable_pkeys(void)
+{
+	unsigned int reserved;
+
+	if (static_branch_likely(&pkey_disabled))
+		return 0;
+
+	/* Reserve one more to account for the execute-only pkey. */
+	reserved = hweight32(initial_allocation_mask) + 1;
+
+	return pkeys_total > reserved ? pkeys_total - reserved : 0;
+}
+
+bool arch_supports_pkeys(int cap)
+{
+	if (static_branch_likely(&pkey_disabled))
+		return false;
+
+	if (cap & PKEY_DISABLE_EXECUTE)
+		return pkey_execute_disable_supported;
+
+	return (cap & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+}
+
 long sys_pkey_modify(int pkey, unsigned long new_val)
 {
 	bool ret;
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 6699fc4..e3efabb 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -129,6 +129,8 @@ static inline void switch_ldt(struct mm_struct *prev, struct mm_struct *next)
 
 void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk);
 
+#define PKEY_INITIAL_ALLOCATION_MAP	1
+
 static inline int init_new_context(struct task_struct *tsk,
 				   struct mm_struct *mm)
 {
@@ -138,7 +140,7 @@ static inline int init_new_context(struct task_struct *tsk,
 	#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 	if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
 		/* pkey 0 is the default and always allocated */
-		mm->context.pkey_allocation_map = 0x1;
+		mm->context.pkey_allocation_map = PKEY_INITIAL_ALLOCATION_MAP;
 		/* -1 means unallocated or invalid */
 		mm->context.execute_only_pkey = -1;
 	}
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index f6c287b..6807288 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -106,5 +106,6 @@ extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 extern void copy_init_pkru_to_fpregs(void);
+extern unsigned int arch_usable_pkeys(void);
 
 #endif /*_ASM_X86_PKEYS_H */
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index d7bc0ee..3083a59 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -122,6 +122,15 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot, int pkey
 	return vma_pkey(vma);
 }
 
+unsigned int arch_usable_pkeys(void)
+{
+	/* Reserve one more to account for the execute-only pkey. */
+	unsigned int reserved = (boot_cpu_has(X86_FEATURE_OSPKE) ?
+			hweight32(PKEY_INITIAL_ALLOCATION_MAP) : 0) + 1;
+
+	return arch_max_pkey() > reserved ? arch_max_pkey() - reserved : 0;
+}
+
 #define PKRU_AD_KEY(pkey)	(PKRU_AD_BIT << ((pkey) * PKRU_BITS_PER_PKEY))
 
 /*
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 3ca2e44..0784f20 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -11,6 +11,7 @@
 #define arch_max_pkey() (1)
 #define execute_only_pkey(mm) (0)
 #define arch_override_mprotect_pkey(vma, prot, pkey) (0)
+#define arch_usable_pkeys() (0)
 #define PKEY_DEDICATED_EXECUTE_ONLY 0
 #define ARCH_VM_PKEY_FLAGS 0
 
@@ -43,7 +44,6 @@ static inline bool arch_pkeys_enabled(void)
 static inline void copy_init_pkru_to_fpregs(void)
 {
 }
-
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 #endif /* _LINUX_PKEYS_H */
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ec39f73..43a4584 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -568,4 +568,92 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 	return ret;
 }
 
+#ifdef CONFIG_SYSFS
+
+#define PKEYS_ATTR_RO(_name)						\
+	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+
+static ssize_t total_keys_show(struct kobject *kobj,
+			       struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", arch_max_pkey());
+}
+PKEYS_ATTR_RO(total_keys);
+
+static ssize_t usable_keys_show(struct kobject *kobj,
+				struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", arch_usable_pkeys());
+}
+PKEYS_ATTR_RO(usable_keys);
+
+static ssize_t disable_access_supported_show(struct kobject *kobj,
+					      struct kobj_attribute *attr,
+					      char *buf)
+{
+	if (arch_pkeys_enabled()) {
+		strcpy(buf, "true\n");
+		return sizeof("true\n") - 1;
+	}
+
+	strcpy(buf, "false\n");
+	return sizeof("false\n") - 1;
+}
+PKEYS_ATTR_RO(disable_access_supported);
+
+static ssize_t disable_write_supported_show(struct kobject *kobj,
+					     struct kobj_attribute *attr,
+					     char *buf)
+{
+	if (arch_pkeys_enabled()) {
+		strcpy(buf, "true\n");
+		return sizeof("true\n") - 1;
+	}
+
+	strcpy(buf, "false\n");
+	return sizeof("false\n") - 1;
+}
+PKEYS_ATTR_RO(disable_write_supported);
+
+static ssize_t disable_execute_supported_show(struct kobject *kobj,
+					      struct kobj_attribute *attr,
+					      char *buf)
+{
+#ifdef PKEY_DISABLE_EXECUTE
+	if (arch_supports_pkeys(PKEY_DISABLE_EXECUTE)) {
+		strcpy(buf, "true\n");
+		return sizeof("true\n") - 1;
+	}
+#endif
+
+	strcpy(buf, "false\n");
+	return sizeof("false\n") - 1;
+}
+PKEYS_ATTR_RO(disable_execute_supported);
+
+static struct attribute *pkeys_attrs[] = {
+	&total_keys_attr.attr,
+	&usable_keys_attr.attr,
+	&disable_access_supported_attr.attr,
+	&disable_write_supported_attr.attr,
+	&disable_execute_supported_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group pkeys_attr_group = {
+	.attrs = pkeys_attrs,
+	.name = "protection_keys",
+};
+
+static int __init pkeys_sysfs_init(void)
+{
+	int err;
+
+	err = sysfs_create_group(mm_kobj, &pkeys_attr_group);
+
+	return err;
+}
+late_initcall(pkeys_sysfs_init);
+#endif /* CONFIG_SYSFS */
+
 #endif /* CONFIG_ARCH_HAS_PKEYS */
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 30/51] Documentation/x86: Move protecton key documentation to arch neutral directory
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Since PowerPC and Intel both support memory protection keys, moving
the documenation to arch-neutral directory.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 Documentation/vm/protection-keys.txt  |   85 +++++++++++++++++++++++++++++++++
 Documentation/x86/protection-keys.txt |   85 ---------------------------------
 2 files changed, 85 insertions(+), 85 deletions(-)
 create mode 100644 Documentation/vm/protection-keys.txt
 delete mode 100644 Documentation/x86/protection-keys.txt

diff --git a/Documentation/vm/protection-keys.txt b/Documentation/vm/protection-keys.txt
new file mode 100644
index 0000000..fa46dcb
--- /dev/null
+++ b/Documentation/vm/protection-keys.txt
@@ -0,0 +1,85 @@
+Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
+which will be found on future Intel CPUs.
+
+Memory Protection Keys provides a mechanism for enforcing page-based
+protections, but without requiring modification of the page tables
+when an application changes protection domains.  It works by
+dedicating 4 previously ignored bits in each page table entry to a
+"protection key", giving 16 possible keys.
+
+There is also a new user-accessible register (PKRU) with two separate
+bits (Access Disable and Write Disable) for each key.  Being a CPU
+register, PKRU is inherently thread-local, potentially giving each
+thread a different set of protections from every other thread.
+
+There are two new instructions (RDPKRU/WRPKRU) for reading and writing
+to the new register.  The feature is only available in 64-bit mode,
+even though there is theoretically space in the PAE PTEs.  These
+permissions are enforced on data access only and have no effect on
+instruction fetches.
+
+=========================== Syscalls ===========================
+
+There are 3 system calls which directly interact with pkeys:
+
+	int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
+	int pkey_free(int pkey);
+	int pkey_mprotect(unsigned long start, size_t len,
+			  unsigned long prot, int pkey);
+
+Before a pkey can be used, it must first be allocated with
+pkey_alloc().  An application calls the WRPKRU instruction
+directly in order to change access permissions to memory covered
+with a key.  In this example WRPKRU is wrapped by a C function
+called pkey_set().
+
+	int real_prot = PROT_READ|PROT_WRITE;
+	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
+	ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
+	... application runs here
+
+Now, if the application needs to update the data at 'ptr', it can
+gain access, do the update, then remove its write access:
+
+	pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
+	*ptr = foo; // assign something
+	pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
+
+Now when it frees the memory, it will also free the pkey since it
+is no longer in use:
+
+	munmap(ptr, PAGE_SIZE);
+	pkey_free(pkey);
+
+(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
+ An example implementation can be found in
+ tools/testing/selftests/x86/protection_keys.c)
+
+=========================== Behavior ===========================
+
+The kernel attempts to make protection keys consistent with the
+behavior of a plain mprotect().  For instance if you do this:
+
+	mprotect(ptr, size, PROT_NONE);
+	something(ptr);
+
+you can expect the same effects with protection keys when doing this:
+
+	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
+	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
+	something(ptr);
+
+That should be true whether something() is a direct access to 'ptr'
+like:
+
+	*ptr = foo;
+
+or when the kernel does the access on the application's behalf like
+with a read():
+
+	read(fd, ptr, 1);
+
+The kernel will send a SIGSEGV in both cases, but si_code will be set
+to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
+the plain mprotect() permissions are violated.
diff --git a/Documentation/x86/protection-keys.txt b/Documentation/x86/protection-keys.txt
deleted file mode 100644
index fa46dcb..0000000
--- a/Documentation/x86/protection-keys.txt
+++ /dev/null
@@ -1,85 +0,0 @@
-Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
-which will be found on future Intel CPUs.
-
-Memory Protection Keys provides a mechanism for enforcing page-based
-protections, but without requiring modification of the page tables
-when an application changes protection domains.  It works by
-dedicating 4 previously ignored bits in each page table entry to a
-"protection key", giving 16 possible keys.
-
-There is also a new user-accessible register (PKRU) with two separate
-bits (Access Disable and Write Disable) for each key.  Being a CPU
-register, PKRU is inherently thread-local, potentially giving each
-thread a different set of protections from every other thread.
-
-There are two new instructions (RDPKRU/WRPKRU) for reading and writing
-to the new register.  The feature is only available in 64-bit mode,
-even though there is theoretically space in the PAE PTEs.  These
-permissions are enforced on data access only and have no effect on
-instruction fetches.
-
-=========================== Syscalls ===========================
-
-There are 3 system calls which directly interact with pkeys:
-
-	int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
-	int pkey_free(int pkey);
-	int pkey_mprotect(unsigned long start, size_t len,
-			  unsigned long prot, int pkey);
-
-Before a pkey can be used, it must first be allocated with
-pkey_alloc().  An application calls the WRPKRU instruction
-directly in order to change access permissions to memory covered
-with a key.  In this example WRPKRU is wrapped by a C function
-called pkey_set().
-
-	int real_prot = PROT_READ|PROT_WRITE;
-	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
-	ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-	ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
-	... application runs here
-
-Now, if the application needs to update the data at 'ptr', it can
-gain access, do the update, then remove its write access:
-
-	pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
-	*ptr = foo; // assign something
-	pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
-
-Now when it frees the memory, it will also free the pkey since it
-is no longer in use:
-
-	munmap(ptr, PAGE_SIZE);
-	pkey_free(pkey);
-
-(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
- An example implementation can be found in
- tools/testing/selftests/x86/protection_keys.c)
-
-=========================== Behavior ===========================
-
-The kernel attempts to make protection keys consistent with the
-behavior of a plain mprotect().  For instance if you do this:
-
-	mprotect(ptr, size, PROT_NONE);
-	something(ptr);
-
-you can expect the same effects with protection keys when doing this:
-
-	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
-	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
-	something(ptr);
-
-That should be true whether something() is a direct access to 'ptr'
-like:
-
-	*ptr = foo;
-
-or when the kernel does the access on the application's behalf like
-with a read():
-
-	read(fd, ptr, 1);
-
-The kernel will send a SIGSEGV in both cases, but si_code will be set
-to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
-the plain mprotect() permissions are violated.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 30/51] Documentation/x86: Move protecton key documentation to arch neutral directory
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Since PowerPC and Intel both support memory protection keys, moving
the documenation to arch-neutral directory.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 Documentation/vm/protection-keys.txt  |   85 +++++++++++++++++++++++++++++++++
 Documentation/x86/protection-keys.txt |   85 ---------------------------------
 2 files changed, 85 insertions(+), 85 deletions(-)
 create mode 100644 Documentation/vm/protection-keys.txt
 delete mode 100644 Documentation/x86/protection-keys.txt

diff --git a/Documentation/vm/protection-keys.txt b/Documentation/vm/protection-keys.txt
new file mode 100644
index 0000000..fa46dcb
--- /dev/null
+++ b/Documentation/vm/protection-keys.txt
@@ -0,0 +1,85 @@
+Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
+which will be found on future Intel CPUs.
+
+Memory Protection Keys provides a mechanism for enforcing page-based
+protections, but without requiring modification of the page tables
+when an application changes protection domains.  It works by
+dedicating 4 previously ignored bits in each page table entry to a
+"protection key", giving 16 possible keys.
+
+There is also a new user-accessible register (PKRU) with two separate
+bits (Access Disable and Write Disable) for each key.  Being a CPU
+register, PKRU is inherently thread-local, potentially giving each
+thread a different set of protections from every other thread.
+
+There are two new instructions (RDPKRU/WRPKRU) for reading and writing
+to the new register.  The feature is only available in 64-bit mode,
+even though there is theoretically space in the PAE PTEs.  These
+permissions are enforced on data access only and have no effect on
+instruction fetches.
+
+=========================== Syscalls ===========================
+
+There are 3 system calls which directly interact with pkeys:
+
+	int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
+	int pkey_free(int pkey);
+	int pkey_mprotect(unsigned long start, size_t len,
+			  unsigned long prot, int pkey);
+
+Before a pkey can be used, it must first be allocated with
+pkey_alloc().  An application calls the WRPKRU instruction
+directly in order to change access permissions to memory covered
+with a key.  In this example WRPKRU is wrapped by a C function
+called pkey_set().
+
+	int real_prot = PROT_READ|PROT_WRITE;
+	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
+	ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
+	... application runs here
+
+Now, if the application needs to update the data at 'ptr', it can
+gain access, do the update, then remove its write access:
+
+	pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
+	*ptr = foo; // assign something
+	pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
+
+Now when it frees the memory, it will also free the pkey since it
+is no longer in use:
+
+	munmap(ptr, PAGE_SIZE);
+	pkey_free(pkey);
+
+(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
+ An example implementation can be found in
+ tools/testing/selftests/x86/protection_keys.c)
+
+=========================== Behavior ===========================
+
+The kernel attempts to make protection keys consistent with the
+behavior of a plain mprotect().  For instance if you do this:
+
+	mprotect(ptr, size, PROT_NONE);
+	something(ptr);
+
+you can expect the same effects with protection keys when doing this:
+
+	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
+	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
+	something(ptr);
+
+That should be true whether something() is a direct access to 'ptr'
+like:
+
+	*ptr = foo;
+
+or when the kernel does the access on the application's behalf like
+with a read():
+
+	read(fd, ptr, 1);
+
+The kernel will send a SIGSEGV in both cases, but si_code will be set
+to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
+the plain mprotect() permissions are violated.
diff --git a/Documentation/x86/protection-keys.txt b/Documentation/x86/protection-keys.txt
deleted file mode 100644
index fa46dcb..0000000
--- a/Documentation/x86/protection-keys.txt
+++ /dev/null
@@ -1,85 +0,0 @@
-Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
-which will be found on future Intel CPUs.
-
-Memory Protection Keys provides a mechanism for enforcing page-based
-protections, but without requiring modification of the page tables
-when an application changes protection domains.  It works by
-dedicating 4 previously ignored bits in each page table entry to a
-"protection key", giving 16 possible keys.
-
-There is also a new user-accessible register (PKRU) with two separate
-bits (Access Disable and Write Disable) for each key.  Being a CPU
-register, PKRU is inherently thread-local, potentially giving each
-thread a different set of protections from every other thread.
-
-There are two new instructions (RDPKRU/WRPKRU) for reading and writing
-to the new register.  The feature is only available in 64-bit mode,
-even though there is theoretically space in the PAE PTEs.  These
-permissions are enforced on data access only and have no effect on
-instruction fetches.
-
-=========================== Syscalls ===========================
-
-There are 3 system calls which directly interact with pkeys:
-
-	int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
-	int pkey_free(int pkey);
-	int pkey_mprotect(unsigned long start, size_t len,
-			  unsigned long prot, int pkey);
-
-Before a pkey can be used, it must first be allocated with
-pkey_alloc().  An application calls the WRPKRU instruction
-directly in order to change access permissions to memory covered
-with a key.  In this example WRPKRU is wrapped by a C function
-called pkey_set().
-
-	int real_prot = PROT_READ|PROT_WRITE;
-	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
-	ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-	ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
-	... application runs here
-
-Now, if the application needs to update the data at 'ptr', it can
-gain access, do the update, then remove its write access:
-
-	pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
-	*ptr = foo; // assign something
-	pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
-
-Now when it frees the memory, it will also free the pkey since it
-is no longer in use:
-
-	munmap(ptr, PAGE_SIZE);
-	pkey_free(pkey);
-
-(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
- An example implementation can be found in
- tools/testing/selftests/x86/protection_keys.c)
-
-=========================== Behavior ===========================
-
-The kernel attempts to make protection keys consistent with the
-behavior of a plain mprotect().  For instance if you do this:
-
-	mprotect(ptr, size, PROT_NONE);
-	something(ptr);
-
-you can expect the same effects with protection keys when doing this:
-
-	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
-	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
-	something(ptr);
-
-That should be true whether something() is a direct access to 'ptr'
-like:
-
-	*ptr = foo;
-
-or when the kernel does the access on the application's behalf like
-with a read():
-
-	read(fd, ptr, 1);
-
-The kernel will send a SIGSEGV in both cases, but si_code will be set
-to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
-the plain mprotect() permissions are violated.
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 31/51] Documentation/vm: PowerPC specific updates to memory protection keys
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Add documentation updates that capture PowerPC specific changes.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 Documentation/vm/protection-keys.txt |  126 +++++++++++++++++++++++++++-------
 1 files changed, 101 insertions(+), 25 deletions(-)

diff --git a/Documentation/vm/protection-keys.txt b/Documentation/vm/protection-keys.txt
index fa46dcb..bc079b3 100644
--- a/Documentation/vm/protection-keys.txt
+++ b/Documentation/vm/protection-keys.txt
@@ -1,22 +1,46 @@
-Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
-which will be found on future Intel CPUs.
-
-Memory Protection Keys provides a mechanism for enforcing page-based
-protections, but without requiring modification of the page tables
-when an application changes protection domains.  It works by
-dedicating 4 previously ignored bits in each page table entry to a
-"protection key", giving 16 possible keys.
-
-There is also a new user-accessible register (PKRU) with two separate
-bits (Access Disable and Write Disable) for each key.  Being a CPU
-register, PKRU is inherently thread-local, potentially giving each
-thread a different set of protections from every other thread.
-
-There are two new instructions (RDPKRU/WRPKRU) for reading and writing
-to the new register.  The feature is only available in 64-bit mode,
-even though there is theoretically space in the PAE PTEs.  These
-permissions are enforced on data access only and have no effect on
-instruction fetches.
+Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature found on
+future Intel CPUs and on PowerPC 5 and higher CPUs.
+
+Memory Protection Keys provide a mechanism for enforcing page-based
+protections, but without requiring modification of the page tables when an
+application changes protection domains.
+
+It works by dedicating bits in each page table entry to a "protection key".
+There is also a user-accessible register with two separate bits for each
+key.  Being a CPU register, the user-accessible register is inherently
+thread-local, potentially giving each thread a different set of protections
+from every other thread.
+
+On Intel:
+
+	Four previously bits are used the page table entry giving 16 possible keys.
+
+	The user accessible register(PKRU) has a bit each per key to disable
+	access and to disable write.
+
+	The feature is only available in 64-bit mode, even though there is
+	theoretically space in the PAE PTEs.  These permissions are enforced on
+	data access only and have no effect on instruction fetches.
+
+On PowerPC:
+
+	Five bits in the page table entry are used giving 32 possible keys.
+	This support is currently for Hash Page Table mode only.
+
+	The user accessible register(AMR) has a bit each per key to disable
+	read and write. Access disable can be achieved by disabling
+	read and write.
+
+	'mtspr 0xd, mem' reads the AMR register
+	'mfspr mem, 0xd' writes into the AMR register.
+
+	Execution can  be  disabled by allocating a key with execute-disabled
+	permission. The execute-permissions on the key; however, cannot be
+	changed through a user accessible register. Instead; a powerpc specific
+	system call sys_pkey_modify() must be used. The CPU will not allow
+	execution of instruction in pages that are associated with
+	execute-disabled key.
+
 
 =========================== Syscalls ===========================
 
@@ -28,9 +52,9 @@ There are 3 system calls which directly interact with pkeys:
 			  unsigned long prot, int pkey);
 
 Before a pkey can be used, it must first be allocated with
-pkey_alloc().  An application calls the WRPKRU instruction
+pkey_alloc().  An application calls the WRPKRU/AMR instruction
 directly in order to change access permissions to memory covered
-with a key.  In this example WRPKRU is wrapped by a C function
+with a key.  In this example WRPKRU/AMR is wrapped by a C function
 called pkey_set().
 
 	int real_prot = PROT_READ|PROT_WRITE;
@@ -52,11 +76,11 @@ is no longer in use:
 	munmap(ptr, PAGE_SIZE);
 	pkey_free(pkey);
 
-(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
+(Note: pkey_set() is a wrapper for the RDPKRU,WRPKRU or AMR instructions.
  An example implementation can be found in
- tools/testing/selftests/x86/protection_keys.c)
+ tools/testing/selftests/vm/protection_keys.c)
 
-=========================== Behavior ===========================
+=========================== Behavior =================================
 
 The kernel attempts to make protection keys consistent with the
 behavior of a plain mprotect().  For instance if you do this:
@@ -66,7 +90,7 @@ behavior of a plain mprotect().  For instance if you do this:
 
 you can expect the same effects with protection keys when doing this:
 
-	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
+	pkey = pkey_alloc(0, PKEY_DISABLE_ACCESS);
 	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
 	something(ptr);
 
@@ -83,3 +107,55 @@ with a read():
 The kernel will send a SIGSEGV in both cases, but si_code will be set
 to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
 the plain mprotect() permissions are violated.
+
+========================== sysfs Interface ==========================
+
+Information about support of protection keys on the system can be
+found in the /sys/kernel/mm/protection_keys directory, which
+contains the following files:
+
+- total_keys: Shows the number of keys supported by the hardware.
+    Not all of those keys may be available for use by a process
+    because the platform or operating system may reserve some keys
+    for their own use.
+
+- usable_keys: Shows the minimum number of keys guaranteed to be
+    available for use by a process. In other words: total_keys minus
+    the keys reserved by the platform or operating system. This
+    number doesn't change to reflect keys that are already being
+    used by the process reading the file.
+
+    There may be one more key available than what is advertised in
+    this file because the kernel may use one key for mprotect()
+    calls setting up memory with execute-only permissions. This file
+    assumes that this key is being used, but if it is not the
+    process will have one more key it can use for other purposes.
+
+- disable_access_supported: Shows 'true' if the system supports keys
+    which disallow reading from a given page (i.e., the
+    PKEY_DISABLE_ACCESS flag is supported).
+
+- disable_write_supported: Shows 'true' if the system supports keys
+    which disallow writing to a given page (i.e., the
+    PKEY_DISABLE_WRITE flag is supported).
+
+- disable_execute_supported: Shows 'true' if the system supports keys
+    which disallow code execution from a given page (i.e., the
+    PKEY_DISABLE_EXECUTE flag is supported).
+
+====================================================================
+		Differences
+
+The following differences exist between x86 and power.
+
+a) powerpc (PowerPC8 onwards) *also* allows creation of a key with
+   execute-disabled.
+	The following is allowed
+	pkey = pkey_alloc(0, PKEY_DISABLE_EXECUTE);
+
+b) On powerpc the access/write permission on a key can be modified by
+   programming the AMR register from the signal handler. The changes
+   persist across signal boundaries. On x86, the PKRU specific fpregs
+   entry has to be modified to change the access/write permission on
+   a key.
+=====================================================================
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 31/51] Documentation/vm: PowerPC specific updates to memory protection keys
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Add documentation updates that capture PowerPC specific changes.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 Documentation/vm/protection-keys.txt |  126 +++++++++++++++++++++++++++-------
 1 files changed, 101 insertions(+), 25 deletions(-)

diff --git a/Documentation/vm/protection-keys.txt b/Documentation/vm/protection-keys.txt
index fa46dcb..bc079b3 100644
--- a/Documentation/vm/protection-keys.txt
+++ b/Documentation/vm/protection-keys.txt
@@ -1,22 +1,46 @@
-Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
-which will be found on future Intel CPUs.
-
-Memory Protection Keys provides a mechanism for enforcing page-based
-protections, but without requiring modification of the page tables
-when an application changes protection domains.  It works by
-dedicating 4 previously ignored bits in each page table entry to a
-"protection key", giving 16 possible keys.
-
-There is also a new user-accessible register (PKRU) with two separate
-bits (Access Disable and Write Disable) for each key.  Being a CPU
-register, PKRU is inherently thread-local, potentially giving each
-thread a different set of protections from every other thread.
-
-There are two new instructions (RDPKRU/WRPKRU) for reading and writing
-to the new register.  The feature is only available in 64-bit mode,
-even though there is theoretically space in the PAE PTEs.  These
-permissions are enforced on data access only and have no effect on
-instruction fetches.
+Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature found on
+future Intel CPUs and on PowerPC 5 and higher CPUs.
+
+Memory Protection Keys provide a mechanism for enforcing page-based
+protections, but without requiring modification of the page tables when an
+application changes protection domains.
+
+It works by dedicating bits in each page table entry to a "protection key".
+There is also a user-accessible register with two separate bits for each
+key.  Being a CPU register, the user-accessible register is inherently
+thread-local, potentially giving each thread a different set of protections
+from every other thread.
+
+On Intel:
+
+	Four previously bits are used the page table entry giving 16 possible keys.
+
+	The user accessible register(PKRU) has a bit each per key to disable
+	access and to disable write.
+
+	The feature is only available in 64-bit mode, even though there is
+	theoretically space in the PAE PTEs.  These permissions are enforced on
+	data access only and have no effect on instruction fetches.
+
+On PowerPC:
+
+	Five bits in the page table entry are used giving 32 possible keys.
+	This support is currently for Hash Page Table mode only.
+
+	The user accessible register(AMR) has a bit each per key to disable
+	read and write. Access disable can be achieved by disabling
+	read and write.
+
+	'mtspr 0xd, mem' reads the AMR register
+	'mfspr mem, 0xd' writes into the AMR register.
+
+	Execution can  be  disabled by allocating a key with execute-disabled
+	permission. The execute-permissions on the key; however, cannot be
+	changed through a user accessible register. Instead; a powerpc specific
+	system call sys_pkey_modify() must be used. The CPU will not allow
+	execution of instruction in pages that are associated with
+	execute-disabled key.
+
 
 =========================== Syscalls ===========================
 
@@ -28,9 +52,9 @@ There are 3 system calls which directly interact with pkeys:
 			  unsigned long prot, int pkey);
 
 Before a pkey can be used, it must first be allocated with
-pkey_alloc().  An application calls the WRPKRU instruction
+pkey_alloc().  An application calls the WRPKRU/AMR instruction
 directly in order to change access permissions to memory covered
-with a key.  In this example WRPKRU is wrapped by a C function
+with a key.  In this example WRPKRU/AMR is wrapped by a C function
 called pkey_set().
 
 	int real_prot = PROT_READ|PROT_WRITE;
@@ -52,11 +76,11 @@ is no longer in use:
 	munmap(ptr, PAGE_SIZE);
 	pkey_free(pkey);
 
-(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
+(Note: pkey_set() is a wrapper for the RDPKRU,WRPKRU or AMR instructions.
  An example implementation can be found in
- tools/testing/selftests/x86/protection_keys.c)
+ tools/testing/selftests/vm/protection_keys.c)
 
-=========================== Behavior ===========================
+=========================== Behavior =================================
 
 The kernel attempts to make protection keys consistent with the
 behavior of a plain mprotect().  For instance if you do this:
@@ -66,7 +90,7 @@ behavior of a plain mprotect().  For instance if you do this:
 
 you can expect the same effects with protection keys when doing this:
 
-	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
+	pkey = pkey_alloc(0, PKEY_DISABLE_ACCESS);
 	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
 	something(ptr);
 
@@ -83,3 +107,55 @@ with a read():
 The kernel will send a SIGSEGV in both cases, but si_code will be set
 to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
 the plain mprotect() permissions are violated.
+
+========================== sysfs Interface ==========================
+
+Information about support of protection keys on the system can be
+found in the /sys/kernel/mm/protection_keys directory, which
+contains the following files:
+
+- total_keys: Shows the number of keys supported by the hardware.
+    Not all of those keys may be available for use by a process
+    because the platform or operating system may reserve some keys
+    for their own use.
+
+- usable_keys: Shows the minimum number of keys guaranteed to be
+    available for use by a process. In other words: total_keys minus
+    the keys reserved by the platform or operating system. This
+    number doesn't change to reflect keys that are already being
+    used by the process reading the file.
+
+    There may be one more key available than what is advertised in
+    this file because the kernel may use one key for mprotect()
+    calls setting up memory with execute-only permissions. This file
+    assumes that this key is being used, but if it is not the
+    process will have one more key it can use for other purposes.
+
+- disable_access_supported: Shows 'true' if the system supports keys
+    which disallow reading from a given page (i.e., the
+    PKEY_DISABLE_ACCESS flag is supported).
+
+- disable_write_supported: Shows 'true' if the system supports keys
+    which disallow writing to a given page (i.e., the
+    PKEY_DISABLE_WRITE flag is supported).
+
+- disable_execute_supported: Shows 'true' if the system supports keys
+    which disallow code execution from a given page (i.e., the
+    PKEY_DISABLE_EXECUTE flag is supported).
+
+====================================================================
+		Differences
+
+The following differences exist between x86 and power.
+
+a) powerpc (PowerPC8 onwards) *also* allows creation of a key with
+   execute-disabled.
+	The following is allowed
+	pkey = pkey_alloc(0, PKEY_DISABLE_EXECUTE);
+
+b) On powerpc the access/write permission on a key can be modified by
+   programming the AMR register from the signal handler. The changes
+   persist across signal boundaries. On x86, the PKRU specific fpregs
+   entry has to be modified to change the access/write permission on
+   a key.
+=====================================================================
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 32/51] selftest/x86: Move protecton key selftest to arch neutral directory
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/Makefile           |    1 +
 tools/testing/selftests/vm/pkey-helpers.h     |  220 ++++
 tools/testing/selftests/vm/protection_keys.c  | 1395 +++++++++++++++++++++++++
 tools/testing/selftests/x86/Makefile          |    2 +-
 tools/testing/selftests/x86/pkey-helpers.h    |  220 ----
 tools/testing/selftests/x86/protection_keys.c | 1395 -------------------------
 6 files changed, 1617 insertions(+), 1616 deletions(-)
 create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
 create mode 100644 tools/testing/selftests/vm/protection_keys.c
 delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
 delete mode 100644 tools/testing/selftests/x86/protection_keys.c

diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile
index e49eca1..6f18ef4 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -18,6 +18,7 @@ TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 TEST_GEN_FILES += mlock-random-test
 TEST_GEN_FILES += virtual_address_range
+TEST_GEN_FILES += protection_keys
 
 TEST_PROGS := run_vmtests
 
diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
new file mode 100644
index 0000000..3818f25
--- /dev/null
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -0,0 +1,220 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PKEYS_HELPER_H
+#define _PKEYS_HELPER_H
+#define _GNU_SOURCE
+#include <string.h>
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <ucontext.h>
+#include <sys/mman.h>
+
+#define NR_PKEYS 16
+#define PKRU_BITS_PER_PKEY 2
+
+#ifndef DEBUG_LEVEL
+#define DEBUG_LEVEL 0
+#endif
+#define DPRINT_IN_SIGNAL_BUF_SIZE 4096
+extern int dprint_in_signal;
+extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+static inline void sigsafe_printf(const char *format, ...)
+{
+	va_list ap;
+
+	va_start(ap, format);
+	if (!dprint_in_signal) {
+		vprintf(format, ap);
+	} else {
+		int len = vsnprintf(dprint_in_signal_buffer,
+				    DPRINT_IN_SIGNAL_BUF_SIZE,
+				    format, ap);
+		/*
+		 * len is amount that would have been printed,
+		 * but actual write is truncated at BUF_SIZE.
+		 */
+		if (len > DPRINT_IN_SIGNAL_BUF_SIZE)
+			len = DPRINT_IN_SIGNAL_BUF_SIZE;
+		write(1, dprint_in_signal_buffer, len);
+	}
+	va_end(ap);
+}
+#define dprintf_level(level, args...) do {	\
+	if (level <= DEBUG_LEVEL)		\
+		sigsafe_printf(args);		\
+	fflush(NULL);				\
+} while (0)
+#define dprintf0(args...) dprintf_level(0, args)
+#define dprintf1(args...) dprintf_level(1, args)
+#define dprintf2(args...) dprintf_level(2, args)
+#define dprintf3(args...) dprintf_level(3, args)
+#define dprintf4(args...) dprintf_level(4, args)
+
+extern unsigned int shadow_pkru;
+static inline unsigned int __rdpkru(void)
+{
+	unsigned int eax, edx;
+	unsigned int ecx = 0;
+	unsigned int pkru;
+
+	asm volatile(".byte 0x0f,0x01,0xee\n\t"
+		     : "=a" (eax), "=d" (edx)
+		     : "c" (ecx));
+	pkru = eax;
+	return pkru;
+}
+
+static inline unsigned int _rdpkru(int line)
+{
+	unsigned int pkru = __rdpkru();
+
+	dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
+			line, pkru, shadow_pkru);
+	assert(pkru == shadow_pkru);
+
+	return pkru;
+}
+
+#define rdpkru() _rdpkru(__LINE__)
+
+static inline void __wrpkru(unsigned int pkru)
+{
+	unsigned int eax = pkru;
+	unsigned int ecx = 0;
+	unsigned int edx = 0;
+
+	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	asm volatile(".byte 0x0f,0x01,0xef\n\t"
+		     : : "a" (eax), "c" (ecx), "d" (edx));
+	assert(pkru == __rdpkru());
+}
+
+static inline void wrpkru(unsigned int pkru)
+{
+	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	/* will do the shadow check for us: */
+	rdpkru();
+	__wrpkru(pkru);
+	shadow_pkru = pkru;
+	dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
+}
+
+/*
+ * These are technically racy. since something could
+ * change PKRU between the read and the write.
+ */
+static inline void __pkey_access_allow(int pkey, int do_allow)
+{
+	unsigned int pkru = rdpkru();
+	int bit = pkey * 2;
+
+	if (do_allow)
+		pkru &= (1<<bit);
+	else
+		pkru |= (1<<bit);
+
+	dprintf4("pkru now: %08x\n", rdpkru());
+	wrpkru(pkru);
+}
+
+static inline void __pkey_write_allow(int pkey, int do_allow_write)
+{
+	long pkru = rdpkru();
+	int bit = pkey * 2 + 1;
+
+	if (do_allow_write)
+		pkru &= (1<<bit);
+	else
+		pkru |= (1<<bit);
+
+	wrpkru(pkru);
+	dprintf4("pkru now: %08x\n", rdpkru());
+}
+
+#define PROT_PKEY0     0x10            /* protection key value (bit 0) */
+#define PROT_PKEY1     0x20            /* protection key value (bit 1) */
+#define PROT_PKEY2     0x40            /* protection key value (bit 2) */
+#define PROT_PKEY3     0x80            /* protection key value (bit 3) */
+
+#define PAGE_SIZE 4096
+#define MB	(1<<20)
+
+static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
+		unsigned int *ecx, unsigned int *edx)
+{
+	/* ecx is often an input as well as an output. */
+	asm volatile(
+		"cpuid;"
+		: "=a" (*eax),
+		  "=b" (*ebx),
+		  "=c" (*ecx),
+		  "=d" (*edx)
+		: "0" (*eax), "2" (*ecx));
+}
+
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU        (1<<3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE      (1<<4) /* OS Protection Keys Enable */
+
+static inline int cpu_has_pku(void)
+{
+	unsigned int eax;
+	unsigned int ebx;
+	unsigned int ecx;
+	unsigned int edx;
+
+	eax = 0x7;
+	ecx = 0x0;
+	__cpuid(&eax, &ebx, &ecx, &edx);
+
+	if (!(ecx & X86_FEATURE_PKU)) {
+		dprintf2("cpu does not have PKU\n");
+		return 0;
+	}
+	if (!(ecx & X86_FEATURE_OSPKE)) {
+		dprintf2("cpu does not have OSPKE\n");
+		return 0;
+	}
+	return 1;
+}
+
+#define XSTATE_PKRU_BIT	(9)
+#define XSTATE_PKRU	0x200
+
+int pkru_xstate_offset(void)
+{
+	unsigned int eax;
+	unsigned int ebx;
+	unsigned int ecx;
+	unsigned int edx;
+	int xstate_offset;
+	int xstate_size;
+	unsigned long XSTATE_CPUID = 0xd;
+	int leaf;
+
+	/* assume that XSTATE_PKRU is set in XCR0 */
+	leaf = XSTATE_PKRU_BIT;
+	{
+		eax = XSTATE_CPUID;
+		ecx = leaf;
+		__cpuid(&eax, &ebx, &ecx, &edx);
+
+		if (leaf == XSTATE_PKRU_BIT) {
+			xstate_offset = ebx;
+			xstate_size = eax;
+		}
+	}
+
+	if (xstate_size == 0) {
+		printf("could not find size/offset of PKRU in xsave state\n");
+		return 0;
+	}
+
+	return xstate_offset;
+}
+
+#endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
new file mode 100644
index 0000000..555e43c
--- /dev/null
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -0,0 +1,1395 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Tests x86 Memory Protection Keys (see Documentation/x86/protection-keys.txt)
+ *
+ * There are examples in here of:
+ *  * how to set protection keys on memory
+ *  * how to set/clear bits in PKRU (the rights register)
+ *  * how to handle SEGV_PKRU signals and extract pkey-relevant
+ *    information from the siginfo
+ *
+ * Things to add:
+ *	make sure KSM and KSM COW breaking works
+ *	prefault pages in at malloc, or not
+ *	protect MPX bounds tables with protection keys?
+ *	make sure VMA splitting/merging is working correctly
+ *	OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune to pkeys
+ *	look for pkey "leaks" where it is still set on a VMA but "freed" back to the kernel
+ *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
+ *
+ * Compile like this:
+ *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ */
+#define _GNU_SOURCE
+#include <errno.h>
+#include <linux/futex.h>
+#include <sys/time.h>
+#include <sys/syscall.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <ucontext.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/ptrace.h>
+#include <setjmp.h>
+
+#include "pkey-helpers.h"
+
+int iteration_nr = 1;
+int test_nr;
+
+unsigned int shadow_pkru;
+
+#define HPAGE_SIZE	(1UL<<21)
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
+#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
+#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
+#define ALIGN_PTR_UP(p, ptr_align_to)	((typeof(p))ALIGN_UP((unsigned long)(p),	ptr_align_to))
+#define ALIGN_PTR_DOWN(p, ptr_align_to)	((typeof(p))ALIGN_DOWN((unsigned long)(p),	ptr_align_to))
+#define __stringify_1(x...)     #x
+#define __stringify(x...)       __stringify_1(x)
+
+#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
+
+int dprint_in_signal;
+char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+
+extern void abort_hooks(void);
+#define pkey_assert(condition) do {		\
+	if (!(condition)) {			\
+		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
+				__FILE__, __LINE__,	\
+				test_nr, iteration_nr);	\
+		dprintf0("errno at assert: %d", errno);	\
+		abort_hooks();			\
+		assert(condition);		\
+	}					\
+} while (0)
+#define raw_assert(cond) assert(cond)
+
+void cat_into_file(char *str, char *file)
+{
+	int fd = open(file, O_RDWR);
+	int ret;
+
+	dprintf2("%s(): writing '%s' to '%s'\n", __func__, str, file);
+	/*
+	 * these need to be raw because they are called under
+	 * pkey_assert()
+	 */
+	raw_assert(fd >= 0);
+	ret = write(fd, str, strlen(str));
+	if (ret != strlen(str)) {
+		perror("write to file failed");
+		fprintf(stderr, "filename: '%s' str: '%s'\n", file, str);
+		raw_assert(0);
+	}
+	close(fd);
+}
+
+#if CONTROL_TRACING > 0
+static int warned_tracing;
+int tracing_root_ok(void)
+{
+	if (geteuid() != 0) {
+		if (!warned_tracing)
+			fprintf(stderr, "WARNING: not run as root, "
+					"can not do tracing control\n");
+		warned_tracing = 1;
+		return 0;
+	}
+	return 1;
+}
+#endif
+
+void tracing_on(void)
+{
+#if CONTROL_TRACING > 0
+#define TRACEDIR "/sys/kernel/debug/tracing"
+	char pidstr[32];
+
+	if (!tracing_root_ok())
+		return;
+
+	sprintf(pidstr, "%d", getpid());
+	cat_into_file("0", TRACEDIR "/tracing_on");
+	cat_into_file("\n", TRACEDIR "/trace");
+	if (1) {
+		cat_into_file("function_graph", TRACEDIR "/current_tracer");
+		cat_into_file("1", TRACEDIR "/options/funcgraph-proc");
+	} else {
+		cat_into_file("nop", TRACEDIR "/current_tracer");
+	}
+	cat_into_file(pidstr, TRACEDIR "/set_ftrace_pid");
+	cat_into_file("1", TRACEDIR "/tracing_on");
+	dprintf1("enabled tracing\n");
+#endif
+}
+
+void tracing_off(void)
+{
+#if CONTROL_TRACING > 0
+	if (!tracing_root_ok())
+		return;
+	cat_into_file("0", "/sys/kernel/debug/tracing/tracing_on");
+#endif
+}
+
+void abort_hooks(void)
+{
+	fprintf(stderr, "running %s()...\n", __func__);
+	tracing_off();
+#ifdef SLEEP_ON_ABORT
+	sleep(SLEEP_ON_ABORT);
+#endif
+}
+
+static inline void __page_o_noops(void)
+{
+	/* 8-bytes of instruction * 512 bytes = 1 page */
+	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
+}
+
+/*
+ * This attempts to have roughly a page of instructions followed by a few
+ * instructions that do a write, and another page of instructions.  That
+ * way, we are pretty sure that the write is in the second page of
+ * instructions and has at least a page of padding behind it.
+ *
+ * *That* lets us be sure to madvise() away the write instruction, which
+ * will then fault, which makes sure that the fault code handles
+ * execute-only memory properly.
+ */
+__attribute__((__aligned__(PAGE_SIZE)))
+void lots_o_noops_around_write(int *write_to_me)
+{
+	dprintf3("running %s()\n", __func__);
+	__page_o_noops();
+	/* Assume this happens in the second page of instructions: */
+	*write_to_me = __LINE__;
+	/* pad out by another page: */
+	__page_o_noops();
+	dprintf3("%s() done\n", __func__);
+}
+
+/* Define some kernel-like types */
+#define  u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+
+#ifdef __i386__
+#define SYS_mprotect_key 380
+#define SYS_pkey_alloc	 381
+#define SYS_pkey_free	 382
+#define REG_IP_IDX REG_EIP
+#define si_pkey_offset 0x14
+#else
+#define SYS_mprotect_key 329
+#define SYS_pkey_alloc	 330
+#define SYS_pkey_free	 331
+#define REG_IP_IDX REG_RIP
+#define si_pkey_offset 0x20
+#endif
+
+void dump_mem(void *dumpme, int len_bytes)
+{
+	char *c = (void *)dumpme;
+	int i;
+
+	for (i = 0; i < len_bytes; i += sizeof(u64)) {
+		u64 *ptr = (u64 *)(c + i);
+		dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
+	}
+}
+
+#define SEGV_BNDERR     3  /* failed address bound checks */
+#define SEGV_PKUERR     4
+
+static char *si_code_str(int si_code)
+{
+	if (si_code == SEGV_MAPERR)
+		return "SEGV_MAPERR";
+	if (si_code == SEGV_ACCERR)
+		return "SEGV_ACCERR";
+	if (si_code == SEGV_BNDERR)
+		return "SEGV_BNDERR";
+	if (si_code == SEGV_PKUERR)
+		return "SEGV_PKUERR";
+	return "UNKNOWN";
+}
+
+int pkru_faults;
+int last_si_pkey = -1;
+void signal_handler(int signum, siginfo_t *si, void *vucontext)
+{
+	ucontext_t *uctxt = vucontext;
+	int trapno;
+	unsigned long ip;
+	char *fpregs;
+	u32 *pkru_ptr;
+	u64 si_pkey;
+	u32 *si_pkey_ptr;
+	int pkru_offset;
+	fpregset_t fpregset;
+
+	dprint_in_signal = 1;
+	dprintf1(">>>>===============SIGSEGV============================\n");
+	dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
+			__rdpkru(), shadow_pkru);
+
+	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
+	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
+	fpregset = uctxt->uc_mcontext.fpregs;
+	fpregs = (void *)fpregset;
+
+	dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
+			trapno, ip, si_code_str(si->si_code), si->si_code);
+#ifdef __i386__
+	/*
+	 * 32-bit has some extra padding so that userspace can tell whether
+	 * the XSTATE header is present in addition to the "legacy" FPU
+	 * state.  We just assume that it is here.
+	 */
+	fpregs += 0x70;
+#endif
+	pkru_offset = pkru_xstate_offset();
+	pkru_ptr = (void *)(&fpregs[pkru_offset]);
+
+	dprintf1("siginfo: %p\n", si);
+	dprintf1(" fpregs: %p\n", fpregs);
+	/*
+	 * If we got a PKRU fault, we *HAVE* to have at least one bit set in
+	 * here.
+	 */
+	dprintf1("pkru_xstate_offset: %d\n", pkru_xstate_offset());
+	if (DEBUG_LEVEL > 4)
+		dump_mem(pkru_ptr - 128, 256);
+	pkey_assert(*pkru_ptr);
+
+	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
+	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
+	dump_mem(si_pkey_ptr - 8, 24);
+	si_pkey = *si_pkey_ptr;
+	pkey_assert(si_pkey < NR_PKEYS);
+	last_si_pkey = si_pkey;
+
+	if ((si->si_code == SEGV_MAPERR) ||
+	    (si->si_code == SEGV_ACCERR) ||
+	    (si->si_code == SEGV_BNDERR)) {
+		printf("non-PK si_code, exiting...\n");
+		exit(4);
+	}
+
+	dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
+	/* need __rdpkru() version so we do not do shadow_pkru checking */
+	dprintf1("signal pkru from  pkru: %08x\n", __rdpkru());
+	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
+	*(u64 *)pkru_ptr = 0x00000000;
+	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
+	pkru_faults++;
+	dprintf1("<<<<==================================================\n");
+	return;
+	if (trapno == 14) {
+		fprintf(stderr,
+			"ERROR: In signal handler, page fault, trapno = %d, ip = %016lx\n",
+			trapno, ip);
+		fprintf(stderr, "si_addr %p\n", si->si_addr);
+		fprintf(stderr, "REG_ERR: %lx\n",
+				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
+		exit(1);
+	} else {
+		fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
+		fprintf(stderr, "si_addr %p\n", si->si_addr);
+		fprintf(stderr, "REG_ERR: %lx\n",
+				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
+		exit(2);
+	}
+	dprint_in_signal = 0;
+}
+
+int wait_all_children(void)
+{
+	int status;
+	return waitpid(-1, &status, 0);
+}
+
+void sig_chld(int x)
+{
+	dprint_in_signal = 1;
+	dprintf2("[%d] SIGCHLD: %d\n", getpid(), x);
+	dprint_in_signal = 0;
+}
+
+void setup_sigsegv_handler(void)
+{
+	int r, rs;
+	struct sigaction newact;
+	struct sigaction oldact;
+
+	/* #PF is mapped to sigsegv */
+	int signum  = SIGSEGV;
+
+	newact.sa_handler = 0;
+	newact.sa_sigaction = signal_handler;
+
+	/*sigset_t - signals to block while in the handler */
+	/* get the old signal mask. */
+	rs = sigprocmask(SIG_SETMASK, 0, &newact.sa_mask);
+	pkey_assert(rs == 0);
+
+	/* call sa_sigaction, not sa_handler*/
+	newact.sa_flags = SA_SIGINFO;
+
+	newact.sa_restorer = 0;  /* void(*)(), obsolete */
+	r = sigaction(signum, &newact, &oldact);
+	r = sigaction(SIGALRM, &newact, &oldact);
+	pkey_assert(r == 0);
+}
+
+void setup_handlers(void)
+{
+	signal(SIGCHLD, &sig_chld);
+	setup_sigsegv_handler();
+}
+
+pid_t fork_lazy_child(void)
+{
+	pid_t forkret;
+
+	forkret = fork();
+	pkey_assert(forkret >= 0);
+	dprintf3("[%d] fork() ret: %d\n", getpid(), forkret);
+
+	if (!forkret) {
+		/* in the child */
+		while (1) {
+			dprintf1("child sleeping...\n");
+			sleep(30);
+		}
+	}
+	return forkret;
+}
+
+void davecmp(void *_a, void *_b, int len)
+{
+	int i;
+	unsigned long *a = _a;
+	unsigned long *b = _b;
+
+	for (i = 0; i < len / sizeof(*a); i++) {
+		if (a[i] == b[i])
+			continue;
+
+		dprintf3("[%3d]: a: %016lx b: %016lx\n", i, a[i], b[i]);
+	}
+}
+
+void dumpit(char *f)
+{
+	int fd = open(f, O_RDONLY);
+	char buf[100];
+	int nr_read;
+
+	dprintf2("maps fd: %d\n", fd);
+	do {
+		nr_read = read(fd, &buf[0], sizeof(buf));
+		write(1, buf, nr_read);
+	} while (nr_read > 0);
+	close(fd);
+}
+
+#define PKEY_DISABLE_ACCESS    0x1
+#define PKEY_DISABLE_WRITE     0x2
+
+u32 pkey_get(int pkey, unsigned long flags)
+{
+	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
+	u32 pkru = __rdpkru();
+	u32 shifted_pkru;
+	u32 masked_pkru;
+
+	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
+			__func__, pkey, flags, 0, 0);
+	dprintf2("%s() raw pkru: %x\n", __func__, pkru);
+
+	shifted_pkru = (pkru >> (pkey * PKRU_BITS_PER_PKEY));
+	dprintf2("%s() shifted_pkru: %x\n", __func__, shifted_pkru);
+	masked_pkru = shifted_pkru & mask;
+	dprintf2("%s() masked  pkru: %x\n", __func__, masked_pkru);
+	/*
+	 * shift down the relevant bits to the lowest two, then
+	 * mask off all the other high bits.
+	 */
+	return masked_pkru;
+}
+
+int pkey_set(int pkey, unsigned long rights, unsigned long flags)
+{
+	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
+	u32 old_pkru = __rdpkru();
+	u32 new_pkru;
+
+	/* make sure that 'rights' only contains the bits we expect: */
+	assert(!(rights & ~mask));
+
+	/* copy old pkru */
+	new_pkru = old_pkru;
+	/* mask out bits from pkey in old value: */
+	new_pkru &= ~(mask << (pkey * PKRU_BITS_PER_PKEY));
+	/* OR in new bits for pkey: */
+	new_pkru |= (rights << (pkey * PKRU_BITS_PER_PKEY));
+
+	__wrpkru(new_pkru);
+
+	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x pkru now: %x old_pkru: %x\n",
+			__func__, pkey, rights, flags, 0, __rdpkru(), old_pkru);
+	return 0;
+}
+
+void pkey_disable_set(int pkey, int flags)
+{
+	unsigned long syscall_flags = 0;
+	int ret;
+	int pkey_rights;
+	u32 orig_pkru = rdpkru();
+
+	dprintf1("START->%s(%d, 0x%x)\n", __func__,
+		pkey, flags);
+	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+	pkey_rights = pkey_get(pkey, syscall_flags);
+
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+	pkey_assert(pkey_rights >= 0);
+
+	pkey_rights |= flags;
+
+	ret = pkey_set(pkey, pkey_rights, syscall_flags);
+	assert(!ret);
+	/*pkru and flags have the same format */
+	shadow_pkru |= flags << (pkey * 2);
+	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkru);
+
+	pkey_assert(ret >= 0);
+
+	pkey_rights = pkey_get(pkey, syscall_flags);
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+
+	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	if (flags)
+		pkey_assert(rdpkru() > orig_pkru);
+	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
+		pkey, flags);
+}
+
+void pkey_disable_clear(int pkey, int flags)
+{
+	unsigned long syscall_flags = 0;
+	int ret;
+	int pkey_rights = pkey_get(pkey, syscall_flags);
+	u32 orig_pkru = rdpkru();
+
+	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+	pkey_assert(pkey_rights >= 0);
+
+	pkey_rights |= flags;
+
+	ret = pkey_set(pkey, pkey_rights, 0);
+	/* pkru and flags have the same format */
+	shadow_pkru &= ~(flags << (pkey * 2));
+	pkey_assert(ret >= 0);
+
+	pkey_rights = pkey_get(pkey, syscall_flags);
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+
+	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	if (flags)
+		assert(rdpkru() > orig_pkru);
+}
+
+void pkey_write_allow(int pkey)
+{
+	pkey_disable_clear(pkey, PKEY_DISABLE_WRITE);
+}
+void pkey_write_deny(int pkey)
+{
+	pkey_disable_set(pkey, PKEY_DISABLE_WRITE);
+}
+void pkey_access_allow(int pkey)
+{
+	pkey_disable_clear(pkey, PKEY_DISABLE_ACCESS);
+}
+void pkey_access_deny(int pkey)
+{
+	pkey_disable_set(pkey, PKEY_DISABLE_ACCESS);
+}
+
+int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
+		unsigned long pkey)
+{
+	int sret;
+
+	dprintf2("%s(0x%p, %zx, prot=%lx, pkey=%lx)\n", __func__,
+			ptr, size, orig_prot, pkey);
+
+	errno = 0;
+	sret = syscall(SYS_mprotect_key, ptr, size, orig_prot, pkey);
+	if (errno) {
+		dprintf2("SYS_mprotect_key sret: %d\n", sret);
+		dprintf2("SYS_mprotect_key prot: 0x%lx\n", orig_prot);
+		dprintf2("SYS_mprotect_key failed, errno: %d\n", errno);
+		if (DEBUG_LEVEL >= 2)
+			perror("SYS_mprotect_pkey");
+	}
+	return sret;
+}
+
+int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
+{
+	int ret = syscall(SYS_pkey_alloc, flags, init_val);
+	dprintf1("%s(flags=%lx, init_val=%lx) syscall ret: %d errno: %d\n",
+			__func__, flags, init_val, ret, errno);
+	return ret;
+}
+
+int alloc_pkey(void)
+{
+	int ret;
+	unsigned long init_val = 0x0;
+
+	dprintf1("alloc_pkey()::%d, pkru: 0x%x shadow: %x\n",
+			__LINE__, __rdpkru(), shadow_pkru);
+	ret = sys_pkey_alloc(0, init_val);
+	/*
+	 * pkey_alloc() sets PKRU, so we need to reflect it in
+	 * shadow_pkru:
+	 */
+	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	if (ret) {
+		/* clear both the bits: */
+		shadow_pkru &= ~(0x3      << (ret * 2));
+		dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+				__LINE__, ret, __rdpkru(), shadow_pkru);
+		/*
+		 * move the new state in from init_val
+		 * (remember, we cheated and init_val == pkru format)
+		 */
+		shadow_pkru |=  (init_val << (ret * 2));
+	}
+	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf1("alloc_pkey()::%d errno: %d\n", __LINE__, errno);
+	/* for shadow checking: */
+	rdpkru();
+	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	return ret;
+}
+
+int sys_pkey_free(unsigned long pkey)
+{
+	int ret = syscall(SYS_pkey_free, pkey);
+	dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
+	return ret;
+}
+
+/*
+ * I had a bug where pkey bits could be set by mprotect() but
+ * not cleared.  This ensures we get lots of random bit sets
+ * and clears on the vma and pte pkey bits.
+ */
+int alloc_random_pkey(void)
+{
+	int max_nr_pkey_allocs;
+	int ret;
+	int i;
+	int alloced_pkeys[NR_PKEYS];
+	int nr_alloced = 0;
+	int random_index;
+	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
+
+	/* allocate every possible key and make a note of which ones we got */
+	max_nr_pkey_allocs = NR_PKEYS;
+	max_nr_pkey_allocs = 1;
+	for (i = 0; i < max_nr_pkey_allocs; i++) {
+		int new_pkey = alloc_pkey();
+		if (new_pkey < 0)
+			break;
+		alloced_pkeys[nr_alloced++] = new_pkey;
+	}
+
+	pkey_assert(nr_alloced > 0);
+	/* select a random one out of the allocated ones */
+	random_index = rand() % nr_alloced;
+	ret = alloced_pkeys[random_index];
+	/* now zero it out so we don't free it next */
+	alloced_pkeys[random_index] = 0;
+
+	/* go through the allocated ones that we did not want and free them */
+	for (i = 0; i < nr_alloced; i++) {
+		int free_ret;
+		if (!alloced_pkeys[i])
+			continue;
+		free_ret = sys_pkey_free(alloced_pkeys[i]);
+		pkey_assert(!free_ret);
+	}
+	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	return ret;
+}
+
+int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
+		unsigned long pkey)
+{
+	int nr_iterations = random() % 100;
+	int ret;
+
+	while (0) {
+		int rpkey = alloc_random_pkey();
+		ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
+		dprintf1("sys_mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
+				ptr, size, orig_prot, pkey, ret);
+		if (nr_iterations-- < 0)
+			break;
+
+		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+		sys_pkey_free(rpkey);
+		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	}
+	pkey_assert(pkey < NR_PKEYS);
+
+	ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
+	dprintf1("mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
+			ptr, size, orig_prot, pkey, ret);
+	pkey_assert(!ret);
+	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	return ret;
+}
+
+struct pkey_malloc_record {
+	void *ptr;
+	long size;
+};
+struct pkey_malloc_record *pkey_malloc_records;
+long nr_pkey_malloc_records;
+void record_pkey_malloc(void *ptr, long size)
+{
+	long i;
+	struct pkey_malloc_record *rec = NULL;
+
+	for (i = 0; i < nr_pkey_malloc_records; i++) {
+		rec = &pkey_malloc_records[i];
+		/* find a free record */
+		if (rec)
+			break;
+	}
+	if (!rec) {
+		/* every record is full */
+		size_t old_nr_records = nr_pkey_malloc_records;
+		size_t new_nr_records = (nr_pkey_malloc_records * 2 + 1);
+		size_t new_size = new_nr_records * sizeof(struct pkey_malloc_record);
+		dprintf2("new_nr_records: %zd\n", new_nr_records);
+		dprintf2("new_size: %zd\n", new_size);
+		pkey_malloc_records = realloc(pkey_malloc_records, new_size);
+		pkey_assert(pkey_malloc_records != NULL);
+		rec = &pkey_malloc_records[nr_pkey_malloc_records];
+		/*
+		 * realloc() does not initialize memory, so zero it from
+		 * the first new record all the way to the end.
+		 */
+		for (i = 0; i < new_nr_records - old_nr_records; i++)
+			memset(rec + i, 0, sizeof(*rec));
+	}
+	dprintf3("filling malloc record[%d/%p]: {%p, %ld}\n",
+		(int)(rec - pkey_malloc_records), rec, ptr, size);
+	rec->ptr = ptr;
+	rec->size = size;
+	nr_pkey_malloc_records++;
+}
+
+void free_pkey_malloc(void *ptr)
+{
+	long i;
+	int ret;
+	dprintf3("%s(%p)\n", __func__, ptr);
+	for (i = 0; i < nr_pkey_malloc_records; i++) {
+		struct pkey_malloc_record *rec = &pkey_malloc_records[i];
+		dprintf4("looking for ptr %p at record[%ld/%p]: {%p, %ld}\n",
+				ptr, i, rec, rec->ptr, rec->size);
+		if ((ptr <  rec->ptr) ||
+		    (ptr >= rec->ptr + rec->size))
+			continue;
+
+		dprintf3("found ptr %p at record[%ld/%p]: {%p, %ld}\n",
+				ptr, i, rec, rec->ptr, rec->size);
+		nr_pkey_malloc_records--;
+		ret = munmap(rec->ptr, rec->size);
+		dprintf3("munmap ret: %d\n", ret);
+		pkey_assert(!ret);
+		dprintf3("clearing rec->ptr, rec: %p\n", rec);
+		rec->ptr = NULL;
+		dprintf3("done clearing rec->ptr, rec: %p\n", rec);
+		return;
+	}
+	pkey_assert(false);
+}
+
+
+void *malloc_pkey_with_mprotect(long size, int prot, u16 pkey)
+{
+	void *ptr;
+	int ret;
+
+	rdpkru();
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	pkey_assert(pkey < NR_PKEYS);
+	ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
+	pkey_assert(!ret);
+	record_pkey_malloc(ptr, size);
+	rdpkru();
+
+	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
+	return ptr;
+}
+
+void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
+{
+	int ret;
+	void *ptr;
+
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	/*
+	 * Guarantee we can fit at least one huge page in the resulting
+	 * allocation by allocating space for 2:
+	 */
+	size = ALIGN_UP(size, HPAGE_SIZE * 2);
+	ptr = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+	record_pkey_malloc(ptr, size);
+	mprotect_pkey(ptr, size, prot, pkey);
+
+	dprintf1("unaligned ptr: %p\n", ptr);
+	ptr = ALIGN_PTR_UP(ptr, HPAGE_SIZE);
+	dprintf1("  aligned ptr: %p\n", ptr);
+	ret = madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE);
+	dprintf1("MADV_HUGEPAGE ret: %d\n", ret);
+	ret = madvise(ptr, HPAGE_SIZE, MADV_WILLNEED);
+	dprintf1("MADV_WILLNEED ret: %d\n", ret);
+	memset(ptr, 0, HPAGE_SIZE);
+
+	dprintf1("mmap()'d thp for pkey %d @ %p\n", pkey, ptr);
+	return ptr;
+}
+
+int hugetlb_setup_ok;
+#define GET_NR_HUGE_PAGES 10
+void setup_hugetlbfs(void)
+{
+	int err;
+	int fd;
+	char buf[] = "123";
+
+	if (geteuid() != 0) {
+		fprintf(stderr, "WARNING: not run as root, can not do hugetlb test\n");
+		return;
+	}
+
+	cat_into_file(__stringify(GET_NR_HUGE_PAGES), "/proc/sys/vm/nr_hugepages");
+
+	/*
+	 * Now go make sure that we got the pages and that they
+	 * are 2M pages.  Someone might have made 1G the default.
+	 */
+	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", O_RDONLY);
+	if (fd < 0) {
+		perror("opening sysfs 2M hugetlb config");
+		return;
+	}
+
+	/* -1 to guarantee leaving the trailing \0 */
+	err = read(fd, buf, sizeof(buf)-1);
+	close(fd);
+	if (err <= 0) {
+		perror("reading sysfs 2M hugetlb config");
+		return;
+	}
+
+	if (atoi(buf) != GET_NR_HUGE_PAGES) {
+		fprintf(stderr, "could not confirm 2M pages, got: '%s' expected %d\n",
+			buf, GET_NR_HUGE_PAGES);
+		return;
+	}
+
+	hugetlb_setup_ok = 1;
+}
+
+void *malloc_pkey_hugetlb(long size, int prot, u16 pkey)
+{
+	void *ptr;
+	int flags = MAP_ANONYMOUS|MAP_PRIVATE|MAP_HUGETLB;
+
+	if (!hugetlb_setup_ok)
+		return PTR_ERR_ENOTSUP;
+
+	dprintf1("doing %s(%ld, %x, %x)\n", __func__, size, prot, pkey);
+	size = ALIGN_UP(size, HPAGE_SIZE * 2);
+	pkey_assert(pkey < NR_PKEYS);
+	ptr = mmap(NULL, size, PROT_NONE, flags, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+	mprotect_pkey(ptr, size, prot, pkey);
+
+	record_pkey_malloc(ptr, size);
+
+	dprintf1("mmap()'d hugetlbfs for pkey %d @ %p\n", pkey, ptr);
+	return ptr;
+}
+
+void *malloc_pkey_mmap_dax(long size, int prot, u16 pkey)
+{
+	void *ptr;
+	int fd;
+
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	pkey_assert(pkey < NR_PKEYS);
+	fd = open("/dax/foo", O_RDWR);
+	pkey_assert(fd >= 0);
+
+	ptr = mmap(0, size, prot, MAP_SHARED, fd, 0);
+	pkey_assert(ptr != (void *)-1);
+
+	mprotect_pkey(ptr, size, prot, pkey);
+
+	record_pkey_malloc(ptr, size);
+
+	dprintf1("mmap()'d for pkey %d @ %p\n", pkey, ptr);
+	close(fd);
+	return ptr;
+}
+
+void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
+
+	malloc_pkey_with_mprotect,
+	malloc_pkey_anon_huge,
+	malloc_pkey_hugetlb
+/* can not do direct with the pkey_mprotect() API:
+	malloc_pkey_mmap_direct,
+	malloc_pkey_mmap_dax,
+*/
+};
+
+void *malloc_pkey(long size, int prot, u16 pkey)
+{
+	void *ret;
+	static int malloc_type;
+	int nr_malloc_types = ARRAY_SIZE(pkey_malloc);
+
+	pkey_assert(pkey < NR_PKEYS);
+
+	while (1) {
+		pkey_assert(malloc_type < nr_malloc_types);
+
+		ret = pkey_malloc[malloc_type](size, prot, pkey);
+		pkey_assert(ret != (void *)-1);
+
+		malloc_type++;
+		if (malloc_type >= nr_malloc_types)
+			malloc_type = (random()%nr_malloc_types);
+
+		/* try again if the malloc_type we tried is unsupported */
+		if (ret == PTR_ERR_ENOTSUP)
+			continue;
+
+		break;
+	}
+
+	dprintf3("%s(%ld, prot=%x, pkey=%x) returning: %p\n", __func__,
+			size, prot, pkey, ret);
+	return ret;
+}
+
+int last_pkru_faults;
+void expected_pk_fault(int pkey)
+{
+	dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
+			__func__, last_pkru_faults, pkru_faults);
+	dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
+	pkey_assert(last_pkru_faults + 1 == pkru_faults);
+	pkey_assert(last_si_pkey == pkey);
+	/*
+	 * The signal handler shold have cleared out PKRU to let the
+	 * test program continue.  We now have to restore it.
+	 */
+	if (__rdpkru() != 0)
+		pkey_assert(0);
+
+	__wrpkru(shadow_pkru);
+	dprintf1("%s() set PKRU=%x to restore state after signal nuked it\n",
+			__func__, shadow_pkru);
+	last_pkru_faults = pkru_faults;
+	last_si_pkey = -1;
+}
+
+void do_not_expect_pk_fault(void)
+{
+	pkey_assert(last_pkru_faults == pkru_faults);
+}
+
+int test_fds[10] = { -1 };
+int nr_test_fds;
+void __save_test_fd(int fd)
+{
+	pkey_assert(fd >= 0);
+	pkey_assert(nr_test_fds < ARRAY_SIZE(test_fds));
+	test_fds[nr_test_fds] = fd;
+	nr_test_fds++;
+}
+
+int get_test_read_fd(void)
+{
+	int test_fd = open("/etc/passwd", O_RDONLY);
+	__save_test_fd(test_fd);
+	return test_fd;
+}
+
+void close_test_fds(void)
+{
+	int i;
+
+	for (i = 0; i < nr_test_fds; i++) {
+		if (test_fds[i] < 0)
+			continue;
+		close(test_fds[i]);
+		test_fds[i] = -1;
+	}
+	nr_test_fds = 0;
+}
+
+#define barrier() __asm__ __volatile__("": : :"memory")
+__attribute__((noinline)) int read_ptr(int *ptr)
+{
+	/*
+	 * Keep GCC from optimizing this away somehow
+	 */
+	barrier();
+	return *ptr;
+}
+
+void test_read_of_write_disabled_region(int *ptr, u16 pkey)
+{
+	int ptr_contents;
+
+	dprintf1("disabling write access to PKEY[1], doing read\n");
+	pkey_write_deny(pkey);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("*ptr: %d\n", ptr_contents);
+	dprintf1("\n");
+}
+void test_read_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	int ptr_contents;
+
+	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
+	rdpkru();
+	pkey_access_deny(pkey);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("*ptr: %d\n", ptr_contents);
+	expected_pk_fault(pkey);
+}
+void test_write_of_write_disabled_region(int *ptr, u16 pkey)
+{
+	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
+	pkey_write_deny(pkey);
+	*ptr = __LINE__;
+	expected_pk_fault(pkey);
+}
+void test_write_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	dprintf1("disabling access to PKEY[%02d], doing write\n", pkey);
+	pkey_access_deny(pkey);
+	*ptr = __LINE__;
+	expected_pk_fault(pkey);
+}
+void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	int ret;
+	int test_fd = get_test_read_fd();
+
+	dprintf1("disabling access to PKEY[%02d], "
+		 "having kernel read() to buffer\n", pkey);
+	pkey_access_deny(pkey);
+	ret = read(test_fd, ptr, 1);
+	dprintf1("read ret: %d\n", ret);
+	pkey_assert(ret);
+}
+void test_kernel_write_of_write_disabled_region(int *ptr, u16 pkey)
+{
+	int ret;
+	int test_fd = get_test_read_fd();
+
+	pkey_write_deny(pkey);
+	ret = read(test_fd, ptr, 100);
+	dprintf1("read ret: %d\n", ret);
+	if (ret < 0 && (DEBUG_LEVEL > 0))
+		perror("verbose read result (OK for this to be bad)");
+	pkey_assert(ret);
+}
+
+void test_kernel_gup_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	int pipe_ret, vmsplice_ret;
+	struct iovec iov;
+	int pipe_fds[2];
+
+	pipe_ret = pipe(pipe_fds);
+
+	pkey_assert(pipe_ret == 0);
+	dprintf1("disabling access to PKEY[%02d], "
+		 "having kernel vmsplice from buffer\n", pkey);
+	pkey_access_deny(pkey);
+	iov.iov_base = ptr;
+	iov.iov_len = PAGE_SIZE;
+	vmsplice_ret = vmsplice(pipe_fds[1], &iov, 1, SPLICE_F_GIFT);
+	dprintf1("vmsplice() ret: %d\n", vmsplice_ret);
+	pkey_assert(vmsplice_ret == -1);
+
+	close(pipe_fds[0]);
+	close(pipe_fds[1]);
+}
+
+void test_kernel_gup_write_to_write_disabled_region(int *ptr, u16 pkey)
+{
+	int ignored = 0xdada;
+	int futex_ret;
+	int some_int = __LINE__;
+
+	dprintf1("disabling write to PKEY[%02d], "
+		 "doing futex gunk in buffer\n", pkey);
+	*ptr = some_int;
+	pkey_write_deny(pkey);
+	futex_ret = syscall(SYS_futex, ptr, FUTEX_WAIT, some_int-1, NULL,
+			&ignored, ignored);
+	if (DEBUG_LEVEL > 0)
+		perror("futex");
+	dprintf1("futex() ret: %d\n", futex_ret);
+}
+
+/* Assumes that all pkeys other than 'pkey' are unallocated */
+void test_pkey_syscalls_on_non_allocated_pkey(int *ptr, u16 pkey)
+{
+	int err;
+	int i;
+
+	/* Note: 0 is the default pkey, so don't mess with it */
+	for (i = 1; i < NR_PKEYS; i++) {
+		if (pkey == i)
+			continue;
+
+		dprintf1("trying get/set/free to non-allocated pkey: %2d\n", i);
+		err = sys_pkey_free(i);
+		pkey_assert(err);
+
+		err = sys_pkey_free(i);
+		pkey_assert(err);
+
+		err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, i);
+		pkey_assert(err);
+	}
+}
+
+/* Assumes that all pkeys other than 'pkey' are unallocated */
+void test_pkey_syscalls_bad_args(int *ptr, u16 pkey)
+{
+	int err;
+	int bad_pkey = NR_PKEYS+99;
+
+	/* pass a known-invalid pkey in: */
+	err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, bad_pkey);
+	pkey_assert(err);
+}
+
+/* Assumes that all pkeys other than 'pkey' are unallocated */
+void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
+{
+	int err;
+	int allocated_pkeys[NR_PKEYS] = {0};
+	int nr_allocated_pkeys = 0;
+	int i;
+
+	for (i = 0; i < NR_PKEYS*2; i++) {
+		int new_pkey;
+		dprintf1("%s() alloc loop: %d\n", __func__, i);
+		new_pkey = alloc_pkey();
+		dprintf4("%s()::%d, err: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+				__LINE__, err, __rdpkru(), shadow_pkru);
+		rdpkru(); /* for shadow checking */
+		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
+		if ((new_pkey == -1) && (errno == ENOSPC)) {
+			dprintf2("%s() failed to allocate pkey after %d tries\n",
+				__func__, nr_allocated_pkeys);
+			break;
+		}
+		pkey_assert(nr_allocated_pkeys < NR_PKEYS);
+		allocated_pkeys[nr_allocated_pkeys++] = new_pkey;
+	}
+
+	dprintf3("%s()::%d\n", __func__, __LINE__);
+
+	/*
+	 * ensure it did not reach the end of the loop without
+	 * failure:
+	 */
+	pkey_assert(i < NR_PKEYS*2);
+
+	/*
+	 * There are 16 pkeys supported in hardware.  One is taken
+	 * up for the default (0) and another can be taken up by
+	 * an execute-only mapping.  Ensure that we can allocate
+	 * at least 14 (16-2).
+	 */
+	pkey_assert(i >= NR_PKEYS-2);
+
+	for (i = 0; i < nr_allocated_pkeys; i++) {
+		err = sys_pkey_free(allocated_pkeys[i]);
+		pkey_assert(!err);
+		rdpkru(); /* for shadow checking */
+	}
+}
+
+void test_ptrace_of_child(int *ptr, u16 pkey)
+{
+	__attribute__((__unused__)) int peek_result;
+	pid_t child_pid;
+	void *ignored = 0;
+	long ret;
+	int status;
+	/*
+	 * This is the "control" for our little expermient.  Make sure
+	 * we can always access it when ptracing.
+	 */
+	int *plain_ptr_unaligned = malloc(HPAGE_SIZE);
+	int *plain_ptr = ALIGN_PTR_UP(plain_ptr_unaligned, PAGE_SIZE);
+
+	/*
+	 * Fork a child which is an exact copy of this process, of course.
+	 * That means we can do all of our tests via ptrace() and then plain
+	 * memory access and ensure they work differently.
+	 */
+	child_pid = fork_lazy_child();
+	dprintf1("[%d] child pid: %d\n", getpid(), child_pid);
+
+	ret = ptrace(PTRACE_ATTACH, child_pid, ignored, ignored);
+	if (ret)
+		perror("attach");
+	dprintf1("[%d] attach ret: %ld %d\n", getpid(), ret, __LINE__);
+	pkey_assert(ret != -1);
+	ret = waitpid(child_pid, &status, WUNTRACED);
+	if ((ret != child_pid) || !(WIFSTOPPED(status))) {
+		fprintf(stderr, "weird waitpid result %ld stat %x\n",
+				ret, status);
+		pkey_assert(0);
+	}
+	dprintf2("waitpid ret: %ld\n", ret);
+	dprintf2("waitpid status: %d\n", status);
+
+	pkey_access_deny(pkey);
+	pkey_write_deny(pkey);
+
+	/* Write access, untested for now:
+	ret = ptrace(PTRACE_POKEDATA, child_pid, peek_at, data);
+	pkey_assert(ret != -1);
+	dprintf1("poke at %p: %ld\n", peek_at, ret);
+	*/
+
+	/*
+	 * Try to access the pkey-protected "ptr" via ptrace:
+	 */
+	ret = ptrace(PTRACE_PEEKDATA, child_pid, ptr, ignored);
+	/* expect it to work, without an error: */
+	pkey_assert(ret != -1);
+	/* Now access from the current task, and expect an exception: */
+	peek_result = read_ptr(ptr);
+	expected_pk_fault(pkey);
+
+	/*
+	 * Try to access the NON-pkey-protected "plain_ptr" via ptrace:
+	 */
+	ret = ptrace(PTRACE_PEEKDATA, child_pid, plain_ptr, ignored);
+	/* expect it to work, without an error: */
+	pkey_assert(ret != -1);
+	/* Now access from the current task, and expect NO exception: */
+	peek_result = read_ptr(plain_ptr);
+	do_not_expect_pk_fault();
+
+	ret = ptrace(PTRACE_DETACH, child_pid, ignored, 0);
+	pkey_assert(ret != -1);
+
+	ret = kill(child_pid, SIGKILL);
+	pkey_assert(ret != -1);
+
+	wait(&status);
+
+	free(plain_ptr_unaligned);
+}
+
+void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
+{
+	void *p1;
+	int scratch;
+	int ptr_contents;
+	int ret;
+
+	p1 = ALIGN_PTR_UP(&lots_o_noops_around_write, PAGE_SIZE);
+	dprintf3("&lots_o_noops: %p\n", &lots_o_noops_around_write);
+	/* lots_o_noops_around_write should be page-aligned already */
+	assert(p1 == &lots_o_noops_around_write);
+
+	/* Point 'p1' at the *second* page of the function: */
+	p1 += PAGE_SIZE;
+
+	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
+	lots_o_noops_around_write(&scratch);
+	ptr_contents = read_ptr(p1);
+	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+
+	ret = mprotect_pkey(p1, PAGE_SIZE, PROT_EXEC, (u64)pkey);
+	pkey_assert(!ret);
+	pkey_access_deny(pkey);
+
+	dprintf2("pkru: %x\n", rdpkru());
+
+	/*
+	 * Make sure this is an *instruction* fault
+	 */
+	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
+	lots_o_noops_around_write(&scratch);
+	do_not_expect_pk_fault();
+	ptr_contents = read_ptr(p1);
+	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+	expected_pk_fault(pkey);
+}
+
+void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
+{
+	int size = PAGE_SIZE;
+	int sret;
+
+	if (cpu_has_pku()) {
+		dprintf1("SKIP: %s: no CPU support\n", __func__);
+		return;
+	}
+
+	sret = syscall(SYS_mprotect_key, ptr, size, PROT_READ, pkey);
+	pkey_assert(sret < 0);
+}
+
+void (*pkey_tests[])(int *ptr, u16 pkey) = {
+	test_read_of_write_disabled_region,
+	test_read_of_access_disabled_region,
+	test_write_of_write_disabled_region,
+	test_write_of_access_disabled_region,
+	test_kernel_write_of_access_disabled_region,
+	test_kernel_write_of_write_disabled_region,
+	test_kernel_gup_of_access_disabled_region,
+	test_kernel_gup_write_to_write_disabled_region,
+	test_executing_on_unreadable_memory,
+	test_ptrace_of_child,
+	test_pkey_syscalls_on_non_allocated_pkey,
+	test_pkey_syscalls_bad_args,
+	test_pkey_alloc_exhaust,
+};
+
+void run_tests_once(void)
+{
+	int *ptr;
+	int prot = PROT_READ|PROT_WRITE;
+
+	for (test_nr = 0; test_nr < ARRAY_SIZE(pkey_tests); test_nr++) {
+		int pkey;
+		int orig_pkru_faults = pkru_faults;
+
+		dprintf1("======================\n");
+		dprintf1("test %d preparing...\n", test_nr);
+
+		tracing_on();
+		pkey = alloc_random_pkey();
+		dprintf1("test %d starting with pkey: %d\n", test_nr, pkey);
+		ptr = malloc_pkey(PAGE_SIZE, prot, pkey);
+		dprintf1("test %d starting...\n", test_nr);
+		pkey_tests[test_nr](ptr, pkey);
+		dprintf1("freeing test memory: %p\n", ptr);
+		free_pkey_malloc(ptr);
+		sys_pkey_free(pkey);
+
+		dprintf1("pkru_faults: %d\n", pkru_faults);
+		dprintf1("orig_pkru_faults: %d\n", orig_pkru_faults);
+
+		tracing_off();
+		close_test_fds();
+
+		printf("test %2d PASSED (iteration %d)\n", test_nr, iteration_nr);
+		dprintf1("======================\n\n");
+	}
+	iteration_nr++;
+}
+
+void pkey_setup_shadow(void)
+{
+	shadow_pkru = __rdpkru();
+}
+
+int main(void)
+{
+	int nr_iterations = 22;
+
+	setup_handlers();
+
+	printf("has pku: %d\n", cpu_has_pku());
+
+	if (!cpu_has_pku()) {
+		int size = PAGE_SIZE;
+		int *ptr;
+
+		printf("running PKEY tests for unsupported CPU/OS\n");
+
+		ptr  = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+		assert(ptr != (void *)-1);
+		test_mprotect_pkey_on_unsupported_cpu(ptr, 1);
+		exit(0);
+	}
+
+	pkey_setup_shadow();
+	printf("startup pkru: %x\n", rdpkru());
+	setup_hugetlbfs();
+
+	while (nr_iterations-- > 0)
+		run_tests_once();
+
+	printf("done (all tests OK)\n");
+	return 0;
+}
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 7b1adee..9687501 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -7,7 +7,7 @@ include ../lib.mk
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \
 			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test ioperm \
-			protection_keys test_vdso
+			test_vdso
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/pkey-helpers.h b/tools/testing/selftests/x86/pkey-helpers.h
deleted file mode 100644
index 3818f25..0000000
--- a/tools/testing/selftests/x86/pkey-helpers.h
+++ /dev/null
@@ -1,220 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _PKEYS_HELPER_H
-#define _PKEYS_HELPER_H
-#define _GNU_SOURCE
-#include <string.h>
-#include <stdarg.h>
-#include <stdio.h>
-#include <stdint.h>
-#include <stdbool.h>
-#include <signal.h>
-#include <assert.h>
-#include <stdlib.h>
-#include <ucontext.h>
-#include <sys/mman.h>
-
-#define NR_PKEYS 16
-#define PKRU_BITS_PER_PKEY 2
-
-#ifndef DEBUG_LEVEL
-#define DEBUG_LEVEL 0
-#endif
-#define DPRINT_IN_SIGNAL_BUF_SIZE 4096
-extern int dprint_in_signal;
-extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
-static inline void sigsafe_printf(const char *format, ...)
-{
-	va_list ap;
-
-	va_start(ap, format);
-	if (!dprint_in_signal) {
-		vprintf(format, ap);
-	} else {
-		int len = vsnprintf(dprint_in_signal_buffer,
-				    DPRINT_IN_SIGNAL_BUF_SIZE,
-				    format, ap);
-		/*
-		 * len is amount that would have been printed,
-		 * but actual write is truncated at BUF_SIZE.
-		 */
-		if (len > DPRINT_IN_SIGNAL_BUF_SIZE)
-			len = DPRINT_IN_SIGNAL_BUF_SIZE;
-		write(1, dprint_in_signal_buffer, len);
-	}
-	va_end(ap);
-}
-#define dprintf_level(level, args...) do {	\
-	if (level <= DEBUG_LEVEL)		\
-		sigsafe_printf(args);		\
-	fflush(NULL);				\
-} while (0)
-#define dprintf0(args...) dprintf_level(0, args)
-#define dprintf1(args...) dprintf_level(1, args)
-#define dprintf2(args...) dprintf_level(2, args)
-#define dprintf3(args...) dprintf_level(3, args)
-#define dprintf4(args...) dprintf_level(4, args)
-
-extern unsigned int shadow_pkru;
-static inline unsigned int __rdpkru(void)
-{
-	unsigned int eax, edx;
-	unsigned int ecx = 0;
-	unsigned int pkru;
-
-	asm volatile(".byte 0x0f,0x01,0xee\n\t"
-		     : "=a" (eax), "=d" (edx)
-		     : "c" (ecx));
-	pkru = eax;
-	return pkru;
-}
-
-static inline unsigned int _rdpkru(int line)
-{
-	unsigned int pkru = __rdpkru();
-
-	dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
-			line, pkru, shadow_pkru);
-	assert(pkru == shadow_pkru);
-
-	return pkru;
-}
-
-#define rdpkru() _rdpkru(__LINE__)
-
-static inline void __wrpkru(unsigned int pkru)
-{
-	unsigned int eax = pkru;
-	unsigned int ecx = 0;
-	unsigned int edx = 0;
-
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
-	asm volatile(".byte 0x0f,0x01,0xef\n\t"
-		     : : "a" (eax), "c" (ecx), "d" (edx));
-	assert(pkru == __rdpkru());
-}
-
-static inline void wrpkru(unsigned int pkru)
-{
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
-	/* will do the shadow check for us: */
-	rdpkru();
-	__wrpkru(pkru);
-	shadow_pkru = pkru;
-	dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
-}
-
-/*
- * These are technically racy. since something could
- * change PKRU between the read and the write.
- */
-static inline void __pkey_access_allow(int pkey, int do_allow)
-{
-	unsigned int pkru = rdpkru();
-	int bit = pkey * 2;
-
-	if (do_allow)
-		pkru &= (1<<bit);
-	else
-		pkru |= (1<<bit);
-
-	dprintf4("pkru now: %08x\n", rdpkru());
-	wrpkru(pkru);
-}
-
-static inline void __pkey_write_allow(int pkey, int do_allow_write)
-{
-	long pkru = rdpkru();
-	int bit = pkey * 2 + 1;
-
-	if (do_allow_write)
-		pkru &= (1<<bit);
-	else
-		pkru |= (1<<bit);
-
-	wrpkru(pkru);
-	dprintf4("pkru now: %08x\n", rdpkru());
-}
-
-#define PROT_PKEY0     0x10            /* protection key value (bit 0) */
-#define PROT_PKEY1     0x20            /* protection key value (bit 1) */
-#define PROT_PKEY2     0x40            /* protection key value (bit 2) */
-#define PROT_PKEY3     0x80            /* protection key value (bit 3) */
-
-#define PAGE_SIZE 4096
-#define MB	(1<<20)
-
-static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
-		unsigned int *ecx, unsigned int *edx)
-{
-	/* ecx is often an input as well as an output. */
-	asm volatile(
-		"cpuid;"
-		: "=a" (*eax),
-		  "=b" (*ebx),
-		  "=c" (*ecx),
-		  "=d" (*edx)
-		: "0" (*eax), "2" (*ecx));
-}
-
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
-#define X86_FEATURE_PKU        (1<<3) /* Protection Keys for Userspace */
-#define X86_FEATURE_OSPKE      (1<<4) /* OS Protection Keys Enable */
-
-static inline int cpu_has_pku(void)
-{
-	unsigned int eax;
-	unsigned int ebx;
-	unsigned int ecx;
-	unsigned int edx;
-
-	eax = 0x7;
-	ecx = 0x0;
-	__cpuid(&eax, &ebx, &ecx, &edx);
-
-	if (!(ecx & X86_FEATURE_PKU)) {
-		dprintf2("cpu does not have PKU\n");
-		return 0;
-	}
-	if (!(ecx & X86_FEATURE_OSPKE)) {
-		dprintf2("cpu does not have OSPKE\n");
-		return 0;
-	}
-	return 1;
-}
-
-#define XSTATE_PKRU_BIT	(9)
-#define XSTATE_PKRU	0x200
-
-int pkru_xstate_offset(void)
-{
-	unsigned int eax;
-	unsigned int ebx;
-	unsigned int ecx;
-	unsigned int edx;
-	int xstate_offset;
-	int xstate_size;
-	unsigned long XSTATE_CPUID = 0xd;
-	int leaf;
-
-	/* assume that XSTATE_PKRU is set in XCR0 */
-	leaf = XSTATE_PKRU_BIT;
-	{
-		eax = XSTATE_CPUID;
-		ecx = leaf;
-		__cpuid(&eax, &ebx, &ecx, &edx);
-
-		if (leaf == XSTATE_PKRU_BIT) {
-			xstate_offset = ebx;
-			xstate_size = eax;
-		}
-	}
-
-	if (xstate_size == 0) {
-		printf("could not find size/offset of PKRU in xsave state\n");
-		return 0;
-	}
-
-	return xstate_offset;
-}
-
-#endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/x86/protection_keys.c b/tools/testing/selftests/x86/protection_keys.c
deleted file mode 100644
index 555e43c..0000000
--- a/tools/testing/selftests/x86/protection_keys.c
+++ /dev/null
@@ -1,1395 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Tests x86 Memory Protection Keys (see Documentation/x86/protection-keys.txt)
- *
- * There are examples in here of:
- *  * how to set protection keys on memory
- *  * how to set/clear bits in PKRU (the rights register)
- *  * how to handle SEGV_PKRU signals and extract pkey-relevant
- *    information from the siginfo
- *
- * Things to add:
- *	make sure KSM and KSM COW breaking works
- *	prefault pages in at malloc, or not
- *	protect MPX bounds tables with protection keys?
- *	make sure VMA splitting/merging is working correctly
- *	OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune to pkeys
- *	look for pkey "leaks" where it is still set on a VMA but "freed" back to the kernel
- *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
- *
- * Compile like this:
- *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- */
-#define _GNU_SOURCE
-#include <errno.h>
-#include <linux/futex.h>
-#include <sys/time.h>
-#include <sys/syscall.h>
-#include <string.h>
-#include <stdio.h>
-#include <stdint.h>
-#include <stdbool.h>
-#include <signal.h>
-#include <assert.h>
-#include <stdlib.h>
-#include <ucontext.h>
-#include <sys/mman.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <unistd.h>
-#include <sys/ptrace.h>
-#include <setjmp.h>
-
-#include "pkey-helpers.h"
-
-int iteration_nr = 1;
-int test_nr;
-
-unsigned int shadow_pkru;
-
-#define HPAGE_SIZE	(1UL<<21)
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
-#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
-#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
-#define ALIGN_PTR_UP(p, ptr_align_to)	((typeof(p))ALIGN_UP((unsigned long)(p),	ptr_align_to))
-#define ALIGN_PTR_DOWN(p, ptr_align_to)	((typeof(p))ALIGN_DOWN((unsigned long)(p),	ptr_align_to))
-#define __stringify_1(x...)     #x
-#define __stringify(x...)       __stringify_1(x)
-
-#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
-
-int dprint_in_signal;
-char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
-
-extern void abort_hooks(void);
-#define pkey_assert(condition) do {		\
-	if (!(condition)) {			\
-		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
-				__FILE__, __LINE__,	\
-				test_nr, iteration_nr);	\
-		dprintf0("errno at assert: %d", errno);	\
-		abort_hooks();			\
-		assert(condition);		\
-	}					\
-} while (0)
-#define raw_assert(cond) assert(cond)
-
-void cat_into_file(char *str, char *file)
-{
-	int fd = open(file, O_RDWR);
-	int ret;
-
-	dprintf2("%s(): writing '%s' to '%s'\n", __func__, str, file);
-	/*
-	 * these need to be raw because they are called under
-	 * pkey_assert()
-	 */
-	raw_assert(fd >= 0);
-	ret = write(fd, str, strlen(str));
-	if (ret != strlen(str)) {
-		perror("write to file failed");
-		fprintf(stderr, "filename: '%s' str: '%s'\n", file, str);
-		raw_assert(0);
-	}
-	close(fd);
-}
-
-#if CONTROL_TRACING > 0
-static int warned_tracing;
-int tracing_root_ok(void)
-{
-	if (geteuid() != 0) {
-		if (!warned_tracing)
-			fprintf(stderr, "WARNING: not run as root, "
-					"can not do tracing control\n");
-		warned_tracing = 1;
-		return 0;
-	}
-	return 1;
-}
-#endif
-
-void tracing_on(void)
-{
-#if CONTROL_TRACING > 0
-#define TRACEDIR "/sys/kernel/debug/tracing"
-	char pidstr[32];
-
-	if (!tracing_root_ok())
-		return;
-
-	sprintf(pidstr, "%d", getpid());
-	cat_into_file("0", TRACEDIR "/tracing_on");
-	cat_into_file("\n", TRACEDIR "/trace");
-	if (1) {
-		cat_into_file("function_graph", TRACEDIR "/current_tracer");
-		cat_into_file("1", TRACEDIR "/options/funcgraph-proc");
-	} else {
-		cat_into_file("nop", TRACEDIR "/current_tracer");
-	}
-	cat_into_file(pidstr, TRACEDIR "/set_ftrace_pid");
-	cat_into_file("1", TRACEDIR "/tracing_on");
-	dprintf1("enabled tracing\n");
-#endif
-}
-
-void tracing_off(void)
-{
-#if CONTROL_TRACING > 0
-	if (!tracing_root_ok())
-		return;
-	cat_into_file("0", "/sys/kernel/debug/tracing/tracing_on");
-#endif
-}
-
-void abort_hooks(void)
-{
-	fprintf(stderr, "running %s()...\n", __func__);
-	tracing_off();
-#ifdef SLEEP_ON_ABORT
-	sleep(SLEEP_ON_ABORT);
-#endif
-}
-
-static inline void __page_o_noops(void)
-{
-	/* 8-bytes of instruction * 512 bytes = 1 page */
-	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
-}
-
-/*
- * This attempts to have roughly a page of instructions followed by a few
- * instructions that do a write, and another page of instructions.  That
- * way, we are pretty sure that the write is in the second page of
- * instructions and has at least a page of padding behind it.
- *
- * *That* lets us be sure to madvise() away the write instruction, which
- * will then fault, which makes sure that the fault code handles
- * execute-only memory properly.
- */
-__attribute__((__aligned__(PAGE_SIZE)))
-void lots_o_noops_around_write(int *write_to_me)
-{
-	dprintf3("running %s()\n", __func__);
-	__page_o_noops();
-	/* Assume this happens in the second page of instructions: */
-	*write_to_me = __LINE__;
-	/* pad out by another page: */
-	__page_o_noops();
-	dprintf3("%s() done\n", __func__);
-}
-
-/* Define some kernel-like types */
-#define  u8 uint8_t
-#define u16 uint16_t
-#define u32 uint32_t
-#define u64 uint64_t
-
-#ifdef __i386__
-#define SYS_mprotect_key 380
-#define SYS_pkey_alloc	 381
-#define SYS_pkey_free	 382
-#define REG_IP_IDX REG_EIP
-#define si_pkey_offset 0x14
-#else
-#define SYS_mprotect_key 329
-#define SYS_pkey_alloc	 330
-#define SYS_pkey_free	 331
-#define REG_IP_IDX REG_RIP
-#define si_pkey_offset 0x20
-#endif
-
-void dump_mem(void *dumpme, int len_bytes)
-{
-	char *c = (void *)dumpme;
-	int i;
-
-	for (i = 0; i < len_bytes; i += sizeof(u64)) {
-		u64 *ptr = (u64 *)(c + i);
-		dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
-	}
-}
-
-#define SEGV_BNDERR     3  /* failed address bound checks */
-#define SEGV_PKUERR     4
-
-static char *si_code_str(int si_code)
-{
-	if (si_code == SEGV_MAPERR)
-		return "SEGV_MAPERR";
-	if (si_code == SEGV_ACCERR)
-		return "SEGV_ACCERR";
-	if (si_code == SEGV_BNDERR)
-		return "SEGV_BNDERR";
-	if (si_code == SEGV_PKUERR)
-		return "SEGV_PKUERR";
-	return "UNKNOWN";
-}
-
-int pkru_faults;
-int last_si_pkey = -1;
-void signal_handler(int signum, siginfo_t *si, void *vucontext)
-{
-	ucontext_t *uctxt = vucontext;
-	int trapno;
-	unsigned long ip;
-	char *fpregs;
-	u32 *pkru_ptr;
-	u64 si_pkey;
-	u32 *si_pkey_ptr;
-	int pkru_offset;
-	fpregset_t fpregset;
-
-	dprint_in_signal = 1;
-	dprintf1(">>>>===============SIGSEGV============================\n");
-	dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
-			__rdpkru(), shadow_pkru);
-
-	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
-	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
-	fpregset = uctxt->uc_mcontext.fpregs;
-	fpregs = (void *)fpregset;
-
-	dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
-			trapno, ip, si_code_str(si->si_code), si->si_code);
-#ifdef __i386__
-	/*
-	 * 32-bit has some extra padding so that userspace can tell whether
-	 * the XSTATE header is present in addition to the "legacy" FPU
-	 * state.  We just assume that it is here.
-	 */
-	fpregs += 0x70;
-#endif
-	pkru_offset = pkru_xstate_offset();
-	pkru_ptr = (void *)(&fpregs[pkru_offset]);
-
-	dprintf1("siginfo: %p\n", si);
-	dprintf1(" fpregs: %p\n", fpregs);
-	/*
-	 * If we got a PKRU fault, we *HAVE* to have at least one bit set in
-	 * here.
-	 */
-	dprintf1("pkru_xstate_offset: %d\n", pkru_xstate_offset());
-	if (DEBUG_LEVEL > 4)
-		dump_mem(pkru_ptr - 128, 256);
-	pkey_assert(*pkru_ptr);
-
-	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
-	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
-	dump_mem(si_pkey_ptr - 8, 24);
-	si_pkey = *si_pkey_ptr;
-	pkey_assert(si_pkey < NR_PKEYS);
-	last_si_pkey = si_pkey;
-
-	if ((si->si_code == SEGV_MAPERR) ||
-	    (si->si_code == SEGV_ACCERR) ||
-	    (si->si_code == SEGV_BNDERR)) {
-		printf("non-PK si_code, exiting...\n");
-		exit(4);
-	}
-
-	dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
-	/* need __rdpkru() version so we do not do shadow_pkru checking */
-	dprintf1("signal pkru from  pkru: %08x\n", __rdpkru());
-	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
-	*(u64 *)pkru_ptr = 0x00000000;
-	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
-	pkru_faults++;
-	dprintf1("<<<<==================================================\n");
-	return;
-	if (trapno == 14) {
-		fprintf(stderr,
-			"ERROR: In signal handler, page fault, trapno = %d, ip = %016lx\n",
-			trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(1);
-	} else {
-		fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(2);
-	}
-	dprint_in_signal = 0;
-}
-
-int wait_all_children(void)
-{
-	int status;
-	return waitpid(-1, &status, 0);
-}
-
-void sig_chld(int x)
-{
-	dprint_in_signal = 1;
-	dprintf2("[%d] SIGCHLD: %d\n", getpid(), x);
-	dprint_in_signal = 0;
-}
-
-void setup_sigsegv_handler(void)
-{
-	int r, rs;
-	struct sigaction newact;
-	struct sigaction oldact;
-
-	/* #PF is mapped to sigsegv */
-	int signum  = SIGSEGV;
-
-	newact.sa_handler = 0;
-	newact.sa_sigaction = signal_handler;
-
-	/*sigset_t - signals to block while in the handler */
-	/* get the old signal mask. */
-	rs = sigprocmask(SIG_SETMASK, 0, &newact.sa_mask);
-	pkey_assert(rs == 0);
-
-	/* call sa_sigaction, not sa_handler*/
-	newact.sa_flags = SA_SIGINFO;
-
-	newact.sa_restorer = 0;  /* void(*)(), obsolete */
-	r = sigaction(signum, &newact, &oldact);
-	r = sigaction(SIGALRM, &newact, &oldact);
-	pkey_assert(r == 0);
-}
-
-void setup_handlers(void)
-{
-	signal(SIGCHLD, &sig_chld);
-	setup_sigsegv_handler();
-}
-
-pid_t fork_lazy_child(void)
-{
-	pid_t forkret;
-
-	forkret = fork();
-	pkey_assert(forkret >= 0);
-	dprintf3("[%d] fork() ret: %d\n", getpid(), forkret);
-
-	if (!forkret) {
-		/* in the child */
-		while (1) {
-			dprintf1("child sleeping...\n");
-			sleep(30);
-		}
-	}
-	return forkret;
-}
-
-void davecmp(void *_a, void *_b, int len)
-{
-	int i;
-	unsigned long *a = _a;
-	unsigned long *b = _b;
-
-	for (i = 0; i < len / sizeof(*a); i++) {
-		if (a[i] == b[i])
-			continue;
-
-		dprintf3("[%3d]: a: %016lx b: %016lx\n", i, a[i], b[i]);
-	}
-}
-
-void dumpit(char *f)
-{
-	int fd = open(f, O_RDONLY);
-	char buf[100];
-	int nr_read;
-
-	dprintf2("maps fd: %d\n", fd);
-	do {
-		nr_read = read(fd, &buf[0], sizeof(buf));
-		write(1, buf, nr_read);
-	} while (nr_read > 0);
-	close(fd);
-}
-
-#define PKEY_DISABLE_ACCESS    0x1
-#define PKEY_DISABLE_WRITE     0x2
-
-u32 pkey_get(int pkey, unsigned long flags)
-{
-	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 pkru = __rdpkru();
-	u32 shifted_pkru;
-	u32 masked_pkru;
-
-	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
-			__func__, pkey, flags, 0, 0);
-	dprintf2("%s() raw pkru: %x\n", __func__, pkru);
-
-	shifted_pkru = (pkru >> (pkey * PKRU_BITS_PER_PKEY));
-	dprintf2("%s() shifted_pkru: %x\n", __func__, shifted_pkru);
-	masked_pkru = shifted_pkru & mask;
-	dprintf2("%s() masked  pkru: %x\n", __func__, masked_pkru);
-	/*
-	 * shift down the relevant bits to the lowest two, then
-	 * mask off all the other high bits.
-	 */
-	return masked_pkru;
-}
-
-int pkey_set(int pkey, unsigned long rights, unsigned long flags)
-{
-	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 old_pkru = __rdpkru();
-	u32 new_pkru;
-
-	/* make sure that 'rights' only contains the bits we expect: */
-	assert(!(rights & ~mask));
-
-	/* copy old pkru */
-	new_pkru = old_pkru;
-	/* mask out bits from pkey in old value: */
-	new_pkru &= ~(mask << (pkey * PKRU_BITS_PER_PKEY));
-	/* OR in new bits for pkey: */
-	new_pkru |= (rights << (pkey * PKRU_BITS_PER_PKEY));
-
-	__wrpkru(new_pkru);
-
-	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x pkru now: %x old_pkru: %x\n",
-			__func__, pkey, rights, flags, 0, __rdpkru(), old_pkru);
-	return 0;
-}
-
-void pkey_disable_set(int pkey, int flags)
-{
-	unsigned long syscall_flags = 0;
-	int ret;
-	int pkey_rights;
-	u32 orig_pkru = rdpkru();
-
-	dprintf1("START->%s(%d, 0x%x)\n", __func__,
-		pkey, flags);
-	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
-
-	pkey_rights = pkey_get(pkey, syscall_flags);
-
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-	pkey_assert(pkey_rights >= 0);
-
-	pkey_rights |= flags;
-
-	ret = pkey_set(pkey, pkey_rights, syscall_flags);
-	assert(!ret);
-	/*pkru and flags have the same format */
-	shadow_pkru |= flags << (pkey * 2);
-	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkru);
-
-	pkey_assert(ret >= 0);
-
-	pkey_rights = pkey_get(pkey, syscall_flags);
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
-	if (flags)
-		pkey_assert(rdpkru() > orig_pkru);
-	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
-		pkey, flags);
-}
-
-void pkey_disable_clear(int pkey, int flags)
-{
-	unsigned long syscall_flags = 0;
-	int ret;
-	int pkey_rights = pkey_get(pkey, syscall_flags);
-	u32 orig_pkru = rdpkru();
-
-	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
-
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-	pkey_assert(pkey_rights >= 0);
-
-	pkey_rights |= flags;
-
-	ret = pkey_set(pkey, pkey_rights, 0);
-	/* pkru and flags have the same format */
-	shadow_pkru &= ~(flags << (pkey * 2));
-	pkey_assert(ret >= 0);
-
-	pkey_rights = pkey_get(pkey, syscall_flags);
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
-	if (flags)
-		assert(rdpkru() > orig_pkru);
-}
-
-void pkey_write_allow(int pkey)
-{
-	pkey_disable_clear(pkey, PKEY_DISABLE_WRITE);
-}
-void pkey_write_deny(int pkey)
-{
-	pkey_disable_set(pkey, PKEY_DISABLE_WRITE);
-}
-void pkey_access_allow(int pkey)
-{
-	pkey_disable_clear(pkey, PKEY_DISABLE_ACCESS);
-}
-void pkey_access_deny(int pkey)
-{
-	pkey_disable_set(pkey, PKEY_DISABLE_ACCESS);
-}
-
-int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
-		unsigned long pkey)
-{
-	int sret;
-
-	dprintf2("%s(0x%p, %zx, prot=%lx, pkey=%lx)\n", __func__,
-			ptr, size, orig_prot, pkey);
-
-	errno = 0;
-	sret = syscall(SYS_mprotect_key, ptr, size, orig_prot, pkey);
-	if (errno) {
-		dprintf2("SYS_mprotect_key sret: %d\n", sret);
-		dprintf2("SYS_mprotect_key prot: 0x%lx\n", orig_prot);
-		dprintf2("SYS_mprotect_key failed, errno: %d\n", errno);
-		if (DEBUG_LEVEL >= 2)
-			perror("SYS_mprotect_pkey");
-	}
-	return sret;
-}
-
-int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
-{
-	int ret = syscall(SYS_pkey_alloc, flags, init_val);
-	dprintf1("%s(flags=%lx, init_val=%lx) syscall ret: %d errno: %d\n",
-			__func__, flags, init_val, ret, errno);
-	return ret;
-}
-
-int alloc_pkey(void)
-{
-	int ret;
-	unsigned long init_val = 0x0;
-
-	dprintf1("alloc_pkey()::%d, pkru: 0x%x shadow: %x\n",
-			__LINE__, __rdpkru(), shadow_pkru);
-	ret = sys_pkey_alloc(0, init_val);
-	/*
-	 * pkey_alloc() sets PKRU, so we need to reflect it in
-	 * shadow_pkru:
-	 */
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	if (ret) {
-		/* clear both the bits: */
-		shadow_pkru &= ~(0x3      << (ret * 2));
-		dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-				__LINE__, ret, __rdpkru(), shadow_pkru);
-		/*
-		 * move the new state in from init_val
-		 * (remember, we cheated and init_val == pkru format)
-		 */
-		shadow_pkru |=  (init_val << (ret * 2));
-	}
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	dprintf1("alloc_pkey()::%d errno: %d\n", __LINE__, errno);
-	/* for shadow checking: */
-	rdpkru();
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	return ret;
-}
-
-int sys_pkey_free(unsigned long pkey)
-{
-	int ret = syscall(SYS_pkey_free, pkey);
-	dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
-	return ret;
-}
-
-/*
- * I had a bug where pkey bits could be set by mprotect() but
- * not cleared.  This ensures we get lots of random bit sets
- * and clears on the vma and pte pkey bits.
- */
-int alloc_random_pkey(void)
-{
-	int max_nr_pkey_allocs;
-	int ret;
-	int i;
-	int alloced_pkeys[NR_PKEYS];
-	int nr_alloced = 0;
-	int random_index;
-	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
-
-	/* allocate every possible key and make a note of which ones we got */
-	max_nr_pkey_allocs = NR_PKEYS;
-	max_nr_pkey_allocs = 1;
-	for (i = 0; i < max_nr_pkey_allocs; i++) {
-		int new_pkey = alloc_pkey();
-		if (new_pkey < 0)
-			break;
-		alloced_pkeys[nr_alloced++] = new_pkey;
-	}
-
-	pkey_assert(nr_alloced > 0);
-	/* select a random one out of the allocated ones */
-	random_index = rand() % nr_alloced;
-	ret = alloced_pkeys[random_index];
-	/* now zero it out so we don't free it next */
-	alloced_pkeys[random_index] = 0;
-
-	/* go through the allocated ones that we did not want and free them */
-	for (i = 0; i < nr_alloced; i++) {
-		int free_ret;
-		if (!alloced_pkeys[i])
-			continue;
-		free_ret = sys_pkey_free(alloced_pkeys[i]);
-		pkey_assert(!free_ret);
-	}
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	return ret;
-}
-
-int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
-		unsigned long pkey)
-{
-	int nr_iterations = random() % 100;
-	int ret;
-
-	while (0) {
-		int rpkey = alloc_random_pkey();
-		ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
-		dprintf1("sys_mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
-				ptr, size, orig_prot, pkey, ret);
-		if (nr_iterations-- < 0)
-			break;
-
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-		sys_pkey_free(rpkey);
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	}
-	pkey_assert(pkey < NR_PKEYS);
-
-	ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
-	dprintf1("mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
-			ptr, size, orig_prot, pkey, ret);
-	pkey_assert(!ret);
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	return ret;
-}
-
-struct pkey_malloc_record {
-	void *ptr;
-	long size;
-};
-struct pkey_malloc_record *pkey_malloc_records;
-long nr_pkey_malloc_records;
-void record_pkey_malloc(void *ptr, long size)
-{
-	long i;
-	struct pkey_malloc_record *rec = NULL;
-
-	for (i = 0; i < nr_pkey_malloc_records; i++) {
-		rec = &pkey_malloc_records[i];
-		/* find a free record */
-		if (rec)
-			break;
-	}
-	if (!rec) {
-		/* every record is full */
-		size_t old_nr_records = nr_pkey_malloc_records;
-		size_t new_nr_records = (nr_pkey_malloc_records * 2 + 1);
-		size_t new_size = new_nr_records * sizeof(struct pkey_malloc_record);
-		dprintf2("new_nr_records: %zd\n", new_nr_records);
-		dprintf2("new_size: %zd\n", new_size);
-		pkey_malloc_records = realloc(pkey_malloc_records, new_size);
-		pkey_assert(pkey_malloc_records != NULL);
-		rec = &pkey_malloc_records[nr_pkey_malloc_records];
-		/*
-		 * realloc() does not initialize memory, so zero it from
-		 * the first new record all the way to the end.
-		 */
-		for (i = 0; i < new_nr_records - old_nr_records; i++)
-			memset(rec + i, 0, sizeof(*rec));
-	}
-	dprintf3("filling malloc record[%d/%p]: {%p, %ld}\n",
-		(int)(rec - pkey_malloc_records), rec, ptr, size);
-	rec->ptr = ptr;
-	rec->size = size;
-	nr_pkey_malloc_records++;
-}
-
-void free_pkey_malloc(void *ptr)
-{
-	long i;
-	int ret;
-	dprintf3("%s(%p)\n", __func__, ptr);
-	for (i = 0; i < nr_pkey_malloc_records; i++) {
-		struct pkey_malloc_record *rec = &pkey_malloc_records[i];
-		dprintf4("looking for ptr %p at record[%ld/%p]: {%p, %ld}\n",
-				ptr, i, rec, rec->ptr, rec->size);
-		if ((ptr <  rec->ptr) ||
-		    (ptr >= rec->ptr + rec->size))
-			continue;
-
-		dprintf3("found ptr %p at record[%ld/%p]: {%p, %ld}\n",
-				ptr, i, rec, rec->ptr, rec->size);
-		nr_pkey_malloc_records--;
-		ret = munmap(rec->ptr, rec->size);
-		dprintf3("munmap ret: %d\n", ret);
-		pkey_assert(!ret);
-		dprintf3("clearing rec->ptr, rec: %p\n", rec);
-		rec->ptr = NULL;
-		dprintf3("done clearing rec->ptr, rec: %p\n", rec);
-		return;
-	}
-	pkey_assert(false);
-}
-
-
-void *malloc_pkey_with_mprotect(long size, int prot, u16 pkey)
-{
-	void *ptr;
-	int ret;
-
-	rdpkru();
-	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
-			size, prot, pkey);
-	pkey_assert(pkey < NR_PKEYS);
-	ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-	pkey_assert(ptr != (void *)-1);
-	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
-	pkey_assert(!ret);
-	record_pkey_malloc(ptr, size);
-	rdpkru();
-
-	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
-	return ptr;
-}
-
-void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
-{
-	int ret;
-	void *ptr;
-
-	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
-			size, prot, pkey);
-	/*
-	 * Guarantee we can fit at least one huge page in the resulting
-	 * allocation by allocating space for 2:
-	 */
-	size = ALIGN_UP(size, HPAGE_SIZE * 2);
-	ptr = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-	pkey_assert(ptr != (void *)-1);
-	record_pkey_malloc(ptr, size);
-	mprotect_pkey(ptr, size, prot, pkey);
-
-	dprintf1("unaligned ptr: %p\n", ptr);
-	ptr = ALIGN_PTR_UP(ptr, HPAGE_SIZE);
-	dprintf1("  aligned ptr: %p\n", ptr);
-	ret = madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE);
-	dprintf1("MADV_HUGEPAGE ret: %d\n", ret);
-	ret = madvise(ptr, HPAGE_SIZE, MADV_WILLNEED);
-	dprintf1("MADV_WILLNEED ret: %d\n", ret);
-	memset(ptr, 0, HPAGE_SIZE);
-
-	dprintf1("mmap()'d thp for pkey %d @ %p\n", pkey, ptr);
-	return ptr;
-}
-
-int hugetlb_setup_ok;
-#define GET_NR_HUGE_PAGES 10
-void setup_hugetlbfs(void)
-{
-	int err;
-	int fd;
-	char buf[] = "123";
-
-	if (geteuid() != 0) {
-		fprintf(stderr, "WARNING: not run as root, can not do hugetlb test\n");
-		return;
-	}
-
-	cat_into_file(__stringify(GET_NR_HUGE_PAGES), "/proc/sys/vm/nr_hugepages");
-
-	/*
-	 * Now go make sure that we got the pages and that they
-	 * are 2M pages.  Someone might have made 1G the default.
-	 */
-	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", O_RDONLY);
-	if (fd < 0) {
-		perror("opening sysfs 2M hugetlb config");
-		return;
-	}
-
-	/* -1 to guarantee leaving the trailing \0 */
-	err = read(fd, buf, sizeof(buf)-1);
-	close(fd);
-	if (err <= 0) {
-		perror("reading sysfs 2M hugetlb config");
-		return;
-	}
-
-	if (atoi(buf) != GET_NR_HUGE_PAGES) {
-		fprintf(stderr, "could not confirm 2M pages, got: '%s' expected %d\n",
-			buf, GET_NR_HUGE_PAGES);
-		return;
-	}
-
-	hugetlb_setup_ok = 1;
-}
-
-void *malloc_pkey_hugetlb(long size, int prot, u16 pkey)
-{
-	void *ptr;
-	int flags = MAP_ANONYMOUS|MAP_PRIVATE|MAP_HUGETLB;
-
-	if (!hugetlb_setup_ok)
-		return PTR_ERR_ENOTSUP;
-
-	dprintf1("doing %s(%ld, %x, %x)\n", __func__, size, prot, pkey);
-	size = ALIGN_UP(size, HPAGE_SIZE * 2);
-	pkey_assert(pkey < NR_PKEYS);
-	ptr = mmap(NULL, size, PROT_NONE, flags, -1, 0);
-	pkey_assert(ptr != (void *)-1);
-	mprotect_pkey(ptr, size, prot, pkey);
-
-	record_pkey_malloc(ptr, size);
-
-	dprintf1("mmap()'d hugetlbfs for pkey %d @ %p\n", pkey, ptr);
-	return ptr;
-}
-
-void *malloc_pkey_mmap_dax(long size, int prot, u16 pkey)
-{
-	void *ptr;
-	int fd;
-
-	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
-			size, prot, pkey);
-	pkey_assert(pkey < NR_PKEYS);
-	fd = open("/dax/foo", O_RDWR);
-	pkey_assert(fd >= 0);
-
-	ptr = mmap(0, size, prot, MAP_SHARED, fd, 0);
-	pkey_assert(ptr != (void *)-1);
-
-	mprotect_pkey(ptr, size, prot, pkey);
-
-	record_pkey_malloc(ptr, size);
-
-	dprintf1("mmap()'d for pkey %d @ %p\n", pkey, ptr);
-	close(fd);
-	return ptr;
-}
-
-void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
-
-	malloc_pkey_with_mprotect,
-	malloc_pkey_anon_huge,
-	malloc_pkey_hugetlb
-/* can not do direct with the pkey_mprotect() API:
-	malloc_pkey_mmap_direct,
-	malloc_pkey_mmap_dax,
-*/
-};
-
-void *malloc_pkey(long size, int prot, u16 pkey)
-{
-	void *ret;
-	static int malloc_type;
-	int nr_malloc_types = ARRAY_SIZE(pkey_malloc);
-
-	pkey_assert(pkey < NR_PKEYS);
-
-	while (1) {
-		pkey_assert(malloc_type < nr_malloc_types);
-
-		ret = pkey_malloc[malloc_type](size, prot, pkey);
-		pkey_assert(ret != (void *)-1);
-
-		malloc_type++;
-		if (malloc_type >= nr_malloc_types)
-			malloc_type = (random()%nr_malloc_types);
-
-		/* try again if the malloc_type we tried is unsupported */
-		if (ret == PTR_ERR_ENOTSUP)
-			continue;
-
-		break;
-	}
-
-	dprintf3("%s(%ld, prot=%x, pkey=%x) returning: %p\n", __func__,
-			size, prot, pkey, ret);
-	return ret;
-}
-
-int last_pkru_faults;
-void expected_pk_fault(int pkey)
-{
-	dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
-			__func__, last_pkru_faults, pkru_faults);
-	dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
-	pkey_assert(last_pkru_faults + 1 == pkru_faults);
-	pkey_assert(last_si_pkey == pkey);
-	/*
-	 * The signal handler shold have cleared out PKRU to let the
-	 * test program continue.  We now have to restore it.
-	 */
-	if (__rdpkru() != 0)
-		pkey_assert(0);
-
-	__wrpkru(shadow_pkru);
-	dprintf1("%s() set PKRU=%x to restore state after signal nuked it\n",
-			__func__, shadow_pkru);
-	last_pkru_faults = pkru_faults;
-	last_si_pkey = -1;
-}
-
-void do_not_expect_pk_fault(void)
-{
-	pkey_assert(last_pkru_faults == pkru_faults);
-}
-
-int test_fds[10] = { -1 };
-int nr_test_fds;
-void __save_test_fd(int fd)
-{
-	pkey_assert(fd >= 0);
-	pkey_assert(nr_test_fds < ARRAY_SIZE(test_fds));
-	test_fds[nr_test_fds] = fd;
-	nr_test_fds++;
-}
-
-int get_test_read_fd(void)
-{
-	int test_fd = open("/etc/passwd", O_RDONLY);
-	__save_test_fd(test_fd);
-	return test_fd;
-}
-
-void close_test_fds(void)
-{
-	int i;
-
-	for (i = 0; i < nr_test_fds; i++) {
-		if (test_fds[i] < 0)
-			continue;
-		close(test_fds[i]);
-		test_fds[i] = -1;
-	}
-	nr_test_fds = 0;
-}
-
-#define barrier() __asm__ __volatile__("": : :"memory")
-__attribute__((noinline)) int read_ptr(int *ptr)
-{
-	/*
-	 * Keep GCC from optimizing this away somehow
-	 */
-	barrier();
-	return *ptr;
-}
-
-void test_read_of_write_disabled_region(int *ptr, u16 pkey)
-{
-	int ptr_contents;
-
-	dprintf1("disabling write access to PKEY[1], doing read\n");
-	pkey_write_deny(pkey);
-	ptr_contents = read_ptr(ptr);
-	dprintf1("*ptr: %d\n", ptr_contents);
-	dprintf1("\n");
-}
-void test_read_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	int ptr_contents;
-
-	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
-	rdpkru();
-	pkey_access_deny(pkey);
-	ptr_contents = read_ptr(ptr);
-	dprintf1("*ptr: %d\n", ptr_contents);
-	expected_pk_fault(pkey);
-}
-void test_write_of_write_disabled_region(int *ptr, u16 pkey)
-{
-	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
-	pkey_write_deny(pkey);
-	*ptr = __LINE__;
-	expected_pk_fault(pkey);
-}
-void test_write_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	dprintf1("disabling access to PKEY[%02d], doing write\n", pkey);
-	pkey_access_deny(pkey);
-	*ptr = __LINE__;
-	expected_pk_fault(pkey);
-}
-void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	int ret;
-	int test_fd = get_test_read_fd();
-
-	dprintf1("disabling access to PKEY[%02d], "
-		 "having kernel read() to buffer\n", pkey);
-	pkey_access_deny(pkey);
-	ret = read(test_fd, ptr, 1);
-	dprintf1("read ret: %d\n", ret);
-	pkey_assert(ret);
-}
-void test_kernel_write_of_write_disabled_region(int *ptr, u16 pkey)
-{
-	int ret;
-	int test_fd = get_test_read_fd();
-
-	pkey_write_deny(pkey);
-	ret = read(test_fd, ptr, 100);
-	dprintf1("read ret: %d\n", ret);
-	if (ret < 0 && (DEBUG_LEVEL > 0))
-		perror("verbose read result (OK for this to be bad)");
-	pkey_assert(ret);
-}
-
-void test_kernel_gup_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	int pipe_ret, vmsplice_ret;
-	struct iovec iov;
-	int pipe_fds[2];
-
-	pipe_ret = pipe(pipe_fds);
-
-	pkey_assert(pipe_ret == 0);
-	dprintf1("disabling access to PKEY[%02d], "
-		 "having kernel vmsplice from buffer\n", pkey);
-	pkey_access_deny(pkey);
-	iov.iov_base = ptr;
-	iov.iov_len = PAGE_SIZE;
-	vmsplice_ret = vmsplice(pipe_fds[1], &iov, 1, SPLICE_F_GIFT);
-	dprintf1("vmsplice() ret: %d\n", vmsplice_ret);
-	pkey_assert(vmsplice_ret == -1);
-
-	close(pipe_fds[0]);
-	close(pipe_fds[1]);
-}
-
-void test_kernel_gup_write_to_write_disabled_region(int *ptr, u16 pkey)
-{
-	int ignored = 0xdada;
-	int futex_ret;
-	int some_int = __LINE__;
-
-	dprintf1("disabling write to PKEY[%02d], "
-		 "doing futex gunk in buffer\n", pkey);
-	*ptr = some_int;
-	pkey_write_deny(pkey);
-	futex_ret = syscall(SYS_futex, ptr, FUTEX_WAIT, some_int-1, NULL,
-			&ignored, ignored);
-	if (DEBUG_LEVEL > 0)
-		perror("futex");
-	dprintf1("futex() ret: %d\n", futex_ret);
-}
-
-/* Assumes that all pkeys other than 'pkey' are unallocated */
-void test_pkey_syscalls_on_non_allocated_pkey(int *ptr, u16 pkey)
-{
-	int err;
-	int i;
-
-	/* Note: 0 is the default pkey, so don't mess with it */
-	for (i = 1; i < NR_PKEYS; i++) {
-		if (pkey == i)
-			continue;
-
-		dprintf1("trying get/set/free to non-allocated pkey: %2d\n", i);
-		err = sys_pkey_free(i);
-		pkey_assert(err);
-
-		err = sys_pkey_free(i);
-		pkey_assert(err);
-
-		err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, i);
-		pkey_assert(err);
-	}
-}
-
-/* Assumes that all pkeys other than 'pkey' are unallocated */
-void test_pkey_syscalls_bad_args(int *ptr, u16 pkey)
-{
-	int err;
-	int bad_pkey = NR_PKEYS+99;
-
-	/* pass a known-invalid pkey in: */
-	err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, bad_pkey);
-	pkey_assert(err);
-}
-
-/* Assumes that all pkeys other than 'pkey' are unallocated */
-void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
-{
-	int err;
-	int allocated_pkeys[NR_PKEYS] = {0};
-	int nr_allocated_pkeys = 0;
-	int i;
-
-	for (i = 0; i < NR_PKEYS*2; i++) {
-		int new_pkey;
-		dprintf1("%s() alloc loop: %d\n", __func__, i);
-		new_pkey = alloc_pkey();
-		dprintf4("%s()::%d, err: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-				__LINE__, err, __rdpkru(), shadow_pkru);
-		rdpkru(); /* for shadow checking */
-		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
-		if ((new_pkey == -1) && (errno == ENOSPC)) {
-			dprintf2("%s() failed to allocate pkey after %d tries\n",
-				__func__, nr_allocated_pkeys);
-			break;
-		}
-		pkey_assert(nr_allocated_pkeys < NR_PKEYS);
-		allocated_pkeys[nr_allocated_pkeys++] = new_pkey;
-	}
-
-	dprintf3("%s()::%d\n", __func__, __LINE__);
-
-	/*
-	 * ensure it did not reach the end of the loop without
-	 * failure:
-	 */
-	pkey_assert(i < NR_PKEYS*2);
-
-	/*
-	 * There are 16 pkeys supported in hardware.  One is taken
-	 * up for the default (0) and another can be taken up by
-	 * an execute-only mapping.  Ensure that we can allocate
-	 * at least 14 (16-2).
-	 */
-	pkey_assert(i >= NR_PKEYS-2);
-
-	for (i = 0; i < nr_allocated_pkeys; i++) {
-		err = sys_pkey_free(allocated_pkeys[i]);
-		pkey_assert(!err);
-		rdpkru(); /* for shadow checking */
-	}
-}
-
-void test_ptrace_of_child(int *ptr, u16 pkey)
-{
-	__attribute__((__unused__)) int peek_result;
-	pid_t child_pid;
-	void *ignored = 0;
-	long ret;
-	int status;
-	/*
-	 * This is the "control" for our little expermient.  Make sure
-	 * we can always access it when ptracing.
-	 */
-	int *plain_ptr_unaligned = malloc(HPAGE_SIZE);
-	int *plain_ptr = ALIGN_PTR_UP(plain_ptr_unaligned, PAGE_SIZE);
-
-	/*
-	 * Fork a child which is an exact copy of this process, of course.
-	 * That means we can do all of our tests via ptrace() and then plain
-	 * memory access and ensure they work differently.
-	 */
-	child_pid = fork_lazy_child();
-	dprintf1("[%d] child pid: %d\n", getpid(), child_pid);
-
-	ret = ptrace(PTRACE_ATTACH, child_pid, ignored, ignored);
-	if (ret)
-		perror("attach");
-	dprintf1("[%d] attach ret: %ld %d\n", getpid(), ret, __LINE__);
-	pkey_assert(ret != -1);
-	ret = waitpid(child_pid, &status, WUNTRACED);
-	if ((ret != child_pid) || !(WIFSTOPPED(status))) {
-		fprintf(stderr, "weird waitpid result %ld stat %x\n",
-				ret, status);
-		pkey_assert(0);
-	}
-	dprintf2("waitpid ret: %ld\n", ret);
-	dprintf2("waitpid status: %d\n", status);
-
-	pkey_access_deny(pkey);
-	pkey_write_deny(pkey);
-
-	/* Write access, untested for now:
-	ret = ptrace(PTRACE_POKEDATA, child_pid, peek_at, data);
-	pkey_assert(ret != -1);
-	dprintf1("poke at %p: %ld\n", peek_at, ret);
-	*/
-
-	/*
-	 * Try to access the pkey-protected "ptr" via ptrace:
-	 */
-	ret = ptrace(PTRACE_PEEKDATA, child_pid, ptr, ignored);
-	/* expect it to work, without an error: */
-	pkey_assert(ret != -1);
-	/* Now access from the current task, and expect an exception: */
-	peek_result = read_ptr(ptr);
-	expected_pk_fault(pkey);
-
-	/*
-	 * Try to access the NON-pkey-protected "plain_ptr" via ptrace:
-	 */
-	ret = ptrace(PTRACE_PEEKDATA, child_pid, plain_ptr, ignored);
-	/* expect it to work, without an error: */
-	pkey_assert(ret != -1);
-	/* Now access from the current task, and expect NO exception: */
-	peek_result = read_ptr(plain_ptr);
-	do_not_expect_pk_fault();
-
-	ret = ptrace(PTRACE_DETACH, child_pid, ignored, 0);
-	pkey_assert(ret != -1);
-
-	ret = kill(child_pid, SIGKILL);
-	pkey_assert(ret != -1);
-
-	wait(&status);
-
-	free(plain_ptr_unaligned);
-}
-
-void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
-{
-	void *p1;
-	int scratch;
-	int ptr_contents;
-	int ret;
-
-	p1 = ALIGN_PTR_UP(&lots_o_noops_around_write, PAGE_SIZE);
-	dprintf3("&lots_o_noops: %p\n", &lots_o_noops_around_write);
-	/* lots_o_noops_around_write should be page-aligned already */
-	assert(p1 == &lots_o_noops_around_write);
-
-	/* Point 'p1' at the *second* page of the function: */
-	p1 += PAGE_SIZE;
-
-	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
-	lots_o_noops_around_write(&scratch);
-	ptr_contents = read_ptr(p1);
-	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-
-	ret = mprotect_pkey(p1, PAGE_SIZE, PROT_EXEC, (u64)pkey);
-	pkey_assert(!ret);
-	pkey_access_deny(pkey);
-
-	dprintf2("pkru: %x\n", rdpkru());
-
-	/*
-	 * Make sure this is an *instruction* fault
-	 */
-	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
-	lots_o_noops_around_write(&scratch);
-	do_not_expect_pk_fault();
-	ptr_contents = read_ptr(p1);
-	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-	expected_pk_fault(pkey);
-}
-
-void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
-{
-	int size = PAGE_SIZE;
-	int sret;
-
-	if (cpu_has_pku()) {
-		dprintf1("SKIP: %s: no CPU support\n", __func__);
-		return;
-	}
-
-	sret = syscall(SYS_mprotect_key, ptr, size, PROT_READ, pkey);
-	pkey_assert(sret < 0);
-}
-
-void (*pkey_tests[])(int *ptr, u16 pkey) = {
-	test_read_of_write_disabled_region,
-	test_read_of_access_disabled_region,
-	test_write_of_write_disabled_region,
-	test_write_of_access_disabled_region,
-	test_kernel_write_of_access_disabled_region,
-	test_kernel_write_of_write_disabled_region,
-	test_kernel_gup_of_access_disabled_region,
-	test_kernel_gup_write_to_write_disabled_region,
-	test_executing_on_unreadable_memory,
-	test_ptrace_of_child,
-	test_pkey_syscalls_on_non_allocated_pkey,
-	test_pkey_syscalls_bad_args,
-	test_pkey_alloc_exhaust,
-};
-
-void run_tests_once(void)
-{
-	int *ptr;
-	int prot = PROT_READ|PROT_WRITE;
-
-	for (test_nr = 0; test_nr < ARRAY_SIZE(pkey_tests); test_nr++) {
-		int pkey;
-		int orig_pkru_faults = pkru_faults;
-
-		dprintf1("======================\n");
-		dprintf1("test %d preparing...\n", test_nr);
-
-		tracing_on();
-		pkey = alloc_random_pkey();
-		dprintf1("test %d starting with pkey: %d\n", test_nr, pkey);
-		ptr = malloc_pkey(PAGE_SIZE, prot, pkey);
-		dprintf1("test %d starting...\n", test_nr);
-		pkey_tests[test_nr](ptr, pkey);
-		dprintf1("freeing test memory: %p\n", ptr);
-		free_pkey_malloc(ptr);
-		sys_pkey_free(pkey);
-
-		dprintf1("pkru_faults: %d\n", pkru_faults);
-		dprintf1("orig_pkru_faults: %d\n", orig_pkru_faults);
-
-		tracing_off();
-		close_test_fds();
-
-		printf("test %2d PASSED (iteration %d)\n", test_nr, iteration_nr);
-		dprintf1("======================\n\n");
-	}
-	iteration_nr++;
-}
-
-void pkey_setup_shadow(void)
-{
-	shadow_pkru = __rdpkru();
-}
-
-int main(void)
-{
-	int nr_iterations = 22;
-
-	setup_handlers();
-
-	printf("has pku: %d\n", cpu_has_pku());
-
-	if (!cpu_has_pku()) {
-		int size = PAGE_SIZE;
-		int *ptr;
-
-		printf("running PKEY tests for unsupported CPU/OS\n");
-
-		ptr  = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-		assert(ptr != (void *)-1);
-		test_mprotect_pkey_on_unsupported_cpu(ptr, 1);
-		exit(0);
-	}
-
-	pkey_setup_shadow();
-	printf("startup pkru: %x\n", rdpkru());
-	setup_hugetlbfs();
-
-	while (nr_iterations-- > 0)
-		run_tests_once();
-
-	printf("done (all tests OK)\n");
-	return 0;
-}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 32/51] selftest/x86: Move protecton key selftest to arch neutral directory
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/Makefile           |    1 +
 tools/testing/selftests/vm/pkey-helpers.h     |  220 ++++
 tools/testing/selftests/vm/protection_keys.c  | 1395 +++++++++++++++++++++++++
 tools/testing/selftests/x86/Makefile          |    2 +-
 tools/testing/selftests/x86/pkey-helpers.h    |  220 ----
 tools/testing/selftests/x86/protection_keys.c | 1395 -------------------------
 6 files changed, 1617 insertions(+), 1616 deletions(-)
 create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
 create mode 100644 tools/testing/selftests/vm/protection_keys.c
 delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
 delete mode 100644 tools/testing/selftests/x86/protection_keys.c

diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile
index e49eca1..6f18ef4 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -18,6 +18,7 @@ TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 TEST_GEN_FILES += mlock-random-test
 TEST_GEN_FILES += virtual_address_range
+TEST_GEN_FILES += protection_keys
 
 TEST_PROGS := run_vmtests
 
diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
new file mode 100644
index 0000000..3818f25
--- /dev/null
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -0,0 +1,220 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PKEYS_HELPER_H
+#define _PKEYS_HELPER_H
+#define _GNU_SOURCE
+#include <string.h>
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <ucontext.h>
+#include <sys/mman.h>
+
+#define NR_PKEYS 16
+#define PKRU_BITS_PER_PKEY 2
+
+#ifndef DEBUG_LEVEL
+#define DEBUG_LEVEL 0
+#endif
+#define DPRINT_IN_SIGNAL_BUF_SIZE 4096
+extern int dprint_in_signal;
+extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+static inline void sigsafe_printf(const char *format, ...)
+{
+	va_list ap;
+
+	va_start(ap, format);
+	if (!dprint_in_signal) {
+		vprintf(format, ap);
+	} else {
+		int len = vsnprintf(dprint_in_signal_buffer,
+				    DPRINT_IN_SIGNAL_BUF_SIZE,
+				    format, ap);
+		/*
+		 * len is amount that would have been printed,
+		 * but actual write is truncated at BUF_SIZE.
+		 */
+		if (len > DPRINT_IN_SIGNAL_BUF_SIZE)
+			len = DPRINT_IN_SIGNAL_BUF_SIZE;
+		write(1, dprint_in_signal_buffer, len);
+	}
+	va_end(ap);
+}
+#define dprintf_level(level, args...) do {	\
+	if (level <= DEBUG_LEVEL)		\
+		sigsafe_printf(args);		\
+	fflush(NULL);				\
+} while (0)
+#define dprintf0(args...) dprintf_level(0, args)
+#define dprintf1(args...) dprintf_level(1, args)
+#define dprintf2(args...) dprintf_level(2, args)
+#define dprintf3(args...) dprintf_level(3, args)
+#define dprintf4(args...) dprintf_level(4, args)
+
+extern unsigned int shadow_pkru;
+static inline unsigned int __rdpkru(void)
+{
+	unsigned int eax, edx;
+	unsigned int ecx = 0;
+	unsigned int pkru;
+
+	asm volatile(".byte 0x0f,0x01,0xee\n\t"
+		     : "=a" (eax), "=d" (edx)
+		     : "c" (ecx));
+	pkru = eax;
+	return pkru;
+}
+
+static inline unsigned int _rdpkru(int line)
+{
+	unsigned int pkru = __rdpkru();
+
+	dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
+			line, pkru, shadow_pkru);
+	assert(pkru == shadow_pkru);
+
+	return pkru;
+}
+
+#define rdpkru() _rdpkru(__LINE__)
+
+static inline void __wrpkru(unsigned int pkru)
+{
+	unsigned int eax = pkru;
+	unsigned int ecx = 0;
+	unsigned int edx = 0;
+
+	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	asm volatile(".byte 0x0f,0x01,0xef\n\t"
+		     : : "a" (eax), "c" (ecx), "d" (edx));
+	assert(pkru == __rdpkru());
+}
+
+static inline void wrpkru(unsigned int pkru)
+{
+	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	/* will do the shadow check for us: */
+	rdpkru();
+	__wrpkru(pkru);
+	shadow_pkru = pkru;
+	dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
+}
+
+/*
+ * These are technically racy. since something could
+ * change PKRU between the read and the write.
+ */
+static inline void __pkey_access_allow(int pkey, int do_allow)
+{
+	unsigned int pkru = rdpkru();
+	int bit = pkey * 2;
+
+	if (do_allow)
+		pkru &= (1<<bit);
+	else
+		pkru |= (1<<bit);
+
+	dprintf4("pkru now: %08x\n", rdpkru());
+	wrpkru(pkru);
+}
+
+static inline void __pkey_write_allow(int pkey, int do_allow_write)
+{
+	long pkru = rdpkru();
+	int bit = pkey * 2 + 1;
+
+	if (do_allow_write)
+		pkru &= (1<<bit);
+	else
+		pkru |= (1<<bit);
+
+	wrpkru(pkru);
+	dprintf4("pkru now: %08x\n", rdpkru());
+}
+
+#define PROT_PKEY0     0x10            /* protection key value (bit 0) */
+#define PROT_PKEY1     0x20            /* protection key value (bit 1) */
+#define PROT_PKEY2     0x40            /* protection key value (bit 2) */
+#define PROT_PKEY3     0x80            /* protection key value (bit 3) */
+
+#define PAGE_SIZE 4096
+#define MB	(1<<20)
+
+static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
+		unsigned int *ecx, unsigned int *edx)
+{
+	/* ecx is often an input as well as an output. */
+	asm volatile(
+		"cpuid;"
+		: "=a" (*eax),
+		  "=b" (*ebx),
+		  "=c" (*ecx),
+		  "=d" (*edx)
+		: "0" (*eax), "2" (*ecx));
+}
+
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU        (1<<3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE      (1<<4) /* OS Protection Keys Enable */
+
+static inline int cpu_has_pku(void)
+{
+	unsigned int eax;
+	unsigned int ebx;
+	unsigned int ecx;
+	unsigned int edx;
+
+	eax = 0x7;
+	ecx = 0x0;
+	__cpuid(&eax, &ebx, &ecx, &edx);
+
+	if (!(ecx & X86_FEATURE_PKU)) {
+		dprintf2("cpu does not have PKU\n");
+		return 0;
+	}
+	if (!(ecx & X86_FEATURE_OSPKE)) {
+		dprintf2("cpu does not have OSPKE\n");
+		return 0;
+	}
+	return 1;
+}
+
+#define XSTATE_PKRU_BIT	(9)
+#define XSTATE_PKRU	0x200
+
+int pkru_xstate_offset(void)
+{
+	unsigned int eax;
+	unsigned int ebx;
+	unsigned int ecx;
+	unsigned int edx;
+	int xstate_offset;
+	int xstate_size;
+	unsigned long XSTATE_CPUID = 0xd;
+	int leaf;
+
+	/* assume that XSTATE_PKRU is set in XCR0 */
+	leaf = XSTATE_PKRU_BIT;
+	{
+		eax = XSTATE_CPUID;
+		ecx = leaf;
+		__cpuid(&eax, &ebx, &ecx, &edx);
+
+		if (leaf == XSTATE_PKRU_BIT) {
+			xstate_offset = ebx;
+			xstate_size = eax;
+		}
+	}
+
+	if (xstate_size == 0) {
+		printf("could not find size/offset of PKRU in xsave state\n");
+		return 0;
+	}
+
+	return xstate_offset;
+}
+
+#endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
new file mode 100644
index 0000000..555e43c
--- /dev/null
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -0,0 +1,1395 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Tests x86 Memory Protection Keys (see Documentation/x86/protection-keys.txt)
+ *
+ * There are examples in here of:
+ *  * how to set protection keys on memory
+ *  * how to set/clear bits in PKRU (the rights register)
+ *  * how to handle SEGV_PKRU signals and extract pkey-relevant
+ *    information from the siginfo
+ *
+ * Things to add:
+ *	make sure KSM and KSM COW breaking works
+ *	prefault pages in at malloc, or not
+ *	protect MPX bounds tables with protection keys?
+ *	make sure VMA splitting/merging is working correctly
+ *	OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune to pkeys
+ *	look for pkey "leaks" where it is still set on a VMA but "freed" back to the kernel
+ *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
+ *
+ * Compile like this:
+ *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ */
+#define _GNU_SOURCE
+#include <errno.h>
+#include <linux/futex.h>
+#include <sys/time.h>
+#include <sys/syscall.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <ucontext.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/ptrace.h>
+#include <setjmp.h>
+
+#include "pkey-helpers.h"
+
+int iteration_nr = 1;
+int test_nr;
+
+unsigned int shadow_pkru;
+
+#define HPAGE_SIZE	(1UL<<21)
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
+#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
+#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
+#define ALIGN_PTR_UP(p, ptr_align_to)	((typeof(p))ALIGN_UP((unsigned long)(p),	ptr_align_to))
+#define ALIGN_PTR_DOWN(p, ptr_align_to)	((typeof(p))ALIGN_DOWN((unsigned long)(p),	ptr_align_to))
+#define __stringify_1(x...)     #x
+#define __stringify(x...)       __stringify_1(x)
+
+#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
+
+int dprint_in_signal;
+char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+
+extern void abort_hooks(void);
+#define pkey_assert(condition) do {		\
+	if (!(condition)) {			\
+		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
+				__FILE__, __LINE__,	\
+				test_nr, iteration_nr);	\
+		dprintf0("errno at assert: %d", errno);	\
+		abort_hooks();			\
+		assert(condition);		\
+	}					\
+} while (0)
+#define raw_assert(cond) assert(cond)
+
+void cat_into_file(char *str, char *file)
+{
+	int fd = open(file, O_RDWR);
+	int ret;
+
+	dprintf2("%s(): writing '%s' to '%s'\n", __func__, str, file);
+	/*
+	 * these need to be raw because they are called under
+	 * pkey_assert()
+	 */
+	raw_assert(fd >= 0);
+	ret = write(fd, str, strlen(str));
+	if (ret != strlen(str)) {
+		perror("write to file failed");
+		fprintf(stderr, "filename: '%s' str: '%s'\n", file, str);
+		raw_assert(0);
+	}
+	close(fd);
+}
+
+#if CONTROL_TRACING > 0
+static int warned_tracing;
+int tracing_root_ok(void)
+{
+	if (geteuid() != 0) {
+		if (!warned_tracing)
+			fprintf(stderr, "WARNING: not run as root, "
+					"can not do tracing control\n");
+		warned_tracing = 1;
+		return 0;
+	}
+	return 1;
+}
+#endif
+
+void tracing_on(void)
+{
+#if CONTROL_TRACING > 0
+#define TRACEDIR "/sys/kernel/debug/tracing"
+	char pidstr[32];
+
+	if (!tracing_root_ok())
+		return;
+
+	sprintf(pidstr, "%d", getpid());
+	cat_into_file("0", TRACEDIR "/tracing_on");
+	cat_into_file("\n", TRACEDIR "/trace");
+	if (1) {
+		cat_into_file("function_graph", TRACEDIR "/current_tracer");
+		cat_into_file("1", TRACEDIR "/options/funcgraph-proc");
+	} else {
+		cat_into_file("nop", TRACEDIR "/current_tracer");
+	}
+	cat_into_file(pidstr, TRACEDIR "/set_ftrace_pid");
+	cat_into_file("1", TRACEDIR "/tracing_on");
+	dprintf1("enabled tracing\n");
+#endif
+}
+
+void tracing_off(void)
+{
+#if CONTROL_TRACING > 0
+	if (!tracing_root_ok())
+		return;
+	cat_into_file("0", "/sys/kernel/debug/tracing/tracing_on");
+#endif
+}
+
+void abort_hooks(void)
+{
+	fprintf(stderr, "running %s()...\n", __func__);
+	tracing_off();
+#ifdef SLEEP_ON_ABORT
+	sleep(SLEEP_ON_ABORT);
+#endif
+}
+
+static inline void __page_o_noops(void)
+{
+	/* 8-bytes of instruction * 512 bytes = 1 page */
+	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
+}
+
+/*
+ * This attempts to have roughly a page of instructions followed by a few
+ * instructions that do a write, and another page of instructions.  That
+ * way, we are pretty sure that the write is in the second page of
+ * instructions and has at least a page of padding behind it.
+ *
+ * *That* lets us be sure to madvise() away the write instruction, which
+ * will then fault, which makes sure that the fault code handles
+ * execute-only memory properly.
+ */
+__attribute__((__aligned__(PAGE_SIZE)))
+void lots_o_noops_around_write(int *write_to_me)
+{
+	dprintf3("running %s()\n", __func__);
+	__page_o_noops();
+	/* Assume this happens in the second page of instructions: */
+	*write_to_me = __LINE__;
+	/* pad out by another page: */
+	__page_o_noops();
+	dprintf3("%s() done\n", __func__);
+}
+
+/* Define some kernel-like types */
+#define  u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+
+#ifdef __i386__
+#define SYS_mprotect_key 380
+#define SYS_pkey_alloc	 381
+#define SYS_pkey_free	 382
+#define REG_IP_IDX REG_EIP
+#define si_pkey_offset 0x14
+#else
+#define SYS_mprotect_key 329
+#define SYS_pkey_alloc	 330
+#define SYS_pkey_free	 331
+#define REG_IP_IDX REG_RIP
+#define si_pkey_offset 0x20
+#endif
+
+void dump_mem(void *dumpme, int len_bytes)
+{
+	char *c = (void *)dumpme;
+	int i;
+
+	for (i = 0; i < len_bytes; i += sizeof(u64)) {
+		u64 *ptr = (u64 *)(c + i);
+		dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
+	}
+}
+
+#define SEGV_BNDERR     3  /* failed address bound checks */
+#define SEGV_PKUERR     4
+
+static char *si_code_str(int si_code)
+{
+	if (si_code == SEGV_MAPERR)
+		return "SEGV_MAPERR";
+	if (si_code == SEGV_ACCERR)
+		return "SEGV_ACCERR";
+	if (si_code == SEGV_BNDERR)
+		return "SEGV_BNDERR";
+	if (si_code == SEGV_PKUERR)
+		return "SEGV_PKUERR";
+	return "UNKNOWN";
+}
+
+int pkru_faults;
+int last_si_pkey = -1;
+void signal_handler(int signum, siginfo_t *si, void *vucontext)
+{
+	ucontext_t *uctxt = vucontext;
+	int trapno;
+	unsigned long ip;
+	char *fpregs;
+	u32 *pkru_ptr;
+	u64 si_pkey;
+	u32 *si_pkey_ptr;
+	int pkru_offset;
+	fpregset_t fpregset;
+
+	dprint_in_signal = 1;
+	dprintf1(">>>>===============SIGSEGV============================\n");
+	dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
+			__rdpkru(), shadow_pkru);
+
+	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
+	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
+	fpregset = uctxt->uc_mcontext.fpregs;
+	fpregs = (void *)fpregset;
+
+	dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
+			trapno, ip, si_code_str(si->si_code), si->si_code);
+#ifdef __i386__
+	/*
+	 * 32-bit has some extra padding so that userspace can tell whether
+	 * the XSTATE header is present in addition to the "legacy" FPU
+	 * state.  We just assume that it is here.
+	 */
+	fpregs += 0x70;
+#endif
+	pkru_offset = pkru_xstate_offset();
+	pkru_ptr = (void *)(&fpregs[pkru_offset]);
+
+	dprintf1("siginfo: %p\n", si);
+	dprintf1(" fpregs: %p\n", fpregs);
+	/*
+	 * If we got a PKRU fault, we *HAVE* to have at least one bit set in
+	 * here.
+	 */
+	dprintf1("pkru_xstate_offset: %d\n", pkru_xstate_offset());
+	if (DEBUG_LEVEL > 4)
+		dump_mem(pkru_ptr - 128, 256);
+	pkey_assert(*pkru_ptr);
+
+	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
+	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
+	dump_mem(si_pkey_ptr - 8, 24);
+	si_pkey = *si_pkey_ptr;
+	pkey_assert(si_pkey < NR_PKEYS);
+	last_si_pkey = si_pkey;
+
+	if ((si->si_code == SEGV_MAPERR) ||
+	    (si->si_code == SEGV_ACCERR) ||
+	    (si->si_code == SEGV_BNDERR)) {
+		printf("non-PK si_code, exiting...\n");
+		exit(4);
+	}
+
+	dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
+	/* need __rdpkru() version so we do not do shadow_pkru checking */
+	dprintf1("signal pkru from  pkru: %08x\n", __rdpkru());
+	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
+	*(u64 *)pkru_ptr = 0x00000000;
+	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
+	pkru_faults++;
+	dprintf1("<<<<==================================================\n");
+	return;
+	if (trapno == 14) {
+		fprintf(stderr,
+			"ERROR: In signal handler, page fault, trapno = %d, ip = %016lx\n",
+			trapno, ip);
+		fprintf(stderr, "si_addr %p\n", si->si_addr);
+		fprintf(stderr, "REG_ERR: %lx\n",
+				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
+		exit(1);
+	} else {
+		fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
+		fprintf(stderr, "si_addr %p\n", si->si_addr);
+		fprintf(stderr, "REG_ERR: %lx\n",
+				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
+		exit(2);
+	}
+	dprint_in_signal = 0;
+}
+
+int wait_all_children(void)
+{
+	int status;
+	return waitpid(-1, &status, 0);
+}
+
+void sig_chld(int x)
+{
+	dprint_in_signal = 1;
+	dprintf2("[%d] SIGCHLD: %d\n", getpid(), x);
+	dprint_in_signal = 0;
+}
+
+void setup_sigsegv_handler(void)
+{
+	int r, rs;
+	struct sigaction newact;
+	struct sigaction oldact;
+
+	/* #PF is mapped to sigsegv */
+	int signum  = SIGSEGV;
+
+	newact.sa_handler = 0;
+	newact.sa_sigaction = signal_handler;
+
+	/*sigset_t - signals to block while in the handler */
+	/* get the old signal mask. */
+	rs = sigprocmask(SIG_SETMASK, 0, &newact.sa_mask);
+	pkey_assert(rs == 0);
+
+	/* call sa_sigaction, not sa_handler*/
+	newact.sa_flags = SA_SIGINFO;
+
+	newact.sa_restorer = 0;  /* void(*)(), obsolete */
+	r = sigaction(signum, &newact, &oldact);
+	r = sigaction(SIGALRM, &newact, &oldact);
+	pkey_assert(r == 0);
+}
+
+void setup_handlers(void)
+{
+	signal(SIGCHLD, &sig_chld);
+	setup_sigsegv_handler();
+}
+
+pid_t fork_lazy_child(void)
+{
+	pid_t forkret;
+
+	forkret = fork();
+	pkey_assert(forkret >= 0);
+	dprintf3("[%d] fork() ret: %d\n", getpid(), forkret);
+
+	if (!forkret) {
+		/* in the child */
+		while (1) {
+			dprintf1("child sleeping...\n");
+			sleep(30);
+		}
+	}
+	return forkret;
+}
+
+void davecmp(void *_a, void *_b, int len)
+{
+	int i;
+	unsigned long *a = _a;
+	unsigned long *b = _b;
+
+	for (i = 0; i < len / sizeof(*a); i++) {
+		if (a[i] == b[i])
+			continue;
+
+		dprintf3("[%3d]: a: %016lx b: %016lx\n", i, a[i], b[i]);
+	}
+}
+
+void dumpit(char *f)
+{
+	int fd = open(f, O_RDONLY);
+	char buf[100];
+	int nr_read;
+
+	dprintf2("maps fd: %d\n", fd);
+	do {
+		nr_read = read(fd, &buf[0], sizeof(buf));
+		write(1, buf, nr_read);
+	} while (nr_read > 0);
+	close(fd);
+}
+
+#define PKEY_DISABLE_ACCESS    0x1
+#define PKEY_DISABLE_WRITE     0x2
+
+u32 pkey_get(int pkey, unsigned long flags)
+{
+	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
+	u32 pkru = __rdpkru();
+	u32 shifted_pkru;
+	u32 masked_pkru;
+
+	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
+			__func__, pkey, flags, 0, 0);
+	dprintf2("%s() raw pkru: %x\n", __func__, pkru);
+
+	shifted_pkru = (pkru >> (pkey * PKRU_BITS_PER_PKEY));
+	dprintf2("%s() shifted_pkru: %x\n", __func__, shifted_pkru);
+	masked_pkru = shifted_pkru & mask;
+	dprintf2("%s() masked  pkru: %x\n", __func__, masked_pkru);
+	/*
+	 * shift down the relevant bits to the lowest two, then
+	 * mask off all the other high bits.
+	 */
+	return masked_pkru;
+}
+
+int pkey_set(int pkey, unsigned long rights, unsigned long flags)
+{
+	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
+	u32 old_pkru = __rdpkru();
+	u32 new_pkru;
+
+	/* make sure that 'rights' only contains the bits we expect: */
+	assert(!(rights & ~mask));
+
+	/* copy old pkru */
+	new_pkru = old_pkru;
+	/* mask out bits from pkey in old value: */
+	new_pkru &= ~(mask << (pkey * PKRU_BITS_PER_PKEY));
+	/* OR in new bits for pkey: */
+	new_pkru |= (rights << (pkey * PKRU_BITS_PER_PKEY));
+
+	__wrpkru(new_pkru);
+
+	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x pkru now: %x old_pkru: %x\n",
+			__func__, pkey, rights, flags, 0, __rdpkru(), old_pkru);
+	return 0;
+}
+
+void pkey_disable_set(int pkey, int flags)
+{
+	unsigned long syscall_flags = 0;
+	int ret;
+	int pkey_rights;
+	u32 orig_pkru = rdpkru();
+
+	dprintf1("START->%s(%d, 0x%x)\n", __func__,
+		pkey, flags);
+	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+	pkey_rights = pkey_get(pkey, syscall_flags);
+
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+	pkey_assert(pkey_rights >= 0);
+
+	pkey_rights |= flags;
+
+	ret = pkey_set(pkey, pkey_rights, syscall_flags);
+	assert(!ret);
+	/*pkru and flags have the same format */
+	shadow_pkru |= flags << (pkey * 2);
+	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkru);
+
+	pkey_assert(ret >= 0);
+
+	pkey_rights = pkey_get(pkey, syscall_flags);
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+
+	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	if (flags)
+		pkey_assert(rdpkru() > orig_pkru);
+	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
+		pkey, flags);
+}
+
+void pkey_disable_clear(int pkey, int flags)
+{
+	unsigned long syscall_flags = 0;
+	int ret;
+	int pkey_rights = pkey_get(pkey, syscall_flags);
+	u32 orig_pkru = rdpkru();
+
+	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+	pkey_assert(pkey_rights >= 0);
+
+	pkey_rights |= flags;
+
+	ret = pkey_set(pkey, pkey_rights, 0);
+	/* pkru and flags have the same format */
+	shadow_pkru &= ~(flags << (pkey * 2));
+	pkey_assert(ret >= 0);
+
+	pkey_rights = pkey_get(pkey, syscall_flags);
+	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
+			pkey, pkey, pkey_rights);
+
+	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	if (flags)
+		assert(rdpkru() > orig_pkru);
+}
+
+void pkey_write_allow(int pkey)
+{
+	pkey_disable_clear(pkey, PKEY_DISABLE_WRITE);
+}
+void pkey_write_deny(int pkey)
+{
+	pkey_disable_set(pkey, PKEY_DISABLE_WRITE);
+}
+void pkey_access_allow(int pkey)
+{
+	pkey_disable_clear(pkey, PKEY_DISABLE_ACCESS);
+}
+void pkey_access_deny(int pkey)
+{
+	pkey_disable_set(pkey, PKEY_DISABLE_ACCESS);
+}
+
+int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
+		unsigned long pkey)
+{
+	int sret;
+
+	dprintf2("%s(0x%p, %zx, prot=%lx, pkey=%lx)\n", __func__,
+			ptr, size, orig_prot, pkey);
+
+	errno = 0;
+	sret = syscall(SYS_mprotect_key, ptr, size, orig_prot, pkey);
+	if (errno) {
+		dprintf2("SYS_mprotect_key sret: %d\n", sret);
+		dprintf2("SYS_mprotect_key prot: 0x%lx\n", orig_prot);
+		dprintf2("SYS_mprotect_key failed, errno: %d\n", errno);
+		if (DEBUG_LEVEL >= 2)
+			perror("SYS_mprotect_pkey");
+	}
+	return sret;
+}
+
+int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
+{
+	int ret = syscall(SYS_pkey_alloc, flags, init_val);
+	dprintf1("%s(flags=%lx, init_val=%lx) syscall ret: %d errno: %d\n",
+			__func__, flags, init_val, ret, errno);
+	return ret;
+}
+
+int alloc_pkey(void)
+{
+	int ret;
+	unsigned long init_val = 0x0;
+
+	dprintf1("alloc_pkey()::%d, pkru: 0x%x shadow: %x\n",
+			__LINE__, __rdpkru(), shadow_pkru);
+	ret = sys_pkey_alloc(0, init_val);
+	/*
+	 * pkey_alloc() sets PKRU, so we need to reflect it in
+	 * shadow_pkru:
+	 */
+	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	if (ret) {
+		/* clear both the bits: */
+		shadow_pkru &= ~(0x3      << (ret * 2));
+		dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+				__LINE__, ret, __rdpkru(), shadow_pkru);
+		/*
+		 * move the new state in from init_val
+		 * (remember, we cheated and init_val == pkru format)
+		 */
+		shadow_pkru |=  (init_val << (ret * 2));
+	}
+	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf1("alloc_pkey()::%d errno: %d\n", __LINE__, errno);
+	/* for shadow checking: */
+	rdpkru();
+	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	return ret;
+}
+
+int sys_pkey_free(unsigned long pkey)
+{
+	int ret = syscall(SYS_pkey_free, pkey);
+	dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
+	return ret;
+}
+
+/*
+ * I had a bug where pkey bits could be set by mprotect() but
+ * not cleared.  This ensures we get lots of random bit sets
+ * and clears on the vma and pte pkey bits.
+ */
+int alloc_random_pkey(void)
+{
+	int max_nr_pkey_allocs;
+	int ret;
+	int i;
+	int alloced_pkeys[NR_PKEYS];
+	int nr_alloced = 0;
+	int random_index;
+	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
+
+	/* allocate every possible key and make a note of which ones we got */
+	max_nr_pkey_allocs = NR_PKEYS;
+	max_nr_pkey_allocs = 1;
+	for (i = 0; i < max_nr_pkey_allocs; i++) {
+		int new_pkey = alloc_pkey();
+		if (new_pkey < 0)
+			break;
+		alloced_pkeys[nr_alloced++] = new_pkey;
+	}
+
+	pkey_assert(nr_alloced > 0);
+	/* select a random one out of the allocated ones */
+	random_index = rand() % nr_alloced;
+	ret = alloced_pkeys[random_index];
+	/* now zero it out so we don't free it next */
+	alloced_pkeys[random_index] = 0;
+
+	/* go through the allocated ones that we did not want and free them */
+	for (i = 0; i < nr_alloced; i++) {
+		int free_ret;
+		if (!alloced_pkeys[i])
+			continue;
+		free_ret = sys_pkey_free(alloced_pkeys[i]);
+		pkey_assert(!free_ret);
+	}
+	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	return ret;
+}
+
+int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
+		unsigned long pkey)
+{
+	int nr_iterations = random() % 100;
+	int ret;
+
+	while (0) {
+		int rpkey = alloc_random_pkey();
+		ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
+		dprintf1("sys_mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
+				ptr, size, orig_prot, pkey, ret);
+		if (nr_iterations-- < 0)
+			break;
+
+		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+		sys_pkey_free(rpkey);
+		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	}
+	pkey_assert(pkey < NR_PKEYS);
+
+	ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
+	dprintf1("mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
+			ptr, size, orig_prot, pkey, ret);
+	pkey_assert(!ret);
+	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkru(), shadow_pkru);
+	return ret;
+}
+
+struct pkey_malloc_record {
+	void *ptr;
+	long size;
+};
+struct pkey_malloc_record *pkey_malloc_records;
+long nr_pkey_malloc_records;
+void record_pkey_malloc(void *ptr, long size)
+{
+	long i;
+	struct pkey_malloc_record *rec = NULL;
+
+	for (i = 0; i < nr_pkey_malloc_records; i++) {
+		rec = &pkey_malloc_records[i];
+		/* find a free record */
+		if (rec)
+			break;
+	}
+	if (!rec) {
+		/* every record is full */
+		size_t old_nr_records = nr_pkey_malloc_records;
+		size_t new_nr_records = (nr_pkey_malloc_records * 2 + 1);
+		size_t new_size = new_nr_records * sizeof(struct pkey_malloc_record);
+		dprintf2("new_nr_records: %zd\n", new_nr_records);
+		dprintf2("new_size: %zd\n", new_size);
+		pkey_malloc_records = realloc(pkey_malloc_records, new_size);
+		pkey_assert(pkey_malloc_records != NULL);
+		rec = &pkey_malloc_records[nr_pkey_malloc_records];
+		/*
+		 * realloc() does not initialize memory, so zero it from
+		 * the first new record all the way to the end.
+		 */
+		for (i = 0; i < new_nr_records - old_nr_records; i++)
+			memset(rec + i, 0, sizeof(*rec));
+	}
+	dprintf3("filling malloc record[%d/%p]: {%p, %ld}\n",
+		(int)(rec - pkey_malloc_records), rec, ptr, size);
+	rec->ptr = ptr;
+	rec->size = size;
+	nr_pkey_malloc_records++;
+}
+
+void free_pkey_malloc(void *ptr)
+{
+	long i;
+	int ret;
+	dprintf3("%s(%p)\n", __func__, ptr);
+	for (i = 0; i < nr_pkey_malloc_records; i++) {
+		struct pkey_malloc_record *rec = &pkey_malloc_records[i];
+		dprintf4("looking for ptr %p at record[%ld/%p]: {%p, %ld}\n",
+				ptr, i, rec, rec->ptr, rec->size);
+		if ((ptr <  rec->ptr) ||
+		    (ptr >= rec->ptr + rec->size))
+			continue;
+
+		dprintf3("found ptr %p at record[%ld/%p]: {%p, %ld}\n",
+				ptr, i, rec, rec->ptr, rec->size);
+		nr_pkey_malloc_records--;
+		ret = munmap(rec->ptr, rec->size);
+		dprintf3("munmap ret: %d\n", ret);
+		pkey_assert(!ret);
+		dprintf3("clearing rec->ptr, rec: %p\n", rec);
+		rec->ptr = NULL;
+		dprintf3("done clearing rec->ptr, rec: %p\n", rec);
+		return;
+	}
+	pkey_assert(false);
+}
+
+
+void *malloc_pkey_with_mprotect(long size, int prot, u16 pkey)
+{
+	void *ptr;
+	int ret;
+
+	rdpkru();
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	pkey_assert(pkey < NR_PKEYS);
+	ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
+	pkey_assert(!ret);
+	record_pkey_malloc(ptr, size);
+	rdpkru();
+
+	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
+	return ptr;
+}
+
+void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
+{
+	int ret;
+	void *ptr;
+
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	/*
+	 * Guarantee we can fit at least one huge page in the resulting
+	 * allocation by allocating space for 2:
+	 */
+	size = ALIGN_UP(size, HPAGE_SIZE * 2);
+	ptr = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+	record_pkey_malloc(ptr, size);
+	mprotect_pkey(ptr, size, prot, pkey);
+
+	dprintf1("unaligned ptr: %p\n", ptr);
+	ptr = ALIGN_PTR_UP(ptr, HPAGE_SIZE);
+	dprintf1("  aligned ptr: %p\n", ptr);
+	ret = madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE);
+	dprintf1("MADV_HUGEPAGE ret: %d\n", ret);
+	ret = madvise(ptr, HPAGE_SIZE, MADV_WILLNEED);
+	dprintf1("MADV_WILLNEED ret: %d\n", ret);
+	memset(ptr, 0, HPAGE_SIZE);
+
+	dprintf1("mmap()'d thp for pkey %d @ %p\n", pkey, ptr);
+	return ptr;
+}
+
+int hugetlb_setup_ok;
+#define GET_NR_HUGE_PAGES 10
+void setup_hugetlbfs(void)
+{
+	int err;
+	int fd;
+	char buf[] = "123";
+
+	if (geteuid() != 0) {
+		fprintf(stderr, "WARNING: not run as root, can not do hugetlb test\n");
+		return;
+	}
+
+	cat_into_file(__stringify(GET_NR_HUGE_PAGES), "/proc/sys/vm/nr_hugepages");
+
+	/*
+	 * Now go make sure that we got the pages and that they
+	 * are 2M pages.  Someone might have made 1G the default.
+	 */
+	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", O_RDONLY);
+	if (fd < 0) {
+		perror("opening sysfs 2M hugetlb config");
+		return;
+	}
+
+	/* -1 to guarantee leaving the trailing \0 */
+	err = read(fd, buf, sizeof(buf)-1);
+	close(fd);
+	if (err <= 0) {
+		perror("reading sysfs 2M hugetlb config");
+		return;
+	}
+
+	if (atoi(buf) != GET_NR_HUGE_PAGES) {
+		fprintf(stderr, "could not confirm 2M pages, got: '%s' expected %d\n",
+			buf, GET_NR_HUGE_PAGES);
+		return;
+	}
+
+	hugetlb_setup_ok = 1;
+}
+
+void *malloc_pkey_hugetlb(long size, int prot, u16 pkey)
+{
+	void *ptr;
+	int flags = MAP_ANONYMOUS|MAP_PRIVATE|MAP_HUGETLB;
+
+	if (!hugetlb_setup_ok)
+		return PTR_ERR_ENOTSUP;
+
+	dprintf1("doing %s(%ld, %x, %x)\n", __func__, size, prot, pkey);
+	size = ALIGN_UP(size, HPAGE_SIZE * 2);
+	pkey_assert(pkey < NR_PKEYS);
+	ptr = mmap(NULL, size, PROT_NONE, flags, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+	mprotect_pkey(ptr, size, prot, pkey);
+
+	record_pkey_malloc(ptr, size);
+
+	dprintf1("mmap()'d hugetlbfs for pkey %d @ %p\n", pkey, ptr);
+	return ptr;
+}
+
+void *malloc_pkey_mmap_dax(long size, int prot, u16 pkey)
+{
+	void *ptr;
+	int fd;
+
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	pkey_assert(pkey < NR_PKEYS);
+	fd = open("/dax/foo", O_RDWR);
+	pkey_assert(fd >= 0);
+
+	ptr = mmap(0, size, prot, MAP_SHARED, fd, 0);
+	pkey_assert(ptr != (void *)-1);
+
+	mprotect_pkey(ptr, size, prot, pkey);
+
+	record_pkey_malloc(ptr, size);
+
+	dprintf1("mmap()'d for pkey %d @ %p\n", pkey, ptr);
+	close(fd);
+	return ptr;
+}
+
+void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
+
+	malloc_pkey_with_mprotect,
+	malloc_pkey_anon_huge,
+	malloc_pkey_hugetlb
+/* can not do direct with the pkey_mprotect() API:
+	malloc_pkey_mmap_direct,
+	malloc_pkey_mmap_dax,
+*/
+};
+
+void *malloc_pkey(long size, int prot, u16 pkey)
+{
+	void *ret;
+	static int malloc_type;
+	int nr_malloc_types = ARRAY_SIZE(pkey_malloc);
+
+	pkey_assert(pkey < NR_PKEYS);
+
+	while (1) {
+		pkey_assert(malloc_type < nr_malloc_types);
+
+		ret = pkey_malloc[malloc_type](size, prot, pkey);
+		pkey_assert(ret != (void *)-1);
+
+		malloc_type++;
+		if (malloc_type >= nr_malloc_types)
+			malloc_type = (random()%nr_malloc_types);
+
+		/* try again if the malloc_type we tried is unsupported */
+		if (ret == PTR_ERR_ENOTSUP)
+			continue;
+
+		break;
+	}
+
+	dprintf3("%s(%ld, prot=%x, pkey=%x) returning: %p\n", __func__,
+			size, prot, pkey, ret);
+	return ret;
+}
+
+int last_pkru_faults;
+void expected_pk_fault(int pkey)
+{
+	dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
+			__func__, last_pkru_faults, pkru_faults);
+	dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
+	pkey_assert(last_pkru_faults + 1 == pkru_faults);
+	pkey_assert(last_si_pkey == pkey);
+	/*
+	 * The signal handler shold have cleared out PKRU to let the
+	 * test program continue.  We now have to restore it.
+	 */
+	if (__rdpkru() != 0)
+		pkey_assert(0);
+
+	__wrpkru(shadow_pkru);
+	dprintf1("%s() set PKRU=%x to restore state after signal nuked it\n",
+			__func__, shadow_pkru);
+	last_pkru_faults = pkru_faults;
+	last_si_pkey = -1;
+}
+
+void do_not_expect_pk_fault(void)
+{
+	pkey_assert(last_pkru_faults == pkru_faults);
+}
+
+int test_fds[10] = { -1 };
+int nr_test_fds;
+void __save_test_fd(int fd)
+{
+	pkey_assert(fd >= 0);
+	pkey_assert(nr_test_fds < ARRAY_SIZE(test_fds));
+	test_fds[nr_test_fds] = fd;
+	nr_test_fds++;
+}
+
+int get_test_read_fd(void)
+{
+	int test_fd = open("/etc/passwd", O_RDONLY);
+	__save_test_fd(test_fd);
+	return test_fd;
+}
+
+void close_test_fds(void)
+{
+	int i;
+
+	for (i = 0; i < nr_test_fds; i++) {
+		if (test_fds[i] < 0)
+			continue;
+		close(test_fds[i]);
+		test_fds[i] = -1;
+	}
+	nr_test_fds = 0;
+}
+
+#define barrier() __asm__ __volatile__("": : :"memory")
+__attribute__((noinline)) int read_ptr(int *ptr)
+{
+	/*
+	 * Keep GCC from optimizing this away somehow
+	 */
+	barrier();
+	return *ptr;
+}
+
+void test_read_of_write_disabled_region(int *ptr, u16 pkey)
+{
+	int ptr_contents;
+
+	dprintf1("disabling write access to PKEY[1], doing read\n");
+	pkey_write_deny(pkey);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("*ptr: %d\n", ptr_contents);
+	dprintf1("\n");
+}
+void test_read_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	int ptr_contents;
+
+	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
+	rdpkru();
+	pkey_access_deny(pkey);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("*ptr: %d\n", ptr_contents);
+	expected_pk_fault(pkey);
+}
+void test_write_of_write_disabled_region(int *ptr, u16 pkey)
+{
+	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
+	pkey_write_deny(pkey);
+	*ptr = __LINE__;
+	expected_pk_fault(pkey);
+}
+void test_write_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	dprintf1("disabling access to PKEY[%02d], doing write\n", pkey);
+	pkey_access_deny(pkey);
+	*ptr = __LINE__;
+	expected_pk_fault(pkey);
+}
+void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	int ret;
+	int test_fd = get_test_read_fd();
+
+	dprintf1("disabling access to PKEY[%02d], "
+		 "having kernel read() to buffer\n", pkey);
+	pkey_access_deny(pkey);
+	ret = read(test_fd, ptr, 1);
+	dprintf1("read ret: %d\n", ret);
+	pkey_assert(ret);
+}
+void test_kernel_write_of_write_disabled_region(int *ptr, u16 pkey)
+{
+	int ret;
+	int test_fd = get_test_read_fd();
+
+	pkey_write_deny(pkey);
+	ret = read(test_fd, ptr, 100);
+	dprintf1("read ret: %d\n", ret);
+	if (ret < 0 && (DEBUG_LEVEL > 0))
+		perror("verbose read result (OK for this to be bad)");
+	pkey_assert(ret);
+}
+
+void test_kernel_gup_of_access_disabled_region(int *ptr, u16 pkey)
+{
+	int pipe_ret, vmsplice_ret;
+	struct iovec iov;
+	int pipe_fds[2];
+
+	pipe_ret = pipe(pipe_fds);
+
+	pkey_assert(pipe_ret == 0);
+	dprintf1("disabling access to PKEY[%02d], "
+		 "having kernel vmsplice from buffer\n", pkey);
+	pkey_access_deny(pkey);
+	iov.iov_base = ptr;
+	iov.iov_len = PAGE_SIZE;
+	vmsplice_ret = vmsplice(pipe_fds[1], &iov, 1, SPLICE_F_GIFT);
+	dprintf1("vmsplice() ret: %d\n", vmsplice_ret);
+	pkey_assert(vmsplice_ret == -1);
+
+	close(pipe_fds[0]);
+	close(pipe_fds[1]);
+}
+
+void test_kernel_gup_write_to_write_disabled_region(int *ptr, u16 pkey)
+{
+	int ignored = 0xdada;
+	int futex_ret;
+	int some_int = __LINE__;
+
+	dprintf1("disabling write to PKEY[%02d], "
+		 "doing futex gunk in buffer\n", pkey);
+	*ptr = some_int;
+	pkey_write_deny(pkey);
+	futex_ret = syscall(SYS_futex, ptr, FUTEX_WAIT, some_int-1, NULL,
+			&ignored, ignored);
+	if (DEBUG_LEVEL > 0)
+		perror("futex");
+	dprintf1("futex() ret: %d\n", futex_ret);
+}
+
+/* Assumes that all pkeys other than 'pkey' are unallocated */
+void test_pkey_syscalls_on_non_allocated_pkey(int *ptr, u16 pkey)
+{
+	int err;
+	int i;
+
+	/* Note: 0 is the default pkey, so don't mess with it */
+	for (i = 1; i < NR_PKEYS; i++) {
+		if (pkey == i)
+			continue;
+
+		dprintf1("trying get/set/free to non-allocated pkey: %2d\n", i);
+		err = sys_pkey_free(i);
+		pkey_assert(err);
+
+		err = sys_pkey_free(i);
+		pkey_assert(err);
+
+		err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, i);
+		pkey_assert(err);
+	}
+}
+
+/* Assumes that all pkeys other than 'pkey' are unallocated */
+void test_pkey_syscalls_bad_args(int *ptr, u16 pkey)
+{
+	int err;
+	int bad_pkey = NR_PKEYS+99;
+
+	/* pass a known-invalid pkey in: */
+	err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, bad_pkey);
+	pkey_assert(err);
+}
+
+/* Assumes that all pkeys other than 'pkey' are unallocated */
+void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
+{
+	int err;
+	int allocated_pkeys[NR_PKEYS] = {0};
+	int nr_allocated_pkeys = 0;
+	int i;
+
+	for (i = 0; i < NR_PKEYS*2; i++) {
+		int new_pkey;
+		dprintf1("%s() alloc loop: %d\n", __func__, i);
+		new_pkey = alloc_pkey();
+		dprintf4("%s()::%d, err: %d pkru: 0x%x shadow: 0x%x\n", __func__,
+				__LINE__, err, __rdpkru(), shadow_pkru);
+		rdpkru(); /* for shadow checking */
+		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
+		if ((new_pkey == -1) && (errno == ENOSPC)) {
+			dprintf2("%s() failed to allocate pkey after %d tries\n",
+				__func__, nr_allocated_pkeys);
+			break;
+		}
+		pkey_assert(nr_allocated_pkeys < NR_PKEYS);
+		allocated_pkeys[nr_allocated_pkeys++] = new_pkey;
+	}
+
+	dprintf3("%s()::%d\n", __func__, __LINE__);
+
+	/*
+	 * ensure it did not reach the end of the loop without
+	 * failure:
+	 */
+	pkey_assert(i < NR_PKEYS*2);
+
+	/*
+	 * There are 16 pkeys supported in hardware.  One is taken
+	 * up for the default (0) and another can be taken up by
+	 * an execute-only mapping.  Ensure that we can allocate
+	 * at least 14 (16-2).
+	 */
+	pkey_assert(i >= NR_PKEYS-2);
+
+	for (i = 0; i < nr_allocated_pkeys; i++) {
+		err = sys_pkey_free(allocated_pkeys[i]);
+		pkey_assert(!err);
+		rdpkru(); /* for shadow checking */
+	}
+}
+
+void test_ptrace_of_child(int *ptr, u16 pkey)
+{
+	__attribute__((__unused__)) int peek_result;
+	pid_t child_pid;
+	void *ignored = 0;
+	long ret;
+	int status;
+	/*
+	 * This is the "control" for our little expermient.  Make sure
+	 * we can always access it when ptracing.
+	 */
+	int *plain_ptr_unaligned = malloc(HPAGE_SIZE);
+	int *plain_ptr = ALIGN_PTR_UP(plain_ptr_unaligned, PAGE_SIZE);
+
+	/*
+	 * Fork a child which is an exact copy of this process, of course.
+	 * That means we can do all of our tests via ptrace() and then plain
+	 * memory access and ensure they work differently.
+	 */
+	child_pid = fork_lazy_child();
+	dprintf1("[%d] child pid: %d\n", getpid(), child_pid);
+
+	ret = ptrace(PTRACE_ATTACH, child_pid, ignored, ignored);
+	if (ret)
+		perror("attach");
+	dprintf1("[%d] attach ret: %ld %d\n", getpid(), ret, __LINE__);
+	pkey_assert(ret != -1);
+	ret = waitpid(child_pid, &status, WUNTRACED);
+	if ((ret != child_pid) || !(WIFSTOPPED(status))) {
+		fprintf(stderr, "weird waitpid result %ld stat %x\n",
+				ret, status);
+		pkey_assert(0);
+	}
+	dprintf2("waitpid ret: %ld\n", ret);
+	dprintf2("waitpid status: %d\n", status);
+
+	pkey_access_deny(pkey);
+	pkey_write_deny(pkey);
+
+	/* Write access, untested for now:
+	ret = ptrace(PTRACE_POKEDATA, child_pid, peek_at, data);
+	pkey_assert(ret != -1);
+	dprintf1("poke at %p: %ld\n", peek_at, ret);
+	*/
+
+	/*
+	 * Try to access the pkey-protected "ptr" via ptrace:
+	 */
+	ret = ptrace(PTRACE_PEEKDATA, child_pid, ptr, ignored);
+	/* expect it to work, without an error: */
+	pkey_assert(ret != -1);
+	/* Now access from the current task, and expect an exception: */
+	peek_result = read_ptr(ptr);
+	expected_pk_fault(pkey);
+
+	/*
+	 * Try to access the NON-pkey-protected "plain_ptr" via ptrace:
+	 */
+	ret = ptrace(PTRACE_PEEKDATA, child_pid, plain_ptr, ignored);
+	/* expect it to work, without an error: */
+	pkey_assert(ret != -1);
+	/* Now access from the current task, and expect NO exception: */
+	peek_result = read_ptr(plain_ptr);
+	do_not_expect_pk_fault();
+
+	ret = ptrace(PTRACE_DETACH, child_pid, ignored, 0);
+	pkey_assert(ret != -1);
+
+	ret = kill(child_pid, SIGKILL);
+	pkey_assert(ret != -1);
+
+	wait(&status);
+
+	free(plain_ptr_unaligned);
+}
+
+void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
+{
+	void *p1;
+	int scratch;
+	int ptr_contents;
+	int ret;
+
+	p1 = ALIGN_PTR_UP(&lots_o_noops_around_write, PAGE_SIZE);
+	dprintf3("&lots_o_noops: %p\n", &lots_o_noops_around_write);
+	/* lots_o_noops_around_write should be page-aligned already */
+	assert(p1 == &lots_o_noops_around_write);
+
+	/* Point 'p1' at the *second* page of the function: */
+	p1 += PAGE_SIZE;
+
+	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
+	lots_o_noops_around_write(&scratch);
+	ptr_contents = read_ptr(p1);
+	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+
+	ret = mprotect_pkey(p1, PAGE_SIZE, PROT_EXEC, (u64)pkey);
+	pkey_assert(!ret);
+	pkey_access_deny(pkey);
+
+	dprintf2("pkru: %x\n", rdpkru());
+
+	/*
+	 * Make sure this is an *instruction* fault
+	 */
+	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
+	lots_o_noops_around_write(&scratch);
+	do_not_expect_pk_fault();
+	ptr_contents = read_ptr(p1);
+	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+	expected_pk_fault(pkey);
+}
+
+void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
+{
+	int size = PAGE_SIZE;
+	int sret;
+
+	if (cpu_has_pku()) {
+		dprintf1("SKIP: %s: no CPU support\n", __func__);
+		return;
+	}
+
+	sret = syscall(SYS_mprotect_key, ptr, size, PROT_READ, pkey);
+	pkey_assert(sret < 0);
+}
+
+void (*pkey_tests[])(int *ptr, u16 pkey) = {
+	test_read_of_write_disabled_region,
+	test_read_of_access_disabled_region,
+	test_write_of_write_disabled_region,
+	test_write_of_access_disabled_region,
+	test_kernel_write_of_access_disabled_region,
+	test_kernel_write_of_write_disabled_region,
+	test_kernel_gup_of_access_disabled_region,
+	test_kernel_gup_write_to_write_disabled_region,
+	test_executing_on_unreadable_memory,
+	test_ptrace_of_child,
+	test_pkey_syscalls_on_non_allocated_pkey,
+	test_pkey_syscalls_bad_args,
+	test_pkey_alloc_exhaust,
+};
+
+void run_tests_once(void)
+{
+	int *ptr;
+	int prot = PROT_READ|PROT_WRITE;
+
+	for (test_nr = 0; test_nr < ARRAY_SIZE(pkey_tests); test_nr++) {
+		int pkey;
+		int orig_pkru_faults = pkru_faults;
+
+		dprintf1("======================\n");
+		dprintf1("test %d preparing...\n", test_nr);
+
+		tracing_on();
+		pkey = alloc_random_pkey();
+		dprintf1("test %d starting with pkey: %d\n", test_nr, pkey);
+		ptr = malloc_pkey(PAGE_SIZE, prot, pkey);
+		dprintf1("test %d starting...\n", test_nr);
+		pkey_tests[test_nr](ptr, pkey);
+		dprintf1("freeing test memory: %p\n", ptr);
+		free_pkey_malloc(ptr);
+		sys_pkey_free(pkey);
+
+		dprintf1("pkru_faults: %d\n", pkru_faults);
+		dprintf1("orig_pkru_faults: %d\n", orig_pkru_faults);
+
+		tracing_off();
+		close_test_fds();
+
+		printf("test %2d PASSED (iteration %d)\n", test_nr, iteration_nr);
+		dprintf1("======================\n\n");
+	}
+	iteration_nr++;
+}
+
+void pkey_setup_shadow(void)
+{
+	shadow_pkru = __rdpkru();
+}
+
+int main(void)
+{
+	int nr_iterations = 22;
+
+	setup_handlers();
+
+	printf("has pku: %d\n", cpu_has_pku());
+
+	if (!cpu_has_pku()) {
+		int size = PAGE_SIZE;
+		int *ptr;
+
+		printf("running PKEY tests for unsupported CPU/OS\n");
+
+		ptr  = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+		assert(ptr != (void *)-1);
+		test_mprotect_pkey_on_unsupported_cpu(ptr, 1);
+		exit(0);
+	}
+
+	pkey_setup_shadow();
+	printf("startup pkru: %x\n", rdpkru());
+	setup_hugetlbfs();
+
+	while (nr_iterations-- > 0)
+		run_tests_once();
+
+	printf("done (all tests OK)\n");
+	return 0;
+}
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 7b1adee..9687501 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -7,7 +7,7 @@ include ../lib.mk
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \
 			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test ioperm \
-			protection_keys test_vdso
+			test_vdso
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/pkey-helpers.h b/tools/testing/selftests/x86/pkey-helpers.h
deleted file mode 100644
index 3818f25..0000000
--- a/tools/testing/selftests/x86/pkey-helpers.h
+++ /dev/null
@@ -1,220 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _PKEYS_HELPER_H
-#define _PKEYS_HELPER_H
-#define _GNU_SOURCE
-#include <string.h>
-#include <stdarg.h>
-#include <stdio.h>
-#include <stdint.h>
-#include <stdbool.h>
-#include <signal.h>
-#include <assert.h>
-#include <stdlib.h>
-#include <ucontext.h>
-#include <sys/mman.h>
-
-#define NR_PKEYS 16
-#define PKRU_BITS_PER_PKEY 2
-
-#ifndef DEBUG_LEVEL
-#define DEBUG_LEVEL 0
-#endif
-#define DPRINT_IN_SIGNAL_BUF_SIZE 4096
-extern int dprint_in_signal;
-extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
-static inline void sigsafe_printf(const char *format, ...)
-{
-	va_list ap;
-
-	va_start(ap, format);
-	if (!dprint_in_signal) {
-		vprintf(format, ap);
-	} else {
-		int len = vsnprintf(dprint_in_signal_buffer,
-				    DPRINT_IN_SIGNAL_BUF_SIZE,
-				    format, ap);
-		/*
-		 * len is amount that would have been printed,
-		 * but actual write is truncated at BUF_SIZE.
-		 */
-		if (len > DPRINT_IN_SIGNAL_BUF_SIZE)
-			len = DPRINT_IN_SIGNAL_BUF_SIZE;
-		write(1, dprint_in_signal_buffer, len);
-	}
-	va_end(ap);
-}
-#define dprintf_level(level, args...) do {	\
-	if (level <= DEBUG_LEVEL)		\
-		sigsafe_printf(args);		\
-	fflush(NULL);				\
-} while (0)
-#define dprintf0(args...) dprintf_level(0, args)
-#define dprintf1(args...) dprintf_level(1, args)
-#define dprintf2(args...) dprintf_level(2, args)
-#define dprintf3(args...) dprintf_level(3, args)
-#define dprintf4(args...) dprintf_level(4, args)
-
-extern unsigned int shadow_pkru;
-static inline unsigned int __rdpkru(void)
-{
-	unsigned int eax, edx;
-	unsigned int ecx = 0;
-	unsigned int pkru;
-
-	asm volatile(".byte 0x0f,0x01,0xee\n\t"
-		     : "=a" (eax), "=d" (edx)
-		     : "c" (ecx));
-	pkru = eax;
-	return pkru;
-}
-
-static inline unsigned int _rdpkru(int line)
-{
-	unsigned int pkru = __rdpkru();
-
-	dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
-			line, pkru, shadow_pkru);
-	assert(pkru == shadow_pkru);
-
-	return pkru;
-}
-
-#define rdpkru() _rdpkru(__LINE__)
-
-static inline void __wrpkru(unsigned int pkru)
-{
-	unsigned int eax = pkru;
-	unsigned int ecx = 0;
-	unsigned int edx = 0;
-
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
-	asm volatile(".byte 0x0f,0x01,0xef\n\t"
-		     : : "a" (eax), "c" (ecx), "d" (edx));
-	assert(pkru == __rdpkru());
-}
-
-static inline void wrpkru(unsigned int pkru)
-{
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
-	/* will do the shadow check for us: */
-	rdpkru();
-	__wrpkru(pkru);
-	shadow_pkru = pkru;
-	dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
-}
-
-/*
- * These are technically racy. since something could
- * change PKRU between the read and the write.
- */
-static inline void __pkey_access_allow(int pkey, int do_allow)
-{
-	unsigned int pkru = rdpkru();
-	int bit = pkey * 2;
-
-	if (do_allow)
-		pkru &= (1<<bit);
-	else
-		pkru |= (1<<bit);
-
-	dprintf4("pkru now: %08x\n", rdpkru());
-	wrpkru(pkru);
-}
-
-static inline void __pkey_write_allow(int pkey, int do_allow_write)
-{
-	long pkru = rdpkru();
-	int bit = pkey * 2 + 1;
-
-	if (do_allow_write)
-		pkru &= (1<<bit);
-	else
-		pkru |= (1<<bit);
-
-	wrpkru(pkru);
-	dprintf4("pkru now: %08x\n", rdpkru());
-}
-
-#define PROT_PKEY0     0x10            /* protection key value (bit 0) */
-#define PROT_PKEY1     0x20            /* protection key value (bit 1) */
-#define PROT_PKEY2     0x40            /* protection key value (bit 2) */
-#define PROT_PKEY3     0x80            /* protection key value (bit 3) */
-
-#define PAGE_SIZE 4096
-#define MB	(1<<20)
-
-static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
-		unsigned int *ecx, unsigned int *edx)
-{
-	/* ecx is often an input as well as an output. */
-	asm volatile(
-		"cpuid;"
-		: "=a" (*eax),
-		  "=b" (*ebx),
-		  "=c" (*ecx),
-		  "=d" (*edx)
-		: "0" (*eax), "2" (*ecx));
-}
-
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
-#define X86_FEATURE_PKU        (1<<3) /* Protection Keys for Userspace */
-#define X86_FEATURE_OSPKE      (1<<4) /* OS Protection Keys Enable */
-
-static inline int cpu_has_pku(void)
-{
-	unsigned int eax;
-	unsigned int ebx;
-	unsigned int ecx;
-	unsigned int edx;
-
-	eax = 0x7;
-	ecx = 0x0;
-	__cpuid(&eax, &ebx, &ecx, &edx);
-
-	if (!(ecx & X86_FEATURE_PKU)) {
-		dprintf2("cpu does not have PKU\n");
-		return 0;
-	}
-	if (!(ecx & X86_FEATURE_OSPKE)) {
-		dprintf2("cpu does not have OSPKE\n");
-		return 0;
-	}
-	return 1;
-}
-
-#define XSTATE_PKRU_BIT	(9)
-#define XSTATE_PKRU	0x200
-
-int pkru_xstate_offset(void)
-{
-	unsigned int eax;
-	unsigned int ebx;
-	unsigned int ecx;
-	unsigned int edx;
-	int xstate_offset;
-	int xstate_size;
-	unsigned long XSTATE_CPUID = 0xd;
-	int leaf;
-
-	/* assume that XSTATE_PKRU is set in XCR0 */
-	leaf = XSTATE_PKRU_BIT;
-	{
-		eax = XSTATE_CPUID;
-		ecx = leaf;
-		__cpuid(&eax, &ebx, &ecx, &edx);
-
-		if (leaf == XSTATE_PKRU_BIT) {
-			xstate_offset = ebx;
-			xstate_size = eax;
-		}
-	}
-
-	if (xstate_size == 0) {
-		printf("could not find size/offset of PKRU in xsave state\n");
-		return 0;
-	}
-
-	return xstate_offset;
-}
-
-#endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/x86/protection_keys.c b/tools/testing/selftests/x86/protection_keys.c
deleted file mode 100644
index 555e43c..0000000
--- a/tools/testing/selftests/x86/protection_keys.c
+++ /dev/null
@@ -1,1395 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Tests x86 Memory Protection Keys (see Documentation/x86/protection-keys.txt)
- *
- * There are examples in here of:
- *  * how to set protection keys on memory
- *  * how to set/clear bits in PKRU (the rights register)
- *  * how to handle SEGV_PKRU signals and extract pkey-relevant
- *    information from the siginfo
- *
- * Things to add:
- *	make sure KSM and KSM COW breaking works
- *	prefault pages in at malloc, or not
- *	protect MPX bounds tables with protection keys?
- *	make sure VMA splitting/merging is working correctly
- *	OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune to pkeys
- *	look for pkey "leaks" where it is still set on a VMA but "freed" back to the kernel
- *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
- *
- * Compile like this:
- *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- */
-#define _GNU_SOURCE
-#include <errno.h>
-#include <linux/futex.h>
-#include <sys/time.h>
-#include <sys/syscall.h>
-#include <string.h>
-#include <stdio.h>
-#include <stdint.h>
-#include <stdbool.h>
-#include <signal.h>
-#include <assert.h>
-#include <stdlib.h>
-#include <ucontext.h>
-#include <sys/mman.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <unistd.h>
-#include <sys/ptrace.h>
-#include <setjmp.h>
-
-#include "pkey-helpers.h"
-
-int iteration_nr = 1;
-int test_nr;
-
-unsigned int shadow_pkru;
-
-#define HPAGE_SIZE	(1UL<<21)
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
-#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
-#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
-#define ALIGN_PTR_UP(p, ptr_align_to)	((typeof(p))ALIGN_UP((unsigned long)(p),	ptr_align_to))
-#define ALIGN_PTR_DOWN(p, ptr_align_to)	((typeof(p))ALIGN_DOWN((unsigned long)(p),	ptr_align_to))
-#define __stringify_1(x...)     #x
-#define __stringify(x...)       __stringify_1(x)
-
-#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
-
-int dprint_in_signal;
-char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
-
-extern void abort_hooks(void);
-#define pkey_assert(condition) do {		\
-	if (!(condition)) {			\
-		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
-				__FILE__, __LINE__,	\
-				test_nr, iteration_nr);	\
-		dprintf0("errno at assert: %d", errno);	\
-		abort_hooks();			\
-		assert(condition);		\
-	}					\
-} while (0)
-#define raw_assert(cond) assert(cond)
-
-void cat_into_file(char *str, char *file)
-{
-	int fd = open(file, O_RDWR);
-	int ret;
-
-	dprintf2("%s(): writing '%s' to '%s'\n", __func__, str, file);
-	/*
-	 * these need to be raw because they are called under
-	 * pkey_assert()
-	 */
-	raw_assert(fd >= 0);
-	ret = write(fd, str, strlen(str));
-	if (ret != strlen(str)) {
-		perror("write to file failed");
-		fprintf(stderr, "filename: '%s' str: '%s'\n", file, str);
-		raw_assert(0);
-	}
-	close(fd);
-}
-
-#if CONTROL_TRACING > 0
-static int warned_tracing;
-int tracing_root_ok(void)
-{
-	if (geteuid() != 0) {
-		if (!warned_tracing)
-			fprintf(stderr, "WARNING: not run as root, "
-					"can not do tracing control\n");
-		warned_tracing = 1;
-		return 0;
-	}
-	return 1;
-}
-#endif
-
-void tracing_on(void)
-{
-#if CONTROL_TRACING > 0
-#define TRACEDIR "/sys/kernel/debug/tracing"
-	char pidstr[32];
-
-	if (!tracing_root_ok())
-		return;
-
-	sprintf(pidstr, "%d", getpid());
-	cat_into_file("0", TRACEDIR "/tracing_on");
-	cat_into_file("\n", TRACEDIR "/trace");
-	if (1) {
-		cat_into_file("function_graph", TRACEDIR "/current_tracer");
-		cat_into_file("1", TRACEDIR "/options/funcgraph-proc");
-	} else {
-		cat_into_file("nop", TRACEDIR "/current_tracer");
-	}
-	cat_into_file(pidstr, TRACEDIR "/set_ftrace_pid");
-	cat_into_file("1", TRACEDIR "/tracing_on");
-	dprintf1("enabled tracing\n");
-#endif
-}
-
-void tracing_off(void)
-{
-#if CONTROL_TRACING > 0
-	if (!tracing_root_ok())
-		return;
-	cat_into_file("0", "/sys/kernel/debug/tracing/tracing_on");
-#endif
-}
-
-void abort_hooks(void)
-{
-	fprintf(stderr, "running %s()...\n", __func__);
-	tracing_off();
-#ifdef SLEEP_ON_ABORT
-	sleep(SLEEP_ON_ABORT);
-#endif
-}
-
-static inline void __page_o_noops(void)
-{
-	/* 8-bytes of instruction * 512 bytes = 1 page */
-	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
-}
-
-/*
- * This attempts to have roughly a page of instructions followed by a few
- * instructions that do a write, and another page of instructions.  That
- * way, we are pretty sure that the write is in the second page of
- * instructions and has at least a page of padding behind it.
- *
- * *That* lets us be sure to madvise() away the write instruction, which
- * will then fault, which makes sure that the fault code handles
- * execute-only memory properly.
- */
-__attribute__((__aligned__(PAGE_SIZE)))
-void lots_o_noops_around_write(int *write_to_me)
-{
-	dprintf3("running %s()\n", __func__);
-	__page_o_noops();
-	/* Assume this happens in the second page of instructions: */
-	*write_to_me = __LINE__;
-	/* pad out by another page: */
-	__page_o_noops();
-	dprintf3("%s() done\n", __func__);
-}
-
-/* Define some kernel-like types */
-#define  u8 uint8_t
-#define u16 uint16_t
-#define u32 uint32_t
-#define u64 uint64_t
-
-#ifdef __i386__
-#define SYS_mprotect_key 380
-#define SYS_pkey_alloc	 381
-#define SYS_pkey_free	 382
-#define REG_IP_IDX REG_EIP
-#define si_pkey_offset 0x14
-#else
-#define SYS_mprotect_key 329
-#define SYS_pkey_alloc	 330
-#define SYS_pkey_free	 331
-#define REG_IP_IDX REG_RIP
-#define si_pkey_offset 0x20
-#endif
-
-void dump_mem(void *dumpme, int len_bytes)
-{
-	char *c = (void *)dumpme;
-	int i;
-
-	for (i = 0; i < len_bytes; i += sizeof(u64)) {
-		u64 *ptr = (u64 *)(c + i);
-		dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
-	}
-}
-
-#define SEGV_BNDERR     3  /* failed address bound checks */
-#define SEGV_PKUERR     4
-
-static char *si_code_str(int si_code)
-{
-	if (si_code == SEGV_MAPERR)
-		return "SEGV_MAPERR";
-	if (si_code == SEGV_ACCERR)
-		return "SEGV_ACCERR";
-	if (si_code == SEGV_BNDERR)
-		return "SEGV_BNDERR";
-	if (si_code == SEGV_PKUERR)
-		return "SEGV_PKUERR";
-	return "UNKNOWN";
-}
-
-int pkru_faults;
-int last_si_pkey = -1;
-void signal_handler(int signum, siginfo_t *si, void *vucontext)
-{
-	ucontext_t *uctxt = vucontext;
-	int trapno;
-	unsigned long ip;
-	char *fpregs;
-	u32 *pkru_ptr;
-	u64 si_pkey;
-	u32 *si_pkey_ptr;
-	int pkru_offset;
-	fpregset_t fpregset;
-
-	dprint_in_signal = 1;
-	dprintf1(">>>>===============SIGSEGV============================\n");
-	dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
-			__rdpkru(), shadow_pkru);
-
-	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
-	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
-	fpregset = uctxt->uc_mcontext.fpregs;
-	fpregs = (void *)fpregset;
-
-	dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
-			trapno, ip, si_code_str(si->si_code), si->si_code);
-#ifdef __i386__
-	/*
-	 * 32-bit has some extra padding so that userspace can tell whether
-	 * the XSTATE header is present in addition to the "legacy" FPU
-	 * state.  We just assume that it is here.
-	 */
-	fpregs += 0x70;
-#endif
-	pkru_offset = pkru_xstate_offset();
-	pkru_ptr = (void *)(&fpregs[pkru_offset]);
-
-	dprintf1("siginfo: %p\n", si);
-	dprintf1(" fpregs: %p\n", fpregs);
-	/*
-	 * If we got a PKRU fault, we *HAVE* to have at least one bit set in
-	 * here.
-	 */
-	dprintf1("pkru_xstate_offset: %d\n", pkru_xstate_offset());
-	if (DEBUG_LEVEL > 4)
-		dump_mem(pkru_ptr - 128, 256);
-	pkey_assert(*pkru_ptr);
-
-	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
-	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
-	dump_mem(si_pkey_ptr - 8, 24);
-	si_pkey = *si_pkey_ptr;
-	pkey_assert(si_pkey < NR_PKEYS);
-	last_si_pkey = si_pkey;
-
-	if ((si->si_code == SEGV_MAPERR) ||
-	    (si->si_code == SEGV_ACCERR) ||
-	    (si->si_code == SEGV_BNDERR)) {
-		printf("non-PK si_code, exiting...\n");
-		exit(4);
-	}
-
-	dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
-	/* need __rdpkru() version so we do not do shadow_pkru checking */
-	dprintf1("signal pkru from  pkru: %08x\n", __rdpkru());
-	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
-	*(u64 *)pkru_ptr = 0x00000000;
-	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
-	pkru_faults++;
-	dprintf1("<<<<==================================================\n");
-	return;
-	if (trapno == 14) {
-		fprintf(stderr,
-			"ERROR: In signal handler, page fault, trapno = %d, ip = %016lx\n",
-			trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(1);
-	} else {
-		fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(2);
-	}
-	dprint_in_signal = 0;
-}
-
-int wait_all_children(void)
-{
-	int status;
-	return waitpid(-1, &status, 0);
-}
-
-void sig_chld(int x)
-{
-	dprint_in_signal = 1;
-	dprintf2("[%d] SIGCHLD: %d\n", getpid(), x);
-	dprint_in_signal = 0;
-}
-
-void setup_sigsegv_handler(void)
-{
-	int r, rs;
-	struct sigaction newact;
-	struct sigaction oldact;
-
-	/* #PF is mapped to sigsegv */
-	int signum  = SIGSEGV;
-
-	newact.sa_handler = 0;
-	newact.sa_sigaction = signal_handler;
-
-	/*sigset_t - signals to block while in the handler */
-	/* get the old signal mask. */
-	rs = sigprocmask(SIG_SETMASK, 0, &newact.sa_mask);
-	pkey_assert(rs == 0);
-
-	/* call sa_sigaction, not sa_handler*/
-	newact.sa_flags = SA_SIGINFO;
-
-	newact.sa_restorer = 0;  /* void(*)(), obsolete */
-	r = sigaction(signum, &newact, &oldact);
-	r = sigaction(SIGALRM, &newact, &oldact);
-	pkey_assert(r == 0);
-}
-
-void setup_handlers(void)
-{
-	signal(SIGCHLD, &sig_chld);
-	setup_sigsegv_handler();
-}
-
-pid_t fork_lazy_child(void)
-{
-	pid_t forkret;
-
-	forkret = fork();
-	pkey_assert(forkret >= 0);
-	dprintf3("[%d] fork() ret: %d\n", getpid(), forkret);
-
-	if (!forkret) {
-		/* in the child */
-		while (1) {
-			dprintf1("child sleeping...\n");
-			sleep(30);
-		}
-	}
-	return forkret;
-}
-
-void davecmp(void *_a, void *_b, int len)
-{
-	int i;
-	unsigned long *a = _a;
-	unsigned long *b = _b;
-
-	for (i = 0; i < len / sizeof(*a); i++) {
-		if (a[i] == b[i])
-			continue;
-
-		dprintf3("[%3d]: a: %016lx b: %016lx\n", i, a[i], b[i]);
-	}
-}
-
-void dumpit(char *f)
-{
-	int fd = open(f, O_RDONLY);
-	char buf[100];
-	int nr_read;
-
-	dprintf2("maps fd: %d\n", fd);
-	do {
-		nr_read = read(fd, &buf[0], sizeof(buf));
-		write(1, buf, nr_read);
-	} while (nr_read > 0);
-	close(fd);
-}
-
-#define PKEY_DISABLE_ACCESS    0x1
-#define PKEY_DISABLE_WRITE     0x2
-
-u32 pkey_get(int pkey, unsigned long flags)
-{
-	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 pkru = __rdpkru();
-	u32 shifted_pkru;
-	u32 masked_pkru;
-
-	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
-			__func__, pkey, flags, 0, 0);
-	dprintf2("%s() raw pkru: %x\n", __func__, pkru);
-
-	shifted_pkru = (pkru >> (pkey * PKRU_BITS_PER_PKEY));
-	dprintf2("%s() shifted_pkru: %x\n", __func__, shifted_pkru);
-	masked_pkru = shifted_pkru & mask;
-	dprintf2("%s() masked  pkru: %x\n", __func__, masked_pkru);
-	/*
-	 * shift down the relevant bits to the lowest two, then
-	 * mask off all the other high bits.
-	 */
-	return masked_pkru;
-}
-
-int pkey_set(int pkey, unsigned long rights, unsigned long flags)
-{
-	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 old_pkru = __rdpkru();
-	u32 new_pkru;
-
-	/* make sure that 'rights' only contains the bits we expect: */
-	assert(!(rights & ~mask));
-
-	/* copy old pkru */
-	new_pkru = old_pkru;
-	/* mask out bits from pkey in old value: */
-	new_pkru &= ~(mask << (pkey * PKRU_BITS_PER_PKEY));
-	/* OR in new bits for pkey: */
-	new_pkru |= (rights << (pkey * PKRU_BITS_PER_PKEY));
-
-	__wrpkru(new_pkru);
-
-	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x pkru now: %x old_pkru: %x\n",
-			__func__, pkey, rights, flags, 0, __rdpkru(), old_pkru);
-	return 0;
-}
-
-void pkey_disable_set(int pkey, int flags)
-{
-	unsigned long syscall_flags = 0;
-	int ret;
-	int pkey_rights;
-	u32 orig_pkru = rdpkru();
-
-	dprintf1("START->%s(%d, 0x%x)\n", __func__,
-		pkey, flags);
-	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
-
-	pkey_rights = pkey_get(pkey, syscall_flags);
-
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-	pkey_assert(pkey_rights >= 0);
-
-	pkey_rights |= flags;
-
-	ret = pkey_set(pkey, pkey_rights, syscall_flags);
-	assert(!ret);
-	/*pkru and flags have the same format */
-	shadow_pkru |= flags << (pkey * 2);
-	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkru);
-
-	pkey_assert(ret >= 0);
-
-	pkey_rights = pkey_get(pkey, syscall_flags);
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
-	if (flags)
-		pkey_assert(rdpkru() > orig_pkru);
-	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
-		pkey, flags);
-}
-
-void pkey_disable_clear(int pkey, int flags)
-{
-	unsigned long syscall_flags = 0;
-	int ret;
-	int pkey_rights = pkey_get(pkey, syscall_flags);
-	u32 orig_pkru = rdpkru();
-
-	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
-
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-	pkey_assert(pkey_rights >= 0);
-
-	pkey_rights |= flags;
-
-	ret = pkey_set(pkey, pkey_rights, 0);
-	/* pkru and flags have the same format */
-	shadow_pkru &= ~(flags << (pkey * 2));
-	pkey_assert(ret >= 0);
-
-	pkey_rights = pkey_get(pkey, syscall_flags);
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
-
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
-	if (flags)
-		assert(rdpkru() > orig_pkru);
-}
-
-void pkey_write_allow(int pkey)
-{
-	pkey_disable_clear(pkey, PKEY_DISABLE_WRITE);
-}
-void pkey_write_deny(int pkey)
-{
-	pkey_disable_set(pkey, PKEY_DISABLE_WRITE);
-}
-void pkey_access_allow(int pkey)
-{
-	pkey_disable_clear(pkey, PKEY_DISABLE_ACCESS);
-}
-void pkey_access_deny(int pkey)
-{
-	pkey_disable_set(pkey, PKEY_DISABLE_ACCESS);
-}
-
-int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
-		unsigned long pkey)
-{
-	int sret;
-
-	dprintf2("%s(0x%p, %zx, prot=%lx, pkey=%lx)\n", __func__,
-			ptr, size, orig_prot, pkey);
-
-	errno = 0;
-	sret = syscall(SYS_mprotect_key, ptr, size, orig_prot, pkey);
-	if (errno) {
-		dprintf2("SYS_mprotect_key sret: %d\n", sret);
-		dprintf2("SYS_mprotect_key prot: 0x%lx\n", orig_prot);
-		dprintf2("SYS_mprotect_key failed, errno: %d\n", errno);
-		if (DEBUG_LEVEL >= 2)
-			perror("SYS_mprotect_pkey");
-	}
-	return sret;
-}
-
-int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
-{
-	int ret = syscall(SYS_pkey_alloc, flags, init_val);
-	dprintf1("%s(flags=%lx, init_val=%lx) syscall ret: %d errno: %d\n",
-			__func__, flags, init_val, ret, errno);
-	return ret;
-}
-
-int alloc_pkey(void)
-{
-	int ret;
-	unsigned long init_val = 0x0;
-
-	dprintf1("alloc_pkey()::%d, pkru: 0x%x shadow: %x\n",
-			__LINE__, __rdpkru(), shadow_pkru);
-	ret = sys_pkey_alloc(0, init_val);
-	/*
-	 * pkey_alloc() sets PKRU, so we need to reflect it in
-	 * shadow_pkru:
-	 */
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	if (ret) {
-		/* clear both the bits: */
-		shadow_pkru &= ~(0x3      << (ret * 2));
-		dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-				__LINE__, ret, __rdpkru(), shadow_pkru);
-		/*
-		 * move the new state in from init_val
-		 * (remember, we cheated and init_val == pkru format)
-		 */
-		shadow_pkru |=  (init_val << (ret * 2));
-	}
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	dprintf1("alloc_pkey()::%d errno: %d\n", __LINE__, errno);
-	/* for shadow checking: */
-	rdpkru();
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	return ret;
-}
-
-int sys_pkey_free(unsigned long pkey)
-{
-	int ret = syscall(SYS_pkey_free, pkey);
-	dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
-	return ret;
-}
-
-/*
- * I had a bug where pkey bits could be set by mprotect() but
- * not cleared.  This ensures we get lots of random bit sets
- * and clears on the vma and pte pkey bits.
- */
-int alloc_random_pkey(void)
-{
-	int max_nr_pkey_allocs;
-	int ret;
-	int i;
-	int alloced_pkeys[NR_PKEYS];
-	int nr_alloced = 0;
-	int random_index;
-	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
-
-	/* allocate every possible key and make a note of which ones we got */
-	max_nr_pkey_allocs = NR_PKEYS;
-	max_nr_pkey_allocs = 1;
-	for (i = 0; i < max_nr_pkey_allocs; i++) {
-		int new_pkey = alloc_pkey();
-		if (new_pkey < 0)
-			break;
-		alloced_pkeys[nr_alloced++] = new_pkey;
-	}
-
-	pkey_assert(nr_alloced > 0);
-	/* select a random one out of the allocated ones */
-	random_index = rand() % nr_alloced;
-	ret = alloced_pkeys[random_index];
-	/* now zero it out so we don't free it next */
-	alloced_pkeys[random_index] = 0;
-
-	/* go through the allocated ones that we did not want and free them */
-	for (i = 0; i < nr_alloced; i++) {
-		int free_ret;
-		if (!alloced_pkeys[i])
-			continue;
-		free_ret = sys_pkey_free(alloced_pkeys[i]);
-		pkey_assert(!free_ret);
-	}
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	return ret;
-}
-
-int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
-		unsigned long pkey)
-{
-	int nr_iterations = random() % 100;
-	int ret;
-
-	while (0) {
-		int rpkey = alloc_random_pkey();
-		ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
-		dprintf1("sys_mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
-				ptr, size, orig_prot, pkey, ret);
-		if (nr_iterations-- < 0)
-			break;
-
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-		sys_pkey_free(rpkey);
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	}
-	pkey_assert(pkey < NR_PKEYS);
-
-	ret = sys_mprotect_pkey(ptr, size, orig_prot, pkey);
-	dprintf1("mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
-			ptr, size, orig_prot, pkey, ret);
-	pkey_assert(!ret);
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	return ret;
-}
-
-struct pkey_malloc_record {
-	void *ptr;
-	long size;
-};
-struct pkey_malloc_record *pkey_malloc_records;
-long nr_pkey_malloc_records;
-void record_pkey_malloc(void *ptr, long size)
-{
-	long i;
-	struct pkey_malloc_record *rec = NULL;
-
-	for (i = 0; i < nr_pkey_malloc_records; i++) {
-		rec = &pkey_malloc_records[i];
-		/* find a free record */
-		if (rec)
-			break;
-	}
-	if (!rec) {
-		/* every record is full */
-		size_t old_nr_records = nr_pkey_malloc_records;
-		size_t new_nr_records = (nr_pkey_malloc_records * 2 + 1);
-		size_t new_size = new_nr_records * sizeof(struct pkey_malloc_record);
-		dprintf2("new_nr_records: %zd\n", new_nr_records);
-		dprintf2("new_size: %zd\n", new_size);
-		pkey_malloc_records = realloc(pkey_malloc_records, new_size);
-		pkey_assert(pkey_malloc_records != NULL);
-		rec = &pkey_malloc_records[nr_pkey_malloc_records];
-		/*
-		 * realloc() does not initialize memory, so zero it from
-		 * the first new record all the way to the end.
-		 */
-		for (i = 0; i < new_nr_records - old_nr_records; i++)
-			memset(rec + i, 0, sizeof(*rec));
-	}
-	dprintf3("filling malloc record[%d/%p]: {%p, %ld}\n",
-		(int)(rec - pkey_malloc_records), rec, ptr, size);
-	rec->ptr = ptr;
-	rec->size = size;
-	nr_pkey_malloc_records++;
-}
-
-void free_pkey_malloc(void *ptr)
-{
-	long i;
-	int ret;
-	dprintf3("%s(%p)\n", __func__, ptr);
-	for (i = 0; i < nr_pkey_malloc_records; i++) {
-		struct pkey_malloc_record *rec = &pkey_malloc_records[i];
-		dprintf4("looking for ptr %p at record[%ld/%p]: {%p, %ld}\n",
-				ptr, i, rec, rec->ptr, rec->size);
-		if ((ptr <  rec->ptr) ||
-		    (ptr >= rec->ptr + rec->size))
-			continue;
-
-		dprintf3("found ptr %p at record[%ld/%p]: {%p, %ld}\n",
-				ptr, i, rec, rec->ptr, rec->size);
-		nr_pkey_malloc_records--;
-		ret = munmap(rec->ptr, rec->size);
-		dprintf3("munmap ret: %d\n", ret);
-		pkey_assert(!ret);
-		dprintf3("clearing rec->ptr, rec: %p\n", rec);
-		rec->ptr = NULL;
-		dprintf3("done clearing rec->ptr, rec: %p\n", rec);
-		return;
-	}
-	pkey_assert(false);
-}
-
-
-void *malloc_pkey_with_mprotect(long size, int prot, u16 pkey)
-{
-	void *ptr;
-	int ret;
-
-	rdpkru();
-	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
-			size, prot, pkey);
-	pkey_assert(pkey < NR_PKEYS);
-	ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-	pkey_assert(ptr != (void *)-1);
-	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
-	pkey_assert(!ret);
-	record_pkey_malloc(ptr, size);
-	rdpkru();
-
-	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
-	return ptr;
-}
-
-void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
-{
-	int ret;
-	void *ptr;
-
-	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
-			size, prot, pkey);
-	/*
-	 * Guarantee we can fit at least one huge page in the resulting
-	 * allocation by allocating space for 2:
-	 */
-	size = ALIGN_UP(size, HPAGE_SIZE * 2);
-	ptr = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-	pkey_assert(ptr != (void *)-1);
-	record_pkey_malloc(ptr, size);
-	mprotect_pkey(ptr, size, prot, pkey);
-
-	dprintf1("unaligned ptr: %p\n", ptr);
-	ptr = ALIGN_PTR_UP(ptr, HPAGE_SIZE);
-	dprintf1("  aligned ptr: %p\n", ptr);
-	ret = madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE);
-	dprintf1("MADV_HUGEPAGE ret: %d\n", ret);
-	ret = madvise(ptr, HPAGE_SIZE, MADV_WILLNEED);
-	dprintf1("MADV_WILLNEED ret: %d\n", ret);
-	memset(ptr, 0, HPAGE_SIZE);
-
-	dprintf1("mmap()'d thp for pkey %d @ %p\n", pkey, ptr);
-	return ptr;
-}
-
-int hugetlb_setup_ok;
-#define GET_NR_HUGE_PAGES 10
-void setup_hugetlbfs(void)
-{
-	int err;
-	int fd;
-	char buf[] = "123";
-
-	if (geteuid() != 0) {
-		fprintf(stderr, "WARNING: not run as root, can not do hugetlb test\n");
-		return;
-	}
-
-	cat_into_file(__stringify(GET_NR_HUGE_PAGES), "/proc/sys/vm/nr_hugepages");
-
-	/*
-	 * Now go make sure that we got the pages and that they
-	 * are 2M pages.  Someone might have made 1G the default.
-	 */
-	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", O_RDONLY);
-	if (fd < 0) {
-		perror("opening sysfs 2M hugetlb config");
-		return;
-	}
-
-	/* -1 to guarantee leaving the trailing \0 */
-	err = read(fd, buf, sizeof(buf)-1);
-	close(fd);
-	if (err <= 0) {
-		perror("reading sysfs 2M hugetlb config");
-		return;
-	}
-
-	if (atoi(buf) != GET_NR_HUGE_PAGES) {
-		fprintf(stderr, "could not confirm 2M pages, got: '%s' expected %d\n",
-			buf, GET_NR_HUGE_PAGES);
-		return;
-	}
-
-	hugetlb_setup_ok = 1;
-}
-
-void *malloc_pkey_hugetlb(long size, int prot, u16 pkey)
-{
-	void *ptr;
-	int flags = MAP_ANONYMOUS|MAP_PRIVATE|MAP_HUGETLB;
-
-	if (!hugetlb_setup_ok)
-		return PTR_ERR_ENOTSUP;
-
-	dprintf1("doing %s(%ld, %x, %x)\n", __func__, size, prot, pkey);
-	size = ALIGN_UP(size, HPAGE_SIZE * 2);
-	pkey_assert(pkey < NR_PKEYS);
-	ptr = mmap(NULL, size, PROT_NONE, flags, -1, 0);
-	pkey_assert(ptr != (void *)-1);
-	mprotect_pkey(ptr, size, prot, pkey);
-
-	record_pkey_malloc(ptr, size);
-
-	dprintf1("mmap()'d hugetlbfs for pkey %d @ %p\n", pkey, ptr);
-	return ptr;
-}
-
-void *malloc_pkey_mmap_dax(long size, int prot, u16 pkey)
-{
-	void *ptr;
-	int fd;
-
-	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
-			size, prot, pkey);
-	pkey_assert(pkey < NR_PKEYS);
-	fd = open("/dax/foo", O_RDWR);
-	pkey_assert(fd >= 0);
-
-	ptr = mmap(0, size, prot, MAP_SHARED, fd, 0);
-	pkey_assert(ptr != (void *)-1);
-
-	mprotect_pkey(ptr, size, prot, pkey);
-
-	record_pkey_malloc(ptr, size);
-
-	dprintf1("mmap()'d for pkey %d @ %p\n", pkey, ptr);
-	close(fd);
-	return ptr;
-}
-
-void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
-
-	malloc_pkey_with_mprotect,
-	malloc_pkey_anon_huge,
-	malloc_pkey_hugetlb
-/* can not do direct with the pkey_mprotect() API:
-	malloc_pkey_mmap_direct,
-	malloc_pkey_mmap_dax,
-*/
-};
-
-void *malloc_pkey(long size, int prot, u16 pkey)
-{
-	void *ret;
-	static int malloc_type;
-	int nr_malloc_types = ARRAY_SIZE(pkey_malloc);
-
-	pkey_assert(pkey < NR_PKEYS);
-
-	while (1) {
-		pkey_assert(malloc_type < nr_malloc_types);
-
-		ret = pkey_malloc[malloc_type](size, prot, pkey);
-		pkey_assert(ret != (void *)-1);
-
-		malloc_type++;
-		if (malloc_type >= nr_malloc_types)
-			malloc_type = (random()%nr_malloc_types);
-
-		/* try again if the malloc_type we tried is unsupported */
-		if (ret == PTR_ERR_ENOTSUP)
-			continue;
-
-		break;
-	}
-
-	dprintf3("%s(%ld, prot=%x, pkey=%x) returning: %p\n", __func__,
-			size, prot, pkey, ret);
-	return ret;
-}
-
-int last_pkru_faults;
-void expected_pk_fault(int pkey)
-{
-	dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
-			__func__, last_pkru_faults, pkru_faults);
-	dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
-	pkey_assert(last_pkru_faults + 1 == pkru_faults);
-	pkey_assert(last_si_pkey == pkey);
-	/*
-	 * The signal handler shold have cleared out PKRU to let the
-	 * test program continue.  We now have to restore it.
-	 */
-	if (__rdpkru() != 0)
-		pkey_assert(0);
-
-	__wrpkru(shadow_pkru);
-	dprintf1("%s() set PKRU=%x to restore state after signal nuked it\n",
-			__func__, shadow_pkru);
-	last_pkru_faults = pkru_faults;
-	last_si_pkey = -1;
-}
-
-void do_not_expect_pk_fault(void)
-{
-	pkey_assert(last_pkru_faults == pkru_faults);
-}
-
-int test_fds[10] = { -1 };
-int nr_test_fds;
-void __save_test_fd(int fd)
-{
-	pkey_assert(fd >= 0);
-	pkey_assert(nr_test_fds < ARRAY_SIZE(test_fds));
-	test_fds[nr_test_fds] = fd;
-	nr_test_fds++;
-}
-
-int get_test_read_fd(void)
-{
-	int test_fd = open("/etc/passwd", O_RDONLY);
-	__save_test_fd(test_fd);
-	return test_fd;
-}
-
-void close_test_fds(void)
-{
-	int i;
-
-	for (i = 0; i < nr_test_fds; i++) {
-		if (test_fds[i] < 0)
-			continue;
-		close(test_fds[i]);
-		test_fds[i] = -1;
-	}
-	nr_test_fds = 0;
-}
-
-#define barrier() __asm__ __volatile__("": : :"memory")
-__attribute__((noinline)) int read_ptr(int *ptr)
-{
-	/*
-	 * Keep GCC from optimizing this away somehow
-	 */
-	barrier();
-	return *ptr;
-}
-
-void test_read_of_write_disabled_region(int *ptr, u16 pkey)
-{
-	int ptr_contents;
-
-	dprintf1("disabling write access to PKEY[1], doing read\n");
-	pkey_write_deny(pkey);
-	ptr_contents = read_ptr(ptr);
-	dprintf1("*ptr: %d\n", ptr_contents);
-	dprintf1("\n");
-}
-void test_read_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	int ptr_contents;
-
-	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
-	rdpkru();
-	pkey_access_deny(pkey);
-	ptr_contents = read_ptr(ptr);
-	dprintf1("*ptr: %d\n", ptr_contents);
-	expected_pk_fault(pkey);
-}
-void test_write_of_write_disabled_region(int *ptr, u16 pkey)
-{
-	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
-	pkey_write_deny(pkey);
-	*ptr = __LINE__;
-	expected_pk_fault(pkey);
-}
-void test_write_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	dprintf1("disabling access to PKEY[%02d], doing write\n", pkey);
-	pkey_access_deny(pkey);
-	*ptr = __LINE__;
-	expected_pk_fault(pkey);
-}
-void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	int ret;
-	int test_fd = get_test_read_fd();
-
-	dprintf1("disabling access to PKEY[%02d], "
-		 "having kernel read() to buffer\n", pkey);
-	pkey_access_deny(pkey);
-	ret = read(test_fd, ptr, 1);
-	dprintf1("read ret: %d\n", ret);
-	pkey_assert(ret);
-}
-void test_kernel_write_of_write_disabled_region(int *ptr, u16 pkey)
-{
-	int ret;
-	int test_fd = get_test_read_fd();
-
-	pkey_write_deny(pkey);
-	ret = read(test_fd, ptr, 100);
-	dprintf1("read ret: %d\n", ret);
-	if (ret < 0 && (DEBUG_LEVEL > 0))
-		perror("verbose read result (OK for this to be bad)");
-	pkey_assert(ret);
-}
-
-void test_kernel_gup_of_access_disabled_region(int *ptr, u16 pkey)
-{
-	int pipe_ret, vmsplice_ret;
-	struct iovec iov;
-	int pipe_fds[2];
-
-	pipe_ret = pipe(pipe_fds);
-
-	pkey_assert(pipe_ret == 0);
-	dprintf1("disabling access to PKEY[%02d], "
-		 "having kernel vmsplice from buffer\n", pkey);
-	pkey_access_deny(pkey);
-	iov.iov_base = ptr;
-	iov.iov_len = PAGE_SIZE;
-	vmsplice_ret = vmsplice(pipe_fds[1], &iov, 1, SPLICE_F_GIFT);
-	dprintf1("vmsplice() ret: %d\n", vmsplice_ret);
-	pkey_assert(vmsplice_ret == -1);
-
-	close(pipe_fds[0]);
-	close(pipe_fds[1]);
-}
-
-void test_kernel_gup_write_to_write_disabled_region(int *ptr, u16 pkey)
-{
-	int ignored = 0xdada;
-	int futex_ret;
-	int some_int = __LINE__;
-
-	dprintf1("disabling write to PKEY[%02d], "
-		 "doing futex gunk in buffer\n", pkey);
-	*ptr = some_int;
-	pkey_write_deny(pkey);
-	futex_ret = syscall(SYS_futex, ptr, FUTEX_WAIT, some_int-1, NULL,
-			&ignored, ignored);
-	if (DEBUG_LEVEL > 0)
-		perror("futex");
-	dprintf1("futex() ret: %d\n", futex_ret);
-}
-
-/* Assumes that all pkeys other than 'pkey' are unallocated */
-void test_pkey_syscalls_on_non_allocated_pkey(int *ptr, u16 pkey)
-{
-	int err;
-	int i;
-
-	/* Note: 0 is the default pkey, so don't mess with it */
-	for (i = 1; i < NR_PKEYS; i++) {
-		if (pkey == i)
-			continue;
-
-		dprintf1("trying get/set/free to non-allocated pkey: %2d\n", i);
-		err = sys_pkey_free(i);
-		pkey_assert(err);
-
-		err = sys_pkey_free(i);
-		pkey_assert(err);
-
-		err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, i);
-		pkey_assert(err);
-	}
-}
-
-/* Assumes that all pkeys other than 'pkey' are unallocated */
-void test_pkey_syscalls_bad_args(int *ptr, u16 pkey)
-{
-	int err;
-	int bad_pkey = NR_PKEYS+99;
-
-	/* pass a known-invalid pkey in: */
-	err = sys_mprotect_pkey(ptr, PAGE_SIZE, PROT_READ, bad_pkey);
-	pkey_assert(err);
-}
-
-/* Assumes that all pkeys other than 'pkey' are unallocated */
-void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
-{
-	int err;
-	int allocated_pkeys[NR_PKEYS] = {0};
-	int nr_allocated_pkeys = 0;
-	int i;
-
-	for (i = 0; i < NR_PKEYS*2; i++) {
-		int new_pkey;
-		dprintf1("%s() alloc loop: %d\n", __func__, i);
-		new_pkey = alloc_pkey();
-		dprintf4("%s()::%d, err: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-				__LINE__, err, __rdpkru(), shadow_pkru);
-		rdpkru(); /* for shadow checking */
-		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
-		if ((new_pkey == -1) && (errno == ENOSPC)) {
-			dprintf2("%s() failed to allocate pkey after %d tries\n",
-				__func__, nr_allocated_pkeys);
-			break;
-		}
-		pkey_assert(nr_allocated_pkeys < NR_PKEYS);
-		allocated_pkeys[nr_allocated_pkeys++] = new_pkey;
-	}
-
-	dprintf3("%s()::%d\n", __func__, __LINE__);
-
-	/*
-	 * ensure it did not reach the end of the loop without
-	 * failure:
-	 */
-	pkey_assert(i < NR_PKEYS*2);
-
-	/*
-	 * There are 16 pkeys supported in hardware.  One is taken
-	 * up for the default (0) and another can be taken up by
-	 * an execute-only mapping.  Ensure that we can allocate
-	 * at least 14 (16-2).
-	 */
-	pkey_assert(i >= NR_PKEYS-2);
-
-	for (i = 0; i < nr_allocated_pkeys; i++) {
-		err = sys_pkey_free(allocated_pkeys[i]);
-		pkey_assert(!err);
-		rdpkru(); /* for shadow checking */
-	}
-}
-
-void test_ptrace_of_child(int *ptr, u16 pkey)
-{
-	__attribute__((__unused__)) int peek_result;
-	pid_t child_pid;
-	void *ignored = 0;
-	long ret;
-	int status;
-	/*
-	 * This is the "control" for our little expermient.  Make sure
-	 * we can always access it when ptracing.
-	 */
-	int *plain_ptr_unaligned = malloc(HPAGE_SIZE);
-	int *plain_ptr = ALIGN_PTR_UP(plain_ptr_unaligned, PAGE_SIZE);
-
-	/*
-	 * Fork a child which is an exact copy of this process, of course.
-	 * That means we can do all of our tests via ptrace() and then plain
-	 * memory access and ensure they work differently.
-	 */
-	child_pid = fork_lazy_child();
-	dprintf1("[%d] child pid: %d\n", getpid(), child_pid);
-
-	ret = ptrace(PTRACE_ATTACH, child_pid, ignored, ignored);
-	if (ret)
-		perror("attach");
-	dprintf1("[%d] attach ret: %ld %d\n", getpid(), ret, __LINE__);
-	pkey_assert(ret != -1);
-	ret = waitpid(child_pid, &status, WUNTRACED);
-	if ((ret != child_pid) || !(WIFSTOPPED(status))) {
-		fprintf(stderr, "weird waitpid result %ld stat %x\n",
-				ret, status);
-		pkey_assert(0);
-	}
-	dprintf2("waitpid ret: %ld\n", ret);
-	dprintf2("waitpid status: %d\n", status);
-
-	pkey_access_deny(pkey);
-	pkey_write_deny(pkey);
-
-	/* Write access, untested for now:
-	ret = ptrace(PTRACE_POKEDATA, child_pid, peek_at, data);
-	pkey_assert(ret != -1);
-	dprintf1("poke at %p: %ld\n", peek_at, ret);
-	*/
-
-	/*
-	 * Try to access the pkey-protected "ptr" via ptrace:
-	 */
-	ret = ptrace(PTRACE_PEEKDATA, child_pid, ptr, ignored);
-	/* expect it to work, without an error: */
-	pkey_assert(ret != -1);
-	/* Now access from the current task, and expect an exception: */
-	peek_result = read_ptr(ptr);
-	expected_pk_fault(pkey);
-
-	/*
-	 * Try to access the NON-pkey-protected "plain_ptr" via ptrace:
-	 */
-	ret = ptrace(PTRACE_PEEKDATA, child_pid, plain_ptr, ignored);
-	/* expect it to work, without an error: */
-	pkey_assert(ret != -1);
-	/* Now access from the current task, and expect NO exception: */
-	peek_result = read_ptr(plain_ptr);
-	do_not_expect_pk_fault();
-
-	ret = ptrace(PTRACE_DETACH, child_pid, ignored, 0);
-	pkey_assert(ret != -1);
-
-	ret = kill(child_pid, SIGKILL);
-	pkey_assert(ret != -1);
-
-	wait(&status);
-
-	free(plain_ptr_unaligned);
-}
-
-void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
-{
-	void *p1;
-	int scratch;
-	int ptr_contents;
-	int ret;
-
-	p1 = ALIGN_PTR_UP(&lots_o_noops_around_write, PAGE_SIZE);
-	dprintf3("&lots_o_noops: %p\n", &lots_o_noops_around_write);
-	/* lots_o_noops_around_write should be page-aligned already */
-	assert(p1 == &lots_o_noops_around_write);
-
-	/* Point 'p1' at the *second* page of the function: */
-	p1 += PAGE_SIZE;
-
-	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
-	lots_o_noops_around_write(&scratch);
-	ptr_contents = read_ptr(p1);
-	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-
-	ret = mprotect_pkey(p1, PAGE_SIZE, PROT_EXEC, (u64)pkey);
-	pkey_assert(!ret);
-	pkey_access_deny(pkey);
-
-	dprintf2("pkru: %x\n", rdpkru());
-
-	/*
-	 * Make sure this is an *instruction* fault
-	 */
-	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
-	lots_o_noops_around_write(&scratch);
-	do_not_expect_pk_fault();
-	ptr_contents = read_ptr(p1);
-	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-	expected_pk_fault(pkey);
-}
-
-void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
-{
-	int size = PAGE_SIZE;
-	int sret;
-
-	if (cpu_has_pku()) {
-		dprintf1("SKIP: %s: no CPU support\n", __func__);
-		return;
-	}
-
-	sret = syscall(SYS_mprotect_key, ptr, size, PROT_READ, pkey);
-	pkey_assert(sret < 0);
-}
-
-void (*pkey_tests[])(int *ptr, u16 pkey) = {
-	test_read_of_write_disabled_region,
-	test_read_of_access_disabled_region,
-	test_write_of_write_disabled_region,
-	test_write_of_access_disabled_region,
-	test_kernel_write_of_access_disabled_region,
-	test_kernel_write_of_write_disabled_region,
-	test_kernel_gup_of_access_disabled_region,
-	test_kernel_gup_write_to_write_disabled_region,
-	test_executing_on_unreadable_memory,
-	test_ptrace_of_child,
-	test_pkey_syscalls_on_non_allocated_pkey,
-	test_pkey_syscalls_bad_args,
-	test_pkey_alloc_exhaust,
-};
-
-void run_tests_once(void)
-{
-	int *ptr;
-	int prot = PROT_READ|PROT_WRITE;
-
-	for (test_nr = 0; test_nr < ARRAY_SIZE(pkey_tests); test_nr++) {
-		int pkey;
-		int orig_pkru_faults = pkru_faults;
-
-		dprintf1("======================\n");
-		dprintf1("test %d preparing...\n", test_nr);
-
-		tracing_on();
-		pkey = alloc_random_pkey();
-		dprintf1("test %d starting with pkey: %d\n", test_nr, pkey);
-		ptr = malloc_pkey(PAGE_SIZE, prot, pkey);
-		dprintf1("test %d starting...\n", test_nr);
-		pkey_tests[test_nr](ptr, pkey);
-		dprintf1("freeing test memory: %p\n", ptr);
-		free_pkey_malloc(ptr);
-		sys_pkey_free(pkey);
-
-		dprintf1("pkru_faults: %d\n", pkru_faults);
-		dprintf1("orig_pkru_faults: %d\n", orig_pkru_faults);
-
-		tracing_off();
-		close_test_fds();
-
-		printf("test %2d PASSED (iteration %d)\n", test_nr, iteration_nr);
-		dprintf1("======================\n\n");
-	}
-	iteration_nr++;
-}
-
-void pkey_setup_shadow(void)
-{
-	shadow_pkru = __rdpkru();
-}
-
-int main(void)
-{
-	int nr_iterations = 22;
-
-	setup_handlers();
-
-	printf("has pku: %d\n", cpu_has_pku());
-
-	if (!cpu_has_pku()) {
-		int size = PAGE_SIZE;
-		int *ptr;
-
-		printf("running PKEY tests for unsupported CPU/OS\n");
-
-		ptr  = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-		assert(ptr != (void *)-1);
-		test_mprotect_pkey_on_unsupported_cpu(ptr, 1);
-		exit(0);
-	}
-
-	pkey_setup_shadow();
-	printf("startup pkru: %x\n", rdpkru());
-	setup_hugetlbfs();
-
-	while (nr_iterations-- > 0)
-		run_tests_once();
-
-	printf("done (all tests OK)\n");
-	return 0;
-}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 33/51] selftest/vm: rename all references to pkru to a generic name
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

some pkru references are named to pkey_reg
and some prku references are renamed to pkey

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   85 +++++-----
 tools/testing/selftests/vm/protection_keys.c |  227 ++++++++++++++------------
 2 files changed, 164 insertions(+), 148 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 3818f25..2d91d34 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -14,7 +14,7 @@
 #include <sys/mman.h>
 
 #define NR_PKEYS 16
-#define PKRU_BITS_PER_PKEY 2
+#define PKEY_BITS_PER_PKEY 2
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -54,85 +54,88 @@ static inline void sigsafe_printf(const char *format, ...)
 #define dprintf3(args...) dprintf_level(3, args)
 #define dprintf4(args...) dprintf_level(4, args)
 
-extern unsigned int shadow_pkru;
-static inline unsigned int __rdpkru(void)
+extern unsigned int shadow_pkey_reg;
+static inline unsigned int __rdpkey_reg(void)
 {
 	unsigned int eax, edx;
 	unsigned int ecx = 0;
-	unsigned int pkru;
+	unsigned int pkey_reg;
 
 	asm volatile(".byte 0x0f,0x01,0xee\n\t"
 		     : "=a" (eax), "=d" (edx)
 		     : "c" (ecx));
-	pkru = eax;
-	return pkru;
+	pkey_reg = eax;
+	return pkey_reg;
 }
 
-static inline unsigned int _rdpkru(int line)
+static inline unsigned int _rdpkey_reg(int line)
 {
-	unsigned int pkru = __rdpkru();
+	unsigned int pkey_reg = __rdpkey_reg();
 
-	dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
-			line, pkru, shadow_pkru);
-	assert(pkru == shadow_pkru);
+	dprintf4("rdpkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
+			line, pkey_reg, shadow_pkey_reg);
+	assert(pkey_reg == shadow_pkey_reg);
 
-	return pkru;
+	return pkey_reg;
 }
 
-#define rdpkru() _rdpkru(__LINE__)
+#define rdpkey_reg() _rdpkey_reg(__LINE__)
 
-static inline void __wrpkru(unsigned int pkru)
+static inline void __wrpkey_reg(unsigned int pkey_reg)
 {
-	unsigned int eax = pkru;
+	unsigned int eax = pkey_reg;
 	unsigned int ecx = 0;
 	unsigned int edx = 0;
 
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	dprintf4("%s() changing %08x to %08x\n", __func__,
+			__rdpkey_reg(), pkey_reg);
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (eax), "c" (ecx), "d" (edx));
-	assert(pkru == __rdpkru());
+	assert(pkey_reg == __rdpkey_reg());
 }
 
-static inline void wrpkru(unsigned int pkru)
+static inline void wrpkey_reg(unsigned int pkey_reg)
 {
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	dprintf4("%s() changing %08x to %08x\n", __func__,
+			__rdpkey_reg(), pkey_reg);
 	/* will do the shadow check for us: */
-	rdpkru();
-	__wrpkru(pkru);
-	shadow_pkru = pkru;
-	dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
+	rdpkey_reg();
+	__wrpkey_reg(pkey_reg);
+	shadow_pkey_reg = pkey_reg;
+	dprintf4("%s(%08x) pkey_reg: %08x\n", __func__,
+			pkey_reg, __rdpkey_reg());
 }
 
 /*
  * These are technically racy. since something could
- * change PKRU between the read and the write.
+ * change PKEY register between the read and the write.
  */
 static inline void __pkey_access_allow(int pkey, int do_allow)
 {
-	unsigned int pkru = rdpkru();
+	unsigned int pkey_reg = rdpkey_reg();
 	int bit = pkey * 2;
 
 	if (do_allow)
-		pkru &= (1<<bit);
+		pkey_reg &= (1<<bit);
 	else
-		pkru |= (1<<bit);
+		pkey_reg |= (1<<bit);
 
-	dprintf4("pkru now: %08x\n", rdpkru());
-	wrpkru(pkru);
+	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
+	wrpkey_reg(pkey_reg);
 }
 
 static inline void __pkey_write_allow(int pkey, int do_allow_write)
 {
-	long pkru = rdpkru();
+	long pkey_reg = rdpkey_reg();
 	int bit = pkey * 2 + 1;
 
 	if (do_allow_write)
-		pkru &= (1<<bit);
+		pkey_reg &= (1<<bit);
 	else
-		pkru |= (1<<bit);
+		pkey_reg |= (1<<bit);
 
-	wrpkru(pkru);
-	dprintf4("pkru now: %08x\n", rdpkru());
+	wrpkey_reg(pkey_reg);
+	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
 }
 
 #define PROT_PKEY0     0x10            /* protection key value (bit 0) */
@@ -182,10 +185,10 @@ static inline int cpu_has_pku(void)
 	return 1;
 }
 
-#define XSTATE_PKRU_BIT	(9)
-#define XSTATE_PKRU	0x200
+#define XSTATE_PKEY_BIT	(9)
+#define XSTATE_PKEY	0x200
 
-int pkru_xstate_offset(void)
+int pkey_reg_xstate_offset(void)
 {
 	unsigned int eax;
 	unsigned int ebx;
@@ -196,21 +199,21 @@ int pkru_xstate_offset(void)
 	unsigned long XSTATE_CPUID = 0xd;
 	int leaf;
 
-	/* assume that XSTATE_PKRU is set in XCR0 */
-	leaf = XSTATE_PKRU_BIT;
+	/* assume that XSTATE_PKEY is set in XCR0 */
+	leaf = XSTATE_PKEY_BIT;
 	{
 		eax = XSTATE_CPUID;
 		ecx = leaf;
 		__cpuid(&eax, &ebx, &ecx, &edx);
 
-		if (leaf == XSTATE_PKRU_BIT) {
+		if (leaf == XSTATE_PKEY_BIT) {
 			xstate_offset = ebx;
 			xstate_size = eax;
 		}
 	}
 
 	if (xstate_size == 0) {
-		printf("could not find size/offset of PKRU in xsave state\n");
+		printf("could not find size/offset of PKEY in xsave state\n");
 		return 0;
 	}
 
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 555e43c..27b11e6 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1,11 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Tests x86 Memory Protection Keys (see Documentation/x86/protection-keys.txt)
+ * Tests Memory Protection Keys (see Documentation/vm/protection-keys.txt)
  *
  * There are examples in here of:
  *  * how to set protection keys on memory
- *  * how to set/clear bits in PKRU (the rights register)
- *  * how to handle SEGV_PKRU signals and extract pkey-relevant
+ *  * how to set/clear bits in pkey registers (the rights register)
+ *  * how to handle SEGV_PKUERR signals and extract pkey-relevant
  *    information from the siginfo
  *
  * Things to add:
@@ -48,7 +48,7 @@
 int iteration_nr = 1;
 int test_nr;
 
-unsigned int shadow_pkru;
+unsigned int shadow_pkey_reg;
 
 #define HPAGE_SIZE	(1UL<<21)
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
@@ -229,7 +229,7 @@ void dump_mem(void *dumpme, int len_bytes)
 	return "UNKNOWN";
 }
 
-int pkru_faults;
+int pkey_faults;
 int last_si_pkey = -1;
 void signal_handler(int signum, siginfo_t *si, void *vucontext)
 {
@@ -237,16 +237,16 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	int trapno;
 	unsigned long ip;
 	char *fpregs;
-	u32 *pkru_ptr;
+	u32 *pkey_reg_ptr;
 	u64 si_pkey;
 	u32 *si_pkey_ptr;
-	int pkru_offset;
+	int pkey_reg_offset;
 	fpregset_t fpregset;
 
 	dprint_in_signal = 1;
 	dprintf1(">>>>===============SIGSEGV============================\n");
-	dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
-			__rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__, __LINE__,
+			__rdpkey_reg(), shadow_pkey_reg);
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
 	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
@@ -263,19 +263,19 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	 */
 	fpregs += 0x70;
 #endif
-	pkru_offset = pkru_xstate_offset();
-	pkru_ptr = (void *)(&fpregs[pkru_offset]);
+	pkey_reg_offset = pkey_reg_xstate_offset();
+	pkey_reg_ptr = (void *)(&fpregs[pkey_reg_offset]);
 
 	dprintf1("siginfo: %p\n", si);
 	dprintf1(" fpregs: %p\n", fpregs);
 	/*
-	 * If we got a PKRU fault, we *HAVE* to have at least one bit set in
+	 * If we got a PKEY fault, we *HAVE* to have at least one bit set in
 	 * here.
 	 */
-	dprintf1("pkru_xstate_offset: %d\n", pkru_xstate_offset());
+	dprintf1("pkey_reg_xstate_offset: %d\n", pkey_reg_xstate_offset());
 	if (DEBUG_LEVEL > 4)
-		dump_mem(pkru_ptr - 128, 256);
-	pkey_assert(*pkru_ptr);
+		dump_mem(pkey_reg_ptr - 128, 256);
+	pkey_assert(*pkey_reg_ptr);
 
 	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
 	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
@@ -291,13 +291,16 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 		exit(4);
 	}
 
-	dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
-	/* need __rdpkru() version so we do not do shadow_pkru checking */
-	dprintf1("signal pkru from  pkru: %08x\n", __rdpkru());
+	dprintf1("signal pkey_reg from xsave: %08x\n", *pkey_reg_ptr);
+	/*
+	 * need __rdpkey_reg() version so we do not do shadow_pkey_reg
+	 * checking
+	 */
+	dprintf1("signal pkey_reg from  pkey_reg: %08x\n", __rdpkey_reg());
 	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
-	*(u64 *)pkru_ptr = 0x00000000;
+	*(u64 *)pkey_reg_ptr = 0x00000000;
 	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
-	pkru_faults++;
+	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
 	return;
 	if (trapno == 14) {
@@ -415,45 +418,47 @@ void dumpit(char *f)
 u32 pkey_get(int pkey, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 pkru = __rdpkru();
-	u32 shifted_pkru;
-	u32 masked_pkru;
+	u32 pkey_reg = __rdpkey_reg();
+	u32 shifted_pkey_reg;
+	u32 masked_pkey_reg;
 
 	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
 			__func__, pkey, flags, 0, 0);
-	dprintf2("%s() raw pkru: %x\n", __func__, pkru);
+	dprintf2("%s() raw pkey_reg: %x\n", __func__, pkey_reg);
 
-	shifted_pkru = (pkru >> (pkey * PKRU_BITS_PER_PKEY));
-	dprintf2("%s() shifted_pkru: %x\n", __func__, shifted_pkru);
-	masked_pkru = shifted_pkru & mask;
-	dprintf2("%s() masked  pkru: %x\n", __func__, masked_pkru);
+	shifted_pkey_reg = (pkey_reg >> (pkey * PKEY_BITS_PER_PKEY));
+	dprintf2("%s() shifted_pkey_reg: %x\n", __func__, shifted_pkey_reg);
+	masked_pkey_reg = shifted_pkey_reg & mask;
+	dprintf2("%s() masked  pkey_reg: %x\n", __func__, masked_pkey_reg);
 	/*
 	 * shift down the relevant bits to the lowest two, then
 	 * mask off all the other high bits.
 	 */
-	return masked_pkru;
+	return masked_pkey_reg;
 }
 
 int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 old_pkru = __rdpkru();
-	u32 new_pkru;
+	u32 old_pkey_reg = __rdpkey_reg();
+	u32 new_pkey_reg;
 
 	/* make sure that 'rights' only contains the bits we expect: */
 	assert(!(rights & ~mask));
 
-	/* copy old pkru */
-	new_pkru = old_pkru;
+	/* copy old pkey_reg */
+	new_pkey_reg = old_pkey_reg;
 	/* mask out bits from pkey in old value: */
-	new_pkru &= ~(mask << (pkey * PKRU_BITS_PER_PKEY));
+	new_pkey_reg &= ~(mask << (pkey * PKEY_BITS_PER_PKEY));
 	/* OR in new bits for pkey: */
-	new_pkru |= (rights << (pkey * PKRU_BITS_PER_PKEY));
+	new_pkey_reg |= (rights << (pkey * PKEY_BITS_PER_PKEY));
 
-	__wrpkru(new_pkru);
+	__wrpkey_reg(new_pkey_reg);
 
-	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x pkru now: %x old_pkru: %x\n",
-			__func__, pkey, rights, flags, 0, __rdpkru(), old_pkru);
+	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x"
+	       " pkey_reg now: %x old_pkey_reg: %x\n",
+		__func__, pkey, rights, flags, 0, __rdpkey_reg(),
+		old_pkey_reg);
 	return 0;
 }
 
@@ -462,7 +467,7 @@ void pkey_disable_set(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights;
-	u32 orig_pkru = rdpkru();
+	u32 orig_pkey_reg = rdpkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
@@ -478,9 +483,9 @@ void pkey_disable_set(int pkey, int flags)
 
 	ret = pkey_set(pkey, pkey_rights, syscall_flags);
 	assert(!ret);
-	/*pkru and flags have the same format */
-	shadow_pkru |= flags << (pkey * 2);
-	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkru);
+	/*pkey_reg and flags have the same format */
+	shadow_pkey_reg |= flags << (pkey * 2);
+	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkey_reg);
 
 	pkey_assert(ret >= 0);
 
@@ -488,9 +493,9 @@ void pkey_disable_set(int pkey, int flags)
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
 	if (flags)
-		pkey_assert(rdpkru() > orig_pkru);
+		pkey_assert(rdpkey_reg() > orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
 }
@@ -500,7 +505,7 @@ void pkey_disable_clear(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights = pkey_get(pkey, syscall_flags);
-	u32 orig_pkru = rdpkru();
+	u32 orig_pkey_reg = rdpkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
@@ -511,17 +516,17 @@ void pkey_disable_clear(int pkey, int flags)
 	pkey_rights |= flags;
 
 	ret = pkey_set(pkey, pkey_rights, 0);
-	/* pkru and flags have the same format */
-	shadow_pkru &= ~(flags << (pkey * 2));
+	/* pkey_reg and flags have the same format */
+	shadow_pkey_reg &= ~(flags << (pkey * 2));
 	pkey_assert(ret >= 0);
 
 	pkey_rights = pkey_get(pkey, syscall_flags);
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
 	if (flags)
-		assert(rdpkru() > orig_pkru);
+		assert(rdpkey_reg() > orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
@@ -574,33 +579,38 @@ int alloc_pkey(void)
 	int ret;
 	unsigned long init_val = 0x0;
 
-	dprintf1("alloc_pkey()::%d, pkru: 0x%x shadow: %x\n",
-			__LINE__, __rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__,
+			__LINE__, __rdpkey_reg(), shadow_pkey_reg);
 	ret = sys_pkey_alloc(0, init_val);
 	/*
-	 * pkey_alloc() sets PKRU, so we need to reflect it in
-	 * shadow_pkru:
+	 * pkey_alloc() sets PKEY register, so we need to reflect it in
+	 * shadow_pkey_reg:
 	 */
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
 	if (ret) {
 		/* clear both the bits: */
-		shadow_pkru &= ~(0x3      << (ret * 2));
-		dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-				__LINE__, ret, __rdpkru(), shadow_pkru);
+		shadow_pkey_reg &= ~(0x3      << (ret * 2));
+		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+				__func__,
+				__LINE__, ret, __rdpkey_reg(),
+				shadow_pkey_reg);
 		/*
 		 * move the new state in from init_val
-		 * (remember, we cheated and init_val == pkru format)
+		 * (remember, we cheated and init_val == pkey_reg format)
 		 */
-		shadow_pkru |=  (init_val << (ret * 2));
+		shadow_pkey_reg |=  (init_val << (ret * 2));
 	}
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	dprintf1("alloc_pkey()::%d errno: %d\n", __LINE__, errno);
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
+	dprintf1("%s()::%d errno: %d\n", __func__, __LINE__, errno);
 	/* for shadow checking: */
-	rdpkru();
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	rdpkey_reg();
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+		__func__, __LINE__, ret, __rdpkey_reg(),
+		shadow_pkey_reg);
 	return ret;
 }
 
@@ -651,8 +661,8 @@ int alloc_random_pkey(void)
 		free_ret = sys_pkey_free(alloced_pkeys[i]);
 		pkey_assert(!free_ret);
 	}
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
 	return ret;
 }
 
@@ -670,11 +680,13 @@ int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
 		if (nr_iterations-- < 0)
 			break;
 
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+		dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
 		sys_pkey_free(rpkey);
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+		dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
 	}
 	pkey_assert(pkey < NR_PKEYS);
 
@@ -682,8 +694,8 @@ int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
 	dprintf1("mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
 			ptr, size, orig_prot, pkey, ret);
 	pkey_assert(!ret);
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
 	return ret;
 }
 
@@ -761,7 +773,7 @@ void free_pkey_malloc(void *ptr)
 	void *ptr;
 	int ret;
 
-	rdpkru();
+	rdpkey_reg();
 	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
 			size, prot, pkey);
 	pkey_assert(pkey < NR_PKEYS);
@@ -770,7 +782,7 @@ void free_pkey_malloc(void *ptr)
 	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
 	pkey_assert(!ret);
 	record_pkey_malloc(ptr, size);
-	rdpkru();
+	rdpkey_reg();
 
 	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
 	return ptr;
@@ -933,31 +945,31 @@ void setup_hugetlbfs(void)
 	return ret;
 }
 
-int last_pkru_faults;
-void expected_pk_fault(int pkey)
+int last_pkey_faults;
+void expected_pkey_fault(int pkey)
 {
-	dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
-			__func__, last_pkru_faults, pkru_faults);
+	dprintf2("%s(): last_pkey_faults: %d pkey_faults: %d\n",
+			__func__, last_pkey_faults, pkey_faults);
 	dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
-	pkey_assert(last_pkru_faults + 1 == pkru_faults);
+	pkey_assert(last_pkey_faults + 1 == pkey_faults);
 	pkey_assert(last_si_pkey == pkey);
 	/*
-	 * The signal handler shold have cleared out PKRU to let the
+	 * The signal handler shold have cleared out PKEY register to let the
 	 * test program continue.  We now have to restore it.
 	 */
-	if (__rdpkru() != 0)
+	if (__rdpkey_reg() != 0)
 		pkey_assert(0);
 
-	__wrpkru(shadow_pkru);
-	dprintf1("%s() set PKRU=%x to restore state after signal nuked it\n",
-			__func__, shadow_pkru);
-	last_pkru_faults = pkru_faults;
+	__wrpkey_reg(shadow_pkey_reg);
+	dprintf1("%s() set pkey_reg=%x to restore state after signal "
+		       "nuked it\n", __func__, shadow_pkey_reg);
+	last_pkey_faults = pkey_faults;
 	last_si_pkey = -1;
 }
 
-void do_not_expect_pk_fault(void)
+void do_not_expect_pkey_fault(void)
 {
-	pkey_assert(last_pkru_faults == pkru_faults);
+	pkey_assert(last_pkey_faults == pkey_faults);
 }
 
 int test_fds[10] = { -1 };
@@ -1015,25 +1027,25 @@ void test_read_of_access_disabled_region(int *ptr, u16 pkey)
 	int ptr_contents;
 
 	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
-	rdpkru();
+	rdpkey_reg();
 	pkey_access_deny(pkey);
 	ptr_contents = read_ptr(ptr);
 	dprintf1("*ptr: %d\n", ptr_contents);
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
 	pkey_write_deny(pkey);
 	*ptr = __LINE__;
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 void test_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling access to PKEY[%02d], doing write\n", pkey);
 	pkey_access_deny(pkey);
 	*ptr = __LINE__;
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
@@ -1145,9 +1157,10 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 		int new_pkey;
 		dprintf1("%s() alloc loop: %d\n", __func__, i);
 		new_pkey = alloc_pkey();
-		dprintf4("%s()::%d, err: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-				__LINE__, err, __rdpkru(), shadow_pkru);
-		rdpkru(); /* for shadow checking */
+		dprintf4("%s()::%d, err: %d pkey_reg: 0x%x shadow: 0x%x\n",
+				__func__, __LINE__, err, __rdpkey_reg(),
+				shadow_pkey_reg);
+		rdpkey_reg(); /* for shadow checking */
 		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
 		if ((new_pkey == -1) && (errno == ENOSPC)) {
 			dprintf2("%s() failed to allocate pkey after %d tries\n",
@@ -1177,7 +1190,7 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 	for (i = 0; i < nr_allocated_pkeys; i++) {
 		err = sys_pkey_free(allocated_pkeys[i]);
 		pkey_assert(!err);
-		rdpkru(); /* for shadow checking */
+		rdpkey_reg(); /* for shadow checking */
 	}
 }
 
@@ -1234,7 +1247,7 @@ void test_ptrace_of_child(int *ptr, u16 pkey)
 	pkey_assert(ret != -1);
 	/* Now access from the current task, and expect an exception: */
 	peek_result = read_ptr(ptr);
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 
 	/*
 	 * Try to access the NON-pkey-protected "plain_ptr" via ptrace:
@@ -1244,7 +1257,7 @@ void test_ptrace_of_child(int *ptr, u16 pkey)
 	pkey_assert(ret != -1);
 	/* Now access from the current task, and expect NO exception: */
 	peek_result = read_ptr(plain_ptr);
-	do_not_expect_pk_fault();
+	do_not_expect_pkey_fault();
 
 	ret = ptrace(PTRACE_DETACH, child_pid, ignored, 0);
 	pkey_assert(ret != -1);
@@ -1281,17 +1294,17 @@ void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
 	pkey_assert(!ret);
 	pkey_access_deny(pkey);
 
-	dprintf2("pkru: %x\n", rdpkru());
+	dprintf2("pkey_reg: %x\n", rdpkey_reg());
 
 	/*
 	 * Make sure this is an *instruction* fault
 	 */
 	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
 	lots_o_noops_around_write(&scratch);
-	do_not_expect_pk_fault();
+	do_not_expect_pkey_fault();
 	ptr_contents = read_ptr(p1);
 	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 
 void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
@@ -1331,7 +1344,7 @@ void run_tests_once(void)
 
 	for (test_nr = 0; test_nr < ARRAY_SIZE(pkey_tests); test_nr++) {
 		int pkey;
-		int orig_pkru_faults = pkru_faults;
+		int orig_pkey_faults = pkey_faults;
 
 		dprintf1("======================\n");
 		dprintf1("test %d preparing...\n", test_nr);
@@ -1346,8 +1359,8 @@ void run_tests_once(void)
 		free_pkey_malloc(ptr);
 		sys_pkey_free(pkey);
 
-		dprintf1("pkru_faults: %d\n", pkru_faults);
-		dprintf1("orig_pkru_faults: %d\n", orig_pkru_faults);
+		dprintf1("pkey_faults: %d\n", pkey_faults);
+		dprintf1("orig_pkey_faults: %d\n", orig_pkey_faults);
 
 		tracing_off();
 		close_test_fds();
@@ -1360,7 +1373,7 @@ void run_tests_once(void)
 
 void pkey_setup_shadow(void)
 {
-	shadow_pkru = __rdpkru();
+	shadow_pkey_reg = __rdpkey_reg();
 }
 
 int main(void)
@@ -1384,7 +1397,7 @@ int main(void)
 	}
 
 	pkey_setup_shadow();
-	printf("startup pkru: %x\n", rdpkru());
+	printf("startup pkey_reg: %x\n", rdpkey_reg());
 	setup_hugetlbfs();
 
 	while (nr_iterations-- > 0)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 33/51] selftest/vm: rename all references to pkru to a generic name
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

some pkru references are named to pkey_reg
and some prku references are renamed to pkey

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   85 +++++-----
 tools/testing/selftests/vm/protection_keys.c |  227 ++++++++++++++------------
 2 files changed, 164 insertions(+), 148 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 3818f25..2d91d34 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -14,7 +14,7 @@
 #include <sys/mman.h>
 
 #define NR_PKEYS 16
-#define PKRU_BITS_PER_PKEY 2
+#define PKEY_BITS_PER_PKEY 2
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -54,85 +54,88 @@ static inline void sigsafe_printf(const char *format, ...)
 #define dprintf3(args...) dprintf_level(3, args)
 #define dprintf4(args...) dprintf_level(4, args)
 
-extern unsigned int shadow_pkru;
-static inline unsigned int __rdpkru(void)
+extern unsigned int shadow_pkey_reg;
+static inline unsigned int __rdpkey_reg(void)
 {
 	unsigned int eax, edx;
 	unsigned int ecx = 0;
-	unsigned int pkru;
+	unsigned int pkey_reg;
 
 	asm volatile(".byte 0x0f,0x01,0xee\n\t"
 		     : "=a" (eax), "=d" (edx)
 		     : "c" (ecx));
-	pkru = eax;
-	return pkru;
+	pkey_reg = eax;
+	return pkey_reg;
 }
 
-static inline unsigned int _rdpkru(int line)
+static inline unsigned int _rdpkey_reg(int line)
 {
-	unsigned int pkru = __rdpkru();
+	unsigned int pkey_reg = __rdpkey_reg();
 
-	dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
-			line, pkru, shadow_pkru);
-	assert(pkru == shadow_pkru);
+	dprintf4("rdpkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
+			line, pkey_reg, shadow_pkey_reg);
+	assert(pkey_reg == shadow_pkey_reg);
 
-	return pkru;
+	return pkey_reg;
 }
 
-#define rdpkru() _rdpkru(__LINE__)
+#define rdpkey_reg() _rdpkey_reg(__LINE__)
 
-static inline void __wrpkru(unsigned int pkru)
+static inline void __wrpkey_reg(unsigned int pkey_reg)
 {
-	unsigned int eax = pkru;
+	unsigned int eax = pkey_reg;
 	unsigned int ecx = 0;
 	unsigned int edx = 0;
 
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	dprintf4("%s() changing %08x to %08x\n", __func__,
+			__rdpkey_reg(), pkey_reg);
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (eax), "c" (ecx), "d" (edx));
-	assert(pkru == __rdpkru());
+	assert(pkey_reg == __rdpkey_reg());
 }
 
-static inline void wrpkru(unsigned int pkru)
+static inline void wrpkey_reg(unsigned int pkey_reg)
 {
-	dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+	dprintf4("%s() changing %08x to %08x\n", __func__,
+			__rdpkey_reg(), pkey_reg);
 	/* will do the shadow check for us: */
-	rdpkru();
-	__wrpkru(pkru);
-	shadow_pkru = pkru;
-	dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
+	rdpkey_reg();
+	__wrpkey_reg(pkey_reg);
+	shadow_pkey_reg = pkey_reg;
+	dprintf4("%s(%08x) pkey_reg: %08x\n", __func__,
+			pkey_reg, __rdpkey_reg());
 }
 
 /*
  * These are technically racy. since something could
- * change PKRU between the read and the write.
+ * change PKEY register between the read and the write.
  */
 static inline void __pkey_access_allow(int pkey, int do_allow)
 {
-	unsigned int pkru = rdpkru();
+	unsigned int pkey_reg = rdpkey_reg();
 	int bit = pkey * 2;
 
 	if (do_allow)
-		pkru &= (1<<bit);
+		pkey_reg &= (1<<bit);
 	else
-		pkru |= (1<<bit);
+		pkey_reg |= (1<<bit);
 
-	dprintf4("pkru now: %08x\n", rdpkru());
-	wrpkru(pkru);
+	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
+	wrpkey_reg(pkey_reg);
 }
 
 static inline void __pkey_write_allow(int pkey, int do_allow_write)
 {
-	long pkru = rdpkru();
+	long pkey_reg = rdpkey_reg();
 	int bit = pkey * 2 + 1;
 
 	if (do_allow_write)
-		pkru &= (1<<bit);
+		pkey_reg &= (1<<bit);
 	else
-		pkru |= (1<<bit);
+		pkey_reg |= (1<<bit);
 
-	wrpkru(pkru);
-	dprintf4("pkru now: %08x\n", rdpkru());
+	wrpkey_reg(pkey_reg);
+	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
 }
 
 #define PROT_PKEY0     0x10            /* protection key value (bit 0) */
@@ -182,10 +185,10 @@ static inline int cpu_has_pku(void)
 	return 1;
 }
 
-#define XSTATE_PKRU_BIT	(9)
-#define XSTATE_PKRU	0x200
+#define XSTATE_PKEY_BIT	(9)
+#define XSTATE_PKEY	0x200
 
-int pkru_xstate_offset(void)
+int pkey_reg_xstate_offset(void)
 {
 	unsigned int eax;
 	unsigned int ebx;
@@ -196,21 +199,21 @@ int pkru_xstate_offset(void)
 	unsigned long XSTATE_CPUID = 0xd;
 	int leaf;
 
-	/* assume that XSTATE_PKRU is set in XCR0 */
-	leaf = XSTATE_PKRU_BIT;
+	/* assume that XSTATE_PKEY is set in XCR0 */
+	leaf = XSTATE_PKEY_BIT;
 	{
 		eax = XSTATE_CPUID;
 		ecx = leaf;
 		__cpuid(&eax, &ebx, &ecx, &edx);
 
-		if (leaf == XSTATE_PKRU_BIT) {
+		if (leaf == XSTATE_PKEY_BIT) {
 			xstate_offset = ebx;
 			xstate_size = eax;
 		}
 	}
 
 	if (xstate_size == 0) {
-		printf("could not find size/offset of PKRU in xsave state\n");
+		printf("could not find size/offset of PKEY in xsave state\n");
 		return 0;
 	}
 
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 555e43c..27b11e6 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1,11 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Tests x86 Memory Protection Keys (see Documentation/x86/protection-keys.txt)
+ * Tests Memory Protection Keys (see Documentation/vm/protection-keys.txt)
  *
  * There are examples in here of:
  *  * how to set protection keys on memory
- *  * how to set/clear bits in PKRU (the rights register)
- *  * how to handle SEGV_PKRU signals and extract pkey-relevant
+ *  * how to set/clear bits in pkey registers (the rights register)
+ *  * how to handle SEGV_PKUERR signals and extract pkey-relevant
  *    information from the siginfo
  *
  * Things to add:
@@ -48,7 +48,7 @@
 int iteration_nr = 1;
 int test_nr;
 
-unsigned int shadow_pkru;
+unsigned int shadow_pkey_reg;
 
 #define HPAGE_SIZE	(1UL<<21)
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
@@ -229,7 +229,7 @@ void dump_mem(void *dumpme, int len_bytes)
 	return "UNKNOWN";
 }
 
-int pkru_faults;
+int pkey_faults;
 int last_si_pkey = -1;
 void signal_handler(int signum, siginfo_t *si, void *vucontext)
 {
@@ -237,16 +237,16 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	int trapno;
 	unsigned long ip;
 	char *fpregs;
-	u32 *pkru_ptr;
+	u32 *pkey_reg_ptr;
 	u64 si_pkey;
 	u32 *si_pkey_ptr;
-	int pkru_offset;
+	int pkey_reg_offset;
 	fpregset_t fpregset;
 
 	dprint_in_signal = 1;
 	dprintf1(">>>>===============SIGSEGV============================\n");
-	dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
-			__rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__, __LINE__,
+			__rdpkey_reg(), shadow_pkey_reg);
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
 	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
@@ -263,19 +263,19 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	 */
 	fpregs += 0x70;
 #endif
-	pkru_offset = pkru_xstate_offset();
-	pkru_ptr = (void *)(&fpregs[pkru_offset]);
+	pkey_reg_offset = pkey_reg_xstate_offset();
+	pkey_reg_ptr = (void *)(&fpregs[pkey_reg_offset]);
 
 	dprintf1("siginfo: %p\n", si);
 	dprintf1(" fpregs: %p\n", fpregs);
 	/*
-	 * If we got a PKRU fault, we *HAVE* to have at least one bit set in
+	 * If we got a PKEY fault, we *HAVE* to have at least one bit set in
 	 * here.
 	 */
-	dprintf1("pkru_xstate_offset: %d\n", pkru_xstate_offset());
+	dprintf1("pkey_reg_xstate_offset: %d\n", pkey_reg_xstate_offset());
 	if (DEBUG_LEVEL > 4)
-		dump_mem(pkru_ptr - 128, 256);
-	pkey_assert(*pkru_ptr);
+		dump_mem(pkey_reg_ptr - 128, 256);
+	pkey_assert(*pkey_reg_ptr);
 
 	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
 	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
@@ -291,13 +291,16 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 		exit(4);
 	}
 
-	dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
-	/* need __rdpkru() version so we do not do shadow_pkru checking */
-	dprintf1("signal pkru from  pkru: %08x\n", __rdpkru());
+	dprintf1("signal pkey_reg from xsave: %08x\n", *pkey_reg_ptr);
+	/*
+	 * need __rdpkey_reg() version so we do not do shadow_pkey_reg
+	 * checking
+	 */
+	dprintf1("signal pkey_reg from  pkey_reg: %08x\n", __rdpkey_reg());
 	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
-	*(u64 *)pkru_ptr = 0x00000000;
+	*(u64 *)pkey_reg_ptr = 0x00000000;
 	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
-	pkru_faults++;
+	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
 	return;
 	if (trapno == 14) {
@@ -415,45 +418,47 @@ void dumpit(char *f)
 u32 pkey_get(int pkey, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 pkru = __rdpkru();
-	u32 shifted_pkru;
-	u32 masked_pkru;
+	u32 pkey_reg = __rdpkey_reg();
+	u32 shifted_pkey_reg;
+	u32 masked_pkey_reg;
 
 	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
 			__func__, pkey, flags, 0, 0);
-	dprintf2("%s() raw pkru: %x\n", __func__, pkru);
+	dprintf2("%s() raw pkey_reg: %x\n", __func__, pkey_reg);
 
-	shifted_pkru = (pkru >> (pkey * PKRU_BITS_PER_PKEY));
-	dprintf2("%s() shifted_pkru: %x\n", __func__, shifted_pkru);
-	masked_pkru = shifted_pkru & mask;
-	dprintf2("%s() masked  pkru: %x\n", __func__, masked_pkru);
+	shifted_pkey_reg = (pkey_reg >> (pkey * PKEY_BITS_PER_PKEY));
+	dprintf2("%s() shifted_pkey_reg: %x\n", __func__, shifted_pkey_reg);
+	masked_pkey_reg = shifted_pkey_reg & mask;
+	dprintf2("%s() masked  pkey_reg: %x\n", __func__, masked_pkey_reg);
 	/*
 	 * shift down the relevant bits to the lowest two, then
 	 * mask off all the other high bits.
 	 */
-	return masked_pkru;
+	return masked_pkey_reg;
 }
 
 int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 old_pkru = __rdpkru();
-	u32 new_pkru;
+	u32 old_pkey_reg = __rdpkey_reg();
+	u32 new_pkey_reg;
 
 	/* make sure that 'rights' only contains the bits we expect: */
 	assert(!(rights & ~mask));
 
-	/* copy old pkru */
-	new_pkru = old_pkru;
+	/* copy old pkey_reg */
+	new_pkey_reg = old_pkey_reg;
 	/* mask out bits from pkey in old value: */
-	new_pkru &= ~(mask << (pkey * PKRU_BITS_PER_PKEY));
+	new_pkey_reg &= ~(mask << (pkey * PKEY_BITS_PER_PKEY));
 	/* OR in new bits for pkey: */
-	new_pkru |= (rights << (pkey * PKRU_BITS_PER_PKEY));
+	new_pkey_reg |= (rights << (pkey * PKEY_BITS_PER_PKEY));
 
-	__wrpkru(new_pkru);
+	__wrpkey_reg(new_pkey_reg);
 
-	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x pkru now: %x old_pkru: %x\n",
-			__func__, pkey, rights, flags, 0, __rdpkru(), old_pkru);
+	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x"
+	       " pkey_reg now: %x old_pkey_reg: %x\n",
+		__func__, pkey, rights, flags, 0, __rdpkey_reg(),
+		old_pkey_reg);
 	return 0;
 }
 
@@ -462,7 +467,7 @@ void pkey_disable_set(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights;
-	u32 orig_pkru = rdpkru();
+	u32 orig_pkey_reg = rdpkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
@@ -478,9 +483,9 @@ void pkey_disable_set(int pkey, int flags)
 
 	ret = pkey_set(pkey, pkey_rights, syscall_flags);
 	assert(!ret);
-	/*pkru and flags have the same format */
-	shadow_pkru |= flags << (pkey * 2);
-	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkru);
+	/*pkey_reg and flags have the same format */
+	shadow_pkey_reg |= flags << (pkey * 2);
+	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkey_reg);
 
 	pkey_assert(ret >= 0);
 
@@ -488,9 +493,9 @@ void pkey_disable_set(int pkey, int flags)
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
 	if (flags)
-		pkey_assert(rdpkru() > orig_pkru);
+		pkey_assert(rdpkey_reg() > orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
 }
@@ -500,7 +505,7 @@ void pkey_disable_clear(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights = pkey_get(pkey, syscall_flags);
-	u32 orig_pkru = rdpkru();
+	u32 orig_pkey_reg = rdpkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
@@ -511,17 +516,17 @@ void pkey_disable_clear(int pkey, int flags)
 	pkey_rights |= flags;
 
 	ret = pkey_set(pkey, pkey_rights, 0);
-	/* pkru and flags have the same format */
-	shadow_pkru &= ~(flags << (pkey * 2));
+	/* pkey_reg and flags have the same format */
+	shadow_pkey_reg &= ~(flags << (pkey * 2));
 	pkey_assert(ret >= 0);
 
 	pkey_rights = pkey_get(pkey, syscall_flags);
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkru: 0x%x\n", __func__, pkey, rdpkru());
+	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
 	if (flags)
-		assert(rdpkru() > orig_pkru);
+		assert(rdpkey_reg() > orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
@@ -574,33 +579,38 @@ int alloc_pkey(void)
 	int ret;
 	unsigned long init_val = 0x0;
 
-	dprintf1("alloc_pkey()::%d, pkru: 0x%x shadow: %x\n",
-			__LINE__, __rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__,
+			__LINE__, __rdpkey_reg(), shadow_pkey_reg);
 	ret = sys_pkey_alloc(0, init_val);
 	/*
-	 * pkey_alloc() sets PKRU, so we need to reflect it in
-	 * shadow_pkru:
+	 * pkey_alloc() sets PKEY register, so we need to reflect it in
+	 * shadow_pkey_reg:
 	 */
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
 	if (ret) {
 		/* clear both the bits: */
-		shadow_pkru &= ~(0x3      << (ret * 2));
-		dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-				__LINE__, ret, __rdpkru(), shadow_pkru);
+		shadow_pkey_reg &= ~(0x3      << (ret * 2));
+		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+				__func__,
+				__LINE__, ret, __rdpkey_reg(),
+				shadow_pkey_reg);
 		/*
 		 * move the new state in from init_val
-		 * (remember, we cheated and init_val == pkru format)
+		 * (remember, we cheated and init_val == pkey_reg format)
 		 */
-		shadow_pkru |=  (init_val << (ret * 2));
+		shadow_pkey_reg |=  (init_val << (ret * 2));
 	}
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
-	dprintf1("alloc_pkey()::%d errno: %d\n", __LINE__, errno);
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
+	dprintf1("%s()::%d errno: %d\n", __func__, __LINE__, errno);
 	/* for shadow checking: */
-	rdpkru();
-	dprintf4("alloc_pkey()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n",
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	rdpkey_reg();
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+		__func__, __LINE__, ret, __rdpkey_reg(),
+		shadow_pkey_reg);
 	return ret;
 }
 
@@ -651,8 +661,8 @@ int alloc_random_pkey(void)
 		free_ret = sys_pkey_free(alloced_pkeys[i]);
 		pkey_assert(!free_ret);
 	}
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
 	return ret;
 }
 
@@ -670,11 +680,13 @@ int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
 		if (nr_iterations-- < 0)
 			break;
 
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+		dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
 		sys_pkey_free(rpkey);
-		dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+		dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+			__func__, __LINE__, ret, __rdpkey_reg(),
+			shadow_pkey_reg);
 	}
 	pkey_assert(pkey < NR_PKEYS);
 
@@ -682,8 +694,8 @@ int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
 	dprintf1("mprotect_pkey(%p, %zx, prot=0x%lx, pkey=%ld) ret: %d\n",
 			ptr, size, orig_prot, pkey, ret);
 	pkey_assert(!ret);
-	dprintf1("%s()::%d, ret: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkru(), shadow_pkru);
+	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
+			__LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
 	return ret;
 }
 
@@ -761,7 +773,7 @@ void free_pkey_malloc(void *ptr)
 	void *ptr;
 	int ret;
 
-	rdpkru();
+	rdpkey_reg();
 	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
 			size, prot, pkey);
 	pkey_assert(pkey < NR_PKEYS);
@@ -770,7 +782,7 @@ void free_pkey_malloc(void *ptr)
 	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
 	pkey_assert(!ret);
 	record_pkey_malloc(ptr, size);
-	rdpkru();
+	rdpkey_reg();
 
 	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
 	return ptr;
@@ -933,31 +945,31 @@ void setup_hugetlbfs(void)
 	return ret;
 }
 
-int last_pkru_faults;
-void expected_pk_fault(int pkey)
+int last_pkey_faults;
+void expected_pkey_fault(int pkey)
 {
-	dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
-			__func__, last_pkru_faults, pkru_faults);
+	dprintf2("%s(): last_pkey_faults: %d pkey_faults: %d\n",
+			__func__, last_pkey_faults, pkey_faults);
 	dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
-	pkey_assert(last_pkru_faults + 1 == pkru_faults);
+	pkey_assert(last_pkey_faults + 1 == pkey_faults);
 	pkey_assert(last_si_pkey == pkey);
 	/*
-	 * The signal handler shold have cleared out PKRU to let the
+	 * The signal handler shold have cleared out PKEY register to let the
 	 * test program continue.  We now have to restore it.
 	 */
-	if (__rdpkru() != 0)
+	if (__rdpkey_reg() != 0)
 		pkey_assert(0);
 
-	__wrpkru(shadow_pkru);
-	dprintf1("%s() set PKRU=%x to restore state after signal nuked it\n",
-			__func__, shadow_pkru);
-	last_pkru_faults = pkru_faults;
+	__wrpkey_reg(shadow_pkey_reg);
+	dprintf1("%s() set pkey_reg=%x to restore state after signal "
+		       "nuked it\n", __func__, shadow_pkey_reg);
+	last_pkey_faults = pkey_faults;
 	last_si_pkey = -1;
 }
 
-void do_not_expect_pk_fault(void)
+void do_not_expect_pkey_fault(void)
 {
-	pkey_assert(last_pkru_faults == pkru_faults);
+	pkey_assert(last_pkey_faults == pkey_faults);
 }
 
 int test_fds[10] = { -1 };
@@ -1015,25 +1027,25 @@ void test_read_of_access_disabled_region(int *ptr, u16 pkey)
 	int ptr_contents;
 
 	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
-	rdpkru();
+	rdpkey_reg();
 	pkey_access_deny(pkey);
 	ptr_contents = read_ptr(ptr);
 	dprintf1("*ptr: %d\n", ptr_contents);
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
 	pkey_write_deny(pkey);
 	*ptr = __LINE__;
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 void test_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling access to PKEY[%02d], doing write\n", pkey);
 	pkey_access_deny(pkey);
 	*ptr = __LINE__;
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
@@ -1145,9 +1157,10 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 		int new_pkey;
 		dprintf1("%s() alloc loop: %d\n", __func__, i);
 		new_pkey = alloc_pkey();
-		dprintf4("%s()::%d, err: %d pkru: 0x%x shadow: 0x%x\n", __func__,
-				__LINE__, err, __rdpkru(), shadow_pkru);
-		rdpkru(); /* for shadow checking */
+		dprintf4("%s()::%d, err: %d pkey_reg: 0x%x shadow: 0x%x\n",
+				__func__, __LINE__, err, __rdpkey_reg(),
+				shadow_pkey_reg);
+		rdpkey_reg(); /* for shadow checking */
 		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
 		if ((new_pkey == -1) && (errno == ENOSPC)) {
 			dprintf2("%s() failed to allocate pkey after %d tries\n",
@@ -1177,7 +1190,7 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 	for (i = 0; i < nr_allocated_pkeys; i++) {
 		err = sys_pkey_free(allocated_pkeys[i]);
 		pkey_assert(!err);
-		rdpkru(); /* for shadow checking */
+		rdpkey_reg(); /* for shadow checking */
 	}
 }
 
@@ -1234,7 +1247,7 @@ void test_ptrace_of_child(int *ptr, u16 pkey)
 	pkey_assert(ret != -1);
 	/* Now access from the current task, and expect an exception: */
 	peek_result = read_ptr(ptr);
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 
 	/*
 	 * Try to access the NON-pkey-protected "plain_ptr" via ptrace:
@@ -1244,7 +1257,7 @@ void test_ptrace_of_child(int *ptr, u16 pkey)
 	pkey_assert(ret != -1);
 	/* Now access from the current task, and expect NO exception: */
 	peek_result = read_ptr(plain_ptr);
-	do_not_expect_pk_fault();
+	do_not_expect_pkey_fault();
 
 	ret = ptrace(PTRACE_DETACH, child_pid, ignored, 0);
 	pkey_assert(ret != -1);
@@ -1281,17 +1294,17 @@ void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
 	pkey_assert(!ret);
 	pkey_access_deny(pkey);
 
-	dprintf2("pkru: %x\n", rdpkru());
+	dprintf2("pkey_reg: %x\n", rdpkey_reg());
 
 	/*
 	 * Make sure this is an *instruction* fault
 	 */
 	madvise(p1, PAGE_SIZE, MADV_DONTNEED);
 	lots_o_noops_around_write(&scratch);
-	do_not_expect_pk_fault();
+	do_not_expect_pkey_fault();
 	ptr_contents = read_ptr(p1);
 	dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-	expected_pk_fault(pkey);
+	expected_pkey_fault(pkey);
 }
 
 void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
@@ -1331,7 +1344,7 @@ void run_tests_once(void)
 
 	for (test_nr = 0; test_nr < ARRAY_SIZE(pkey_tests); test_nr++) {
 		int pkey;
-		int orig_pkru_faults = pkru_faults;
+		int orig_pkey_faults = pkey_faults;
 
 		dprintf1("======================\n");
 		dprintf1("test %d preparing...\n", test_nr);
@@ -1346,8 +1359,8 @@ void run_tests_once(void)
 		free_pkey_malloc(ptr);
 		sys_pkey_free(pkey);
 
-		dprintf1("pkru_faults: %d\n", pkru_faults);
-		dprintf1("orig_pkru_faults: %d\n", orig_pkru_faults);
+		dprintf1("pkey_faults: %d\n", pkey_faults);
+		dprintf1("orig_pkey_faults: %d\n", orig_pkey_faults);
 
 		tracing_off();
 		close_test_fds();
@@ -1360,7 +1373,7 @@ void run_tests_once(void)
 
 void pkey_setup_shadow(void)
 {
-	shadow_pkru = __rdpkru();
+	shadow_pkey_reg = __rdpkey_reg();
 }
 
 int main(void)
@@ -1384,7 +1397,7 @@ int main(void)
 	}
 
 	pkey_setup_shadow();
-	printf("startup pkru: %x\n", rdpkru());
+	printf("startup pkey_reg: %x\n", rdpkey_reg());
 	setup_hugetlbfs();
 
 	while (nr_iterations-- > 0)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 34/51] selftest/vm: move generic definitions to header file
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Moved all the generic definition and helper functions to the
header file

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   62 +++++++++++++++++++++++--
 tools/testing/selftests/vm/protection_keys.c |   54 ----------------------
 2 files changed, 57 insertions(+), 59 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 2d91d34..1b15b54 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -13,8 +13,31 @@
 #include <ucontext.h>
 #include <sys/mman.h>
 
+/* Define some kernel-like types */
+#define  u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+
+#ifdef __i386__
+#define SYS_mprotect_key 380
+#define SYS_pkey_alloc	 381
+#define SYS_pkey_free	 382
+#define REG_IP_IDX REG_EIP
+#define si_pkey_offset 0x14
+#else
+#define SYS_mprotect_key 329
+#define SYS_pkey_alloc	 330
+#define SYS_pkey_free	 331
+#define REG_IP_IDX REG_RIP
+#define si_pkey_offset 0x20
+#endif
+
 #define NR_PKEYS 16
 #define PKEY_BITS_PER_PKEY 2
+#define PKEY_DISABLE_ACCESS    0x1
+#define PKEY_DISABLE_WRITE     0x2
+#define HPAGE_SIZE	(1UL<<21)
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -138,11 +161,6 @@ static inline void __pkey_write_allow(int pkey, int do_allow_write)
 	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
 }
 
-#define PROT_PKEY0     0x10            /* protection key value (bit 0) */
-#define PROT_PKEY1     0x20            /* protection key value (bit 1) */
-#define PROT_PKEY2     0x40            /* protection key value (bit 2) */
-#define PROT_PKEY3     0x80            /* protection key value (bit 3) */
-
 #define PAGE_SIZE 4096
 #define MB	(1<<20)
 
@@ -220,4 +238,38 @@ int pkey_reg_xstate_offset(void)
 	return xstate_offset;
 }
 
+static inline void __page_o_noops(void)
+{
+	/* 8-bytes of instruction * 512 bytes = 1 page */
+	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
+}
+
 #endif /* _PKEYS_HELPER_H */
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
+#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
+#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
+#define ALIGN_PTR_UP(p, ptr_align_to)	\
+		((typeof(p))ALIGN_UP((unsigned long)(p), ptr_align_to))
+#define ALIGN_PTR_DOWN(p, ptr_align_to) \
+	((typeof(p))ALIGN_DOWN((unsigned long)(p), ptr_align_to))
+#define __stringify_1(x...)     #x
+#define __stringify(x...)       __stringify_1(x)
+
+#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
+
+int dprint_in_signal;
+char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+
+extern void abort_hooks(void);
+#define pkey_assert(condition) do {		\
+	if (!(condition)) {			\
+		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
+				__FILE__, __LINE__,	\
+				test_nr, iteration_nr);	\
+		dprintf0("errno at assert: %d", errno);	\
+		abort_hooks();			\
+		assert(condition);		\
+	}					\
+} while (0)
+#define raw_assert(cond) assert(cond)
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 27b11e6..dec05e0 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -49,34 +49,9 @@
 int test_nr;
 
 unsigned int shadow_pkey_reg;
-
-#define HPAGE_SIZE	(1UL<<21)
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
-#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
-#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
-#define ALIGN_PTR_UP(p, ptr_align_to)	((typeof(p))ALIGN_UP((unsigned long)(p),	ptr_align_to))
-#define ALIGN_PTR_DOWN(p, ptr_align_to)	((typeof(p))ALIGN_DOWN((unsigned long)(p),	ptr_align_to))
-#define __stringify_1(x...)     #x
-#define __stringify(x...)       __stringify_1(x)
-
-#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
-
 int dprint_in_signal;
 char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
-extern void abort_hooks(void);
-#define pkey_assert(condition) do {		\
-	if (!(condition)) {			\
-		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
-				__FILE__, __LINE__,	\
-				test_nr, iteration_nr);	\
-		dprintf0("errno at assert: %d", errno);	\
-		abort_hooks();			\
-		assert(condition);		\
-	}					\
-} while (0)
-#define raw_assert(cond) assert(cond)
-
 void cat_into_file(char *str, char *file)
 {
 	int fd = open(file, O_RDWR);
@@ -154,12 +129,6 @@ void abort_hooks(void)
 #endif
 }
 
-static inline void __page_o_noops(void)
-{
-	/* 8-bytes of instruction * 512 bytes = 1 page */
-	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
-}
-
 /*
  * This attempts to have roughly a page of instructions followed by a few
  * instructions that do a write, and another page of instructions.  That
@@ -182,26 +151,6 @@ void lots_o_noops_around_write(int *write_to_me)
 	dprintf3("%s() done\n", __func__);
 }
 
-/* Define some kernel-like types */
-#define  u8 uint8_t
-#define u16 uint16_t
-#define u32 uint32_t
-#define u64 uint64_t
-
-#ifdef __i386__
-#define SYS_mprotect_key 380
-#define SYS_pkey_alloc	 381
-#define SYS_pkey_free	 382
-#define REG_IP_IDX REG_EIP
-#define si_pkey_offset 0x14
-#else
-#define SYS_mprotect_key 329
-#define SYS_pkey_alloc	 330
-#define SYS_pkey_free	 331
-#define REG_IP_IDX REG_RIP
-#define si_pkey_offset 0x20
-#endif
-
 void dump_mem(void *dumpme, int len_bytes)
 {
 	char *c = (void *)dumpme;
@@ -412,9 +361,6 @@ void dumpit(char *f)
 	close(fd);
 }
 
-#define PKEY_DISABLE_ACCESS    0x1
-#define PKEY_DISABLE_WRITE     0x2
-
 u32 pkey_get(int pkey, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 34/51] selftest/vm: move generic definitions to header file
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Moved all the generic definition and helper functions to the
header file

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   62 +++++++++++++++++++++++--
 tools/testing/selftests/vm/protection_keys.c |   54 ----------------------
 2 files changed, 57 insertions(+), 59 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 2d91d34..1b15b54 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -13,8 +13,31 @@
 #include <ucontext.h>
 #include <sys/mman.h>
 
+/* Define some kernel-like types */
+#define  u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+
+#ifdef __i386__
+#define SYS_mprotect_key 380
+#define SYS_pkey_alloc	 381
+#define SYS_pkey_free	 382
+#define REG_IP_IDX REG_EIP
+#define si_pkey_offset 0x14
+#else
+#define SYS_mprotect_key 329
+#define SYS_pkey_alloc	 330
+#define SYS_pkey_free	 331
+#define REG_IP_IDX REG_RIP
+#define si_pkey_offset 0x20
+#endif
+
 #define NR_PKEYS 16
 #define PKEY_BITS_PER_PKEY 2
+#define PKEY_DISABLE_ACCESS    0x1
+#define PKEY_DISABLE_WRITE     0x2
+#define HPAGE_SIZE	(1UL<<21)
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -138,11 +161,6 @@ static inline void __pkey_write_allow(int pkey, int do_allow_write)
 	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
 }
 
-#define PROT_PKEY0     0x10            /* protection key value (bit 0) */
-#define PROT_PKEY1     0x20            /* protection key value (bit 1) */
-#define PROT_PKEY2     0x40            /* protection key value (bit 2) */
-#define PROT_PKEY3     0x80            /* protection key value (bit 3) */
-
 #define PAGE_SIZE 4096
 #define MB	(1<<20)
 
@@ -220,4 +238,38 @@ int pkey_reg_xstate_offset(void)
 	return xstate_offset;
 }
 
+static inline void __page_o_noops(void)
+{
+	/* 8-bytes of instruction * 512 bytes = 1 page */
+	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
+}
+
 #endif /* _PKEYS_HELPER_H */
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
+#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
+#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
+#define ALIGN_PTR_UP(p, ptr_align_to)	\
+		((typeof(p))ALIGN_UP((unsigned long)(p), ptr_align_to))
+#define ALIGN_PTR_DOWN(p, ptr_align_to) \
+	((typeof(p))ALIGN_DOWN((unsigned long)(p), ptr_align_to))
+#define __stringify_1(x...)     #x
+#define __stringify(x...)       __stringify_1(x)
+
+#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
+
+int dprint_in_signal;
+char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+
+extern void abort_hooks(void);
+#define pkey_assert(condition) do {		\
+	if (!(condition)) {			\
+		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
+				__FILE__, __LINE__,	\
+				test_nr, iteration_nr);	\
+		dprintf0("errno at assert: %d", errno);	\
+		abort_hooks();			\
+		assert(condition);		\
+	}					\
+} while (0)
+#define raw_assert(cond) assert(cond)
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 27b11e6..dec05e0 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -49,34 +49,9 @@
 int test_nr;
 
 unsigned int shadow_pkey_reg;
-
-#define HPAGE_SIZE	(1UL<<21)
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
-#define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
-#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
-#define ALIGN_PTR_UP(p, ptr_align_to)	((typeof(p))ALIGN_UP((unsigned long)(p),	ptr_align_to))
-#define ALIGN_PTR_DOWN(p, ptr_align_to)	((typeof(p))ALIGN_DOWN((unsigned long)(p),	ptr_align_to))
-#define __stringify_1(x...)     #x
-#define __stringify(x...)       __stringify_1(x)
-
-#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
-
 int dprint_in_signal;
 char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
-extern void abort_hooks(void);
-#define pkey_assert(condition) do {		\
-	if (!(condition)) {			\
-		dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
-				__FILE__, __LINE__,	\
-				test_nr, iteration_nr);	\
-		dprintf0("errno at assert: %d", errno);	\
-		abort_hooks();			\
-		assert(condition);		\
-	}					\
-} while (0)
-#define raw_assert(cond) assert(cond)
-
 void cat_into_file(char *str, char *file)
 {
 	int fd = open(file, O_RDWR);
@@ -154,12 +129,6 @@ void abort_hooks(void)
 #endif
 }
 
-static inline void __page_o_noops(void)
-{
-	/* 8-bytes of instruction * 512 bytes = 1 page */
-	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
-}
-
 /*
  * This attempts to have roughly a page of instructions followed by a few
  * instructions that do a write, and another page of instructions.  That
@@ -182,26 +151,6 @@ void lots_o_noops_around_write(int *write_to_me)
 	dprintf3("%s() done\n", __func__);
 }
 
-/* Define some kernel-like types */
-#define  u8 uint8_t
-#define u16 uint16_t
-#define u32 uint32_t
-#define u64 uint64_t
-
-#ifdef __i386__
-#define SYS_mprotect_key 380
-#define SYS_pkey_alloc	 381
-#define SYS_pkey_free	 382
-#define REG_IP_IDX REG_EIP
-#define si_pkey_offset 0x14
-#else
-#define SYS_mprotect_key 329
-#define SYS_pkey_alloc	 330
-#define SYS_pkey_free	 331
-#define REG_IP_IDX REG_RIP
-#define si_pkey_offset 0x20
-#endif
-
 void dump_mem(void *dumpme, int len_bytes)
 {
 	char *c = (void *)dumpme;
@@ -412,9 +361,6 @@ void dumpit(char *f)
 	close(fd);
 }
 
-#define PKEY_DISABLE_ACCESS    0x1
-#define PKEY_DISABLE_WRITE     0x2
-
 u32 pkey_get(int pkey, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 35/51] selftest/vm: typecast the pkey register
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This is in preparation to accomadate a differing size register
across architectures.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   27 +++++-----
 tools/testing/selftests/vm/protection_keys.c |   71 ++++++++++++++------------
 2 files changed, 52 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 1b15b54..b03f7e5 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -18,6 +18,7 @@
 #define u16 uint16_t
 #define u32 uint32_t
 #define u64 uint64_t
+#define pkey_reg_t u32
 
 #ifdef __i386__
 #define SYS_mprotect_key 380
@@ -77,12 +78,12 @@ static inline void sigsafe_printf(const char *format, ...)
 #define dprintf3(args...) dprintf_level(3, args)
 #define dprintf4(args...) dprintf_level(4, args)
 
-extern unsigned int shadow_pkey_reg;
-static inline unsigned int __rdpkey_reg(void)
+extern pkey_reg_t shadow_pkey_reg;
+static inline pkey_reg_t __rdpkey_reg(void)
 {
 	unsigned int eax, edx;
 	unsigned int ecx = 0;
-	unsigned int pkey_reg;
+	pkey_reg_t pkey_reg;
 
 	asm volatile(".byte 0x0f,0x01,0xee\n\t"
 		     : "=a" (eax), "=d" (edx)
@@ -91,11 +92,11 @@ static inline unsigned int __rdpkey_reg(void)
 	return pkey_reg;
 }
 
-static inline unsigned int _rdpkey_reg(int line)
+static inline pkey_reg_t _rdpkey_reg(int line)
 {
-	unsigned int pkey_reg = __rdpkey_reg();
+	pkey_reg_t pkey_reg = __rdpkey_reg();
 
-	dprintf4("rdpkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
+	dprintf4("rdpkey_reg(line=%d) pkey_reg: %016lx shadow: %016lx\n",
 			line, pkey_reg, shadow_pkey_reg);
 	assert(pkey_reg == shadow_pkey_reg);
 
@@ -104,11 +105,11 @@ static inline unsigned int _rdpkey_reg(int line)
 
 #define rdpkey_reg() _rdpkey_reg(__LINE__)
 
-static inline void __wrpkey_reg(unsigned int pkey_reg)
+static inline void __wrpkey_reg(pkey_reg_t pkey_reg)
 {
-	unsigned int eax = pkey_reg;
-	unsigned int ecx = 0;
-	unsigned int edx = 0;
+	pkey_reg_t eax = pkey_reg;
+	pkey_reg_t ecx = 0;
+	pkey_reg_t edx = 0;
 
 	dprintf4("%s() changing %08x to %08x\n", __func__,
 			__rdpkey_reg(), pkey_reg);
@@ -117,7 +118,7 @@ static inline void __wrpkey_reg(unsigned int pkey_reg)
 	assert(pkey_reg == __rdpkey_reg());
 }
 
-static inline void wrpkey_reg(unsigned int pkey_reg)
+static inline void wrpkey_reg(pkey_reg_t pkey_reg)
 {
 	dprintf4("%s() changing %08x to %08x\n", __func__,
 			__rdpkey_reg(), pkey_reg);
@@ -135,7 +136,7 @@ static inline void wrpkey_reg(unsigned int pkey_reg)
  */
 static inline void __pkey_access_allow(int pkey, int do_allow)
 {
-	unsigned int pkey_reg = rdpkey_reg();
+	pkey_reg_t pkey_reg = rdpkey_reg();
 	int bit = pkey * 2;
 
 	if (do_allow)
@@ -149,7 +150,7 @@ static inline void __pkey_access_allow(int pkey, int do_allow)
 
 static inline void __pkey_write_allow(int pkey, int do_allow_write)
 {
-	long pkey_reg = rdpkey_reg();
+	pkey_reg_t pkey_reg = rdpkey_reg();
 	int bit = pkey * 2 + 1;
 
 	if (do_allow_write)
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index dec05e0..2e8de01 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -48,7 +48,7 @@
 int iteration_nr = 1;
 int test_nr;
 
-unsigned int shadow_pkey_reg;
+pkey_reg_t shadow_pkey_reg;
 int dprint_in_signal;
 char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
@@ -158,7 +158,7 @@ void dump_mem(void *dumpme, int len_bytes)
 
 	for (i = 0; i < len_bytes; i += sizeof(u64)) {
 		u64 *ptr = (u64 *)(c + i);
-		dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
+		dprintf1("dump[%03d][@%p]: %016lx\n", i, ptr, *ptr);
 	}
 }
 
@@ -186,15 +186,16 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	int trapno;
 	unsigned long ip;
 	char *fpregs;
-	u32 *pkey_reg_ptr;
-	u64 si_pkey;
+	pkey_reg_t *pkey_reg_ptr;
+	u32 si_pkey;
 	u32 *si_pkey_ptr;
 	int pkey_reg_offset;
 	fpregset_t fpregset;
 
 	dprint_in_signal = 1;
 	dprintf1(">>>>===============SIGSEGV============================\n");
-	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__, __LINE__,
+	dprintf1("%s()::%d, pkey_reg: 0x%016lx shadow: %016lx\n",
+			__func__, __LINE__,
 			__rdpkey_reg(), shadow_pkey_reg);
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
@@ -202,8 +203,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	fpregset = uctxt->uc_mcontext.fpregs;
 	fpregs = (void *)fpregset;
 
-	dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
-			trapno, ip, si_code_str(si->si_code), si->si_code);
+	dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
+			__func__, trapno, ip, si_code_str(si->si_code),
+			si->si_code);
 #ifdef __i386__
 	/*
 	 * 32-bit has some extra padding so that userspace can tell whether
@@ -240,12 +242,12 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 		exit(4);
 	}
 
-	dprintf1("signal pkey_reg from xsave: %08x\n", *pkey_reg_ptr);
+	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
 	/*
 	 * need __rdpkey_reg() version so we do not do shadow_pkey_reg
 	 * checking
 	 */
-	dprintf1("signal pkey_reg from  pkey_reg: %08x\n", __rdpkey_reg());
+	dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
 	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
 	*(u64 *)pkey_reg_ptr = 0x00000000;
 	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
@@ -364,8 +366,8 @@ void dumpit(char *f)
 u32 pkey_get(int pkey, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 pkey_reg = __rdpkey_reg();
-	u32 shifted_pkey_reg;
+	pkey_reg_t pkey_reg = __rdpkey_reg();
+	pkey_reg_t shifted_pkey_reg;
 	u32 masked_pkey_reg;
 
 	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
@@ -386,8 +388,8 @@ u32 pkey_get(int pkey, unsigned long flags)
 int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 old_pkey_reg = __rdpkey_reg();
-	u32 new_pkey_reg;
+	pkey_reg_t old_pkey_reg = __rdpkey_reg();
+	pkey_reg_t new_pkey_reg;
 
 	/* make sure that 'rights' only contains the bits we expect: */
 	assert(!(rights & ~mask));
@@ -401,10 +403,10 @@ int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 
 	__wrpkey_reg(new_pkey_reg);
 
-	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x"
-	       " pkey_reg now: %x old_pkey_reg: %x\n",
-		__func__, pkey, rights, flags, 0, __rdpkey_reg(),
-		old_pkey_reg);
+	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x "
+			"pkey_reg now: %016lx old_pkey_reg: %016lx\n",
+			__func__, pkey, rights, flags,
+			0, __rdpkey_reg(), old_pkey_reg);
 	return 0;
 }
 
@@ -413,7 +415,7 @@ void pkey_disable_set(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights;
-	u32 orig_pkey_reg = rdpkey_reg();
+	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
@@ -421,8 +423,6 @@ void pkey_disable_set(int pkey, int flags)
 
 	pkey_rights = pkey_get(pkey, syscall_flags);
 
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
 	pkey_assert(pkey_rights >= 0);
 
 	pkey_rights |= flags;
@@ -431,7 +431,8 @@ void pkey_disable_set(int pkey, int flags)
 	assert(!ret);
 	/*pkey_reg and flags have the same format */
 	shadow_pkey_reg |= flags << (pkey * 2);
-	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkey_reg);
+	dprintf1("%s(%d) shadow: 0x%016lx\n",
+		__func__, pkey, shadow_pkey_reg);
 
 	pkey_assert(ret >= 0);
 
@@ -439,7 +440,8 @@ void pkey_disable_set(int pkey, int flags)
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
+	dprintf1("%s(%d) pkey_reg: 0x%lx\n",
+		__func__, pkey, rdpkey_reg());
 	if (flags)
 		pkey_assert(rdpkey_reg() > orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
@@ -451,7 +453,7 @@ void pkey_disable_clear(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights = pkey_get(pkey, syscall_flags);
-	u32 orig_pkey_reg = rdpkey_reg();
+	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
@@ -470,7 +472,8 @@ void pkey_disable_clear(int pkey, int flags)
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
+	dprintf1("%s(%d) pkey_reg: 0x%016lx\n", __func__,
+			pkey, rdpkey_reg());
 	if (flags)
 		assert(rdpkey_reg() > orig_pkey_reg);
 }
@@ -525,20 +528,21 @@ int alloc_pkey(void)
 	int ret;
 	unsigned long init_val = 0x0;
 
-	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__,
+	dprintf1("%s()::%d, pkey_reg: 0x%016lx shadow: %016lx\n", __func__,
 			__LINE__, __rdpkey_reg(), shadow_pkey_reg);
 	ret = sys_pkey_alloc(0, init_val);
 	/*
 	 * pkey_alloc() sets PKEY register, so we need to reflect it in
 	 * shadow_pkey_reg:
 	 */
-	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 			__func__, __LINE__, ret, __rdpkey_reg(),
 			shadow_pkey_reg);
 	if (ret) {
 		/* clear both the bits: */
 		shadow_pkey_reg &= ~(0x3      << (ret * 2));
-		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx "
+				"shadow: 0x%016lx\n",
 				__func__,
 				__LINE__, ret, __rdpkey_reg(),
 				shadow_pkey_reg);
@@ -548,13 +552,13 @@ int alloc_pkey(void)
 		 */
 		shadow_pkey_reg |=  (init_val << (ret * 2));
 	}
-	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 			__func__, __LINE__, ret, __rdpkey_reg(),
 			shadow_pkey_reg);
 	dprintf1("%s()::%d errno: %d\n", __func__, __LINE__, errno);
 	/* for shadow checking: */
 	rdpkey_reg();
-	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 		__func__, __LINE__, ret, __rdpkey_reg(),
 		shadow_pkey_reg);
 	return ret;
@@ -1103,9 +1107,10 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 		int new_pkey;
 		dprintf1("%s() alloc loop: %d\n", __func__, i);
 		new_pkey = alloc_pkey();
-		dprintf4("%s()::%d, err: %d pkey_reg: 0x%x shadow: 0x%x\n",
-				__func__, __LINE__, err, __rdpkey_reg(),
-				shadow_pkey_reg);
+		dprintf4("%s()::%d, err: %d pkey_reg: 0x%016lx "
+				"shadow: 0x%016lx\n",
+			__func__, __LINE__, err, __rdpkey_reg(),
+			shadow_pkey_reg);
 		rdpkey_reg(); /* for shadow checking */
 		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
 		if ((new_pkey == -1) && (errno == ENOSPC)) {
@@ -1343,7 +1348,7 @@ int main(void)
 	}
 
 	pkey_setup_shadow();
-	printf("startup pkey_reg: %x\n", rdpkey_reg());
+	printf("startup pkey_reg: 0x%016lx\n", rdpkey_reg());
 	setup_hugetlbfs();
 
 	while (nr_iterations-- > 0)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 35/51] selftest/vm: typecast the pkey register
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

This is in preparation to accomadate a differing size register
across architectures.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   27 +++++-----
 tools/testing/selftests/vm/protection_keys.c |   71 ++++++++++++++------------
 2 files changed, 52 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 1b15b54..b03f7e5 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -18,6 +18,7 @@
 #define u16 uint16_t
 #define u32 uint32_t
 #define u64 uint64_t
+#define pkey_reg_t u32
 
 #ifdef __i386__
 #define SYS_mprotect_key 380
@@ -77,12 +78,12 @@ static inline void sigsafe_printf(const char *format, ...)
 #define dprintf3(args...) dprintf_level(3, args)
 #define dprintf4(args...) dprintf_level(4, args)
 
-extern unsigned int shadow_pkey_reg;
-static inline unsigned int __rdpkey_reg(void)
+extern pkey_reg_t shadow_pkey_reg;
+static inline pkey_reg_t __rdpkey_reg(void)
 {
 	unsigned int eax, edx;
 	unsigned int ecx = 0;
-	unsigned int pkey_reg;
+	pkey_reg_t pkey_reg;
 
 	asm volatile(".byte 0x0f,0x01,0xee\n\t"
 		     : "=a" (eax), "=d" (edx)
@@ -91,11 +92,11 @@ static inline unsigned int __rdpkey_reg(void)
 	return pkey_reg;
 }
 
-static inline unsigned int _rdpkey_reg(int line)
+static inline pkey_reg_t _rdpkey_reg(int line)
 {
-	unsigned int pkey_reg = __rdpkey_reg();
+	pkey_reg_t pkey_reg = __rdpkey_reg();
 
-	dprintf4("rdpkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
+	dprintf4("rdpkey_reg(line=%d) pkey_reg: %016lx shadow: %016lx\n",
 			line, pkey_reg, shadow_pkey_reg);
 	assert(pkey_reg == shadow_pkey_reg);
 
@@ -104,11 +105,11 @@ static inline unsigned int _rdpkey_reg(int line)
 
 #define rdpkey_reg() _rdpkey_reg(__LINE__)
 
-static inline void __wrpkey_reg(unsigned int pkey_reg)
+static inline void __wrpkey_reg(pkey_reg_t pkey_reg)
 {
-	unsigned int eax = pkey_reg;
-	unsigned int ecx = 0;
-	unsigned int edx = 0;
+	pkey_reg_t eax = pkey_reg;
+	pkey_reg_t ecx = 0;
+	pkey_reg_t edx = 0;
 
 	dprintf4("%s() changing %08x to %08x\n", __func__,
 			__rdpkey_reg(), pkey_reg);
@@ -117,7 +118,7 @@ static inline void __wrpkey_reg(unsigned int pkey_reg)
 	assert(pkey_reg == __rdpkey_reg());
 }
 
-static inline void wrpkey_reg(unsigned int pkey_reg)
+static inline void wrpkey_reg(pkey_reg_t pkey_reg)
 {
 	dprintf4("%s() changing %08x to %08x\n", __func__,
 			__rdpkey_reg(), pkey_reg);
@@ -135,7 +136,7 @@ static inline void wrpkey_reg(unsigned int pkey_reg)
  */
 static inline void __pkey_access_allow(int pkey, int do_allow)
 {
-	unsigned int pkey_reg = rdpkey_reg();
+	pkey_reg_t pkey_reg = rdpkey_reg();
 	int bit = pkey * 2;
 
 	if (do_allow)
@@ -149,7 +150,7 @@ static inline void __pkey_access_allow(int pkey, int do_allow)
 
 static inline void __pkey_write_allow(int pkey, int do_allow_write)
 {
-	long pkey_reg = rdpkey_reg();
+	pkey_reg_t pkey_reg = rdpkey_reg();
 	int bit = pkey * 2 + 1;
 
 	if (do_allow_write)
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index dec05e0..2e8de01 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -48,7 +48,7 @@
 int iteration_nr = 1;
 int test_nr;
 
-unsigned int shadow_pkey_reg;
+pkey_reg_t shadow_pkey_reg;
 int dprint_in_signal;
 char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
@@ -158,7 +158,7 @@ void dump_mem(void *dumpme, int len_bytes)
 
 	for (i = 0; i < len_bytes; i += sizeof(u64)) {
 		u64 *ptr = (u64 *)(c + i);
-		dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
+		dprintf1("dump[%03d][@%p]: %016lx\n", i, ptr, *ptr);
 	}
 }
 
@@ -186,15 +186,16 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	int trapno;
 	unsigned long ip;
 	char *fpregs;
-	u32 *pkey_reg_ptr;
-	u64 si_pkey;
+	pkey_reg_t *pkey_reg_ptr;
+	u32 si_pkey;
 	u32 *si_pkey_ptr;
 	int pkey_reg_offset;
 	fpregset_t fpregset;
 
 	dprint_in_signal = 1;
 	dprintf1(">>>>===============SIGSEGV============================\n");
-	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__, __LINE__,
+	dprintf1("%s()::%d, pkey_reg: 0x%016lx shadow: %016lx\n",
+			__func__, __LINE__,
 			__rdpkey_reg(), shadow_pkey_reg);
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
@@ -202,8 +203,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	fpregset = uctxt->uc_mcontext.fpregs;
 	fpregs = (void *)fpregset;
 
-	dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
-			trapno, ip, si_code_str(si->si_code), si->si_code);
+	dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
+			__func__, trapno, ip, si_code_str(si->si_code),
+			si->si_code);
 #ifdef __i386__
 	/*
 	 * 32-bit has some extra padding so that userspace can tell whether
@@ -240,12 +242,12 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 		exit(4);
 	}
 
-	dprintf1("signal pkey_reg from xsave: %08x\n", *pkey_reg_ptr);
+	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
 	/*
 	 * need __rdpkey_reg() version so we do not do shadow_pkey_reg
 	 * checking
 	 */
-	dprintf1("signal pkey_reg from  pkey_reg: %08x\n", __rdpkey_reg());
+	dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
 	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
 	*(u64 *)pkey_reg_ptr = 0x00000000;
 	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
@@ -364,8 +366,8 @@ void dumpit(char *f)
 u32 pkey_get(int pkey, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 pkey_reg = __rdpkey_reg();
-	u32 shifted_pkey_reg;
+	pkey_reg_t pkey_reg = __rdpkey_reg();
+	pkey_reg_t shifted_pkey_reg;
 	u32 masked_pkey_reg;
 
 	dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
@@ -386,8 +388,8 @@ u32 pkey_get(int pkey, unsigned long flags)
 int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 {
 	u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
-	u32 old_pkey_reg = __rdpkey_reg();
-	u32 new_pkey_reg;
+	pkey_reg_t old_pkey_reg = __rdpkey_reg();
+	pkey_reg_t new_pkey_reg;
 
 	/* make sure that 'rights' only contains the bits we expect: */
 	assert(!(rights & ~mask));
@@ -401,10 +403,10 @@ int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 
 	__wrpkey_reg(new_pkey_reg);
 
-	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x"
-	       " pkey_reg now: %x old_pkey_reg: %x\n",
-		__func__, pkey, rights, flags, 0, __rdpkey_reg(),
-		old_pkey_reg);
+	dprintf3("%s(pkey=%d, rights=%lx, flags=%lx) = %x "
+			"pkey_reg now: %016lx old_pkey_reg: %016lx\n",
+			__func__, pkey, rights, flags,
+			0, __rdpkey_reg(), old_pkey_reg);
 	return 0;
 }
 
@@ -413,7 +415,7 @@ void pkey_disable_set(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights;
-	u32 orig_pkey_reg = rdpkey_reg();
+	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
@@ -421,8 +423,6 @@ void pkey_disable_set(int pkey, int flags)
 
 	pkey_rights = pkey_get(pkey, syscall_flags);
 
-	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
-			pkey, pkey, pkey_rights);
 	pkey_assert(pkey_rights >= 0);
 
 	pkey_rights |= flags;
@@ -431,7 +431,8 @@ void pkey_disable_set(int pkey, int flags)
 	assert(!ret);
 	/*pkey_reg and flags have the same format */
 	shadow_pkey_reg |= flags << (pkey * 2);
-	dprintf1("%s(%d) shadow: 0x%x\n", __func__, pkey, shadow_pkey_reg);
+	dprintf1("%s(%d) shadow: 0x%016lx\n",
+		__func__, pkey, shadow_pkey_reg);
 
 	pkey_assert(ret >= 0);
 
@@ -439,7 +440,8 @@ void pkey_disable_set(int pkey, int flags)
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
+	dprintf1("%s(%d) pkey_reg: 0x%lx\n",
+		__func__, pkey, rdpkey_reg());
 	if (flags)
 		pkey_assert(rdpkey_reg() > orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
@@ -451,7 +453,7 @@ void pkey_disable_clear(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights = pkey_get(pkey, syscall_flags);
-	u32 orig_pkey_reg = rdpkey_reg();
+	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
@@ -470,7 +472,8 @@ void pkey_disable_clear(int pkey, int flags)
 	dprintf1("%s(%d) pkey_get(%d): %x\n", __func__,
 			pkey, pkey, pkey_rights);
 
-	dprintf1("%s(%d) pkey_reg: 0x%x\n", __func__, pkey, rdpkey_reg());
+	dprintf1("%s(%d) pkey_reg: 0x%016lx\n", __func__,
+			pkey, rdpkey_reg());
 	if (flags)
 		assert(rdpkey_reg() > orig_pkey_reg);
 }
@@ -525,20 +528,21 @@ int alloc_pkey(void)
 	int ret;
 	unsigned long init_val = 0x0;
 
-	dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__,
+	dprintf1("%s()::%d, pkey_reg: 0x%016lx shadow: %016lx\n", __func__,
 			__LINE__, __rdpkey_reg(), shadow_pkey_reg);
 	ret = sys_pkey_alloc(0, init_val);
 	/*
 	 * pkey_alloc() sets PKEY register, so we need to reflect it in
 	 * shadow_pkey_reg:
 	 */
-	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 			__func__, __LINE__, ret, __rdpkey_reg(),
 			shadow_pkey_reg);
 	if (ret) {
 		/* clear both the bits: */
 		shadow_pkey_reg &= ~(0x3      << (ret * 2));
-		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx "
+				"shadow: 0x%016lx\n",
 				__func__,
 				__LINE__, ret, __rdpkey_reg(),
 				shadow_pkey_reg);
@@ -548,13 +552,13 @@ int alloc_pkey(void)
 		 */
 		shadow_pkey_reg |=  (init_val << (ret * 2));
 	}
-	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 			__func__, __LINE__, ret, __rdpkey_reg(),
 			shadow_pkey_reg);
 	dprintf1("%s()::%d errno: %d\n", __func__, __LINE__, errno);
 	/* for shadow checking: */
 	rdpkey_reg();
-	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n",
+	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 		__func__, __LINE__, ret, __rdpkey_reg(),
 		shadow_pkey_reg);
 	return ret;
@@ -1103,9 +1107,10 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 		int new_pkey;
 		dprintf1("%s() alloc loop: %d\n", __func__, i);
 		new_pkey = alloc_pkey();
-		dprintf4("%s()::%d, err: %d pkey_reg: 0x%x shadow: 0x%x\n",
-				__func__, __LINE__, err, __rdpkey_reg(),
-				shadow_pkey_reg);
+		dprintf4("%s()::%d, err: %d pkey_reg: 0x%016lx "
+				"shadow: 0x%016lx\n",
+			__func__, __LINE__, err, __rdpkey_reg(),
+			shadow_pkey_reg);
 		rdpkey_reg(); /* for shadow checking */
 		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
 		if ((new_pkey == -1) && (errno == ENOSPC)) {
@@ -1343,7 +1348,7 @@ int main(void)
 	}
 
 	pkey_setup_shadow();
-	printf("startup pkey_reg: %x\n", rdpkey_reg());
+	printf("startup pkey_reg: 0x%016lx\n", rdpkey_reg());
 	setup_hugetlbfs();
 
 	while (nr_iterations-- > 0)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 36/51] selftest/vm: generic function to handle shadow key register
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

helper functions to handler shadow pkey register

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   27 ++++++++++++++++++++
 tools/testing/selftests/vm/protection_keys.c |   34 ++++++++++++++++---------
 2 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index b03f7e5..d521f53 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -44,6 +44,33 @@
 #define DEBUG_LEVEL 0
 #endif
 #define DPRINT_IN_SIGNAL_BUF_SIZE 4096
+
+static inline u32 pkey_to_shift(int pkey)
+{
+	return pkey * PKEY_BITS_PER_PKEY;
+}
+
+static inline pkey_reg_t reset_bits(int pkey, pkey_reg_t bits)
+{
+	u32 shift = pkey_to_shift(pkey);
+
+	return ~(bits << shift);
+}
+
+static inline pkey_reg_t left_shift_bits(int pkey, pkey_reg_t bits)
+{
+	u32 shift = pkey_to_shift(pkey);
+
+	return (bits << shift);
+}
+
+static inline pkey_reg_t right_shift_bits(int pkey, pkey_reg_t bits)
+{
+	u32 shift = pkey_to_shift(pkey);
+
+	return (bits >> shift);
+}
+
 extern int dprint_in_signal;
 extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 static inline void sigsafe_printf(const char *format, ...)
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 2e8de01..8e2e277 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -374,7 +374,7 @@ u32 pkey_get(int pkey, unsigned long flags)
 			__func__, pkey, flags, 0, 0);
 	dprintf2("%s() raw pkey_reg: %x\n", __func__, pkey_reg);
 
-	shifted_pkey_reg = (pkey_reg >> (pkey * PKEY_BITS_PER_PKEY));
+	shifted_pkey_reg = right_shift_bits(pkey, pkey_reg);
 	dprintf2("%s() shifted_pkey_reg: %x\n", __func__, shifted_pkey_reg);
 	masked_pkey_reg = shifted_pkey_reg & mask;
 	dprintf2("%s() masked  pkey_reg: %x\n", __func__, masked_pkey_reg);
@@ -397,9 +397,9 @@ int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 	/* copy old pkey_reg */
 	new_pkey_reg = old_pkey_reg;
 	/* mask out bits from pkey in old value: */
-	new_pkey_reg &= ~(mask << (pkey * PKEY_BITS_PER_PKEY));
+	new_pkey_reg &= reset_bits(pkey, mask);
 	/* OR in new bits for pkey: */
-	new_pkey_reg |= (rights << (pkey * PKEY_BITS_PER_PKEY));
+	new_pkey_reg |= left_shift_bits(pkey, rights);
 
 	__wrpkey_reg(new_pkey_reg);
 
@@ -430,7 +430,7 @@ void pkey_disable_set(int pkey, int flags)
 	ret = pkey_set(pkey, pkey_rights, syscall_flags);
 	assert(!ret);
 	/*pkey_reg and flags have the same format */
-	shadow_pkey_reg |= flags << (pkey * 2);
+	shadow_pkey_reg |= left_shift_bits(pkey, flags);
 	dprintf1("%s(%d) shadow: 0x%016lx\n",
 		__func__, pkey, shadow_pkey_reg);
 
@@ -465,7 +465,7 @@ void pkey_disable_clear(int pkey, int flags)
 
 	ret = pkey_set(pkey, pkey_rights, 0);
 	/* pkey_reg and flags have the same format */
-	shadow_pkey_reg &= ~(flags << (pkey * 2));
+	shadow_pkey_reg &= reset_bits(pkey, flags);
 	pkey_assert(ret >= 0);
 
 	pkey_rights = pkey_get(pkey, syscall_flags);
@@ -523,6 +523,21 @@ int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
 	return ret;
 }
 
+void pkey_setup_shadow(void)
+{
+	shadow_pkey_reg = __rdpkey_reg();
+}
+
+void pkey_reset_shadow(u32 key)
+{
+	shadow_pkey_reg &= reset_bits(key, 0x3);
+}
+
+void pkey_set_shadow(u32 key, u64 init_val)
+{
+	shadow_pkey_reg |=  left_shift_bits(key, init_val);
+}
+
 int alloc_pkey(void)
 {
 	int ret;
@@ -540,7 +555,7 @@ int alloc_pkey(void)
 			shadow_pkey_reg);
 	if (ret) {
 		/* clear both the bits: */
-		shadow_pkey_reg &= ~(0x3      << (ret * 2));
+		pkey_reset_shadow(ret);
 		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx "
 				"shadow: 0x%016lx\n",
 				__func__,
@@ -550,7 +565,7 @@ int alloc_pkey(void)
 		 * move the new state in from init_val
 		 * (remember, we cheated and init_val == pkey_reg format)
 		 */
-		shadow_pkey_reg |=  (init_val << (ret * 2));
+		pkey_set_shadow(ret, init_val);
 	}
 	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 			__func__, __LINE__, ret, __rdpkey_reg(),
@@ -1322,11 +1337,6 @@ void run_tests_once(void)
 	iteration_nr++;
 }
 
-void pkey_setup_shadow(void)
-{
-	shadow_pkey_reg = __rdpkey_reg();
-}
-
 int main(void)
 {
 	int nr_iterations = 22;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 36/51] selftest/vm: generic function to handle shadow key register
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

helper functions to handler shadow pkey register

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   27 ++++++++++++++++++++
 tools/testing/selftests/vm/protection_keys.c |   34 ++++++++++++++++---------
 2 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index b03f7e5..d521f53 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -44,6 +44,33 @@
 #define DEBUG_LEVEL 0
 #endif
 #define DPRINT_IN_SIGNAL_BUF_SIZE 4096
+
+static inline u32 pkey_to_shift(int pkey)
+{
+	return pkey * PKEY_BITS_PER_PKEY;
+}
+
+static inline pkey_reg_t reset_bits(int pkey, pkey_reg_t bits)
+{
+	u32 shift = pkey_to_shift(pkey);
+
+	return ~(bits << shift);
+}
+
+static inline pkey_reg_t left_shift_bits(int pkey, pkey_reg_t bits)
+{
+	u32 shift = pkey_to_shift(pkey);
+
+	return (bits << shift);
+}
+
+static inline pkey_reg_t right_shift_bits(int pkey, pkey_reg_t bits)
+{
+	u32 shift = pkey_to_shift(pkey);
+
+	return (bits >> shift);
+}
+
 extern int dprint_in_signal;
 extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 static inline void sigsafe_printf(const char *format, ...)
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 2e8de01..8e2e277 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -374,7 +374,7 @@ u32 pkey_get(int pkey, unsigned long flags)
 			__func__, pkey, flags, 0, 0);
 	dprintf2("%s() raw pkey_reg: %x\n", __func__, pkey_reg);
 
-	shifted_pkey_reg = (pkey_reg >> (pkey * PKEY_BITS_PER_PKEY));
+	shifted_pkey_reg = right_shift_bits(pkey, pkey_reg);
 	dprintf2("%s() shifted_pkey_reg: %x\n", __func__, shifted_pkey_reg);
 	masked_pkey_reg = shifted_pkey_reg & mask;
 	dprintf2("%s() masked  pkey_reg: %x\n", __func__, masked_pkey_reg);
@@ -397,9 +397,9 @@ int pkey_set(int pkey, unsigned long rights, unsigned long flags)
 	/* copy old pkey_reg */
 	new_pkey_reg = old_pkey_reg;
 	/* mask out bits from pkey in old value: */
-	new_pkey_reg &= ~(mask << (pkey * PKEY_BITS_PER_PKEY));
+	new_pkey_reg &= reset_bits(pkey, mask);
 	/* OR in new bits for pkey: */
-	new_pkey_reg |= (rights << (pkey * PKEY_BITS_PER_PKEY));
+	new_pkey_reg |= left_shift_bits(pkey, rights);
 
 	__wrpkey_reg(new_pkey_reg);
 
@@ -430,7 +430,7 @@ void pkey_disable_set(int pkey, int flags)
 	ret = pkey_set(pkey, pkey_rights, syscall_flags);
 	assert(!ret);
 	/*pkey_reg and flags have the same format */
-	shadow_pkey_reg |= flags << (pkey * 2);
+	shadow_pkey_reg |= left_shift_bits(pkey, flags);
 	dprintf1("%s(%d) shadow: 0x%016lx\n",
 		__func__, pkey, shadow_pkey_reg);
 
@@ -465,7 +465,7 @@ void pkey_disable_clear(int pkey, int flags)
 
 	ret = pkey_set(pkey, pkey_rights, 0);
 	/* pkey_reg and flags have the same format */
-	shadow_pkey_reg &= ~(flags << (pkey * 2));
+	shadow_pkey_reg &= reset_bits(pkey, flags);
 	pkey_assert(ret >= 0);
 
 	pkey_rights = pkey_get(pkey, syscall_flags);
@@ -523,6 +523,21 @@ int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
 	return ret;
 }
 
+void pkey_setup_shadow(void)
+{
+	shadow_pkey_reg = __rdpkey_reg();
+}
+
+void pkey_reset_shadow(u32 key)
+{
+	shadow_pkey_reg &= reset_bits(key, 0x3);
+}
+
+void pkey_set_shadow(u32 key, u64 init_val)
+{
+	shadow_pkey_reg |=  left_shift_bits(key, init_val);
+}
+
 int alloc_pkey(void)
 {
 	int ret;
@@ -540,7 +555,7 @@ int alloc_pkey(void)
 			shadow_pkey_reg);
 	if (ret) {
 		/* clear both the bits: */
-		shadow_pkey_reg &= ~(0x3      << (ret * 2));
+		pkey_reset_shadow(ret);
 		dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx "
 				"shadow: 0x%016lx\n",
 				__func__,
@@ -550,7 +565,7 @@ int alloc_pkey(void)
 		 * move the new state in from init_val
 		 * (remember, we cheated and init_val == pkey_reg format)
 		 */
-		shadow_pkey_reg |=  (init_val << (ret * 2));
+		pkey_set_shadow(ret, init_val);
 	}
 	dprintf4("%s()::%d, ret: %d pkey_reg: 0x%016lx shadow: 0x%016lx\n",
 			__func__, __LINE__, ret, __rdpkey_reg(),
@@ -1322,11 +1337,6 @@ void run_tests_once(void)
 	iteration_nr++;
 }
 
-void pkey_setup_shadow(void)
-{
-	shadow_pkey_reg = __rdpkey_reg();
-}
-
 int main(void)
 {
 	int nr_iterations = 22;
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 37/51] selftest/vm: fix the wrong assert in pkey_disable_set()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

If the flag is 0, no bits will be set. Hence we cant expect
the resulting bitmap to have a higher value than what it
was earlier.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 8e2e277..5aba137 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -443,7 +443,7 @@ void pkey_disable_set(int pkey, int flags)
 	dprintf1("%s(%d) pkey_reg: 0x%lx\n",
 		__func__, pkey, rdpkey_reg());
 	if (flags)
-		pkey_assert(rdpkey_reg() > orig_pkey_reg);
+		pkey_assert(rdpkey_reg() >= orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 37/51] selftest/vm: fix the wrong assert in pkey_disable_set()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

If the flag is 0, no bits will be set. Hence we cant expect
the resulting bitmap to have a higher value than what it
was earlier.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 8e2e277..5aba137 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -443,7 +443,7 @@ void pkey_disable_set(int pkey, int flags)
 	dprintf1("%s(%d) pkey_reg: 0x%lx\n",
 		__func__, pkey, rdpkey_reg());
 	if (flags)
-		pkey_assert(rdpkey_reg() > orig_pkey_reg);
+		pkey_assert(rdpkey_reg() >= orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 38/51] selftest/vm: fixed bugs in pkey_disable_clear()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

instead of clearing the bits, pkey_disable_clear() was setting
the bits. Fixed it.

Also fixed a wrong assertion in that function. When bits are
cleared, the resulting bit value will be less than the original.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 5aba137..384cc9a 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -461,7 +461,7 @@ void pkey_disable_clear(int pkey, int flags)
 			pkey, pkey, pkey_rights);
 	pkey_assert(pkey_rights >= 0);
 
-	pkey_rights |= flags;
+	pkey_rights &= ~flags;
 
 	ret = pkey_set(pkey, pkey_rights, 0);
 	/* pkey_reg and flags have the same format */
@@ -475,7 +475,7 @@ void pkey_disable_clear(int pkey, int flags)
 	dprintf1("%s(%d) pkey_reg: 0x%016lx\n", __func__,
 			pkey, rdpkey_reg());
 	if (flags)
-		assert(rdpkey_reg() > orig_pkey_reg);
+		assert(rdpkey_reg() < orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 38/51] selftest/vm: fixed bugs in pkey_disable_clear()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

instead of clearing the bits, pkey_disable_clear() was setting
the bits. Fixed it.

Also fixed a wrong assertion in that function. When bits are
cleared, the resulting bit value will be less than the original.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 5aba137..384cc9a 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -461,7 +461,7 @@ void pkey_disable_clear(int pkey, int flags)
 			pkey, pkey, pkey_rights);
 	pkey_assert(pkey_rights >= 0);
 
-	pkey_rights |= flags;
+	pkey_rights &= ~flags;
 
 	ret = pkey_set(pkey, pkey_rights, 0);
 	/* pkey_reg and flags have the same format */
@@ -475,7 +475,7 @@ void pkey_disable_clear(int pkey, int flags)
 	dprintf1("%s(%d) pkey_reg: 0x%016lx\n", __func__,
 			pkey, rdpkey_reg());
 	if (flags)
-		assert(rdpkey_reg() > orig_pkey_reg);
+		assert(rdpkey_reg() < orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 39/51] selftest/vm: clear the bits in shadow reg when a pkey is freed.
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

When a key is freed, the  key  is  no  more  effective.
Clear the bits corresponding to the pkey in the shadow
register. Otherwise  it  will carry some spurious bits
which can trigger false-positive asserts.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 384cc9a..2823d4d 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -582,6 +582,9 @@ int alloc_pkey(void)
 int sys_pkey_free(unsigned long pkey)
 {
 	int ret = syscall(SYS_pkey_free, pkey);
+
+	if (!ret)
+		shadow_pkey_reg &= reset_bits(pkey, PKEY_DISABLE_ACCESS);
 	dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
 	return ret;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 39/51] selftest/vm: clear the bits in shadow reg when a pkey is freed.
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

When a key is freed, the  key  is  no  more  effective.
Clear the bits corresponding to the pkey in the shadow
register. Otherwise  it  will carry some spurious bits
which can trigger false-positive asserts.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 384cc9a..2823d4d 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -582,6 +582,9 @@ int alloc_pkey(void)
 int sys_pkey_free(unsigned long pkey)
 {
 	int ret = syscall(SYS_pkey_free, pkey);
+
+	if (!ret)
+		shadow_pkey_reg &= reset_bits(pkey, PKEY_DISABLE_ACCESS);
 	dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
 	return ret;
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 40/51] selftest/vm: fix alloc_random_pkey() to make it really random
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

alloc_random_pkey() was allocating the same pkey every time.
Not all pkeys were geting tested. fixed it.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 2823d4d..1a14027 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -24,6 +24,7 @@
 #define _GNU_SOURCE
 #include <errno.h>
 #include <linux/futex.h>
+#include <time.h>
 #include <sys/time.h>
 #include <sys/syscall.h>
 #include <string.h>
@@ -602,13 +603,15 @@ int alloc_random_pkey(void)
 	int alloced_pkeys[NR_PKEYS];
 	int nr_alloced = 0;
 	int random_index;
+
 	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
+	srand((unsigned int)time(NULL));
 
 	/* allocate every possible key and make a note of which ones we got */
 	max_nr_pkey_allocs = NR_PKEYS;
-	max_nr_pkey_allocs = 1;
 	for (i = 0; i < max_nr_pkey_allocs; i++) {
 		int new_pkey = alloc_pkey();
+
 		if (new_pkey < 0)
 			break;
 		alloced_pkeys[nr_alloced++] = new_pkey;
@@ -624,13 +627,14 @@ int alloc_random_pkey(void)
 	/* go through the allocated ones that we did not want and free them */
 	for (i = 0; i < nr_alloced; i++) {
 		int free_ret;
+
 		if (!alloced_pkeys[i])
 			continue;
 		free_ret = sys_pkey_free(alloced_pkeys[i]);
 		pkey_assert(!free_ret);
 	}
-	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
+	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%016lx\n",
+		__func__, __LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
 	return ret;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 40/51] selftest/vm: fix alloc_random_pkey() to make it really random
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

alloc_random_pkey() was allocating the same pkey every time.
Not all pkeys were geting tested. fixed it.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 2823d4d..1a14027 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -24,6 +24,7 @@
 #define _GNU_SOURCE
 #include <errno.h>
 #include <linux/futex.h>
+#include <time.h>
 #include <sys/time.h>
 #include <sys/syscall.h>
 #include <string.h>
@@ -602,13 +603,15 @@ int alloc_random_pkey(void)
 	int alloced_pkeys[NR_PKEYS];
 	int nr_alloced = 0;
 	int random_index;
+
 	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
+	srand((unsigned int)time(NULL));
 
 	/* allocate every possible key and make a note of which ones we got */
 	max_nr_pkey_allocs = NR_PKEYS;
-	max_nr_pkey_allocs = 1;
 	for (i = 0; i < max_nr_pkey_allocs; i++) {
 		int new_pkey = alloc_pkey();
+
 		if (new_pkey < 0)
 			break;
 		alloced_pkeys[nr_alloced++] = new_pkey;
@@ -624,13 +627,14 @@ int alloc_random_pkey(void)
 	/* go through the allocated ones that we did not want and free them */
 	for (i = 0; i < nr_alloced; i++) {
 		int free_ret;
+
 		if (!alloced_pkeys[i])
 			continue;
 		free_ret = sys_pkey_free(alloced_pkeys[i]);
 		pkey_assert(!free_ret);
 	}
-	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
-			__LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
+	dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%016lx\n",
+		__func__, __LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
 	return ret;
 }
 
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 41/51] selftest/vm: introduce two arch independent abstraction
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

open_hugepage_file() <- opens the huge page file
get_start_key() <--  provides the first non-reserved key.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   11 +++++++++++
 tools/testing/selftests/vm/protection_keys.c |    6 +++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index d521f53..30755be 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -301,3 +301,14 @@ static inline void __page_o_noops(void)
 	}					\
 } while (0)
 #define raw_assert(cond) assert(cond)
+
+static inline int open_hugepage_file(int flag)
+{
+	return open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
+		 O_RDONLY);
+}
+
+static inline int get_start_key(void)
+{
+	return 1;
+}
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 1a14027..19ae991 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -809,7 +809,7 @@ void setup_hugetlbfs(void)
 	 * Now go make sure that we got the pages and that they
 	 * are 2M pages.  Someone might have made 1G the default.
 	 */
-	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", O_RDONLY);
+	fd = open_hugepage_file(O_RDONLY);
 	if (fd < 0) {
 		perror("opening sysfs 2M hugetlb config");
 		return;
@@ -1087,10 +1087,10 @@ void test_kernel_gup_write_to_write_disabled_region(int *ptr, u16 pkey)
 void test_pkey_syscalls_on_non_allocated_pkey(int *ptr, u16 pkey)
 {
 	int err;
-	int i;
+	int i = get_start_key();
 
 	/* Note: 0 is the default pkey, so don't mess with it */
-	for (i = 1; i < NR_PKEYS; i++) {
+	for (; i < NR_PKEYS; i++) {
 		if (pkey == i)
 			continue;
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 41/51] selftest/vm: introduce two arch independent abstraction
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

open_hugepage_file() <- opens the huge page file
get_start_key() <--  provides the first non-reserved key.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   11 +++++++++++
 tools/testing/selftests/vm/protection_keys.c |    6 +++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index d521f53..30755be 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -301,3 +301,14 @@ static inline void __page_o_noops(void)
 	}					\
 } while (0)
 #define raw_assert(cond) assert(cond)
+
+static inline int open_hugepage_file(int flag)
+{
+	return open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
+		 O_RDONLY);
+}
+
+static inline int get_start_key(void)
+{
+	return 1;
+}
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 1a14027..19ae991 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -809,7 +809,7 @@ void setup_hugetlbfs(void)
 	 * Now go make sure that we got the pages and that they
 	 * are 2M pages.  Someone might have made 1G the default.
 	 */
-	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", O_RDONLY);
+	fd = open_hugepage_file(O_RDONLY);
 	if (fd < 0) {
 		perror("opening sysfs 2M hugetlb config");
 		return;
@@ -1087,10 +1087,10 @@ void test_kernel_gup_write_to_write_disabled_region(int *ptr, u16 pkey)
 void test_pkey_syscalls_on_non_allocated_pkey(int *ptr, u16 pkey)
 {
 	int err;
-	int i;
+	int i = get_start_key();
 
 	/* Note: 0 is the default pkey, so don't mess with it */
-	for (i = 1; i < NR_PKEYS; i++) {
+	for (; i < NR_PKEYS; i++) {
 		if (pkey == i)
 			continue;
 
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 42/51] selftest/vm: pkey register should match shadow pkey
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

expected_pkey_fault() is comparing the contents of pkey
register with 0. This may not be true all the time. There
could be bits set by default by the architecture
which can never be changed. Hence compare the value against
shadow pkey register, which is supposed to track the bits
accurately all throughout

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 19ae991..2600f7a 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -926,10 +926,10 @@ void expected_pkey_fault(int pkey)
 	pkey_assert(last_pkey_faults + 1 == pkey_faults);
 	pkey_assert(last_si_pkey == pkey);
 	/*
-	 * The signal handler shold have cleared out PKEY register to let the
+	 * The signal handler shold have cleared out pkey-register to let the
 	 * test program continue.  We now have to restore it.
 	 */
-	if (__rdpkey_reg() != 0)
+	if (__rdpkey_reg() != shadow_pkey_reg)
 		pkey_assert(0);
 
 	__wrpkey_reg(shadow_pkey_reg);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 42/51] selftest/vm: pkey register should match shadow pkey
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

expected_pkey_fault() is comparing the contents of pkey
register with 0. This may not be true all the time. There
could be bits set by default by the architecture
which can never be changed. Hence compare the value against
shadow pkey register, which is supposed to track the bits
accurately all throughout

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 19ae991..2600f7a 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -926,10 +926,10 @@ void expected_pkey_fault(int pkey)
 	pkey_assert(last_pkey_faults + 1 == pkey_faults);
 	pkey_assert(last_si_pkey == pkey);
 	/*
-	 * The signal handler shold have cleared out PKEY register to let the
+	 * The signal handler shold have cleared out pkey-register to let the
 	 * test program continue.  We now have to restore it.
 	 */
-	if (__rdpkey_reg() != 0)
+	if (__rdpkey_reg() != shadow_pkey_reg)
 		pkey_assert(0);
 
 	__wrpkey_reg(shadow_pkey_reg);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 43/51] selftest/vm: generic cleanup
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

cleanup the code to satisfy coding styles.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   81 ++++++++++++++------------
 1 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 2600f7a..3868434 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -4,7 +4,7 @@
  *
  * There are examples in here of:
  *  * how to set protection keys on memory
- *  * how to set/clear bits in pkey registers (the rights register)
+ *  * how to set/clear bits in Protection Key registers (the rights register)
  *  * how to handle SEGV_PKUERR signals and extract pkey-relevant
  *    information from the siginfo
  *
@@ -13,13 +13,18 @@
  *	prefault pages in at malloc, or not
  *	protect MPX bounds tables with protection keys?
  *	make sure VMA splitting/merging is working correctly
- *	OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune to pkeys
- *	look for pkey "leaks" where it is still set on a VMA but "freed" back to the kernel
- *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
+ *	OOMs can destroy mm->mmap (see exit_mmap()),
+ *			so make sure it is immune to pkeys
+ *	look for pkey "leaks" where it is still set on a VMA
+ *			 but "freed" back to the kernel
+ *	do a plain mprotect() to a mprotect_pkey() area and make
+ *			 sure the pkey sticks
  *
  * Compile like this:
- *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc      -o protection_keys    -O2 -g -std=gnu99
+ *			 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99
+ *			 -pthread -Wall protection_keys.c -lrt -ldl -lm
  */
 #define _GNU_SOURCE
 #include <errno.h>
@@ -251,26 +256,11 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
 	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
 	*(u64 *)pkey_reg_ptr = 0x00000000;
-	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
+	dprintf1("WARNING: set PKEY_REG=0 to allow faulting instruction "
+			"to continue\n");
 	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
 	return;
-	if (trapno == 14) {
-		fprintf(stderr,
-			"ERROR: In signal handler, page fault, trapno = %d, ip = %016lx\n",
-			trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(1);
-	} else {
-		fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(2);
-	}
-	dprint_in_signal = 0;
 }
 
 int wait_all_children(void)
@@ -415,7 +405,7 @@ void pkey_disable_set(int pkey, int flags)
 {
 	unsigned long syscall_flags = 0;
 	int ret;
-	int pkey_rights;
+	u32 pkey_rights;
 	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
@@ -453,7 +443,7 @@ void pkey_disable_clear(int pkey, int flags)
 {
 	unsigned long syscall_flags = 0;
 	int ret;
-	int pkey_rights = pkey_get(pkey, syscall_flags);
+	u32 pkey_rights = pkey_get(pkey, syscall_flags);
 	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
@@ -516,9 +506,10 @@ int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
 	return sret;
 }
 
-int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
+int sys_pkey_alloc(unsigned long flags, u64 init_val)
 {
 	int ret = syscall(SYS_pkey_alloc, flags, init_val);
+
 	dprintf1("%s(flags=%lx, init_val=%lx) syscall ret: %d errno: %d\n",
 			__func__, flags, init_val, ret, errno);
 	return ret;
@@ -542,7 +533,7 @@ void pkey_set_shadow(u32 key, u64 init_val)
 int alloc_pkey(void)
 {
 	int ret;
-	unsigned long init_val = 0x0;
+	u64 init_val = 0x0;
 
 	dprintf1("%s()::%d, pkey_reg: 0x%016lx shadow: %016lx\n", __func__,
 			__LINE__, __rdpkey_reg(), shadow_pkey_reg);
@@ -692,7 +683,9 @@ void record_pkey_malloc(void *ptr, long size)
 		/* every record is full */
 		size_t old_nr_records = nr_pkey_malloc_records;
 		size_t new_nr_records = (nr_pkey_malloc_records * 2 + 1);
-		size_t new_size = new_nr_records * sizeof(struct pkey_malloc_record);
+		size_t new_size = new_nr_records *
+				sizeof(struct pkey_malloc_record);
+
 		dprintf2("new_nr_records: %zd\n", new_nr_records);
 		dprintf2("new_size: %zd\n", new_size);
 		pkey_malloc_records = realloc(pkey_malloc_records, new_size);
@@ -716,9 +709,11 @@ void free_pkey_malloc(void *ptr)
 {
 	long i;
 	int ret;
+
 	dprintf3("%s(%p)\n", __func__, ptr);
 	for (i = 0; i < nr_pkey_malloc_records; i++) {
 		struct pkey_malloc_record *rec = &pkey_malloc_records[i];
+
 		dprintf4("looking for ptr %p at record[%ld/%p]: {%p, %ld}\n",
 				ptr, i, rec, rec->ptr, rec->size);
 		if ((ptr <  rec->ptr) ||
@@ -799,11 +794,13 @@ void setup_hugetlbfs(void)
 	char buf[] = "123";
 
 	if (geteuid() != 0) {
-		fprintf(stderr, "WARNING: not run as root, can not do hugetlb test\n");
+		fprintf(stderr,
+			"WARNING: not run as root, can not do hugetlb test\n");
 		return;
 	}
 
-	cat_into_file(__stringify(GET_NR_HUGE_PAGES), "/proc/sys/vm/nr_hugepages");
+	cat_into_file(__stringify(GET_NR_HUGE_PAGES),
+				"/proc/sys/vm/nr_hugepages");
 
 	/*
 	 * Now go make sure that we got the pages and that they
@@ -824,7 +821,8 @@ void setup_hugetlbfs(void)
 	}
 
 	if (atoi(buf) != GET_NR_HUGE_PAGES) {
-		fprintf(stderr, "could not confirm 2M pages, got: '%s' expected %d\n",
+		fprintf(stderr, "could not confirm 2M pages, got:"
+			       " '%s' expected %d\n",
 			buf, GET_NR_HUGE_PAGES);
 		return;
 	}
@@ -957,6 +955,7 @@ void __save_test_fd(int fd)
 int get_test_read_fd(void)
 {
 	int test_fd = open("/etc/passwd", O_RDONLY);
+
 	__save_test_fd(test_fd);
 	return test_fd;
 }
@@ -998,7 +997,8 @@ void test_read_of_access_disabled_region(int *ptr, u16 pkey)
 {
 	int ptr_contents;
 
-	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
+	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n",
+			 pkey, ptr);
 	rdpkey_reg();
 	pkey_access_deny(pkey);
 	ptr_contents = read_ptr(ptr);
@@ -1120,13 +1120,14 @@ void test_pkey_syscalls_bad_args(int *ptr, u16 pkey)
 /* Assumes that all pkeys other than 'pkey' are unallocated */
 void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 {
-	int err;
+	int err = 0;
 	int allocated_pkeys[NR_PKEYS] = {0};
 	int nr_allocated_pkeys = 0;
 	int i;
 
 	for (i = 0; i < NR_PKEYS*2; i++) {
 		int new_pkey;
+
 		dprintf1("%s() alloc loop: %d\n", __func__, i);
 		new_pkey = alloc_pkey();
 		dprintf4("%s()::%d, err: %d pkey_reg: 0x%016lx "
@@ -1134,9 +1135,11 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 			__func__, __LINE__, err, __rdpkey_reg(),
 			shadow_pkey_reg);
 		rdpkey_reg(); /* for shadow checking */
-		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
+		dprintf2("%s() errno: %d ENOSPC: %d\n",
+				__func__, errno, ENOSPC);
 		if ((new_pkey == -1) && (errno == ENOSPC)) {
-			dprintf2("%s() failed to allocate pkey after %d tries\n",
+			dprintf2("%s() failed to allocate pkey "
+					"after %d tries\n",
 				__func__, nr_allocated_pkeys);
 			break;
 		}
@@ -1338,7 +1341,8 @@ void run_tests_once(void)
 		tracing_off();
 		close_test_fds();
 
-		printf("test %2d PASSED (iteration %d)\n", test_nr, iteration_nr);
+		printf("test %2d PASSED (iteration %d)\n",
+				test_nr, iteration_nr);
 		dprintf1("======================\n\n");
 	}
 	iteration_nr++;
@@ -1350,7 +1354,7 @@ int main(void)
 
 	setup_handlers();
 
-	printf("has pku: %d\n", cpu_has_pku());
+	printf("has pkey: %d\n", cpu_has_pku());
 
 	if (!cpu_has_pku()) {
 		int size = PAGE_SIZE;
@@ -1358,7 +1362,8 @@ int main(void)
 
 		printf("running PKEY tests for unsupported CPU/OS\n");
 
-		ptr  = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+		ptr  = mmap(NULL, size, PROT_NONE,
+				MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
 		assert(ptr != (void *)-1);
 		test_mprotect_pkey_on_unsupported_cpu(ptr, 1);
 		exit(0);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 43/51] selftest/vm: generic cleanup
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

cleanup the code to satisfy coding styles.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   81 ++++++++++++++------------
 1 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 2600f7a..3868434 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -4,7 +4,7 @@
  *
  * There are examples in here of:
  *  * how to set protection keys on memory
- *  * how to set/clear bits in pkey registers (the rights register)
+ *  * how to set/clear bits in Protection Key registers (the rights register)
  *  * how to handle SEGV_PKUERR signals and extract pkey-relevant
  *    information from the siginfo
  *
@@ -13,13 +13,18 @@
  *	prefault pages in at malloc, or not
  *	protect MPX bounds tables with protection keys?
  *	make sure VMA splitting/merging is working correctly
- *	OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune to pkeys
- *	look for pkey "leaks" where it is still set on a VMA but "freed" back to the kernel
- *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
+ *	OOMs can destroy mm->mmap (see exit_mmap()),
+ *			so make sure it is immune to pkeys
+ *	look for pkey "leaks" where it is still set on a VMA
+ *			 but "freed" back to the kernel
+ *	do a plain mprotect() to a mprotect_pkey() area and make
+ *			 sure the pkey sticks
  *
  * Compile like this:
- *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc      -o protection_keys    -O2 -g -std=gnu99
+ *			 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99
+ *			 -pthread -Wall protection_keys.c -lrt -ldl -lm
  */
 #define _GNU_SOURCE
 #include <errno.h>
@@ -251,26 +256,11 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
 	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
 	*(u64 *)pkey_reg_ptr = 0x00000000;
-	dprintf1("WARNING: set PRKU=0 to allow faulting instruction to continue\n");
+	dprintf1("WARNING: set PKEY_REG=0 to allow faulting instruction "
+			"to continue\n");
 	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
 	return;
-	if (trapno == 14) {
-		fprintf(stderr,
-			"ERROR: In signal handler, page fault, trapno = %d, ip = %016lx\n",
-			trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(1);
-	} else {
-		fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
-		fprintf(stderr, "si_addr %p\n", si->si_addr);
-		fprintf(stderr, "REG_ERR: %lx\n",
-				(unsigned long)uctxt->uc_mcontext.gregs[REG_ERR]);
-		exit(2);
-	}
-	dprint_in_signal = 0;
 }
 
 int wait_all_children(void)
@@ -415,7 +405,7 @@ void pkey_disable_set(int pkey, int flags)
 {
 	unsigned long syscall_flags = 0;
 	int ret;
-	int pkey_rights;
+	u32 pkey_rights;
 	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
@@ -453,7 +443,7 @@ void pkey_disable_clear(int pkey, int flags)
 {
 	unsigned long syscall_flags = 0;
 	int ret;
-	int pkey_rights = pkey_get(pkey, syscall_flags);
+	u32 pkey_rights = pkey_get(pkey, syscall_flags);
 	pkey_reg_t orig_pkey_reg = rdpkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
@@ -516,9 +506,10 @@ int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
 	return sret;
 }
 
-int sys_pkey_alloc(unsigned long flags, unsigned long init_val)
+int sys_pkey_alloc(unsigned long flags, u64 init_val)
 {
 	int ret = syscall(SYS_pkey_alloc, flags, init_val);
+
 	dprintf1("%s(flags=%lx, init_val=%lx) syscall ret: %d errno: %d\n",
 			__func__, flags, init_val, ret, errno);
 	return ret;
@@ -542,7 +533,7 @@ void pkey_set_shadow(u32 key, u64 init_val)
 int alloc_pkey(void)
 {
 	int ret;
-	unsigned long init_val = 0x0;
+	u64 init_val = 0x0;
 
 	dprintf1("%s()::%d, pkey_reg: 0x%016lx shadow: %016lx\n", __func__,
 			__LINE__, __rdpkey_reg(), shadow_pkey_reg);
@@ -692,7 +683,9 @@ void record_pkey_malloc(void *ptr, long size)
 		/* every record is full */
 		size_t old_nr_records = nr_pkey_malloc_records;
 		size_t new_nr_records = (nr_pkey_malloc_records * 2 + 1);
-		size_t new_size = new_nr_records * sizeof(struct pkey_malloc_record);
+		size_t new_size = new_nr_records *
+				sizeof(struct pkey_malloc_record);
+
 		dprintf2("new_nr_records: %zd\n", new_nr_records);
 		dprintf2("new_size: %zd\n", new_size);
 		pkey_malloc_records = realloc(pkey_malloc_records, new_size);
@@ -716,9 +709,11 @@ void free_pkey_malloc(void *ptr)
 {
 	long i;
 	int ret;
+
 	dprintf3("%s(%p)\n", __func__, ptr);
 	for (i = 0; i < nr_pkey_malloc_records; i++) {
 		struct pkey_malloc_record *rec = &pkey_malloc_records[i];
+
 		dprintf4("looking for ptr %p at record[%ld/%p]: {%p, %ld}\n",
 				ptr, i, rec, rec->ptr, rec->size);
 		if ((ptr <  rec->ptr) ||
@@ -799,11 +794,13 @@ void setup_hugetlbfs(void)
 	char buf[] = "123";
 
 	if (geteuid() != 0) {
-		fprintf(stderr, "WARNING: not run as root, can not do hugetlb test\n");
+		fprintf(stderr,
+			"WARNING: not run as root, can not do hugetlb test\n");
 		return;
 	}
 
-	cat_into_file(__stringify(GET_NR_HUGE_PAGES), "/proc/sys/vm/nr_hugepages");
+	cat_into_file(__stringify(GET_NR_HUGE_PAGES),
+				"/proc/sys/vm/nr_hugepages");
 
 	/*
 	 * Now go make sure that we got the pages and that they
@@ -824,7 +821,8 @@ void setup_hugetlbfs(void)
 	}
 
 	if (atoi(buf) != GET_NR_HUGE_PAGES) {
-		fprintf(stderr, "could not confirm 2M pages, got: '%s' expected %d\n",
+		fprintf(stderr, "could not confirm 2M pages, got:"
+			       " '%s' expected %d\n",
 			buf, GET_NR_HUGE_PAGES);
 		return;
 	}
@@ -957,6 +955,7 @@ void __save_test_fd(int fd)
 int get_test_read_fd(void)
 {
 	int test_fd = open("/etc/passwd", O_RDONLY);
+
 	__save_test_fd(test_fd);
 	return test_fd;
 }
@@ -998,7 +997,8 @@ void test_read_of_access_disabled_region(int *ptr, u16 pkey)
 {
 	int ptr_contents;
 
-	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n", pkey, ptr);
+	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n",
+			 pkey, ptr);
 	rdpkey_reg();
 	pkey_access_deny(pkey);
 	ptr_contents = read_ptr(ptr);
@@ -1120,13 +1120,14 @@ void test_pkey_syscalls_bad_args(int *ptr, u16 pkey)
 /* Assumes that all pkeys other than 'pkey' are unallocated */
 void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 {
-	int err;
+	int err = 0;
 	int allocated_pkeys[NR_PKEYS] = {0};
 	int nr_allocated_pkeys = 0;
 	int i;
 
 	for (i = 0; i < NR_PKEYS*2; i++) {
 		int new_pkey;
+
 		dprintf1("%s() alloc loop: %d\n", __func__, i);
 		new_pkey = alloc_pkey();
 		dprintf4("%s()::%d, err: %d pkey_reg: 0x%016lx "
@@ -1134,9 +1135,11 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 			__func__, __LINE__, err, __rdpkey_reg(),
 			shadow_pkey_reg);
 		rdpkey_reg(); /* for shadow checking */
-		dprintf2("%s() errno: %d ENOSPC: %d\n", __func__, errno, ENOSPC);
+		dprintf2("%s() errno: %d ENOSPC: %d\n",
+				__func__, errno, ENOSPC);
 		if ((new_pkey == -1) && (errno == ENOSPC)) {
-			dprintf2("%s() failed to allocate pkey after %d tries\n",
+			dprintf2("%s() failed to allocate pkey "
+					"after %d tries\n",
 				__func__, nr_allocated_pkeys);
 			break;
 		}
@@ -1338,7 +1341,8 @@ void run_tests_once(void)
 		tracing_off();
 		close_test_fds();
 
-		printf("test %2d PASSED (iteration %d)\n", test_nr, iteration_nr);
+		printf("test %2d PASSED (iteration %d)\n",
+				test_nr, iteration_nr);
 		dprintf1("======================\n\n");
 	}
 	iteration_nr++;
@@ -1350,7 +1354,7 @@ int main(void)
 
 	setup_handlers();
 
-	printf("has pku: %d\n", cpu_has_pku());
+	printf("has pkey: %d\n", cpu_has_pku());
 
 	if (!cpu_has_pku()) {
 		int size = PAGE_SIZE;
@@ -1358,7 +1362,8 @@ int main(void)
 
 		printf("running PKEY tests for unsupported CPU/OS\n");
 
-		ptr  = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+		ptr  = mmap(NULL, size, PROT_NONE,
+				MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
 		assert(ptr != (void *)-1);
 		test_mprotect_pkey_on_unsupported_cpu(ptr, 1);
 		exit(0);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Introduce powerpc implementation for the different
abstactions.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |  109 ++++++++++++++++++++++----
 tools/testing/selftests/vm/protection_keys.c |   38 ++++++----
 2 files changed, 117 insertions(+), 30 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 30755be..f764d66 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -18,27 +18,54 @@
 #define u16 uint16_t
 #define u32 uint32_t
 #define u64 uint64_t
-#define pkey_reg_t u32
 
-#ifdef __i386__
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+
+#ifdef __i386__ /* arch */
 #define SYS_mprotect_key 380
-#define SYS_pkey_alloc	 381
-#define SYS_pkey_free	 382
+#define SYS_pkey_alloc   381
+#define SYS_pkey_free    382
 #define REG_IP_IDX REG_EIP
 #define si_pkey_offset 0x14
-#else
+#elif __x86_64__
 #define SYS_mprotect_key 329
-#define SYS_pkey_alloc	 330
-#define SYS_pkey_free	 331
+#define SYS_pkey_alloc   330
+#define SYS_pkey_free    331
 #define REG_IP_IDX REG_RIP
 #define si_pkey_offset 0x20
-#endif
+#endif /* __x86_64__ */
+
+#define NR_PKEYS		16
+#define NR_RESERVED_PKEYS	1
+#define PKEY_BITS_PER_PKEY	2
+#define PKEY_DISABLE_ACCESS	0x1
+#define PKEY_DISABLE_WRITE	0x2
+#define HPAGE_SIZE		(1UL<<21)
+#define pkey_reg_t u32
 
-#define NR_PKEYS 16
-#define PKEY_BITS_PER_PKEY 2
-#define PKEY_DISABLE_ACCESS    0x1
-#define PKEY_DISABLE_WRITE     0x2
-#define HPAGE_SIZE	(1UL<<21)
+#elif __powerpc64__ /* arch */
+
+#define SYS_mprotect_key 386
+#define SYS_pkey_alloc	 384
+#define SYS_pkey_free	 385
+#define si_pkey_offset	0x20
+#define REG_IP_IDX PT_NIP
+#define REG_TRAPNO PT_TRAP
+#define gregs gp_regs
+#define fpregs fp_regs
+
+#define NR_PKEYS		32
+#define NR_RESERVED_PKEYS_4K	26
+#define NR_RESERVED_PKEYS_64K	3
+#define PKEY_BITS_PER_PKEY	2
+#define PKEY_DISABLE_ACCESS	0x3  /* disable read and write */
+#define PKEY_DISABLE_WRITE	0x2
+#define HPAGE_SIZE		(1UL<<24)
+#define pkey_reg_t u64
+
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -47,7 +74,11 @@
 
 static inline u32 pkey_to_shift(int pkey)
 {
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	return pkey * PKEY_BITS_PER_PKEY;
+#elif __powerpc64__ /* arch */
+	return (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
+#endif /* arch */
 }
 
 static inline pkey_reg_t reset_bits(int pkey, pkey_reg_t bits)
@@ -108,6 +139,7 @@ static inline void sigsafe_printf(const char *format, ...)
 extern pkey_reg_t shadow_pkey_reg;
 static inline pkey_reg_t __rdpkey_reg(void)
 {
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	unsigned int eax, edx;
 	unsigned int ecx = 0;
 	pkey_reg_t pkey_reg;
@@ -115,7 +147,13 @@ static inline pkey_reg_t __rdpkey_reg(void)
 	asm volatile(".byte 0x0f,0x01,0xee\n\t"
 		     : "=a" (eax), "=d" (edx)
 		     : "c" (ecx));
-	pkey_reg = eax;
+#elif __powerpc64__ /* arch */
+	pkey_reg_t eax;
+	pkey_reg_t pkey_reg;
+
+	asm volatile("mfspr %0, 0xd" : "=r" ((pkey_reg_t)(eax)));
+#endif /* arch */
+	pkey_reg = (pkey_reg_t)eax;
 	return pkey_reg;
 }
 
@@ -135,6 +173,7 @@ static inline pkey_reg_t _rdpkey_reg(int line)
 static inline void __wrpkey_reg(pkey_reg_t pkey_reg)
 {
 	pkey_reg_t eax = pkey_reg;
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	pkey_reg_t ecx = 0;
 	pkey_reg_t edx = 0;
 
@@ -143,6 +182,14 @@ static inline void __wrpkey_reg(pkey_reg_t pkey_reg)
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (eax), "c" (ecx), "d" (edx));
 	assert(pkey_reg == __rdpkey_reg());
+
+#elif __powerpc64__ /* arch */
+	dprintf4("%s() changing %llx to %llx\n",
+			 __func__, __rdpkey_reg(), pkey_reg);
+	asm volatile("mtspr 0xd, %0" : : "r" ((unsigned long)(eax)) : "memory");
+#endif /* arch */
+	dprintf4("%s() pkey register after changing %016lx to %016lx\n",
+			 __func__, __rdpkey_reg(), pkey_reg);
 }
 
 static inline void wrpkey_reg(pkey_reg_t pkey_reg)
@@ -189,6 +236,8 @@ static inline void __pkey_write_allow(int pkey, int do_allow_write)
 	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
 }
 
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+
 #define PAGE_SIZE 4096
 #define MB	(1<<20)
 
@@ -271,8 +320,18 @@ static inline void __page_o_noops(void)
 	/* 8-bytes of instruction * 512 bytes = 1 page */
 	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
 }
+#elif __powerpc64__ /* arch */
 
-#endif /* _PKEYS_HELPER_H */
+#define PAGE_SIZE (0x1UL << 16)
+static inline int cpu_has_pku(void)
+{
+	return 1;
+}
+
+/* 8-bytes of instruction * 16384bytes = 1 page */
+#define __page_o_noops() asm(".rept 16384 ; nop; .endr")
+
+#endif /* arch */
 
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
 #define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
@@ -304,11 +363,29 @@ static inline void __page_o_noops(void)
 
 static inline int open_hugepage_file(int flag)
 {
-	return open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
+	int fd;
+
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
 		 O_RDONLY);
+#elif __powerpc64__ /* arch */
+	fd = open("/sys/kernel/mm/hugepages/hugepages-16384kB/nr_hugepages",
+		O_RDONLY);
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
+	return fd;
 }
 
 static inline int get_start_key(void)
 {
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	return 1;
+#elif __powerpc64__ /* arch */
+	return 0;
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
 }
+
+#endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 3868434..4fe42cc 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -186,17 +186,20 @@ void dump_mem(void *dumpme, int len_bytes)
 
 int pkey_faults;
 int last_si_pkey = -1;
+void pkey_access_allow(int pkey);
 void signal_handler(int signum, siginfo_t *si, void *vucontext)
 {
 	ucontext_t *uctxt = vucontext;
 	int trapno;
 	unsigned long ip;
 	char *fpregs;
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	pkey_reg_t *pkey_reg_ptr;
-	u32 si_pkey;
-	u32 *si_pkey_ptr;
 	int pkey_reg_offset;
 	fpregset_t fpregset;
+#endif /* defined(__i386__) || defined(__x86_64__) */
+	u32 si_pkey;
+	u32 *si_pkey_ptr;
 
 	dprint_in_signal = 1;
 	dprintf1(">>>>===============SIGSEGV============================\n");
@@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
 	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
-	fpregset = uctxt->uc_mcontext.fpregs;
-	fpregs = (void *)fpregset;
+	fpregs = (char *) uctxt->uc_mcontext.fpregs;
 
 	dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
 			__func__, trapno, ip, si_code_str(si->si_code),
 			si->si_code);
+
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+
 #ifdef __i386__
 	/*
 	 * 32-bit has some extra padding so that userspace can tell whether
@@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	 * state.  We just assume that it is here.
 	 */
 	fpregs += 0x70;
-#endif
-	pkey_reg_offset = pkey_reg_xstate_offset();
-	pkey_reg_ptr = (void *)(&fpregs[pkey_reg_offset]);
+#endif /* __i386__ */
 
-	dprintf1("siginfo: %p\n", si);
-	dprintf1(" fpregs: %p\n", fpregs);
+	pkey_reg_ptr = (void *)(&fpregs[pkey_reg_xstate_offset()]);
 	/*
-	 * If we got a PKEY fault, we *HAVE* to have at least one bit set in
+	 * If we got a key fault, we *HAVE* to have at least one bit set in
 	 * here.
 	 */
 	dprintf1("pkey_reg_xstate_offset: %d\n", pkey_reg_xstate_offset());
 	if (DEBUG_LEVEL > 4)
 		dump_mem(pkey_reg_ptr - 128, 256);
 	pkey_assert(*pkey_reg_ptr);
+#endif /* defined(__i386__) || defined(__x86_64__) */
+
+	dprintf1("siginfo: %p\n", si);
+	dprintf1(" fpregs: %p\n", fpregs);
 
 	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
 	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
@@ -248,19 +254,23 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 		exit(4);
 	}
 
-	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
 	/*
 	 * need __rdpkey_reg() version so we do not do shadow_pkey_reg
 	 * checking
 	 */
 	dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
-	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
-	*(u64 *)pkey_reg_ptr = 0x00000000;
+	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
+	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
+#elif __powerpc64__
+	pkey_access_allow(si_pkey);
+#endif
+	shadow_pkey_reg &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
 	dprintf1("WARNING: set PKEY_REG=0 to allow faulting instruction "
 			"to continue\n");
 	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
-	return;
 }
 
 int wait_all_children(void)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Introduce powerpc implementation for the different
abstactions.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |  109 ++++++++++++++++++++++----
 tools/testing/selftests/vm/protection_keys.c |   38 ++++++----
 2 files changed, 117 insertions(+), 30 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index 30755be..f764d66 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -18,27 +18,54 @@
 #define u16 uint16_t
 #define u32 uint32_t
 #define u64 uint64_t
-#define pkey_reg_t u32
 
-#ifdef __i386__
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+
+#ifdef __i386__ /* arch */
 #define SYS_mprotect_key 380
-#define SYS_pkey_alloc	 381
-#define SYS_pkey_free	 382
+#define SYS_pkey_alloc   381
+#define SYS_pkey_free    382
 #define REG_IP_IDX REG_EIP
 #define si_pkey_offset 0x14
-#else
+#elif __x86_64__
 #define SYS_mprotect_key 329
-#define SYS_pkey_alloc	 330
-#define SYS_pkey_free	 331
+#define SYS_pkey_alloc   330
+#define SYS_pkey_free    331
 #define REG_IP_IDX REG_RIP
 #define si_pkey_offset 0x20
-#endif
+#endif /* __x86_64__ */
+
+#define NR_PKEYS		16
+#define NR_RESERVED_PKEYS	1
+#define PKEY_BITS_PER_PKEY	2
+#define PKEY_DISABLE_ACCESS	0x1
+#define PKEY_DISABLE_WRITE	0x2
+#define HPAGE_SIZE		(1UL<<21)
+#define pkey_reg_t u32
 
-#define NR_PKEYS 16
-#define PKEY_BITS_PER_PKEY 2
-#define PKEY_DISABLE_ACCESS    0x1
-#define PKEY_DISABLE_WRITE     0x2
-#define HPAGE_SIZE	(1UL<<21)
+#elif __powerpc64__ /* arch */
+
+#define SYS_mprotect_key 386
+#define SYS_pkey_alloc	 384
+#define SYS_pkey_free	 385
+#define si_pkey_offset	0x20
+#define REG_IP_IDX PT_NIP
+#define REG_TRAPNO PT_TRAP
+#define gregs gp_regs
+#define fpregs fp_regs
+
+#define NR_PKEYS		32
+#define NR_RESERVED_PKEYS_4K	26
+#define NR_RESERVED_PKEYS_64K	3
+#define PKEY_BITS_PER_PKEY	2
+#define PKEY_DISABLE_ACCESS	0x3  /* disable read and write */
+#define PKEY_DISABLE_WRITE	0x2
+#define HPAGE_SIZE		(1UL<<24)
+#define pkey_reg_t u64
+
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -47,7 +74,11 @@
 
 static inline u32 pkey_to_shift(int pkey)
 {
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	return pkey * PKEY_BITS_PER_PKEY;
+#elif __powerpc64__ /* arch */
+	return (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
+#endif /* arch */
 }
 
 static inline pkey_reg_t reset_bits(int pkey, pkey_reg_t bits)
@@ -108,6 +139,7 @@ static inline void sigsafe_printf(const char *format, ...)
 extern pkey_reg_t shadow_pkey_reg;
 static inline pkey_reg_t __rdpkey_reg(void)
 {
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	unsigned int eax, edx;
 	unsigned int ecx = 0;
 	pkey_reg_t pkey_reg;
@@ -115,7 +147,13 @@ static inline pkey_reg_t __rdpkey_reg(void)
 	asm volatile(".byte 0x0f,0x01,0xee\n\t"
 		     : "=a" (eax), "=d" (edx)
 		     : "c" (ecx));
-	pkey_reg = eax;
+#elif __powerpc64__ /* arch */
+	pkey_reg_t eax;
+	pkey_reg_t pkey_reg;
+
+	asm volatile("mfspr %0, 0xd" : "=r" ((pkey_reg_t)(eax)));
+#endif /* arch */
+	pkey_reg = (pkey_reg_t)eax;
 	return pkey_reg;
 }
 
@@ -135,6 +173,7 @@ static inline pkey_reg_t _rdpkey_reg(int line)
 static inline void __wrpkey_reg(pkey_reg_t pkey_reg)
 {
 	pkey_reg_t eax = pkey_reg;
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	pkey_reg_t ecx = 0;
 	pkey_reg_t edx = 0;
 
@@ -143,6 +182,14 @@ static inline void __wrpkey_reg(pkey_reg_t pkey_reg)
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (eax), "c" (ecx), "d" (edx));
 	assert(pkey_reg == __rdpkey_reg());
+
+#elif __powerpc64__ /* arch */
+	dprintf4("%s() changing %llx to %llx\n",
+			 __func__, __rdpkey_reg(), pkey_reg);
+	asm volatile("mtspr 0xd, %0" : : "r" ((unsigned long)(eax)) : "memory");
+#endif /* arch */
+	dprintf4("%s() pkey register after changing %016lx to %016lx\n",
+			 __func__, __rdpkey_reg(), pkey_reg);
 }
 
 static inline void wrpkey_reg(pkey_reg_t pkey_reg)
@@ -189,6 +236,8 @@ static inline void __pkey_write_allow(int pkey, int do_allow_write)
 	dprintf4("pkey_reg now: %08x\n", rdpkey_reg());
 }
 
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+
 #define PAGE_SIZE 4096
 #define MB	(1<<20)
 
@@ -271,8 +320,18 @@ static inline void __page_o_noops(void)
 	/* 8-bytes of instruction * 512 bytes = 1 page */
 	asm(".rept 512 ; nopl 0x7eeeeeee(%eax) ; .endr");
 }
+#elif __powerpc64__ /* arch */
 
-#endif /* _PKEYS_HELPER_H */
+#define PAGE_SIZE (0x1UL << 16)
+static inline int cpu_has_pku(void)
+{
+	return 1;
+}
+
+/* 8-bytes of instruction * 16384bytes = 1 page */
+#define __page_o_noops() asm(".rept 16384 ; nop; .endr")
+
+#endif /* arch */
 
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
 #define ALIGN_UP(x, align_to)	(((x) + ((align_to)-1)) & ~((align_to)-1))
@@ -304,11 +363,29 @@ static inline void __page_o_noops(void)
 
 static inline int open_hugepage_file(int flag)
 {
-	return open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
+	int fd;
+
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
 		 O_RDONLY);
+#elif __powerpc64__ /* arch */
+	fd = open("/sys/kernel/mm/hugepages/hugepages-16384kB/nr_hugepages",
+		O_RDONLY);
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
+	return fd;
 }
 
 static inline int get_start_key(void)
 {
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	return 1;
+#elif __powerpc64__ /* arch */
+	return 0;
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
 }
+
+#endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 3868434..4fe42cc 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -186,17 +186,20 @@ void dump_mem(void *dumpme, int len_bytes)
 
 int pkey_faults;
 int last_si_pkey = -1;
+void pkey_access_allow(int pkey);
 void signal_handler(int signum, siginfo_t *si, void *vucontext)
 {
 	ucontext_t *uctxt = vucontext;
 	int trapno;
 	unsigned long ip;
 	char *fpregs;
+#if defined(__i386__) || defined(__x86_64__) /* arch */
 	pkey_reg_t *pkey_reg_ptr;
-	u32 si_pkey;
-	u32 *si_pkey_ptr;
 	int pkey_reg_offset;
 	fpregset_t fpregset;
+#endif /* defined(__i386__) || defined(__x86_64__) */
+	u32 si_pkey;
+	u32 *si_pkey_ptr;
 
 	dprint_in_signal = 1;
 	dprintf1(">>>>===============SIGSEGV============================\n");
@@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
 	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
-	fpregset = uctxt->uc_mcontext.fpregs;
-	fpregs = (void *)fpregset;
+	fpregs = (char *) uctxt->uc_mcontext.fpregs;
 
 	dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
 			__func__, trapno, ip, si_code_str(si->si_code),
 			si->si_code);
+
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+
 #ifdef __i386__
 	/*
 	 * 32-bit has some extra padding so that userspace can tell whether
@@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	 * state.  We just assume that it is here.
 	 */
 	fpregs += 0x70;
-#endif
-	pkey_reg_offset = pkey_reg_xstate_offset();
-	pkey_reg_ptr = (void *)(&fpregs[pkey_reg_offset]);
+#endif /* __i386__ */
 
-	dprintf1("siginfo: %p\n", si);
-	dprintf1(" fpregs: %p\n", fpregs);
+	pkey_reg_ptr = (void *)(&fpregs[pkey_reg_xstate_offset()]);
 	/*
-	 * If we got a PKEY fault, we *HAVE* to have at least one bit set in
+	 * If we got a key fault, we *HAVE* to have at least one bit set in
 	 * here.
 	 */
 	dprintf1("pkey_reg_xstate_offset: %d\n", pkey_reg_xstate_offset());
 	if (DEBUG_LEVEL > 4)
 		dump_mem(pkey_reg_ptr - 128, 256);
 	pkey_assert(*pkey_reg_ptr);
+#endif /* defined(__i386__) || defined(__x86_64__) */
+
+	dprintf1("siginfo: %p\n", si);
+	dprintf1(" fpregs: %p\n", fpregs);
 
 	si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
 	dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
@@ -248,19 +254,23 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 		exit(4);
 	}
 
-	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
 	/*
 	 * need __rdpkey_reg() version so we do not do shadow_pkey_reg
 	 * checking
 	 */
 	dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
-	dprintf1("si_pkey from siginfo: %jx\n", si_pkey);
-	*(u64 *)pkey_reg_ptr = 0x00000000;
+	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
+	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
+#elif __powerpc64__
+	pkey_access_allow(si_pkey);
+#endif
+	shadow_pkey_reg &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
 	dprintf1("WARNING: set PKEY_REG=0 to allow faulting instruction "
 			"to continue\n");
 	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
-	return;
 }
 
 int wait_all_children(void)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 45/51] selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

The maximum number of keys that can be allocated has to
take into consideration, that some keys are reserved by
the architecture for   specific   purpose. Hence cannot
be allocated.

Fix the assertion in test_pkey_alloc_exhaust()

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   14 ++++++++++++++
 tools/testing/selftests/vm/protection_keys.c |    9 ++++-----
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index f764d66..3ea3e06 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -388,4 +388,18 @@ static inline int get_start_key(void)
 #endif /* arch */
 }
 
+static inline int arch_reserved_keys(void)
+{
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	return NR_RESERVED_PKEYS;
+#elif __powerpc64__ /* arch */
+	if (sysconf(_SC_PAGESIZE) == 4096)
+		return NR_RESERVED_PKEYS_4K;
+	else
+		return NR_RESERVED_PKEYS_64K;
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
+}
+
 #endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 4fe42cc..8f0dd94 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1166,12 +1166,11 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 	pkey_assert(i < NR_PKEYS*2);
 
 	/*
-	 * There are 16 pkeys supported in hardware.  One is taken
-	 * up for the default (0) and another can be taken up by
-	 * an execute-only mapping.  Ensure that we can allocate
-	 * at least 14 (16-2).
+	 * There are NR_PKEYS pkeys supported in hardware. arch_reserved_keys()
+	 * are reserved. One   can   be   taken   up by an execute-only mapping.
+	 * Ensure that we can allocate at least the remaining.
 	 */
-	pkey_assert(i >= NR_PKEYS-2);
+	pkey_assert(i >= (NR_PKEYS-arch_reserved_keys()-1));
 
 	for (i = 0; i < nr_allocated_pkeys; i++) {
 		err = sys_pkey_free(allocated_pkeys[i]);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 45/51] selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

The maximum number of keys that can be allocated has to
take into consideration, that some keys are reserved by
the architecture for   specific   purpose. Hence cannot
be allocated.

Fix the assertion in test_pkey_alloc_exhaust()

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/pkey-helpers.h    |   14 ++++++++++++++
 tools/testing/selftests/vm/protection_keys.c |    9 ++++-----
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h b/tools/testing/selftests/vm/pkey-helpers.h
index f764d66..3ea3e06 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -388,4 +388,18 @@ static inline int get_start_key(void)
 #endif /* arch */
 }
 
+static inline int arch_reserved_keys(void)
+{
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	return NR_RESERVED_PKEYS;
+#elif __powerpc64__ /* arch */
+	if (sysconf(_SC_PAGESIZE) == 4096)
+		return NR_RESERVED_PKEYS_4K;
+	else
+		return NR_RESERVED_PKEYS_64K;
+#else /* arch */
+	NOT SUPPORTED
+#endif /* arch */
+}
+
 #endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 4fe42cc..8f0dd94 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1166,12 +1166,11 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 	pkey_assert(i < NR_PKEYS*2);
 
 	/*
-	 * There are 16 pkeys supported in hardware.  One is taken
-	 * up for the default (0) and another can be taken up by
-	 * an execute-only mapping.  Ensure that we can allocate
-	 * at least 14 (16-2).
+	 * There are NR_PKEYS pkeys supported in hardware. arch_reserved_keys()
+	 * are reserved. One   can   be   taken   up by an execute-only mapping.
+	 * Ensure that we can allocate at least the remaining.
 	 */
-	pkey_assert(i >= NR_PKEYS-2);
+	pkey_assert(i >= (NR_PKEYS-arch_reserved_keys()-1));
 
 	for (i = 0; i < nr_allocated_pkeys; i++) {
 		err = sys_pkey_free(allocated_pkeys[i]);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 46/51] selftest/vm: associate key on a mapped page and detect access violation
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

detect access-violation on a page to which access-disabled
key is associated much after the page is mapped.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 8f0dd94..998a44f 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1015,6 +1015,24 @@ void test_read_of_access_disabled_region(int *ptr, u16 pkey)
 	dprintf1("*ptr: %d\n", ptr_contents);
 	expected_pkey_fault(pkey);
 }
+
+void test_read_of_access_disabled_region_with_page_already_mapped(int *ptr,
+		u16 pkey)
+{
+	int ptr_contents;
+
+	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n",
+				pkey, ptr);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("reading ptr before disabling the read : %d\n",
+			ptr_contents);
+	rdpkey_reg();
+	pkey_access_deny(pkey);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("*ptr: %d\n", ptr_contents);
+	expected_pkey_fault(pkey);
+}
+
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
@@ -1309,6 +1327,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 void (*pkey_tests[])(int *ptr, u16 pkey) = {
 	test_read_of_write_disabled_region,
 	test_read_of_access_disabled_region,
+	test_read_of_access_disabled_region_with_page_already_mapped,
 	test_write_of_write_disabled_region,
 	test_write_of_access_disabled_region,
 	test_kernel_write_of_access_disabled_region,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 46/51] selftest/vm: associate key on a mapped page and detect access violation
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

detect access-violation on a page to which access-disabled
key is associated much after the page is mapped.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 8f0dd94..998a44f 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1015,6 +1015,24 @@ void test_read_of_access_disabled_region(int *ptr, u16 pkey)
 	dprintf1("*ptr: %d\n", ptr_contents);
 	expected_pkey_fault(pkey);
 }
+
+void test_read_of_access_disabled_region_with_page_already_mapped(int *ptr,
+		u16 pkey)
+{
+	int ptr_contents;
+
+	dprintf1("disabling access to PKEY[%02d], doing read @ %p\n",
+				pkey, ptr);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("reading ptr before disabling the read : %d\n",
+			ptr_contents);
+	rdpkey_reg();
+	pkey_access_deny(pkey);
+	ptr_contents = read_ptr(ptr);
+	dprintf1("*ptr: %d\n", ptr_contents);
+	expected_pkey_fault(pkey);
+}
+
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
@@ -1309,6 +1327,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 void (*pkey_tests[])(int *ptr, u16 pkey) = {
 	test_read_of_write_disabled_region,
 	test_read_of_access_disabled_region,
+	test_read_of_access_disabled_region_with_page_already_mapped,
 	test_write_of_write_disabled_region,
 	test_write_of_access_disabled_region,
 	test_kernel_write_of_access_disabled_region,
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 47/51] selftest/vm: associate key on a mapped page and detect write violation
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

detect write-violation on a page to which write-disabled
key is associated much after the page is mapped.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 998a44f..0b7b826 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1033,6 +1033,17 @@ void test_read_of_access_disabled_region_with_page_already_mapped(int *ptr,
 	expected_pkey_fault(pkey);
 }
 
+void test_write_of_write_disabled_region_with_page_already_mapped(int *ptr,
+		u16 pkey)
+{
+	*ptr = __LINE__;
+	dprintf1("disabling write access; after accessing the page, "
+		"to PKEY[%02d], doing write\n", pkey);
+	pkey_write_deny(pkey);
+	*ptr = __LINE__;
+	expected_pkey_fault(pkey);
+}
+
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
@@ -1329,6 +1340,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 	test_read_of_access_disabled_region,
 	test_read_of_access_disabled_region_with_page_already_mapped,
 	test_write_of_write_disabled_region,
+	test_write_of_write_disabled_region_with_page_already_mapped,
 	test_write_of_access_disabled_region,
 	test_kernel_write_of_access_disabled_region,
 	test_kernel_write_of_write_disabled_region,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 47/51] selftest/vm: associate key on a mapped page and detect write violation
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

detect write-violation on a page to which write-disabled
key is associated much after the page is mapped.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 998a44f..0b7b826 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1033,6 +1033,17 @@ void test_read_of_access_disabled_region_with_page_already_mapped(int *ptr,
 	expected_pkey_fault(pkey);
 }
 
+void test_write_of_write_disabled_region_with_page_already_mapped(int *ptr,
+		u16 pkey)
+{
+	*ptr = __LINE__;
+	dprintf1("disabling write access; after accessing the page, "
+		"to PKEY[%02d], doing write\n", pkey);
+	pkey_write_deny(pkey);
+	*ptr = __LINE__;
+	expected_pkey_fault(pkey);
+}
+
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
 	dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
@@ -1329,6 +1340,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 	test_read_of_access_disabled_region,
 	test_read_of_access_disabled_region_with_page_already_mapped,
 	test_write_of_write_disabled_region,
+	test_write_of_write_disabled_region_with_page_already_mapped,
 	test_write_of_access_disabled_region,
 	test_kernel_write_of_access_disabled_region,
 	test_kernel_write_of_write_disabled_region,
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 48/51] selftest/vm: detect write violation on a mapped access-denied-key page
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

detect write-violation on a page to which access-disabled
key is associated much after the page is mapped.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 0b7b826..c790bff 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1058,6 +1058,18 @@ void test_write_of_access_disabled_region(int *ptr, u16 pkey)
 	*ptr = __LINE__;
 	expected_pkey_fault(pkey);
 }
+
+void test_write_of_access_disabled_region_with_page_already_mapped(int *ptr,
+			u16 pkey)
+{
+	*ptr = __LINE__;
+	dprintf1("disabling access; after accessing the page, "
+		" to PKEY[%02d], doing write\n", pkey);
+	pkey_access_deny(pkey);
+	*ptr = __LINE__;
+	expected_pkey_fault(pkey);
+}
+
 void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
 	int ret;
@@ -1342,6 +1354,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 	test_write_of_write_disabled_region,
 	test_write_of_write_disabled_region_with_page_already_mapped,
 	test_write_of_access_disabled_region,
+	test_write_of_access_disabled_region_with_page_already_mapped,
 	test_kernel_write_of_access_disabled_region,
 	test_kernel_write_of_write_disabled_region,
 	test_kernel_gup_of_access_disabled_region,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 48/51] selftest/vm: detect write violation on a mapped access-denied-key page
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

detect write-violation on a page to which access-disabled
key is associated much after the page is mapped.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 0b7b826..c790bff 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1058,6 +1058,18 @@ void test_write_of_access_disabled_region(int *ptr, u16 pkey)
 	*ptr = __LINE__;
 	expected_pkey_fault(pkey);
 }
+
+void test_write_of_access_disabled_region_with_page_already_mapped(int *ptr,
+			u16 pkey)
+{
+	*ptr = __LINE__;
+	dprintf1("disabling access; after accessing the page, "
+		" to PKEY[%02d], doing write\n", pkey);
+	pkey_access_deny(pkey);
+	*ptr = __LINE__;
+	expected_pkey_fault(pkey);
+}
+
 void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
 	int ret;
@@ -1342,6 +1354,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 	test_write_of_write_disabled_region,
 	test_write_of_write_disabled_region_with_page_already_mapped,
 	test_write_of_access_disabled_region,
+	test_write_of_access_disabled_region_with_page_already_mapped,
 	test_kernel_write_of_access_disabled_region,
 	test_kernel_write_of_write_disabled_region,
 	test_kernel_gup_of_access_disabled_region,
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 49/51] selftest/vm: sub-page allocator
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

introduce a new allocator that allocates 4k hardware-pages to back
64k linux-page. This allocator is only applicable on powerpc.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   30 ++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index c790bff..7b3649f 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -765,6 +765,35 @@ void free_pkey_malloc(void *ptr)
 	return ptr;
 }
 
+void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey)
+{
+#ifdef __powerpc64__
+	void *ptr;
+	int ret;
+
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	pkey_assert(pkey < NR_PKEYS);
+	ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+
+	ret = syscall(__NR_subpage_prot, ptr, size, NULL);
+	if (ret) {
+		perror("subpage_perm");
+		return PTR_ERR_ENOTSUP;
+	}
+
+	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
+	pkey_assert(!ret);
+	record_pkey_malloc(ptr, size);
+
+	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
+	return ptr;
+#else /*  __powerpc64__ */
+	return PTR_ERR_ENOTSUP;
+#endif /*  __powerpc64__ */
+}
+
 void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
 {
 	int ret;
@@ -887,6 +916,7 @@ void setup_hugetlbfs(void)
 void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
 
 	malloc_pkey_with_mprotect,
+	malloc_pkey_with_mprotect_subpage,
 	malloc_pkey_anon_huge,
 	malloc_pkey_hugetlb
 /* can not do direct with the pkey_mprotect() API:
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 49/51] selftest/vm: sub-page allocator
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

introduce a new allocator that allocates 4k hardware-pages to back
64k linux-page. This allocator is only applicable on powerpc.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 tools/testing/selftests/vm/protection_keys.c |   30 ++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index c790bff..7b3649f 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -765,6 +765,35 @@ void free_pkey_malloc(void *ptr)
 	return ptr;
 }
 
+void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey)
+{
+#ifdef __powerpc64__
+	void *ptr;
+	int ret;
+
+	dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+			size, prot, pkey);
+	pkey_assert(pkey < NR_PKEYS);
+	ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	pkey_assert(ptr != (void *)-1);
+
+	ret = syscall(__NR_subpage_prot, ptr, size, NULL);
+	if (ret) {
+		perror("subpage_perm");
+		return PTR_ERR_ENOTSUP;
+	}
+
+	ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
+	pkey_assert(!ret);
+	record_pkey_malloc(ptr, size);
+
+	dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
+	return ptr;
+#else /*  __powerpc64__ */
+	return PTR_ERR_ENOTSUP;
+#endif /*  __powerpc64__ */
+}
+
 void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
 {
 	int ret;
@@ -887,6 +916,7 @@ void setup_hugetlbfs(void)
 void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
 
 	malloc_pkey_with_mprotect,
+	malloc_pkey_with_mprotect_subpage,
 	malloc_pkey_anon_huge,
 	malloc_pkey_hugetlb
 /* can not do direct with the pkey_mprotect() API:
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 50/51] selftests/powerpc: Add ptrace tests for Protection Key register
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

This test exercises read and write access to the AMR.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 tools/testing/selftests/powerpc/include/reg.h      |    1 +
 tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
 .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++++++++++++++++
 3 files changed, 448 insertions(+), 1 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c

diff --git a/tools/testing/selftests/powerpc/include/reg.h b/tools/testing/selftests/powerpc/include/reg.h
index 4afdebc..7f348c0 100644
--- a/tools/testing/selftests/powerpc/include/reg.h
+++ b/tools/testing/selftests/powerpc/include/reg.h
@@ -54,6 +54,7 @@
 #define SPRN_DSCR_PRIV 0x11	/* Privilege State DSCR */
 #define SPRN_DSCR      0x03	/* Data Stream Control Register */
 #define SPRN_PPR       896	/* Program Priority Register */
+#define SPRN_AMR       13	/* Authority Mask Register - problem state */
 
 /* TEXASR register bits */
 #define TEXASR_FC	0xFE00000000000000
diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile b/tools/testing/selftests/powerpc/ptrace/Makefile
index 4803052..fd896b2 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 TEST_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
               ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
-              ptrace-tm-spd-vsx ptrace-tm-spr
+              ptrace-tm-spd-vsx ptrace-tm-spr ptrace-pkey
 
 include ../../lib.mk
 
@@ -9,6 +9,9 @@ all: $(TEST_PROGS)
 
 CFLAGS += -m64 -I../../../../../usr/include -I../tm -mhtm -fno-pie
 
+ptrace-pkey: ../harness.c ../utils.c ../lib/reg.S ptrace.h ptrace-pkey.c
+	$(LINK.c) $^ $(LDLIBS) -pthread -o $@
+
 $(TEST_PROGS): ../harness.c ../utils.c ../lib/reg.S ptrace.h
 
 clean:
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
new file mode 100644
index 0000000..2e5b676
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -0,0 +1,443 @@
+/*
+ * Ptrace test for Memory Protection Key registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ * Copyright (C) 2017 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <semaphore.h>
+#include "ptrace.h"
+
+#ifndef __NR_pkey_alloc
+#define __NR_pkey_alloc		384
+#endif
+
+#ifndef __NR_pkey_free
+#define __NR_pkey_free		385
+#endif
+
+#ifndef NT_PPC_PKEY
+#define NT_PPC_PKEY		0x110
+#endif
+
+#ifndef PKEY_DISABLE_EXECUTE
+#define PKEY_DISABLE_EXECUTE	0x4
+#endif
+
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64) * 8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey + 1) * AMR_BITS_PER_PKEY))
+
+static const char user_read[] = "[User Read (Running)]";
+static const char user_write[] = "[User Write (Running)]";
+static const char ptrace_read_running[] = "[Ptrace Read (Running)]";
+static const char ptrace_write_running[] = "[Ptrace Write (Running)]";
+
+/* Information shared between the parent and the child. */
+struct shared_info {
+	/* AMR value the parent expects to read from the child. */
+	unsigned long amr1;
+
+	/* AMR value the parent is expected to write to the child. */
+	unsigned long amr2;
+
+	/* AMR value that ptrace should refuse to write to the child. */
+	unsigned long amr3;
+
+	/* IAMR value the parent expects to read from the child. */
+	unsigned long expected_iamr;
+
+	/* UAMOR value the parent expects to read from the child. */
+	unsigned long expected_uamor;
+
+	/*
+	 * IAMR and UAMOR values that ptrace should refuse to write to the child
+	 * (even though they're valid ones) because userspace doesn't have
+	 * access to those registers.
+	 */
+	unsigned long new_iamr;
+	unsigned long new_uamor;
+
+	/* The parent waits on this semaphore. */
+	sem_t sem_parent;
+
+	/* If true, the child should give up as well. */
+	bool parent_gave_up;
+
+	/* The child waits on this semaphore. */
+	sem_t sem_child;
+
+	/* If true, the parent should give up as well. */
+	bool child_gave_up;
+};
+
+#define CHILD_FAIL_IF(x, info)						\
+	do {								\
+		if ((x)) {						\
+			fprintf(stderr,					\
+				"[FAIL] Test FAILED on line %d\n", __LINE__); \
+			(info)->child_gave_up = true;			\
+			prod_parent(info);				\
+			return 1;					\
+		}							\
+	} while (0)
+
+#define PARENT_FAIL_IF(x, info)						\
+	do {								\
+		if ((x)) {						\
+			fprintf(stderr,					\
+				"[FAIL] Test FAILED on line %d\n", __LINE__); \
+			(info)->parent_gave_up = true;			\
+			prod_child(info);				\
+			return 1;					\
+		}							\
+	} while (0)
+
+static int wait_child(struct shared_info *info)
+{
+	int ret;
+
+	/* Wait until the child prods us. */
+	ret = sem_wait(&info->sem_parent);
+	if (ret) {
+		perror("Error waiting for child");
+		return TEST_FAIL;
+	}
+
+	return info->child_gave_up ? TEST_FAIL : TEST_PASS;
+}
+
+static int prod_child(struct shared_info *info)
+{
+	int ret;
+
+	/* Unblock the child now. */
+	ret = sem_post(&info->sem_child);
+	if (ret) {
+		perror("Error prodding child");
+		return TEST_FAIL;
+	}
+
+	return TEST_PASS;
+}
+
+static int wait_parent(struct shared_info *info)
+{
+	int ret;
+
+	/* Wait until the parent prods us. */
+	ret = sem_wait(&info->sem_child);
+	if (ret) {
+		perror("Error waiting for parent");
+		return TEST_FAIL;
+	}
+
+	return info->parent_gave_up ? TEST_FAIL : TEST_PASS;
+}
+
+static int prod_parent(struct shared_info *info)
+{
+	int ret;
+
+	/* Unblock the parent now. */
+	ret = sem_post(&info->sem_parent);
+	if (ret) {
+		perror("Error prodding parent");
+		return TEST_FAIL;
+	}
+
+	return TEST_PASS;
+}
+
+static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
+{
+	return syscall(__NR_pkey_alloc, flags, init_access_rights);
+}
+
+static int sys_pkey_free(int pkey)
+{
+	return syscall(__NR_pkey_free, pkey);
+}
+
+static int ptrace_read_regs(pid_t child, unsigned long regs[], int n)
+{
+	struct iovec iov;
+	long ret;
+
+	FAIL_IF(start_trace(child));
+
+	iov.iov_base = regs;
+	iov.iov_len = n * sizeof(unsigned long);
+
+	ret = ptrace(PTRACE_GETREGSET, child, NT_PPC_PKEY, &iov);
+	FAIL_IF(ret != 0);
+
+	FAIL_IF(stop_trace(child));
+
+	return TEST_PASS;
+}
+
+static long ptrace_write_regs(pid_t child, unsigned long regs[], int n)
+{
+	struct iovec iov;
+	long ret;
+
+	FAIL_IF(start_trace(child));
+
+	iov.iov_base = regs;
+	iov.iov_len = n * sizeof(unsigned long);
+
+	ret = ptrace(PTRACE_SETREGSET, child, NT_PPC_PKEY, &iov);
+
+	FAIL_IF(stop_trace(child));
+
+	return ret;
+}
+
+static int child(struct shared_info *info)
+{
+	unsigned long reg;
+	bool disable_execute = true;
+	int pkey1, pkey2, pkey3;
+	int ret;
+
+	/* Get some pkeys so that we can change their bits in the AMR. */
+	pkey1 = sys_pkey_alloc(0, PKEY_DISABLE_EXECUTE);
+	if (pkey1 < 0) {
+		pkey1 = sys_pkey_alloc(0, 0);
+		CHILD_FAIL_IF(pkey1 < 0, info);
+
+		disable_execute = false;
+	}
+
+	pkey2 = sys_pkey_alloc(0, 0);
+	CHILD_FAIL_IF(pkey2 < 0, info);
+
+	pkey3 = sys_pkey_alloc(0, 0);
+	CHILD_FAIL_IF(pkey3 < 0, info);
+
+	info->amr1 = 3ul << pkeyshift(pkey1);
+	info->amr2 = 3ul << pkeyshift(pkey2);
+	info->amr3 = info->amr2 | 3ul << pkeyshift(pkey3);
+
+	if (disable_execute)
+		info->expected_iamr = 1ul << pkeyshift(pkey1);
+	else
+		info->expected_iamr = 0;
+
+	info->expected_uamor = 3ul << pkeyshift(pkey1) |
+				3ul << pkeyshift(pkey2);
+	info->new_iamr = 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
+	info->new_uamor = 3ul << pkeyshift(pkey1);
+
+	/*
+	 * We won't use pkey3. We just want a plausible but invalid key to test
+	 * whether ptrace will let us write to AMR bits we are not supposed to.
+	 *
+	 * This also tests whether the kernel restores the UAMOR permissions
+	 * after a key is freed.
+	 */
+	sys_pkey_free(pkey3);
+
+	printf("%-30s AMR: %016lx pkey1: %d pkey2: %d pkey3: %d\n",
+	       user_write, info->amr1, pkey1, pkey2, pkey3);
+
+	mtspr(SPRN_AMR, info->amr1);
+
+	/* Wait for parent to read our AMR value and write a new one. */
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	ret = wait_parent(info);
+	if (ret)
+		return ret;
+
+	reg = mfspr(SPRN_AMR);
+
+	printf("%-30s AMR: %016lx\n", user_read, reg);
+
+	CHILD_FAIL_IF(reg != info->amr2, info);
+
+	/*
+	 * Wait for parent to try to write an invalid AMR value.
+	 */
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	ret = wait_parent(info);
+	if (ret)
+		return ret;
+
+	reg = mfspr(SPRN_AMR);
+
+	printf("%-30s AMR: %016lx\n", user_read, reg);
+
+	CHILD_FAIL_IF(reg != info->amr2, info);
+
+	/*
+	 * Wait for parent to try to write an IAMR and a UAMOR value. We can't
+	 * verify them, but we can verify that the AMR didn't change.
+	 */
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	ret = wait_parent(info);
+	if (ret)
+		return ret;
+
+	reg = mfspr(SPRN_AMR);
+
+	printf("%-30s AMR: %016lx\n", user_read, reg);
+
+	CHILD_FAIL_IF(reg != info->amr2, info);
+
+	/* Now let parent now that we are finished. */
+
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	return TEST_PASS;
+}
+
+static int parent(struct shared_info *info, pid_t pid)
+{
+	unsigned long regs[4];
+	int ret, status;
+
+	ret = wait_child(info);
+	if (ret)
+		return ret;
+
+	/* Verify that we can read the pkey registers from the child. */
+	ret = ptrace_read_regs(pid, regs, 3);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       ptrace_read_running, regs[0], regs[1], regs[2]);
+
+	PARENT_FAIL_IF(regs[0] != info->amr1, info);
+	PARENT_FAIL_IF(regs[1] != info->expected_iamr, info);
+	PARENT_FAIL_IF(regs[2] != info->expected_uamor, info);
+
+	/* Write valid AMR value in child. */
+	ret = ptrace_write_regs(pid, &info->amr2, 1);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx\n", ptrace_write_running, info->amr2);
+
+	/* Wake up child so that it can verify it changed. */
+	ret = prod_child(info);
+	PARENT_FAIL_IF(ret, info);
+
+	ret = wait_child(info);
+	if (ret)
+		return ret;
+
+	/* Write invalid AMR value in child. */
+	ret = ptrace_write_regs(pid, &info->amr3, 1);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx\n", ptrace_write_running, info->amr3);
+
+	/* Wake up child so that it can verify it didn't change. */
+	ret = prod_child(info);
+	PARENT_FAIL_IF(ret, info);
+
+	ret = wait_child(info);
+	if (ret)
+		return ret;
+
+	/* Try to write to IAMR. */
+	regs[0] = info->amr1;
+	regs[1] = info->new_iamr;
+	ret = ptrace_write_regs(pid, regs, 2);
+	PARENT_FAIL_IF(!ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx\n",
+	       ptrace_write_running, regs[0], regs[1]);
+
+	/* Try to write to IAMR and UAMOR. */
+	regs[2] = info->new_uamor;
+	ret = ptrace_write_regs(pid, regs, 3);
+	PARENT_FAIL_IF(!ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       ptrace_write_running, regs[0], regs[1], regs[2]);
+
+	/* Verify that all registers still have their expected values. */
+	ret = ptrace_read_regs(pid, regs, 3);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       ptrace_read_running, regs[0], regs[1], regs[2]);
+
+	PARENT_FAIL_IF(regs[0] != info->amr2, info);
+	PARENT_FAIL_IF(regs[1] != info->expected_iamr, info);
+	PARENT_FAIL_IF(regs[2] != info->expected_uamor, info);
+
+	/* Wake up child so that it can verify AMR didn't change and wrap up. */
+	ret = prod_child(info);
+	PARENT_FAIL_IF(ret, info);
+
+	ret = wait(&status);
+	if (ret != pid) {
+		printf("Child's exit status not captured\n");
+		ret = TEST_PASS;
+	} else if (!WIFEXITED(status)) {
+		printf("Child exited abnormally\n");
+		ret = TEST_FAIL;
+	} else
+		ret = WEXITSTATUS(status) ? TEST_FAIL : TEST_PASS;
+
+	return ret;
+}
+
+static int ptrace_pkey(void)
+{
+	struct shared_info *info;
+	int shm_id;
+	int ret;
+	pid_t pid;
+
+	shm_id = shmget(IPC_PRIVATE, sizeof(*info), 0777 | IPC_CREAT);
+	info = shmat(shm_id, NULL, 0);
+
+	ret = sem_init(&info->sem_parent, 1, 0);
+	if (ret) {
+		perror("Semaphore initialization failed");
+		return TEST_FAIL;
+	}
+	ret = sem_init(&info->sem_child, 1, 0);
+	if (ret) {
+		perror("Semaphore initialization failed");
+		return TEST_FAIL;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		perror("fork() failed");
+		ret = TEST_FAIL;
+	} else if (pid == 0)
+		ret = child(info);
+	else
+		ret = parent(info, pid);
+
+	shmdt(info);
+
+	if (pid) {
+		sem_destroy(&info->sem_parent);
+		sem_destroy(&info->sem_child);
+		shmctl(shm_id, IPC_RMID, NULL);
+	}
+
+	return ret;
+}
+
+int main(int argc, char *argv[])
+{
+	return test_harness(ptrace_pkey, "ptrace_pkey");
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 50/51] selftests/powerpc: Add ptrace tests for Protection Key register
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

This test exercises read and write access to the AMR.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 tools/testing/selftests/powerpc/include/reg.h      |    1 +
 tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
 .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++++++++++++++++
 3 files changed, 448 insertions(+), 1 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c

diff --git a/tools/testing/selftests/powerpc/include/reg.h b/tools/testing/selftests/powerpc/include/reg.h
index 4afdebc..7f348c0 100644
--- a/tools/testing/selftests/powerpc/include/reg.h
+++ b/tools/testing/selftests/powerpc/include/reg.h
@@ -54,6 +54,7 @@
 #define SPRN_DSCR_PRIV 0x11	/* Privilege State DSCR */
 #define SPRN_DSCR      0x03	/* Data Stream Control Register */
 #define SPRN_PPR       896	/* Program Priority Register */
+#define SPRN_AMR       13	/* Authority Mask Register - problem state */
 
 /* TEXASR register bits */
 #define TEXASR_FC	0xFE00000000000000
diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile b/tools/testing/selftests/powerpc/ptrace/Makefile
index 4803052..fd896b2 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 TEST_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
               ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
-              ptrace-tm-spd-vsx ptrace-tm-spr
+              ptrace-tm-spd-vsx ptrace-tm-spr ptrace-pkey
 
 include ../../lib.mk
 
@@ -9,6 +9,9 @@ all: $(TEST_PROGS)
 
 CFLAGS += -m64 -I../../../../../usr/include -I../tm -mhtm -fno-pie
 
+ptrace-pkey: ../harness.c ../utils.c ../lib/reg.S ptrace.h ptrace-pkey.c
+	$(LINK.c) $^ $(LDLIBS) -pthread -o $@
+
 $(TEST_PROGS): ../harness.c ../utils.c ../lib/reg.S ptrace.h
 
 clean:
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
new file mode 100644
index 0000000..2e5b676
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -0,0 +1,443 @@
+/*
+ * Ptrace test for Memory Protection Key registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ * Copyright (C) 2017 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <semaphore.h>
+#include "ptrace.h"
+
+#ifndef __NR_pkey_alloc
+#define __NR_pkey_alloc		384
+#endif
+
+#ifndef __NR_pkey_free
+#define __NR_pkey_free		385
+#endif
+
+#ifndef NT_PPC_PKEY
+#define NT_PPC_PKEY		0x110
+#endif
+
+#ifndef PKEY_DISABLE_EXECUTE
+#define PKEY_DISABLE_EXECUTE	0x4
+#endif
+
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64) * 8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey + 1) * AMR_BITS_PER_PKEY))
+
+static const char user_read[] = "[User Read (Running)]";
+static const char user_write[] = "[User Write (Running)]";
+static const char ptrace_read_running[] = "[Ptrace Read (Running)]";
+static const char ptrace_write_running[] = "[Ptrace Write (Running)]";
+
+/* Information shared between the parent and the child. */
+struct shared_info {
+	/* AMR value the parent expects to read from the child. */
+	unsigned long amr1;
+
+	/* AMR value the parent is expected to write to the child. */
+	unsigned long amr2;
+
+	/* AMR value that ptrace should refuse to write to the child. */
+	unsigned long amr3;
+
+	/* IAMR value the parent expects to read from the child. */
+	unsigned long expected_iamr;
+
+	/* UAMOR value the parent expects to read from the child. */
+	unsigned long expected_uamor;
+
+	/*
+	 * IAMR and UAMOR values that ptrace should refuse to write to the child
+	 * (even though they're valid ones) because userspace doesn't have
+	 * access to those registers.
+	 */
+	unsigned long new_iamr;
+	unsigned long new_uamor;
+
+	/* The parent waits on this semaphore. */
+	sem_t sem_parent;
+
+	/* If true, the child should give up as well. */
+	bool parent_gave_up;
+
+	/* The child waits on this semaphore. */
+	sem_t sem_child;
+
+	/* If true, the parent should give up as well. */
+	bool child_gave_up;
+};
+
+#define CHILD_FAIL_IF(x, info)						\
+	do {								\
+		if ((x)) {						\
+			fprintf(stderr,					\
+				"[FAIL] Test FAILED on line %d\n", __LINE__); \
+			(info)->child_gave_up = true;			\
+			prod_parent(info);				\
+			return 1;					\
+		}							\
+	} while (0)
+
+#define PARENT_FAIL_IF(x, info)						\
+	do {								\
+		if ((x)) {						\
+			fprintf(stderr,					\
+				"[FAIL] Test FAILED on line %d\n", __LINE__); \
+			(info)->parent_gave_up = true;			\
+			prod_child(info);				\
+			return 1;					\
+		}							\
+	} while (0)
+
+static int wait_child(struct shared_info *info)
+{
+	int ret;
+
+	/* Wait until the child prods us. */
+	ret = sem_wait(&info->sem_parent);
+	if (ret) {
+		perror("Error waiting for child");
+		return TEST_FAIL;
+	}
+
+	return info->child_gave_up ? TEST_FAIL : TEST_PASS;
+}
+
+static int prod_child(struct shared_info *info)
+{
+	int ret;
+
+	/* Unblock the child now. */
+	ret = sem_post(&info->sem_child);
+	if (ret) {
+		perror("Error prodding child");
+		return TEST_FAIL;
+	}
+
+	return TEST_PASS;
+}
+
+static int wait_parent(struct shared_info *info)
+{
+	int ret;
+
+	/* Wait until the parent prods us. */
+	ret = sem_wait(&info->sem_child);
+	if (ret) {
+		perror("Error waiting for parent");
+		return TEST_FAIL;
+	}
+
+	return info->parent_gave_up ? TEST_FAIL : TEST_PASS;
+}
+
+static int prod_parent(struct shared_info *info)
+{
+	int ret;
+
+	/* Unblock the parent now. */
+	ret = sem_post(&info->sem_parent);
+	if (ret) {
+		perror("Error prodding parent");
+		return TEST_FAIL;
+	}
+
+	return TEST_PASS;
+}
+
+static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
+{
+	return syscall(__NR_pkey_alloc, flags, init_access_rights);
+}
+
+static int sys_pkey_free(int pkey)
+{
+	return syscall(__NR_pkey_free, pkey);
+}
+
+static int ptrace_read_regs(pid_t child, unsigned long regs[], int n)
+{
+	struct iovec iov;
+	long ret;
+
+	FAIL_IF(start_trace(child));
+
+	iov.iov_base = regs;
+	iov.iov_len = n * sizeof(unsigned long);
+
+	ret = ptrace(PTRACE_GETREGSET, child, NT_PPC_PKEY, &iov);
+	FAIL_IF(ret != 0);
+
+	FAIL_IF(stop_trace(child));
+
+	return TEST_PASS;
+}
+
+static long ptrace_write_regs(pid_t child, unsigned long regs[], int n)
+{
+	struct iovec iov;
+	long ret;
+
+	FAIL_IF(start_trace(child));
+
+	iov.iov_base = regs;
+	iov.iov_len = n * sizeof(unsigned long);
+
+	ret = ptrace(PTRACE_SETREGSET, child, NT_PPC_PKEY, &iov);
+
+	FAIL_IF(stop_trace(child));
+
+	return ret;
+}
+
+static int child(struct shared_info *info)
+{
+	unsigned long reg;
+	bool disable_execute = true;
+	int pkey1, pkey2, pkey3;
+	int ret;
+
+	/* Get some pkeys so that we can change their bits in the AMR. */
+	pkey1 = sys_pkey_alloc(0, PKEY_DISABLE_EXECUTE);
+	if (pkey1 < 0) {
+		pkey1 = sys_pkey_alloc(0, 0);
+		CHILD_FAIL_IF(pkey1 < 0, info);
+
+		disable_execute = false;
+	}
+
+	pkey2 = sys_pkey_alloc(0, 0);
+	CHILD_FAIL_IF(pkey2 < 0, info);
+
+	pkey3 = sys_pkey_alloc(0, 0);
+	CHILD_FAIL_IF(pkey3 < 0, info);
+
+	info->amr1 = 3ul << pkeyshift(pkey1);
+	info->amr2 = 3ul << pkeyshift(pkey2);
+	info->amr3 = info->amr2 | 3ul << pkeyshift(pkey3);
+
+	if (disable_execute)
+		info->expected_iamr = 1ul << pkeyshift(pkey1);
+	else
+		info->expected_iamr = 0;
+
+	info->expected_uamor = 3ul << pkeyshift(pkey1) |
+				3ul << pkeyshift(pkey2);
+	info->new_iamr = 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
+	info->new_uamor = 3ul << pkeyshift(pkey1);
+
+	/*
+	 * We won't use pkey3. We just want a plausible but invalid key to test
+	 * whether ptrace will let us write to AMR bits we are not supposed to.
+	 *
+	 * This also tests whether the kernel restores the UAMOR permissions
+	 * after a key is freed.
+	 */
+	sys_pkey_free(pkey3);
+
+	printf("%-30s AMR: %016lx pkey1: %d pkey2: %d pkey3: %d\n",
+	       user_write, info->amr1, pkey1, pkey2, pkey3);
+
+	mtspr(SPRN_AMR, info->amr1);
+
+	/* Wait for parent to read our AMR value and write a new one. */
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	ret = wait_parent(info);
+	if (ret)
+		return ret;
+
+	reg = mfspr(SPRN_AMR);
+
+	printf("%-30s AMR: %016lx\n", user_read, reg);
+
+	CHILD_FAIL_IF(reg != info->amr2, info);
+
+	/*
+	 * Wait for parent to try to write an invalid AMR value.
+	 */
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	ret = wait_parent(info);
+	if (ret)
+		return ret;
+
+	reg = mfspr(SPRN_AMR);
+
+	printf("%-30s AMR: %016lx\n", user_read, reg);
+
+	CHILD_FAIL_IF(reg != info->amr2, info);
+
+	/*
+	 * Wait for parent to try to write an IAMR and a UAMOR value. We can't
+	 * verify them, but we can verify that the AMR didn't change.
+	 */
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	ret = wait_parent(info);
+	if (ret)
+		return ret;
+
+	reg = mfspr(SPRN_AMR);
+
+	printf("%-30s AMR: %016lx\n", user_read, reg);
+
+	CHILD_FAIL_IF(reg != info->amr2, info);
+
+	/* Now let parent now that we are finished. */
+
+	ret = prod_parent(info);
+	CHILD_FAIL_IF(ret, info);
+
+	return TEST_PASS;
+}
+
+static int parent(struct shared_info *info, pid_t pid)
+{
+	unsigned long regs[4];
+	int ret, status;
+
+	ret = wait_child(info);
+	if (ret)
+		return ret;
+
+	/* Verify that we can read the pkey registers from the child. */
+	ret = ptrace_read_regs(pid, regs, 3);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       ptrace_read_running, regs[0], regs[1], regs[2]);
+
+	PARENT_FAIL_IF(regs[0] != info->amr1, info);
+	PARENT_FAIL_IF(regs[1] != info->expected_iamr, info);
+	PARENT_FAIL_IF(regs[2] != info->expected_uamor, info);
+
+	/* Write valid AMR value in child. */
+	ret = ptrace_write_regs(pid, &info->amr2, 1);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx\n", ptrace_write_running, info->amr2);
+
+	/* Wake up child so that it can verify it changed. */
+	ret = prod_child(info);
+	PARENT_FAIL_IF(ret, info);
+
+	ret = wait_child(info);
+	if (ret)
+		return ret;
+
+	/* Write invalid AMR value in child. */
+	ret = ptrace_write_regs(pid, &info->amr3, 1);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx\n", ptrace_write_running, info->amr3);
+
+	/* Wake up child so that it can verify it didn't change. */
+	ret = prod_child(info);
+	PARENT_FAIL_IF(ret, info);
+
+	ret = wait_child(info);
+	if (ret)
+		return ret;
+
+	/* Try to write to IAMR. */
+	regs[0] = info->amr1;
+	regs[1] = info->new_iamr;
+	ret = ptrace_write_regs(pid, regs, 2);
+	PARENT_FAIL_IF(!ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx\n",
+	       ptrace_write_running, regs[0], regs[1]);
+
+	/* Try to write to IAMR and UAMOR. */
+	regs[2] = info->new_uamor;
+	ret = ptrace_write_regs(pid, regs, 3);
+	PARENT_FAIL_IF(!ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       ptrace_write_running, regs[0], regs[1], regs[2]);
+
+	/* Verify that all registers still have their expected values. */
+	ret = ptrace_read_regs(pid, regs, 3);
+	PARENT_FAIL_IF(ret, info);
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       ptrace_read_running, regs[0], regs[1], regs[2]);
+
+	PARENT_FAIL_IF(regs[0] != info->amr2, info);
+	PARENT_FAIL_IF(regs[1] != info->expected_iamr, info);
+	PARENT_FAIL_IF(regs[2] != info->expected_uamor, info);
+
+	/* Wake up child so that it can verify AMR didn't change and wrap up. */
+	ret = prod_child(info);
+	PARENT_FAIL_IF(ret, info);
+
+	ret = wait(&status);
+	if (ret != pid) {
+		printf("Child's exit status not captured\n");
+		ret = TEST_PASS;
+	} else if (!WIFEXITED(status)) {
+		printf("Child exited abnormally\n");
+		ret = TEST_FAIL;
+	} else
+		ret = WEXITSTATUS(status) ? TEST_FAIL : TEST_PASS;
+
+	return ret;
+}
+
+static int ptrace_pkey(void)
+{
+	struct shared_info *info;
+	int shm_id;
+	int ret;
+	pid_t pid;
+
+	shm_id = shmget(IPC_PRIVATE, sizeof(*info), 0777 | IPC_CREAT);
+	info = shmat(shm_id, NULL, 0);
+
+	ret = sem_init(&info->sem_parent, 1, 0);
+	if (ret) {
+		perror("Semaphore initialization failed");
+		return TEST_FAIL;
+	}
+	ret = sem_init(&info->sem_child, 1, 0);
+	if (ret) {
+		perror("Semaphore initialization failed");
+		return TEST_FAIL;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		perror("fork() failed");
+		ret = TEST_FAIL;
+	} else if (pid == 0)
+		ret = child(info);
+	else
+		ret = parent(info, pid);
+
+	shmdt(info);
+
+	if (pid) {
+		sem_destroy(&info->sem_parent);
+		sem_destroy(&info->sem_child);
+		shmctl(shm_id, IPC_RMID, NULL);
+	}
+
+	return ret;
+}
+
+int main(int argc, char *argv[])
+{
+	return test_harness(ptrace_pkey, "ptrace_pkey");
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 51/51] selftests/powerpc: Add core file test for Protection Key register
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06  8:57   ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

This test verifies that the AMR is being written to a
process' core file.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 tools/testing/selftests/powerpc/ptrace/Makefile    |    2 +-
 tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++++++++++++++++
 2 files changed, 439 insertions(+), 1 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile b/tools/testing/selftests/powerpc/ptrace/Makefile
index fd896b2..ca25fda 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 TEST_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
               ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
-              ptrace-tm-spd-vsx ptrace-tm-spr ptrace-pkey
+              ptrace-tm-spd-vsx ptrace-tm-spr ptrace-pkey core-pkey
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/core-pkey.c b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
new file mode 100644
index 0000000..2328f8c
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
@@ -0,0 +1,438 @@
+/*
+ * Ptrace test for Memory Protection Key registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ * Copyright (C) 2017 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <limits.h>
+#include <semaphore.h>
+#include <linux/kernel.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include "ptrace.h"
+
+#ifndef __NR_pkey_alloc
+#define __NR_pkey_alloc		384
+#endif
+
+#ifndef __NR_pkey_free
+#define __NR_pkey_free		385
+#endif
+
+#ifndef NT_PPC_PKEY
+#define NT_PPC_PKEY		0x110
+#endif
+
+#ifndef PKEY_DISABLE_EXECUTE
+#define PKEY_DISABLE_EXECUTE	0x4
+#endif
+
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64) * 8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey + 1) * AMR_BITS_PER_PKEY))
+
+#define CORE_FILE_LIMIT	(5 * 1024 * 1024)	/* 5 MB should be enough */
+
+static const char core_pattern_file[] = "/proc/sys/kernel/core_pattern";
+
+static const char user_write[] = "[User Write (Running)]";
+static const char core_read_running[] = "[Core Read (Running)]";
+
+/* Information shared between the parent and the child. */
+struct shared_info {
+	/* AMR value the parent expects to read in the core file. */
+	unsigned long amr;
+
+	/* IAMR value the parent expects to read from the child. */
+	unsigned long iamr;
+
+	/* UAMOR value the parent expects to read from the child. */
+	unsigned long uamor;
+
+	/* When the child crashed. */
+	time_t core_time;
+};
+
+static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
+{
+	return syscall(__NR_pkey_alloc, flags, init_access_rights);
+}
+
+static int sys_pkey_free(int pkey)
+{
+	return syscall(__NR_pkey_free, pkey);
+}
+
+static int increase_core_file_limit(void)
+{
+	struct rlimit rlim;
+	int ret;
+
+	ret = getrlimit(RLIMIT_CORE, &rlim);
+	FAIL_IF(ret);
+
+	if (rlim.rlim_cur != RLIM_INFINITY && rlim.rlim_cur < CORE_FILE_LIMIT) {
+		rlim.rlim_cur = CORE_FILE_LIMIT;
+
+		if (rlim.rlim_max != RLIM_INFINITY &&
+		    rlim.rlim_max < CORE_FILE_LIMIT)
+			rlim.rlim_max = CORE_FILE_LIMIT;
+
+		ret = setrlimit(RLIMIT_CORE, &rlim);
+		FAIL_IF(ret);
+	}
+
+	ret = getrlimit(RLIMIT_FSIZE, &rlim);
+	FAIL_IF(ret);
+
+	if (rlim.rlim_cur != RLIM_INFINITY && rlim.rlim_cur < CORE_FILE_LIMIT) {
+		rlim.rlim_cur = CORE_FILE_LIMIT;
+
+		if (rlim.rlim_max != RLIM_INFINITY &&
+		    rlim.rlim_max < CORE_FILE_LIMIT)
+			rlim.rlim_max = CORE_FILE_LIMIT;
+
+		ret = setrlimit(RLIMIT_FSIZE, &rlim);
+		FAIL_IF(ret);
+	}
+
+	return TEST_PASS;
+}
+
+static int child(struct shared_info *info)
+{
+	bool disable_execute = true;
+	int pkey1, pkey2, pkey3;
+	int *ptr, ret;
+
+	ret = increase_core_file_limit();
+	FAIL_IF(ret);
+
+	/* Get some pkeys so that we can change their bits in the AMR. */
+	pkey1 = sys_pkey_alloc(0, PKEY_DISABLE_EXECUTE);
+	if (pkey1 < 0) {
+		pkey1 = sys_pkey_alloc(0, 0);
+		FAIL_IF(pkey1 < 0);
+
+		disable_execute = false;
+	}
+
+	pkey2 = sys_pkey_alloc(0, 0);
+	FAIL_IF(pkey2 < 0);
+
+	pkey3 = sys_pkey_alloc(0, 0);
+	FAIL_IF(pkey3 < 0);
+
+	info->amr = 3ul << pkeyshift(pkey1) | 2ul << pkeyshift(pkey2);
+
+	if (disable_execute)
+		info->iamr = 1ul << pkeyshift(pkey1);
+	else
+		info->iamr = 0;
+
+	info->uamor = 3ul << pkeyshift(pkey1) | 3ul << pkeyshift(pkey2);
+
+	printf("%-30s AMR: %016lx pkey1: %d pkey2: %d pkey3: %d\n",
+	       user_write, info->amr, pkey1, pkey2, pkey3);
+
+	mtspr(SPRN_AMR, info->amr);
+
+	/*
+	 * We won't use pkey3. This tests whether the kernel restores the UAMOR
+	 * permissions after a key is freed.
+	 */
+	sys_pkey_free(pkey3);
+
+	info->core_time = time(NULL);
+
+	/* Crash. */
+	ptr = 0;
+	*ptr = 1;
+
+	/* Shouldn't get here. */
+	FAIL_IF(true);
+
+	return TEST_FAIL;
+}
+
+/* Return file size if filename exists and pass sanity check, or zero if not. */
+static off_t try_core_file(const char *filename, struct shared_info *info,
+			   pid_t pid)
+{
+	struct stat buf;
+	int ret;
+
+	ret = stat(filename, &buf);
+	if (ret == -1)
+		return TEST_FAIL;
+
+	/* Make sure we're not using a stale core file. */
+	return buf.st_mtime >= info->core_time ? buf.st_size : TEST_FAIL;
+}
+
+static Elf64_Nhdr *next_note(Elf64_Nhdr *nhdr)
+{
+	return (void *) nhdr + sizeof(*nhdr) +
+		__ALIGN_KERNEL(nhdr->n_namesz, 4) +
+		__ALIGN_KERNEL(nhdr->n_descsz, 4);
+}
+
+static int check_core_file(struct shared_info *info, Elf64_Ehdr *ehdr,
+			   off_t core_size)
+{
+	unsigned long *regs;
+	Elf64_Phdr *phdr;
+	Elf64_Nhdr *nhdr;
+	size_t phdr_size;
+	void *p = ehdr, *note;
+	int ret;
+
+	ret = memcmp(ehdr->e_ident, ELFMAG, SELFMAG);
+	FAIL_IF(ret);
+
+	FAIL_IF(ehdr->e_type != ET_CORE);
+	FAIL_IF(ehdr->e_machine != EM_PPC64);
+	FAIL_IF(ehdr->e_phoff == 0 || ehdr->e_phnum == 0);
+
+	/*
+	 * e_phnum is at most 65535 so calculating the size of the
+	 * program header cannot overflow.
+	 */
+	phdr_size = sizeof(*phdr) * ehdr->e_phnum;
+
+	/* Sanity check the program header table location. */
+	FAIL_IF(ehdr->e_phoff + phdr_size < ehdr->e_phoff);
+	FAIL_IF(ehdr->e_phoff + phdr_size > core_size);
+
+	/* Find the PT_NOTE segment. */
+	for (phdr = p + ehdr->e_phoff;
+	     (void *) phdr < p + ehdr->e_phoff + phdr_size;
+	     phdr += ehdr->e_phentsize)
+		if (phdr->p_type == PT_NOTE)
+			break;
+
+	FAIL_IF((void *) phdr >= p + ehdr->e_phoff + phdr_size);
+
+	/* Find the NT_PPC_PKEY note. */
+	for (nhdr = p + phdr->p_offset;
+	     (void *) nhdr < p + phdr->p_offset + phdr->p_filesz;
+	     nhdr = next_note(nhdr))
+		if (nhdr->n_type == NT_PPC_PKEY)
+			break;
+
+	FAIL_IF((void *) nhdr >= p + phdr->p_offset + phdr->p_filesz);
+	FAIL_IF(nhdr->n_descsz == 0);
+
+	p = nhdr;
+	note = p + sizeof(*nhdr) + __ALIGN_KERNEL(nhdr->n_namesz, 4);
+
+	regs = (unsigned long *) note;
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       core_read_running, regs[0], regs[1], regs[2]);
+
+	FAIL_IF(regs[0] != info->amr);
+	FAIL_IF(regs[1] != info->iamr);
+	FAIL_IF(regs[2] != info->uamor);
+
+	return TEST_PASS;
+}
+
+static int parent(struct shared_info *info, pid_t pid)
+{
+	char *filenames, *filename[3];
+	int fd, i, ret, status;
+	off_t core_size;
+	void *core;
+
+	ret = wait(&status);
+	if (ret != pid) {
+		printf("Child's exit status not captured\n");
+		return TEST_FAIL;
+	} else if (!WIFSIGNALED(status) || !WCOREDUMP(status)) {
+		printf("Child didn't dump core\n");
+		return TEST_FAIL;
+	}
+
+	/* Construct array of core file names to try. */
+
+	filename[0] = filenames = malloc(PATH_MAX);
+	if (!filenames) {
+		perror("Error allocating memory");
+		return TEST_FAIL;
+	}
+
+	ret = snprintf(filename[0], PATH_MAX, "core-pkey.%d", pid);
+	if (ret < 0 || ret >= PATH_MAX) {
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	filename[1] = filename[0] + ret + 1;
+	ret = snprintf(filename[1], PATH_MAX - ret - 1, "core.%d", pid);
+	if (ret < 0 || ret >= PATH_MAX - ret - 1) {
+		ret = TEST_FAIL;
+		goto out;
+	}
+	filename[2] = "core";
+
+	for (i = 0; i < 3; i++) {
+		core_size = try_core_file(filename[i], info, pid);
+		if (core_size != TEST_FAIL)
+			break;
+	}
+
+	if (i == 3) {
+		printf("Couldn't find core file\n");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	fd = open(filename[i], O_RDONLY);
+	if (fd == -1) {
+		perror("Error opening core file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	core = mmap(NULL, core_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (core == (void *) -1) {
+		perror("Error mmaping core file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	ret = check_core_file(info, core, core_size);
+
+	munmap(core, core_size);
+	close(fd);
+	unlink(filename[i]);
+
+ out:
+	free(filenames);
+
+	return ret;
+}
+
+static int write_core_pattern(const char *core_pattern)
+{
+	size_t len = strlen(core_pattern), ret;
+	FILE *f;
+
+	f = fopen(core_pattern_file, "w");
+	if (!f) {
+		perror("Error writing to core_pattern file");
+		return TEST_FAIL;
+	}
+
+	ret = fwrite(core_pattern, 1, len, f);
+	fclose(f);
+	if (ret != len) {
+		perror("Error writing to core_pattern file");
+		return TEST_FAIL;
+	}
+
+	return TEST_PASS;
+}
+
+static int setup_core_pattern(char **core_pattern_, bool *changed_)
+{
+	FILE *f;
+	char *core_pattern;
+	int ret;
+
+	core_pattern = malloc(PATH_MAX);
+	if (!core_pattern) {
+		perror("Error allocating memory");
+		return TEST_FAIL;
+	}
+
+	f = fopen(core_pattern_file, "r");
+	if (!f) {
+		perror("Error opening core_pattern file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	ret = fread(core_pattern, 1, PATH_MAX, f);
+	fclose(f);
+	if (!ret) {
+		perror("Error reading core_pattern file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	/* Check whether we can predict the name of the core file. */
+	if (!strcmp(core_pattern, "core") || !strcmp(core_pattern, "core.%p"))
+		*changed_ = false;
+	else {
+		ret = write_core_pattern("core-pkey.%p");
+		if (ret)
+			goto out;
+
+		*changed_ = true;
+	}
+
+	*core_pattern_ = core_pattern;
+	ret = TEST_PASS;
+
+ out:
+	if (ret)
+		free(core_pattern);
+
+	return ret;
+}
+
+static int core_pkey(void)
+{
+	char *core_pattern;
+	bool changed_core_pattern;
+	struct shared_info *info;
+	int shm_id;
+	int ret;
+	pid_t pid;
+
+	ret = setup_core_pattern(&core_pattern, &changed_core_pattern);
+	if (ret)
+		return ret;
+
+	shm_id = shmget(IPC_PRIVATE, sizeof(*info), 0777 | IPC_CREAT);
+	info = shmat(shm_id, NULL, 0);
+
+	pid = fork();
+	if (pid < 0) {
+		perror("fork() failed");
+		ret = TEST_FAIL;
+	} else if (pid == 0)
+		ret = child(info);
+	else
+		ret = parent(info, pid);
+
+	shmdt(info);
+
+	if (pid) {
+		shmctl(shm_id, IPC_RMID, NULL);
+
+		if (changed_core_pattern)
+			write_core_pattern(core_pattern);
+	}
+
+	free(core_pattern);
+
+	return ret;
+}
+
+int main(int argc, char *argv[])
+{
+	return test_harness(core_pkey, "core_pkey");
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* [PATCH v9 51/51] selftests/powerpc: Add core file test for Protection Key register
@ 2017-11-06  8:57   ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-06  8:57 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

This test verifies that the AMR is being written to a
process' core file.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 tools/testing/selftests/powerpc/ptrace/Makefile    |    2 +-
 tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++++++++++++++++
 2 files changed, 439 insertions(+), 1 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c

diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile b/tools/testing/selftests/powerpc/ptrace/Makefile
index fd896b2..ca25fda 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 TEST_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
               ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
-              ptrace-tm-spd-vsx ptrace-tm-spr ptrace-pkey
+              ptrace-tm-spd-vsx ptrace-tm-spr ptrace-pkey core-pkey
 
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/ptrace/core-pkey.c b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
new file mode 100644
index 0000000..2328f8c
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
@@ -0,0 +1,438 @@
+/*
+ * Ptrace test for Memory Protection Key registers
+ *
+ * Copyright (C) 2015 Anshuman Khandual, IBM Corporation.
+ * Copyright (C) 2017 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <limits.h>
+#include <semaphore.h>
+#include <linux/kernel.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include "ptrace.h"
+
+#ifndef __NR_pkey_alloc
+#define __NR_pkey_alloc		384
+#endif
+
+#ifndef __NR_pkey_free
+#define __NR_pkey_free		385
+#endif
+
+#ifndef NT_PPC_PKEY
+#define NT_PPC_PKEY		0x110
+#endif
+
+#ifndef PKEY_DISABLE_EXECUTE
+#define PKEY_DISABLE_EXECUTE	0x4
+#endif
+
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64) * 8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey + 1) * AMR_BITS_PER_PKEY))
+
+#define CORE_FILE_LIMIT	(5 * 1024 * 1024)	/* 5 MB should be enough */
+
+static const char core_pattern_file[] = "/proc/sys/kernel/core_pattern";
+
+static const char user_write[] = "[User Write (Running)]";
+static const char core_read_running[] = "[Core Read (Running)]";
+
+/* Information shared between the parent and the child. */
+struct shared_info {
+	/* AMR value the parent expects to read in the core file. */
+	unsigned long amr;
+
+	/* IAMR value the parent expects to read from the child. */
+	unsigned long iamr;
+
+	/* UAMOR value the parent expects to read from the child. */
+	unsigned long uamor;
+
+	/* When the child crashed. */
+	time_t core_time;
+};
+
+static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
+{
+	return syscall(__NR_pkey_alloc, flags, init_access_rights);
+}
+
+static int sys_pkey_free(int pkey)
+{
+	return syscall(__NR_pkey_free, pkey);
+}
+
+static int increase_core_file_limit(void)
+{
+	struct rlimit rlim;
+	int ret;
+
+	ret = getrlimit(RLIMIT_CORE, &rlim);
+	FAIL_IF(ret);
+
+	if (rlim.rlim_cur != RLIM_INFINITY && rlim.rlim_cur < CORE_FILE_LIMIT) {
+		rlim.rlim_cur = CORE_FILE_LIMIT;
+
+		if (rlim.rlim_max != RLIM_INFINITY &&
+		    rlim.rlim_max < CORE_FILE_LIMIT)
+			rlim.rlim_max = CORE_FILE_LIMIT;
+
+		ret = setrlimit(RLIMIT_CORE, &rlim);
+		FAIL_IF(ret);
+	}
+
+	ret = getrlimit(RLIMIT_FSIZE, &rlim);
+	FAIL_IF(ret);
+
+	if (rlim.rlim_cur != RLIM_INFINITY && rlim.rlim_cur < CORE_FILE_LIMIT) {
+		rlim.rlim_cur = CORE_FILE_LIMIT;
+
+		if (rlim.rlim_max != RLIM_INFINITY &&
+		    rlim.rlim_max < CORE_FILE_LIMIT)
+			rlim.rlim_max = CORE_FILE_LIMIT;
+
+		ret = setrlimit(RLIMIT_FSIZE, &rlim);
+		FAIL_IF(ret);
+	}
+
+	return TEST_PASS;
+}
+
+static int child(struct shared_info *info)
+{
+	bool disable_execute = true;
+	int pkey1, pkey2, pkey3;
+	int *ptr, ret;
+
+	ret = increase_core_file_limit();
+	FAIL_IF(ret);
+
+	/* Get some pkeys so that we can change their bits in the AMR. */
+	pkey1 = sys_pkey_alloc(0, PKEY_DISABLE_EXECUTE);
+	if (pkey1 < 0) {
+		pkey1 = sys_pkey_alloc(0, 0);
+		FAIL_IF(pkey1 < 0);
+
+		disable_execute = false;
+	}
+
+	pkey2 = sys_pkey_alloc(0, 0);
+	FAIL_IF(pkey2 < 0);
+
+	pkey3 = sys_pkey_alloc(0, 0);
+	FAIL_IF(pkey3 < 0);
+
+	info->amr = 3ul << pkeyshift(pkey1) | 2ul << pkeyshift(pkey2);
+
+	if (disable_execute)
+		info->iamr = 1ul << pkeyshift(pkey1);
+	else
+		info->iamr = 0;
+
+	info->uamor = 3ul << pkeyshift(pkey1) | 3ul << pkeyshift(pkey2);
+
+	printf("%-30s AMR: %016lx pkey1: %d pkey2: %d pkey3: %d\n",
+	       user_write, info->amr, pkey1, pkey2, pkey3);
+
+	mtspr(SPRN_AMR, info->amr);
+
+	/*
+	 * We won't use pkey3. This tests whether the kernel restores the UAMOR
+	 * permissions after a key is freed.
+	 */
+	sys_pkey_free(pkey3);
+
+	info->core_time = time(NULL);
+
+	/* Crash. */
+	ptr = 0;
+	*ptr = 1;
+
+	/* Shouldn't get here. */
+	FAIL_IF(true);
+
+	return TEST_FAIL;
+}
+
+/* Return file size if filename exists and pass sanity check, or zero if not. */
+static off_t try_core_file(const char *filename, struct shared_info *info,
+			   pid_t pid)
+{
+	struct stat buf;
+	int ret;
+
+	ret = stat(filename, &buf);
+	if (ret == -1)
+		return TEST_FAIL;
+
+	/* Make sure we're not using a stale core file. */
+	return buf.st_mtime >= info->core_time ? buf.st_size : TEST_FAIL;
+}
+
+static Elf64_Nhdr *next_note(Elf64_Nhdr *nhdr)
+{
+	return (void *) nhdr + sizeof(*nhdr) +
+		__ALIGN_KERNEL(nhdr->n_namesz, 4) +
+		__ALIGN_KERNEL(nhdr->n_descsz, 4);
+}
+
+static int check_core_file(struct shared_info *info, Elf64_Ehdr *ehdr,
+			   off_t core_size)
+{
+	unsigned long *regs;
+	Elf64_Phdr *phdr;
+	Elf64_Nhdr *nhdr;
+	size_t phdr_size;
+	void *p = ehdr, *note;
+	int ret;
+
+	ret = memcmp(ehdr->e_ident, ELFMAG, SELFMAG);
+	FAIL_IF(ret);
+
+	FAIL_IF(ehdr->e_type != ET_CORE);
+	FAIL_IF(ehdr->e_machine != EM_PPC64);
+	FAIL_IF(ehdr->e_phoff == 0 || ehdr->e_phnum == 0);
+
+	/*
+	 * e_phnum is at most 65535 so calculating the size of the
+	 * program header cannot overflow.
+	 */
+	phdr_size = sizeof(*phdr) * ehdr->e_phnum;
+
+	/* Sanity check the program header table location. */
+	FAIL_IF(ehdr->e_phoff + phdr_size < ehdr->e_phoff);
+	FAIL_IF(ehdr->e_phoff + phdr_size > core_size);
+
+	/* Find the PT_NOTE segment. */
+	for (phdr = p + ehdr->e_phoff;
+	     (void *) phdr < p + ehdr->e_phoff + phdr_size;
+	     phdr += ehdr->e_phentsize)
+		if (phdr->p_type == PT_NOTE)
+			break;
+
+	FAIL_IF((void *) phdr >= p + ehdr->e_phoff + phdr_size);
+
+	/* Find the NT_PPC_PKEY note. */
+	for (nhdr = p + phdr->p_offset;
+	     (void *) nhdr < p + phdr->p_offset + phdr->p_filesz;
+	     nhdr = next_note(nhdr))
+		if (nhdr->n_type == NT_PPC_PKEY)
+			break;
+
+	FAIL_IF((void *) nhdr >= p + phdr->p_offset + phdr->p_filesz);
+	FAIL_IF(nhdr->n_descsz == 0);
+
+	p = nhdr;
+	note = p + sizeof(*nhdr) + __ALIGN_KERNEL(nhdr->n_namesz, 4);
+
+	regs = (unsigned long *) note;
+
+	printf("%-30s AMR: %016lx IAMR: %016lx UAMOR: %016lx\n",
+	       core_read_running, regs[0], regs[1], regs[2]);
+
+	FAIL_IF(regs[0] != info->amr);
+	FAIL_IF(regs[1] != info->iamr);
+	FAIL_IF(regs[2] != info->uamor);
+
+	return TEST_PASS;
+}
+
+static int parent(struct shared_info *info, pid_t pid)
+{
+	char *filenames, *filename[3];
+	int fd, i, ret, status;
+	off_t core_size;
+	void *core;
+
+	ret = wait(&status);
+	if (ret != pid) {
+		printf("Child's exit status not captured\n");
+		return TEST_FAIL;
+	} else if (!WIFSIGNALED(status) || !WCOREDUMP(status)) {
+		printf("Child didn't dump core\n");
+		return TEST_FAIL;
+	}
+
+	/* Construct array of core file names to try. */
+
+	filename[0] = filenames = malloc(PATH_MAX);
+	if (!filenames) {
+		perror("Error allocating memory");
+		return TEST_FAIL;
+	}
+
+	ret = snprintf(filename[0], PATH_MAX, "core-pkey.%d", pid);
+	if (ret < 0 || ret >= PATH_MAX) {
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	filename[1] = filename[0] + ret + 1;
+	ret = snprintf(filename[1], PATH_MAX - ret - 1, "core.%d", pid);
+	if (ret < 0 || ret >= PATH_MAX - ret - 1) {
+		ret = TEST_FAIL;
+		goto out;
+	}
+	filename[2] = "core";
+
+	for (i = 0; i < 3; i++) {
+		core_size = try_core_file(filename[i], info, pid);
+		if (core_size != TEST_FAIL)
+			break;
+	}
+
+	if (i == 3) {
+		printf("Couldn't find core file\n");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	fd = open(filename[i], O_RDONLY);
+	if (fd == -1) {
+		perror("Error opening core file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	core = mmap(NULL, core_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (core == (void *) -1) {
+		perror("Error mmaping core file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	ret = check_core_file(info, core, core_size);
+
+	munmap(core, core_size);
+	close(fd);
+	unlink(filename[i]);
+
+ out:
+	free(filenames);
+
+	return ret;
+}
+
+static int write_core_pattern(const char *core_pattern)
+{
+	size_t len = strlen(core_pattern), ret;
+	FILE *f;
+
+	f = fopen(core_pattern_file, "w");
+	if (!f) {
+		perror("Error writing to core_pattern file");
+		return TEST_FAIL;
+	}
+
+	ret = fwrite(core_pattern, 1, len, f);
+	fclose(f);
+	if (ret != len) {
+		perror("Error writing to core_pattern file");
+		return TEST_FAIL;
+	}
+
+	return TEST_PASS;
+}
+
+static int setup_core_pattern(char **core_pattern_, bool *changed_)
+{
+	FILE *f;
+	char *core_pattern;
+	int ret;
+
+	core_pattern = malloc(PATH_MAX);
+	if (!core_pattern) {
+		perror("Error allocating memory");
+		return TEST_FAIL;
+	}
+
+	f = fopen(core_pattern_file, "r");
+	if (!f) {
+		perror("Error opening core_pattern file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	ret = fread(core_pattern, 1, PATH_MAX, f);
+	fclose(f);
+	if (!ret) {
+		perror("Error reading core_pattern file");
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	/* Check whether we can predict the name of the core file. */
+	if (!strcmp(core_pattern, "core") || !strcmp(core_pattern, "core.%p"))
+		*changed_ = false;
+	else {
+		ret = write_core_pattern("core-pkey.%p");
+		if (ret)
+			goto out;
+
+		*changed_ = true;
+	}
+
+	*core_pattern_ = core_pattern;
+	ret = TEST_PASS;
+
+ out:
+	if (ret)
+		free(core_pattern);
+
+	return ret;
+}
+
+static int core_pkey(void)
+{
+	char *core_pattern;
+	bool changed_core_pattern;
+	struct shared_info *info;
+	int shm_id;
+	int ret;
+	pid_t pid;
+
+	ret = setup_core_pattern(&core_pattern, &changed_core_pattern);
+	if (ret)
+		return ret;
+
+	shm_id = shmget(IPC_PRIVATE, sizeof(*info), 0777 | IPC_CREAT);
+	info = shmat(shm_id, NULL, 0);
+
+	pid = fork();
+	if (pid < 0) {
+		perror("fork() failed");
+		ret = TEST_FAIL;
+	} else if (pid == 0)
+		ret = child(info);
+	else
+		ret = parent(info, pid);
+
+	shmdt(info);
+
+	if (pid) {
+		shmctl(shm_id, IPC_RMID, NULL);
+
+		if (changed_core_pattern)
+			write_core_pattern(core_pattern);
+	}
+
+	free(core_pattern);
+
+	return ret;
+}
+
+int main(int argc, char *argv[])
+{
+	return test_harness(core_pkey, "core_pkey");
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-06  8:56 ` Ram Pai
@ 2017-11-06 21:28   ` Florian Weimer
  -1 siblings, 0 replies; 197+ messages in thread
From: Florian Weimer @ 2017-11-06 21:28 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

* Ram Pai:

> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.

How do you deal with the key reuse problem?  Is it the same as x86-64,
where it's quite easy to accidentally grant existing threads access to
a just-allocated key, either due to key reuse or a changed init_pkru
parameter?

What about siglongjmp from a signal handler?

  <https://sourceware.org/bugzilla/show_bug.cgi?id=22396>

I wonder if it's possible to fix some of these things before the exact
semantics of these interfaces are set in stone.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-06 21:28   ` Florian Weimer
  0 siblings, 0 replies; 197+ messages in thread
From: Florian Weimer @ 2017-11-06 21:28 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

* Ram Pai:

> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.

How do you deal with the key reuse problem?  Is it the same as x86-64,
where it's quite easy to accidentally grant existing threads access to
a just-allocated key, either due to key reuse or a changed init_pkru
parameter?

What about siglongjmp from a signal handler?

  <https://sourceware.org/bugzilla/show_bug.cgi?id=22396>

I wonder if it's possible to fix some of these things before the exact
semantics of these interfaces are set in stone.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-06 21:28   ` Florian Weimer
@ 2017-11-07  1:22     ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07  1:22 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > Testing:
> > -------
> > This patch series has passed all the protection key
> > tests available in the selftest directory.The
> > tests are updated to work on both x86 and powerpc.
> > The selftests have passed on x86 and powerpc hardware.
> 
> How do you deal with the key reuse problem?  Is it the same as x86-64,
> where it's quite easy to accidentally grant existing threads access to
> a just-allocated key, either due to key reuse or a changed init_pkru
> parameter?

I am not sure how on x86-64, two threads get allocated the same key
at the same time? the key allocation is guarded under the mmap_sem
semaphore. So there cannot be a race where two threads get allocated
the same key.

Can you point me to the issue, if it is already discussed somewhere?

As far as the semantics is concerned, a key allocated in one thread's
context has no meaning if used in some other threads context within the
same process.  The app should not try to re-use a key allocated in a
thread's context in some other threads's context.

> 
> What about siglongjmp from a signal handler?

On powerpc there is some relief.  the permissions on a key can be
modified from anywhere, including from the signal handler, and the
effect will be immediate.  You dont have to wait till the
signal handler returns for the key permissions to be restore.

also after return from the sigsetjmp();
possibly caused by siglongjmp(), the program can restore the permission
on any key.

Atleast that is my theory. Can you give me a testcase; if you have one
handy.

> 
>   <https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceware.org_bugzilla_show-5Fbug.cgi-3Fid-3D22396&d=DwIBAg&c=jf_iaSHvJObTbx-siA1ZOg&r=m-UrKChQVkZtnPpjbF6YY99NbT8FBByQ-E-ygV8luxw&m=UmfbMQc2yyiSfzGDF9J2vFgiKDkVtkdf5xW3qdVeCVs&s=MuUgK3t4Ay8rjmIK7YgK94HZsp8IRG7pJwU6n-GnZn0&e=>
> 
> I wonder if it's possible to fix some of these things before the exact
> semantics of these interfaces are set in stone.

Will try.

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07  1:22     ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07  1:22 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > Testing:
> > -------
> > This patch series has passed all the protection key
> > tests available in the selftest directory.The
> > tests are updated to work on both x86 and powerpc.
> > The selftests have passed on x86 and powerpc hardware.
> 
> How do you deal with the key reuse problem?  Is it the same as x86-64,
> where it's quite easy to accidentally grant existing threads access to
> a just-allocated key, either due to key reuse or a changed init_pkru
> parameter?

I am not sure how on x86-64, two threads get allocated the same key
at the same time? the key allocation is guarded under the mmap_sem
semaphore. So there cannot be a race where two threads get allocated
the same key.

Can you point me to the issue, if it is already discussed somewhere?

As far as the semantics is concerned, a key allocated in one thread's
context has no meaning if used in some other threads context within the
same process.  The app should not try to re-use a key allocated in a
thread's context in some other threads's context.

> 
> What about siglongjmp from a signal handler?

On powerpc there is some relief.  the permissions on a key can be
modified from anywhere, including from the signal handler, and the
effect will be immediate.  You dont have to wait till the
signal handler returns for the key permissions to be restore.

also after return from the sigsetjmp();
possibly caused by siglongjmp(), the program can restore the permission
on any key.

Atleast that is my theory. Can you give me a testcase; if you have one
handy.

> 
>   <https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceware.org_bugzilla_show-5Fbug.cgi-3Fid-3D22396&d=DwIBAg&c=jf_iaSHvJObTbx-siA1ZOg&r=m-UrKChQVkZtnPpjbF6YY99NbT8FBByQ-E-ygV8luxw&m=UmfbMQc2yyiSfzGDF9J2vFgiKDkVtkdf5xW3qdVeCVs&s=MuUgK3t4Ay8rjmIK7YgK94HZsp8IRG7pJwU6n-GnZn0&e=>
> 
> I wonder if it's possible to fix some of these things before the exact
> semantics of these interfaces are set in stone.

Will try.

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-07  1:22     ` Ram Pai
  (?)
@ 2017-11-07  7:32       ` Florian Weimer
  -1 siblings, 0 replies; 197+ messages in thread
From: Florian Weimer @ 2017-11-07  7:32 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

* Ram Pai:

> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
>> * Ram Pai:
>> 
>> > Testing:
>> > -------
>> > This patch series has passed all the protection key
>> > tests available in the selftest directory.The
>> > tests are updated to work on both x86 and powerpc.
>> > The selftests have passed on x86 and powerpc hardware.
>> 
>> How do you deal with the key reuse problem?  Is it the same as x86-64,
>> where it's quite easy to accidentally grant existing threads access to
>> a just-allocated key, either due to key reuse or a changed init_pkru
>> parameter?
>
> I am not sure how on x86-64, two threads get allocated the same key
> at the same time? the key allocation is guarded under the mmap_sem
> semaphore. So there cannot be a race where two threads get allocated
> the same key.

The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
sequence.  The pthread_create call makes the new thread inherit the
access rights of the current thread, but then the key is deallocated.
Reallocation of the same key will have that thread retain its access
rights, which is IMHO not correct.

> Can you point me to the issue, if it is already discussed somewhere?

See ‘MPK: pkey_free and key reuse’ on various lists (including
linux-mm and linux-arch).

It has a test case attached which demonstrates the behavior.

> As far as the semantics is concerned, a key allocated in one thread's
> context has no meaning if used in some other threads context within the
> same process.  The app should not try to re-use a key allocated in a
> thread's context in some other threads's context.

Uh-oh, that's not how this feature works on x86-64 at all.  There, the
keys are a process-global resource.  Treating them per-thread
seriously reduces their usefulness.

>> What about siglongjmp from a signal handler?
>
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.

My concern is that the signal handler knows nothing about protection
keys, but the current x86-64 semantics will cause it to clobber the
access rights of the current thread.

> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.

So that's not really an option.

> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.

The glibc patch I posted under the ‘MPK: pkey_free and key reuse’
thread covers this, too.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07  7:32       ` Florian Weimer
  0 siblings, 0 replies; 197+ messages in thread
From: Florian Weimer @ 2017-11-07  7:32 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

* Ram Pai:

> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
>> * Ram Pai:
>> 
>> > Testing:
>> > -------
>> > This patch series has passed all the protection key
>> > tests available in the selftest directory.The
>> > tests are updated to work on both x86 and powerpc.
>> > The selftests have passed on x86 and powerpc hardware.
>> 
>> How do you deal with the key reuse problem?  Is it the same as x86-64,
>> where it's quite easy to accidentally grant existing threads access to
>> a just-allocated key, either due to key reuse or a changed init_pkru
>> parameter?
>
> I am not sure how on x86-64, two threads get allocated the same key
> at the same time? the key allocation is guarded under the mmap_sem
> semaphore. So there cannot be a race where two threads get allocated
> the same key.

The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
sequence.  The pthread_create call makes the new thread inherit the
access rights of the current thread, but then the key is deallocated.
Reallocation of the same key will have that thread retain its access
rights, which is IMHO not correct.

> Can you point me to the issue, if it is already discussed somewhere?

See ‘MPK: pkey_free and key reuse’ on various lists (including
linux-mm and linux-arch).

It has a test case attached which demonstrates the behavior.

> As far as the semantics is concerned, a key allocated in one thread's
> context has no meaning if used in some other threads context within the
> same process.  The app should not try to re-use a key allocated in a
> thread's context in some other threads's context.

Uh-oh, that's not how this feature works on x86-64 at all.  There, the
keys are a process-global resource.  Treating them per-thread
seriously reduces their usefulness.

>> What about siglongjmp from a signal handler?
>
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.

My concern is that the signal handler knows nothing about protection
keys, but the current x86-64 semantics will cause it to clobber the
access rights of the current thread.

> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.

So that's not really an option.

> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.

The glibc patch I posted under the ‘MPK: pkey_free and key reuse’
thread covers this, too.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07  7:32       ` Florian Weimer
  0 siblings, 0 replies; 197+ messages in thread
From: Florian Weimer @ 2017-11-07  7:32 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

* Ram Pai:

> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
>> * Ram Pai:
>>=20
>> > Testing:
>> > -------
>> > This patch series has passed all the protection key
>> > tests available in the selftest directory.The
>> > tests are updated to work on both x86 and powerpc.
>> > The selftests have passed on x86 and powerpc hardware.
>>=20
>> How do you deal with the key reuse problem?  Is it the same as x86-64,
>> where it's quite easy to accidentally grant existing threads access to
>> a just-allocated key, either due to key reuse or a changed init_pkru
>> parameter?
>
> I am not sure how on x86-64, two threads get allocated the same key
> at the same time? the key allocation is guarded under the mmap_sem
> semaphore. So there cannot be a race where two threads get allocated
> the same key.

The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
sequence.  The pthread_create call makes the new thread inherit the
access rights of the current thread, but then the key is deallocated.
Reallocation of the same key will have that thread retain its access
rights, which is IMHO not correct.

> Can you point me to the issue, if it is already discussed somewhere?

See =E2=80=98MPK: pkey_free and key reuse=E2=80=99 on various lists (includ=
ing
linux-mm and linux-arch).

It has a test case attached which demonstrates the behavior.

> As far as the semantics is concerned, a key allocated in one thread's
> context has no meaning if used in some other threads context within the
> same process.  The app should not try to re-use a key allocated in a
> thread's context in some other threads's context.

Uh-oh, that's not how this feature works on x86-64 at all.  There, the
keys are a process-global resource.  Treating them per-thread
seriously reduces their usefulness.

>> What about siglongjmp from a signal handler?
>
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.

My concern is that the signal handler knows nothing about protection
keys, but the current x86-64 semantics will cause it to clobber the
access rights of the current thread.

> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.

So that's not really an option.

> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.

The glibc patch I posted under the =E2=80=98MPK: pkey_free and key reuse=E2=
=80=99
thread covers this, too.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-07  7:32       ` Florian Weimer
                           ` (3 preceding siblings ...)
  (?)
@ 2017-11-07 22:39         ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 22:39 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Tue, Nov 07, 2017 at 08:32:16AM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> >> * Ram Pai:
> >> 
> >> > Testing:
> >> > -------
> >> > This patch series has passed all the protection key
> >> > tests available in the selftest directory.The
> >> > tests are updated to work on both x86 and powerpc.
> >> > The selftests have passed on x86 and powerpc hardware.
> >> 
> >> How do you deal with the key reuse problem?  Is it the same as x86-64,
> >> where it's quite easy to accidentally grant existing threads access to
> >> a just-allocated key, either due to key reuse or a changed init_pkru
> >> parameter?
> >
> > I am not sure how on x86-64, two threads get allocated the same key
> > at the same time? the key allocation is guarded under the mmap_sem
> > semaphore. So there cannot be a race where two threads get allocated
> > the same key.
> 
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

(Dave Hansen: please correct me if I miss-speak below)

As per the current semantics of sys_pkey_free(); the way I understand it,
the calling thread is saying disassociate me from this key. Other
threads continue to be associated with the key and could continue to
get key-faults, but this calling thread will not get key-faults on that
key any more.

Also the key should not get reallocated till all the threads in the process
have disassocated from the key; by calling sys_pkey_free().

>From that point of view, I think there is a bug in the implementation of
pkey on x86 and now on powerpc aswell.

> 
> > Can you point me to the issue, if it is already discussed somewhere?
> 
> See ‘MPK: pkey_free and key reuse’ on various lists (including
> linux-mm and linux-arch).
> 
> It has a test case attached which demonstrates the behavior.
> 
> > As far as the semantics is concerned, a key allocated in one thread's
> > context has no meaning if used in some other threads context within the
> > same process.  The app should not try to re-use a key allocated in a
> > thread's context in some other threads's context.
> 
> Uh-oh, that's not how this feature works on x86-64 at all.  There, the
> keys are a process-global resource.  Treating them per-thread
> seriously reduces their usefulness.

Sorry. I was not thinking right. Let me restate.

A key is a global resource, but the permissions on a key is
local to a thread. For eg: the same key could disable
access on a page for one thread, while it could disable write
on the same page on another thread.

> 
> >> What about siglongjmp from a signal handler?
> >
> > On powerpc there is some relief.  the permissions on a key can be
> > modified from anywhere, including from the signal handler, and the
> > effect will be immediate.  You dont have to wait till the
> > signal handler returns for the key permissions to be restore.
> 
> My concern is that the signal handler knows nothing about protection
> keys, but the current x86-64 semantics will cause it to clobber the
> access rights of the current thread.
> 
> > also after return from the sigsetjmp();
> > possibly caused by siglongjmp(), the program can restore the permission
> > on any key.
> 
> So that's not really an option.
> 
> > Atleast that is my theory. Can you give me a testcase; if you have one
> > handy.
> 
> The glibc patch I posted under the ‘MPK: pkey_free and key reuse’
> thread covers this, too.

thanks. will try the test case with my kernel patches. But, on
powerpc one can change the permissions on the key in the signal handler
which takes into effect immediately, there should not be a bug
in powerpc.

x86 has this requirement where it has to return from the signal handler
back to the kernel in order to change the permission on a key,
it can cause issues with longjump.

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:39         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-11-07 22:39 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4271 bytes --]

On Tue, Nov 07, 2017 at 08:32:16AM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> >> * Ram Pai:
> >> 
> >> > Testing:
> >> > -------
> >> > This patch series has passed all the protection key
> >> > tests available in the selftest directory.The
> >> > tests are updated to work on both x86 and powerpc.
> >> > The selftests have passed on x86 and powerpc hardware.
> >> 
> >> How do you deal with the key reuse problem?  Is it the same as x86-64,
> >> where it's quite easy to accidentally grant existing threads access to
> >> a just-allocated key, either due to key reuse or a changed init_pkru
> >> parameter?
> >
> > I am not sure how on x86-64, two threads get allocated the same key
> > at the same time? the key allocation is guarded under the mmap_sem
> > semaphore. So there cannot be a race where two threads get allocated
> > the same key.
> 
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

(Dave Hansen: please correct me if I miss-speak below)

As per the current semantics of sys_pkey_free(); the way I understand it,
the calling thread is saying disassociate me from this key. Other
threads continue to be associated with the key and could continue to
get key-faults, but this calling thread will not get key-faults on that
key any more.

Also the key should not get reallocated till all the threads in the process
have disassocated from the key; by calling sys_pkey_free().

>From that point of view, I think there is a bug in the implementation of
pkey on x86 and now on powerpc aswell.

> 
> > Can you point me to the issue, if it is already discussed somewhere?
> 
> See ‘MPK: pkey_free and key reuse’ on various lists (including
> linux-mm and linux-arch).
> 
> It has a test case attached which demonstrates the behavior.
> 
> > As far as the semantics is concerned, a key allocated in one thread's
> > context has no meaning if used in some other threads context within the
> > same process.  The app should not try to re-use a key allocated in a
> > thread's context in some other threads's context.
> 
> Uh-oh, that's not how this feature works on x86-64 at all.  There, the
> keys are a process-global resource.  Treating them per-thread
> seriously reduces their usefulness.

Sorry. I was not thinking right. Let me restate.

A key is a global resource, but the permissions on a key is
local to a thread. For eg: the same key could disable
access on a page for one thread, while it could disable write
on the same page on another thread.

> 
> >> What about siglongjmp from a signal handler?
> >
> > On powerpc there is some relief.  the permissions on a key can be
> > modified from anywhere, including from the signal handler, and the
> > effect will be immediate.  You dont have to wait till the
> > signal handler returns for the key permissions to be restore.
> 
> My concern is that the signal handler knows nothing about protection
> keys, but the current x86-64 semantics will cause it to clobber the
> access rights of the current thread.
> 
> > also after return from the sigsetjmp();
> > possibly caused by siglongjmp(), the program can restore the permission
> > on any key.
> 
> So that's not really an option.
> 
> > Atleast that is my theory. Can you give me a testcase; if you have one
> > handy.
> 
> The glibc patch I posted under the ‘MPK: pkey_free and key reuse’
> thread covers this, too.

thanks. will try the test case with my kernel patches. But, on
powerpc one can change the permissions on the key in the signal handler
which takes into effect immediately, there should not be a bug
in powerpc.

x86 has this requirement where it has to return from the signal handler
back to the kernel in order to change the permission on a key,
it can cause issues with longjump.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:39         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 22:39 UTC (permalink / raw)


On Tue, Nov 07, 2017@08:32:16AM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > On Mon, Nov 06, 2017@10:28:41PM +0100, Florian Weimer wrote:
> >> * Ram Pai:
> >> 
> >> > Testing:
> >> > -------
> >> > This patch series has passed all the protection key
> >> > tests available in the selftest directory.The
> >> > tests are updated to work on both x86 and powerpc.
> >> > The selftests have passed on x86 and powerpc hardware.
> >> 
> >> How do you deal with the key reuse problem?  Is it the same as x86-64,
> >> where it's quite easy to accidentally grant existing threads access to
> >> a just-allocated key, either due to key reuse or a changed init_pkru
> >> parameter?
> >
> > I am not sure how on x86-64, two threads get allocated the same key
> > at the same time? the key allocation is guarded under the mmap_sem
> > semaphore. So there cannot be a race where two threads get allocated
> > the same key.
> 
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

(Dave Hansen: please correct me if I miss-speak below)

As per the current semantics of sys_pkey_free(); the way I understand it,
the calling thread is saying disassociate me from this key. Other
threads continue to be associated with the key and could continue to
get key-faults, but this calling thread will not get key-faults on that
key any more.

Also the key should not get reallocated till all the threads in the process
have disassocated from the key; by calling sys_pkey_free().

>From that point of view, I think there is a bug in the implementation of
pkey on x86 and now on powerpc aswell.

> 
> > Can you point me to the issue, if it is already discussed somewhere?
> 
> See ‘MPK: pkey_free and key reuse’ on various lists (including
> linux-mm and linux-arch).
> 
> It has a test case attached which demonstrates the behavior.
> 
> > As far as the semantics is concerned, a key allocated in one thread's
> > context has no meaning if used in some other threads context within the
> > same process.  The app should not try to re-use a key allocated in a
> > thread's context in some other threads's context.
> 
> Uh-oh, that's not how this feature works on x86-64 at all.  There, the
> keys are a process-global resource.  Treating them per-thread
> seriously reduces their usefulness.

Sorry. I was not thinking right. Let me restate.

A key is a global resource, but the permissions on a key is
local to a thread. For eg: the same key could disable
access on a page for one thread, while it could disable write
on the same page on another thread.

> 
> >> What about siglongjmp from a signal handler?
> >
> > On powerpc there is some relief.  the permissions on a key can be
> > modified from anywhere, including from the signal handler, and the
> > effect will be immediate.  You dont have to wait till the
> > signal handler returns for the key permissions to be restore.
> 
> My concern is that the signal handler knows nothing about protection
> keys, but the current x86-64 semantics will cause it to clobber the
> access rights of the current thread.
> 
> > also after return from the sigsetjmp();
> > possibly caused by siglongjmp(), the program can restore the permission
> > on any key.
> 
> So that's not really an option.
> 
> > Atleast that is my theory. Can you give me a testcase; if you have one
> > handy.
> 
> The glibc patch I posted under the ‘MPK: pkey_free and key reuse’
> thread covers this, too.

thanks. will try the test case with my kernel patches. But, on
powerpc one can change the permissions on the key in the signal handler
which takes into effect immediately, there should not be a bug
in powerpc.

x86 has this requirement where it has to return from the signal handler
back to the kernel in order to change the permission on a key,
it can cause issues with longjump.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:39         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 22:39 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Tue, Nov 07, 2017 at 08:32:16AM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> >> * Ram Pai:
> >> 
> >> > Testing:
> >> > -------
> >> > This patch series has passed all the protection key
> >> > tests available in the selftest directory.The
> >> > tests are updated to work on both x86 and powerpc.
> >> > The selftests have passed on x86 and powerpc hardware.
> >> 
> >> How do you deal with the key reuse problem?  Is it the same as x86-64,
> >> where it's quite easy to accidentally grant existing threads access to
> >> a just-allocated key, either due to key reuse or a changed init_pkru
> >> parameter?
> >
> > I am not sure how on x86-64, two threads get allocated the same key
> > at the same time? the key allocation is guarded under the mmap_sem
> > semaphore. So there cannot be a race where two threads get allocated
> > the same key.
> 
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

(Dave Hansen: please correct me if I miss-speak below)

As per the current semantics of sys_pkey_free(); the way I understand it,
the calling thread is saying disassociate me from this key. Other
threads continue to be associated with the key and could continue to
get key-faults, but this calling thread will not get key-faults on that
key any more.

Also the key should not get reallocated till all the threads in the process
have disassocated from the key; by calling sys_pkey_free().

From that point of view, I think there is a bug in the implementation of
pkey on x86 and now on powerpc aswell.

> 
> > Can you point me to the issue, if it is already discussed somewhere?
> 
> See ‘MPK: pkey_free and key reuse’ on various lists (including
> linux-mm and linux-arch).
> 
> It has a test case attached which demonstrates the behavior.
> 
> > As far as the semantics is concerned, a key allocated in one thread's
> > context has no meaning if used in some other threads context within the
> > same process.  The app should not try to re-use a key allocated in a
> > thread's context in some other threads's context.
> 
> Uh-oh, that's not how this feature works on x86-64 at all.  There, the
> keys are a process-global resource.  Treating them per-thread
> seriously reduces their usefulness.

Sorry. I was not thinking right. Let me restate.

A key is a global resource, but the permissions on a key is
local to a thread. For eg: the same key could disable
access on a page for one thread, while it could disable write
on the same page on another thread.

> 
> >> What about siglongjmp from a signal handler?
> >
> > On powerpc there is some relief.  the permissions on a key can be
> > modified from anywhere, including from the signal handler, and the
> > effect will be immediate.  You dont have to wait till the
> > signal handler returns for the key permissions to be restore.
> 
> My concern is that the signal handler knows nothing about protection
> keys, but the current x86-64 semantics will cause it to clobber the
> access rights of the current thread.
> 
> > also after return from the sigsetjmp();
> > possibly caused by siglongjmp(), the program can restore the permission
> > on any key.
> 
> So that's not really an option.
> 
> > Atleast that is my theory. Can you give me a testcase; if you have one
> > handy.
> 
> The glibc patch I posted under the ‘MPK: pkey_free and key reuse’
> thread covers this, too.

thanks. will try the test case with my kernel patches. But, on
powerpc one can change the permissions on the key in the signal handler
which takes into effect immediately, there should not be a bug
in powerpc.

x86 has this requirement where it has to return from the signal handler
back to the kernel in order to change the permission on a key,
it can cause issues with longjump.

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:39         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 22:39 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Tue, Nov 07, 2017 at 08:32:16AM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> >> * Ram Pai:
> >> 
> >> > Testing:
> >> > -------
> >> > This patch series has passed all the protection key
> >> > tests available in the selftest directory.The
> >> > tests are updated to work on both x86 and powerpc.
> >> > The selftests have passed on x86 and powerpc hardware.
> >> 
> >> How do you deal with the key reuse problem?  Is it the same as x86-64,
> >> where it's quite easy to accidentally grant existing threads access to
> >> a just-allocated key, either due to key reuse or a changed init_pkru
> >> parameter?
> >
> > I am not sure how on x86-64, two threads get allocated the same key
> > at the same time? the key allocation is guarded under the mmap_sem
> > semaphore. So there cannot be a race where two threads get allocated
> > the same key.
> 
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

(Dave Hansen: please correct me if I miss-speak below)

As per the current semantics of sys_pkey_free(); the way I understand it,
the calling thread is saying disassociate me from this key. Other
threads continue to be associated with the key and could continue to
get key-faults, but this calling thread will not get key-faults on that
key any more.

Also the key should not get reallocated till all the threads in the process
have disassocated from the key; by calling sys_pkey_free().

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:39         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 22:39 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Tue, Nov 07, 2017 at 08:32:16AM +0100, Florian Weimer wrote:
> * Ram Pai:
> 
> > On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> >> * Ram Pai:
> >> 
> >> > Testing:
> >> > -------
> >> > This patch series has passed all the protection key
> >> > tests available in the selftest directory.The
> >> > tests are updated to work on both x86 and powerpc.
> >> > The selftests have passed on x86 and powerpc hardware.
> >> 
> >> How do you deal with the key reuse problem?  Is it the same as x86-64,
> >> where it's quite easy to accidentally grant existing threads access to
> >> a just-allocated key, either due to key reuse or a changed init_pkru
> >> parameter?
> >
> > I am not sure how on x86-64, two threads get allocated the same key
> > at the same time? the key allocation is guarded under the mmap_sem
> > semaphore. So there cannot be a race where two threads get allocated
> > the same key.
> 
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

(Dave Hansen: please correct me if I miss-speak below)

As per the current semantics of sys_pkey_free(); the way I understand it,
the calling thread is saying disassociate me from this key. Other
threads continue to be associated with the key and could continue to
get key-faults, but this calling thread will not get key-faults on that
key any more.

Also the key should not get reallocated till all the threads in the process
have disassocated from the key; by calling sys_pkey_free().

>From that point of view, I think there is a bug in the implementation of
pkey on x86 and now on powerpc aswell.

> 
> > Can you point me to the issue, if it is already discussed somewhere?
> 
> See a??MPK: pkey_free and key reusea?? on various lists (including
> linux-mm and linux-arch).
> 
> It has a test case attached which demonstrates the behavior.
> 
> > As far as the semantics is concerned, a key allocated in one thread's
> > context has no meaning if used in some other threads context within the
> > same process.  The app should not try to re-use a key allocated in a
> > thread's context in some other threads's context.
> 
> Uh-oh, that's not how this feature works on x86-64 at all.  There, the
> keys are a process-global resource.  Treating them per-thread
> seriously reduces their usefulness.

Sorry. I was not thinking right. Let me restate.

A key is a global resource, but the permissions on a key is
local to a thread. For eg: the same key could disable
access on a page for one thread, while it could disable write
on the same page on another thread.

> 
> >> What about siglongjmp from a signal handler?
> >
> > On powerpc there is some relief.  the permissions on a key can be
> > modified from anywhere, including from the signal handler, and the
> > effect will be immediate.  You dont have to wait till the
> > signal handler returns for the key permissions to be restore.
> 
> My concern is that the signal handler knows nothing about protection
> keys, but the current x86-64 semantics will cause it to clobber the
> access rights of the current thread.
> 
> > also after return from the sigsetjmp();
> > possibly caused by siglongjmp(), the program can restore the permission
> > on any key.
> 
> So that's not really an option.
> 
> > Atleast that is my theory. Can you give me a testcase; if you have one
> > handy.
> 
> The glibc patch I posted under the a??MPK: pkey_free and key reusea??
> thread covers this, too.

thanks. will try the test case with my kernel patches. But, on
powerpc one can change the permissions on the key in the signal handler
which takes into effect immediately, there should not be a bug
in powerpc.

x86 has this requirement where it has to return from the signal handler
back to the kernel in order to change the permission on a key,
it can cause issues with longjump.

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-07 22:39         ` linuxram
  (?)
  (?)
@ 2017-11-07 22:47           ` dave.hansen
  -1 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-11-07 22:47 UTC (permalink / raw)
  To: Ram Pai, Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, linux-kernel, mhocko, linux-mm, paulus, aneesh.kumar,
	linux-kselftest, bauerman, linuxppc-dev, khandual

On 11/07/2017 02:39 PM, Ram Pai wrote:
> 
> As per the current semantics of sys_pkey_free(); the way I understand it,
> the calling thread is saying disassociate me from this key.

No.  It is saying: "this *process* no longer has any uses of this key,
it can be reused".

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:47           ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: dave.hansen @ 2017-11-07 22:47 UTC (permalink / raw)


On 11/07/2017 02:39 PM, Ram Pai wrote:
> 
> As per the current semantics of sys_pkey_free(); the way I understand it,
> the calling thread is saying disassociate me from this key.

No.  It is saying: "this *process* no longer has any uses of this key,
it can be reused".
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:47           ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-11-07 22:47 UTC (permalink / raw)


On 11/07/2017 02:39 PM, Ram Pai wrote:
> 
> As per the current semantics of sys_pkey_free(); the way I understand it,
> the calling thread is saying disassociate me from this key.

No.  It is saying: "this *process* no longer has any uses of this key,
it can be reused".
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 22:47           ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-11-07 22:47 UTC (permalink / raw)
  To: Ram Pai, Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, linux-kernel, mhocko, linux-mm, paulus, aneesh.kumar,
	linux-kselftest, bauerman, linuxppc-dev, khandual

On 11/07/2017 02:39 PM, Ram Pai wrote:
> 
> As per the current semantics of sys_pkey_free(); the way I understand it,
> the calling thread is saying disassociate me from this key.

No.  It is saying: "this *process* no longer has any uses of this key,
it can be reused".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-07 22:47           ` dave.hansen
@ 2017-11-07 23:44             ` Ram Pai
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 23:44 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Florian Weimer, linux-arch, x86, arnd, corbet, linux-doc,
	linux-kernel, mhocko, linux-mm, mingo, paulus, ebiederm,
	linux-kselftest, bauerman, akpm, khandual, linuxppc-dev,
	aneesh.kumar

On Tue, Nov 07, 2017 at 02:47:10PM -0800, Dave Hansen wrote:
> On 11/07/2017 02:39 PM, Ram Pai wrote:
> > 
> > As per the current semantics of sys_pkey_free(); the way I understand it,
> > the calling thread is saying disassociate me from this key.
> 
> No.  It is saying: "this *process* no longer has any uses of this key,
> it can be reused".

ok, in light of the corrected semantics, I see no bug in the implimentation.

> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
...
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

Again.. in light of the corrected semantics --
 the child thread or any thread should not free
a key without cleaning up. 
(a) disassociate the key from its address space
(b) reset the permission on the key across all the threads of the
process.

Because any such uncleaned bits can cause unexpected behavior if the 
same key gets reallocated on sys_pkey_alloc().


-- 
Ram Pai

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-07 23:44             ` Ram Pai
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-07 23:44 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Florian Weimer, linux-arch, x86, arnd, corbet, linux-doc,
	linux-kernel, mhocko, linux-mm, mingo, paulus, ebiederm,
	linux-kselftest, bauerman, akpm, khandual, linuxppc-dev,
	aneesh.kumar

On Tue, Nov 07, 2017 at 02:47:10PM -0800, Dave Hansen wrote:
> On 11/07/2017 02:39 PM, Ram Pai wrote:
> > 
> > As per the current semantics of sys_pkey_free(); the way I understand it,
> > the calling thread is saying disassociate me from this key.
> 
> No.  It is saying: "this *process* no longer has any uses of this key,
> it can be reused".

ok, in light of the corrected semantics, I see no bug in the implimentation.

> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
...
> The problem is a pkey_alloc/pthread_create/pkey_free/pkey_alloc
> sequence.  The pthread_create call makes the new thread inherit the
> access rights of the current thread, but then the key is deallocated.
> Reallocation of the same key will have that thread retain its access
> rights, which is IMHO not correct.

Again.. in light of the corrected semantics --
 the child thread or any thread should not free
a key without cleaning up. 
(a) disassociate the key from its address space
(b) reset the permission on the key across all the threads of the
process.

Because any such uncleaned bits can cause unexpected behavior if the 
same key gets reallocated on sys_pkey_alloc().


-- 
Ram Pai

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
  2017-11-06  8:57   ` Ram Pai
  (?)
  (?)
@ 2017-11-09 18:47     ` leitao
  -1 siblings, 0 replies; 197+ messages in thread
From: Breno Leitao @ 2017-11-09 18:47 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

Hi Ram,

On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  
>  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
>  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> -	fpregset = uctxt->uc_mcontext.fpregs;
> -	fpregs = (void *)fpregset;

Since you removed all references for fpregset now, you probably want to
remove the declaration of the variable above.

> @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  	 * state.  We just assume that it is here.
>  	 */
>  	fpregs += 0x70;
> -#endif
> -	pkey_reg_offset = pkey_reg_xstate_offset();

With this code, you removed all the reference for variable
pkey_reg_offset, thus, its declaration could be removed also.

> -	*(u64 *)pkey_reg_ptr = 0x00000000;
> +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> +#if defined(__i386__) || defined(__x86_64__) /* arch */
> +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> +#elif __powerpc64__

Since the variable pkey_reg_ptr is only used for Intel code (inside
#ifdefs), you probably want to #ifdef the variable declaration also,
avoid triggering "unused variable" warning on non-Intel machines.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-09 18:47     ` leitao
  0 siblings, 0 replies; 197+ messages in thread
From: leitao @ 2017-11-09 18:47 UTC (permalink / raw)


Hi Ram,

On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  
>  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
>  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> -	fpregset = uctxt->uc_mcontext.fpregs;
> -	fpregs = (void *)fpregset;

Since you removed all references for fpregset now, you probably want to
remove the declaration of the variable above.

> @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  	 * state.  We just assume that it is here.
>  	 */
>  	fpregs += 0x70;
> -#endif
> -	pkey_reg_offset = pkey_reg_xstate_offset();

With this code, you removed all the reference for variable
pkey_reg_offset, thus, its declaration could be removed also.

> -	*(u64 *)pkey_reg_ptr = 0x00000000;
> +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> +#if defined(__i386__) || defined(__x86_64__) /* arch */
> +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> +#elif __powerpc64__

Since the variable pkey_reg_ptr is only used for Intel code (inside
#ifdefs), you probably want to #ifdef the variable declaration also,
avoid triggering "unused variable" warning on non-Intel machines.
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-09 18:47     ` leitao
  0 siblings, 0 replies; 197+ messages in thread
From: Breno Leitao @ 2017-11-09 18:47 UTC (permalink / raw)


Hi Ram,

On Mon, Nov 06, 2017@12:57:36AM -0800, Ram Pai wrote:
> @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  
>  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
>  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> -	fpregset = uctxt->uc_mcontext.fpregs;
> -	fpregs = (void *)fpregset;

Since you removed all references for fpregset now, you probably want to
remove the declaration of the variable above.

> @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  	 * state.  We just assume that it is here.
>  	 */
>  	fpregs += 0x70;
> -#endif
> -	pkey_reg_offset = pkey_reg_xstate_offset();

With this code, you removed all the reference for variable
pkey_reg_offset, thus, its declaration could be removed also.

> -	*(u64 *)pkey_reg_ptr = 0x00000000;
> +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> +#if defined(__i386__) || defined(__x86_64__) /* arch */
> +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> +#elif __powerpc64__

Since the variable pkey_reg_ptr is only used for Intel code (inside
#ifdefs), you probably want to #ifdef the variable declaration also,
avoid triggering "unused variable" warning on non-Intel machines.
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-09 18:47     ` leitao
  0 siblings, 0 replies; 197+ messages in thread
From: Breno Leitao @ 2017-11-09 18:47 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

Hi Ram,

On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  
>  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
>  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> -	fpregset = uctxt->uc_mcontext.fpregs;
> -	fpregs = (void *)fpregset;

Since you removed all references for fpregset now, you probably want to
remove the declaration of the variable above.

> @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
>  	 * state.  We just assume that it is here.
>  	 */
>  	fpregs += 0x70;
> -#endif
> -	pkey_reg_offset = pkey_reg_xstate_offset();

With this code, you removed all the reference for variable
pkey_reg_offset, thus, its declaration could be removed also.

> -	*(u64 *)pkey_reg_ptr = 0x00000000;
> +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> +#if defined(__i386__) || defined(__x86_64__) /* arch */
> +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> +#elif __powerpc64__

Since the variable pkey_reg_ptr is only used for Intel code (inside
#ifdefs), you probably want to #ifdef the variable declaration also,
avoid triggering "unused variable" warning on non-Intel machines.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-07  1:22     ` Ram Pai
  (?)
  (?)
@ 2017-11-09 22:23       ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-09 22:23 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Mon, Nov 06, 2017 at 05:22:18PM -0800, Ram Pai wrote:
> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> > * Ram Pai:
> > 
> > > Testing:
> > > -------
> > > This patch series has passed all the protection key
> > > tests available in the selftest directory.The
> > > tests are updated to work on both x86 and powerpc.
> > > The selftests have passed on x86 and powerpc hardware.
> > 
....snip....

> > What about siglongjmp from a signal handler?
> 
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.
> 
> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.
> 
> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.
> 
> > 
> >   <https://sourceware.org/bugzilla/show_bug.cgi?id=22396>
> > 

reading through the bug report, you mention that the following
"The application may not be able to save and restore the protection bits
for all keys because the kernel API does not actually specify that the
set of keys is a small, fixed set."

What exact kernel API do you need? This patch set exposes the total
number of keys and  max keys,  through sysfs.
https://marc.info/?l=linux-kernel&m=150995950219669&w=2

Is this sufficient? or do you need something else?

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-09 22:23       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-11-09 22:23 UTC (permalink / raw)


On Mon, Nov 06, 2017 at 05:22:18PM -0800, Ram Pai wrote:
> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> > * Ram Pai:
> > 
> > > Testing:
> > > -------
> > > This patch series has passed all the protection key
> > > tests available in the selftest directory.The
> > > tests are updated to work on both x86 and powerpc.
> > > The selftests have passed on x86 and powerpc hardware.
> > 
....snip....

> > What about siglongjmp from a signal handler?
> 
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.
> 
> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.
> 
> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.
> 
> > 
> >   <https://sourceware.org/bugzilla/show_bug.cgi?id=22396>
> > 

reading through the bug report, you mention that the following
"The application may not be able to save and restore the protection bits
for all keys because the kernel API does not actually specify that the
set of keys is a small, fixed set."

What exact kernel API do you need? This patch set exposes the total
number of keys and  max keys,  through sysfs.
https://marc.info/?l=linux-kernel&m=150995950219669&w=2

Is this sufficient? or do you need something else?

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-09 22:23       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-09 22:23 UTC (permalink / raw)


On Mon, Nov 06, 2017@05:22:18PM -0800, Ram Pai wrote:
> On Mon, Nov 06, 2017@10:28:41PM +0100, Florian Weimer wrote:
> > * Ram Pai:
> > 
> > > Testing:
> > > -------
> > > This patch series has passed all the protection key
> > > tests available in the selftest directory.The
> > > tests are updated to work on both x86 and powerpc.
> > > The selftests have passed on x86 and powerpc hardware.
> > 
....snip....

> > What about siglongjmp from a signal handler?
> 
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.
> 
> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.
> 
> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.
> 
> > 
> >   <https://sourceware.org/bugzilla/show_bug.cgi?id=22396>
> > 

reading through the bug report, you mention that the following
"The application may not be able to save and restore the protection bits
for all keys because the kernel API does not actually specify that the
set of keys is a small, fixed set."

What exact kernel API do you need? This patch set exposes the total
number of keys and  max keys,  through sysfs.
https://marc.info/?l=linux-kernel&m=150995950219669&w=2

Is this sufficient? or do you need something else?

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-09 22:23       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-09 22:23 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Mon, Nov 06, 2017 at 05:22:18PM -0800, Ram Pai wrote:
> On Mon, Nov 06, 2017 at 10:28:41PM +0100, Florian Weimer wrote:
> > * Ram Pai:
> > 
> > > Testing:
> > > -------
> > > This patch series has passed all the protection key
> > > tests available in the selftest directory.The
> > > tests are updated to work on both x86 and powerpc.
> > > The selftests have passed on x86 and powerpc hardware.
> > 
....snip....

> > What about siglongjmp from a signal handler?
> 
> On powerpc there is some relief.  the permissions on a key can be
> modified from anywhere, including from the signal handler, and the
> effect will be immediate.  You dont have to wait till the
> signal handler returns for the key permissions to be restore.
> 
> also after return from the sigsetjmp();
> possibly caused by siglongjmp(), the program can restore the permission
> on any key.
> 
> Atleast that is my theory. Can you give me a testcase; if you have one
> handy.
> 
> > 
> >   <https://sourceware.org/bugzilla/show_bug.cgi?id=22396>
> > 

reading through the bug report, you mention that the following
"The application may not be able to save and restore the protection bits
for all keys because the kernel API does not actually specify that the
set of keys is a small, fixed set."

What exact kernel API do you need? This patch set exposes the total
number of keys and  max keys,  through sysfs.
https://marc.info/?l=linux-kernel&m=150995950219669&w=2

Is this sufficient? or do you need something else?

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
  2017-11-09 18:47     ` leitao
  (?)
  (?)
@ 2017-11-09 23:37       ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-09 23:37 UTC (permalink / raw)
  To: Breno Leitao
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Thu, Nov 09, 2017 at 04:47:15PM -0200, Breno Leitao wrote:
> Hi Ram,
> 
> On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  
> >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > -	fpregset = uctxt->uc_mcontext.fpregs;
> > -	fpregs = (void *)fpregset;
> 
> Since you removed all references for fpregset now, you probably want to
> remove the declaration of the variable above.

fpregs is still needed.

> 
> > @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  	 * state.  We just assume that it is here.
> >  	 */
> >  	fpregs += 0x70;
> > -#endif
> > -	pkey_reg_offset = pkey_reg_xstate_offset();
> 
> With this code, you removed all the reference for variable
> pkey_reg_offset, thus, its declaration could be removed also.

yes. will fix it.

> 
> > -	*(u64 *)pkey_reg_ptr = 0x00000000;
> > +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> > +#if defined(__i386__) || defined(__x86_64__) /* arch */
> > +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> > +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> > +#elif __powerpc64__
> 
> Since the variable pkey_reg_ptr is only used for Intel code (inside
> #ifdefs), you probably want to #ifdef the variable declaration also,
> avoid triggering "unused variable" warning on non-Intel machines.

yes. Actually it will trigger the warning on intel machines. Fixed it.

Thanks Breno!
RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-09 23:37       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-11-09 23:37 UTC (permalink / raw)


On Thu, Nov 09, 2017 at 04:47:15PM -0200, Breno Leitao wrote:
> Hi Ram,
> 
> On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  
> >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > -	fpregset = uctxt->uc_mcontext.fpregs;
> > -	fpregs = (void *)fpregset;
> 
> Since you removed all references for fpregset now, you probably want to
> remove the declaration of the variable above.

fpregs is still needed.

> 
> > @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  	 * state.  We just assume that it is here.
> >  	 */
> >  	fpregs += 0x70;
> > -#endif
> > -	pkey_reg_offset = pkey_reg_xstate_offset();
> 
> With this code, you removed all the reference for variable
> pkey_reg_offset, thus, its declaration could be removed also.

yes. will fix it.

> 
> > -	*(u64 *)pkey_reg_ptr = 0x00000000;
> > +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> > +#if defined(__i386__) || defined(__x86_64__) /* arch */
> > +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> > +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> > +#elif __powerpc64__
> 
> Since the variable pkey_reg_ptr is only used for Intel code (inside
> #ifdefs), you probably want to #ifdef the variable declaration also,
> avoid triggering "unused variable" warning on non-Intel machines.

yes. Actually it will trigger the warning on intel machines. Fixed it.

Thanks Breno!
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-09 23:37       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-09 23:37 UTC (permalink / raw)


On Thu, Nov 09, 2017@04:47:15PM -0200, Breno Leitao wrote:
> Hi Ram,
> 
> On Mon, Nov 06, 2017@12:57:36AM -0800, Ram Pai wrote:
> > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  
> >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > -	fpregset = uctxt->uc_mcontext.fpregs;
> > -	fpregs = (void *)fpregset;
> 
> Since you removed all references for fpregset now, you probably want to
> remove the declaration of the variable above.

fpregs is still needed.

> 
> > @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  	 * state.  We just assume that it is here.
> >  	 */
> >  	fpregs += 0x70;
> > -#endif
> > -	pkey_reg_offset = pkey_reg_xstate_offset();
> 
> With this code, you removed all the reference for variable
> pkey_reg_offset, thus, its declaration could be removed also.

yes. will fix it.

> 
> > -	*(u64 *)pkey_reg_ptr = 0x00000000;
> > +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> > +#if defined(__i386__) || defined(__x86_64__) /* arch */
> > +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> > +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> > +#elif __powerpc64__
> 
> Since the variable pkey_reg_ptr is only used for Intel code (inside
> #ifdefs), you probably want to #ifdef the variable declaration also,
> avoid triggering "unused variable" warning on non-Intel machines.

yes. Actually it will trigger the warning on intel machines. Fixed it.

Thanks Breno!
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-09 23:37       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-09 23:37 UTC (permalink / raw)
  To: Breno Leitao
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Thu, Nov 09, 2017 at 04:47:15PM -0200, Breno Leitao wrote:
> Hi Ram,
> 
> On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  
> >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > -	fpregset = uctxt->uc_mcontext.fpregs;
> > -	fpregs = (void *)fpregset;
> 
> Since you removed all references for fpregset now, you probably want to
> remove the declaration of the variable above.

fpregs is still needed.

> 
> > @@ -219,20 +224,21 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> >  	 * state.  We just assume that it is here.
> >  	 */
> >  	fpregs += 0x70;
> > -#endif
> > -	pkey_reg_offset = pkey_reg_xstate_offset();
> 
> With this code, you removed all the reference for variable
> pkey_reg_offset, thus, its declaration could be removed also.

yes. will fix it.

> 
> > -	*(u64 *)pkey_reg_ptr = 0x00000000;
> > +	dprintf1("si_pkey from siginfo: %lx\n", si_pkey);
> > +#if defined(__i386__) || defined(__x86_64__) /* arch */
> > +	dprintf1("signal pkey_reg from xsave: %016lx\n", *pkey_reg_ptr);
> > +	*(u64 *)pkey_reg_ptr &= reset_bits(si_pkey, PKEY_DISABLE_ACCESS);
> > +#elif __powerpc64__
> 
> Since the variable pkey_reg_ptr is only used for Intel code (inside
> #ifdefs), you probably want to #ifdef the variable declaration also,
> avoid triggering "unused variable" warning on non-Intel machines.

yes. Actually it will trigger the warning on intel machines. Fixed it.

Thanks Breno!
RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
  2017-11-09 23:37       ` linuxram
  (?)
  (?)
@ 2017-11-10 11:36         ` leitao
  -1 siblings, 0 replies; 197+ messages in thread
From: Breno Leitao @ 2017-11-10 11:36 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

Hi Ram,

On Thu, Nov 09, 2017 at 03:37:46PM -0800, Ram Pai wrote:
> On Thu, Nov 09, 2017 at 04:47:15PM -0200, Breno Leitao wrote:
> > On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> > > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> > >  
> > >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> > >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > > -	fpregset = uctxt->uc_mcontext.fpregs;
> > > -	fpregs = (void *)fpregset;
> > 
> > Since you removed all references for fpregset now, you probably want to
> > remove the declaration of the variable above.
> 
> fpregs is still needed.

Right, fpregs is still needed, but not fpregset. Every reference for this
variable was removed with your patch.

Grepping this variable identifier on a tree with your patches, I see:

 $ grep fpregset protection_keys.c 
 fpregset_t fpregset;

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-10 11:36         ` leitao
  0 siblings, 0 replies; 197+ messages in thread
From: leitao @ 2017-11-10 11:36 UTC (permalink / raw)


Hi Ram,

On Thu, Nov 09, 2017 at 03:37:46PM -0800, Ram Pai wrote:
> On Thu, Nov 09, 2017 at 04:47:15PM -0200, Breno Leitao wrote:
> > On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> > > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> > >  
> > >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> > >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > > -	fpregset = uctxt->uc_mcontext.fpregs;
> > > -	fpregs = (void *)fpregset;
> > 
> > Since you removed all references for fpregset now, you probably want to
> > remove the declaration of the variable above.
> 
> fpregs is still needed.

Right, fpregs is still needed, but not fpregset. Every reference for this
variable was removed with your patch.

Grepping this variable identifier on a tree with your patches, I see:

 $ grep fpregset protection_keys.c 
 fpregset_t fpregset;
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-10 11:36         ` leitao
  0 siblings, 0 replies; 197+ messages in thread
From: Breno Leitao @ 2017-11-10 11:36 UTC (permalink / raw)


Hi Ram,

On Thu, Nov 09, 2017@03:37:46PM -0800, Ram Pai wrote:
> On Thu, Nov 09, 2017@04:47:15PM -0200, Breno Leitao wrote:
> > On Mon, Nov 06, 2017@12:57:36AM -0800, Ram Pai wrote:
> > > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> > >  
> > >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> > >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > > -	fpregset = uctxt->uc_mcontext.fpregs;
> > > -	fpregs = (void *)fpregset;
> > 
> > Since you removed all references for fpregset now, you probably want to
> > remove the declaration of the variable above.
> 
> fpregs is still needed.

Right, fpregs is still needed, but not fpregset. Every reference for this
variable was removed with your patch.

Grepping this variable identifier on a tree with your patches, I see:

 $ grep fpregset protection_keys.c 
 fpregset_t fpregset;
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction
@ 2017-11-10 11:36         ` leitao
  0 siblings, 0 replies; 197+ messages in thread
From: Breno Leitao @ 2017-11-10 11:36 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

Hi Ram,

On Thu, Nov 09, 2017 at 03:37:46PM -0800, Ram Pai wrote:
> On Thu, Nov 09, 2017 at 04:47:15PM -0200, Breno Leitao wrote:
> > On Mon, Nov 06, 2017 at 12:57:36AM -0800, Ram Pai wrote:
> > > @@ -206,12 +209,14 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
> > >  
> > >  	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
> > >  	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
> > > -	fpregset = uctxt->uc_mcontext.fpregs;
> > > -	fpregs = (void *)fpregset;
> > 
> > Since you removed all references for fpregset now, you probably want to
> > remove the declaration of the variable above.
> 
> fpregs is still needed.

Right, fpregs is still needed, but not fpregset. Every reference for this
variable was removed with your patch.

Grepping this variable identifier on a tree with your patches, I see:

 $ grep fpregset protection_keys.c 
 fpregset_t fpregset;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-06  8:56 ` Ram Pai
                     ` (2 preceding siblings ...)
  (?)
@ 2017-11-10 18:10   ` christophe.leroy
  -1 siblings, 0 replies; 197+ messages in thread
From: Christophe LEROY @ 2017-11-10 18:10 UTC (permalink / raw)
  To: Ram Pai, mpe, mingo, akpm, corbet, arnd
  Cc: linux-arch, ebiederm, linux-doc, x86, dave.hansen, linux-kernel,
	mhocko, linux-mm, paulus, aneesh.kumar, linux-kselftest,
	bauerman, linuxppc-dev, khandual

Hi

Le 06/11/2017 à 09:56, Ram Pai a écrit :
> Memory protection keys enable applications to protect its
> address space from inadvertent access from or corruption
> by itself.
> 
> These patches along with the pte-bit freeing patch series
> enables the protection key feature on powerpc; 4k and 64k
> hashpage kernels. It also changes the generic and x86
> code to expose memkey features through sysfs. Finally
> testcases and Documentation is updated.
> 
> All patches can be found at --
> https://github.com/rampai/memorykeys.git memkey.v9

As far as I can see you are focussing the implementation on 64 bits 
powerpc. This could also be implemented on 32 bits powerpc, for instance 
the 8xx has MMU Access Protection Registers which can be used to define 
16 domains and could I think be used for implementing protection keys.
Of course the challenge after that would be to find 4 spare PTE bits, 
I'm sure we can find them on the 8xx, at least when using 16k pages we 
have 2 bits already available, then by merging PAGE_SHARED and PAGE_USER 
and by reducing PAGE_RO to only one bit we can get the 4 spare bits.

Therefore I think it would be great if you could implement a framework 
common to both PPC32 and PPC64.

Christophe

> 
> The overall idea:
> -----------------
>   A process allocates a key and associates it with
>   an address range within its address space.
>   The process then can dynamically set read/write
>   permissions on the key without involving the
>   kernel. Any code that violates the permissions
>   of the address space; as defined by its associated
>   key, will receive a segmentation fault.
> 
> This patch series enables the feature on PPC64 HPTE
> platform.
> 
> ISA3.0 section 5.7.13 describes the detailed
> specifications.
> 
> 
> Highlevel view of the design:
> ---------------------------
> When an application associates a key with a address
> address range, program the key in the Linux PTE.
> When the MMU detects a page fault, allocate a hash
> page and program the key into HPTE. And finally
> when the MMU detects a key violation; due to
> invalid application access, invoke the registered
> signal handler and provide the violated key number.
> 
> 
> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.
> 
> History:
> -------
> version v9:
> 	(1) used jump-labels to optimize code
> 		-- Balbir
> 	(2) fixed a register initialization bug noted
> 		by Balbir
> 	(3) fixed inappropriate use of paca to pass
> 		siginfo and keys to signal handler
> 	(4) Cleanup of comment style not to be right
> 		justified -- mpe
> 	(5) restructured the patches to depend on the
> 		availability of VM_PKEY_BIT4 in
> 		include/linux/mm.h
> 	(6) Incorporated comments from Dave Hansen
> 		towards changes to selftest and got
> 		them tested on x86.
> 
> version v8:
> 	(1) Contents of the AMR register withdrawn from
> 	the siginfo structure. Applications can always
> 	read the AMR register.
> 	(2) AMR/IAMR/UAMOR are now available through
> 		ptrace system call. -- thanks to Thiago
> 	(3) code changes to handle legacy power cpus
> 	that do not support execute-disable.
> 	(4) incorporates many code improvement
> 		suggestions.
> 
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	patch(2).
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	 patch(2).
> 
> version v6:
> 	(1) selftest changes are broken down into 20
> 		incremental patches.
> 	(2) A separate key allocation mask that
> 		includes PKEY_DISABLE_EXECUTE is
> 		added for powerpc
> 	(3) pkey feature is enabled for 64K HPT case
> 		only. RPT and 4k HPT is disabled.
> 	(4) Documentation is updated to better
> 		capture the semantics.
> 	(5) introduced arch_pkeys_enabled() to find
> 		if an arch enables pkeys. Correspond-
> 		ing change the logic that displays
> 		key value in smaps.
> 	(6) code rearranged in many places based on
> 		comments from Dave Hansen, Balbir,
> 		Anshuman.	
> 	(7) fixed one bug where a bogus key could be
> 		associated successfully in
> 		pkey_mprotect().
> 
> version v5:
> 	(1) reverted back to the old design -- store
> 	 the key in the pte, instead of bypassing
> 	 it. The v4 design slowed down the hash
> 	 page path.
> 	(2) detects key violation when kernel is told
> 		to access user pages.
> 	(3) further refined the patches into smaller
> 		consumable units
> 	(4) page faults handlers captures the fault-
> 		ing key
> 	 from the pte instead of the vma. This
> 	 closes a race between where the key
> 	 update in the vma and a key fault caused
> 	 by the key programmed in the pte.
> 	(5) a key created with access-denied should
> 	 also set it up to deny write. Fixed it.
> 	(6) protection-key number is displayed in
>   		smaps the x86 way.
> 
> version v4:
> 	(1) patches no more depend on the pte bits
> 		to program the hpte
> 			-- comment by Balbir
> 	(2) documentation updates
> 	(3) fixed a bug in the selftest.
> 	(4) unlike x86, powerpc lets signal handler
> 		change key permission bits; the
> 		change will persist across signal
> 		handler boundaries. Earlier we
> 		allowed the signal handler to
> 		modify a field in the siginfo
> 		structure which would than be used
> 		by the kernel to program the key
> 		protection register (AMR)
> 		 -- resolves a issue raised by Ben.
> 		"Calls to sys_swapcontext with a
> 		made-up context will end up with a
> 		crap AMR if done by code who didn't
> 		know about that register".
> 	(5) these changes enable protection keys on
>   		4k-page kernel aswell.
> 
> version v3:
> 	(1) split the patches into smaller consumable
> 		patches.
> 	(2) added the ability to disable execute
> 		permission on a key at creation.
> 	(3) rename calc_pte_to_hpte_pkey_bits() to
> 	pte_to_hpte_pkey_bits()
> 		-- suggested by Anshuman
> 	(4) some code optimization and clarity in
> 		do_page_fault()
> 	(5) A bug fix while invalidating a hpte slot
> 		in __hash_page_4K()
> 		-- noticed by Aneesh
> 	
> 
> version v2:
> 	(1) documentation and selftest added.
>   	(2) fixed a bug in 4k hpte backed 64k pte
> 		where page invalidation was not
> 		done correctly, and initialization
> 		of second-part-of-the-pte was not
> 		done correctly if the pte was not
> 		yet Hashed with a hpte.
> 		--	Reported by Aneesh.
> 	(3) Fixed ABI breakage caused in siginfo
> 		structure.
> 		-- Reported by Anshuman.
> 	
> 
> version v1: Initial version
> 
> Ram Pai (47):
>    mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
>      is enabled
>    mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
>    powerpc: initial pkey plumbing
>    powerpc: track allocation status of all pkeys
>    powerpc: helper function to read,write AMR,IAMR,UAMOR registers
>    powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
>    powerpc: cleanup AMR, IAMR when a key is allocated or freed
>    powerpc: implementation for arch_set_user_pkey_access()
>    powerpc: ability to create execute-disabled pkeys
>    powerpc: store and restore the pkey state across context switches
>    powerpc: introduce execute-only pkey
>    powerpc: ability to associate pkey to a vma
>    powerpc: implementation for arch_override_mprotect_pkey()
>    powerpc: map vma key-protection bits to pte key bits.
>    powerpc: Program HPTE key protection bits
>    powerpc: helper to validate key-access permissions of a pte
>    powerpc: check key protection for user page access
>    powerpc: implementation for arch_vma_access_permitted()
>    powerpc: Handle exceptions caused by pkey violation
>    powerpc: introduce get_mm_addr_key() helper
>    powerpc: Deliver SEGV signal on pkey violation
>    powerpc: Enable pkey subsystem
>    powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
>    powerpc: sys_pkey_mprotect() system call
>    powerpc: add sys_pkey_modify() system call
>    mm, x86 : introduce arch_pkeys_enabled()
>    mm: display pkey in smaps if arch_pkeys_enabled() is true
>    Documentation/x86: Move protecton key documentation to arch neutral
>      directory
>    Documentation/vm: PowerPC specific updates to memory protection keys
>    selftest/x86: Move protecton key selftest to arch neutral directory
>    selftest/vm: rename all references to pkru to a generic name
>    selftest/vm: move generic definitions to header file
>    selftest/vm: typecast the pkey register
>    selftest/vm: generic function to handle shadow key register
>    selftest/vm: fix the wrong assert in pkey_disable_set()
>    selftest/vm: fixed bugs in pkey_disable_clear()
>    selftest/vm: clear the bits in shadow reg when a pkey is freed.
>    selftest/vm: fix alloc_random_pkey() to make it really random
>    selftest/vm: introduce two arch independent abstraction
>    selftest/vm: pkey register should match shadow pkey
>    selftest/vm: generic cleanup
>    selftest/vm: powerpc implementation for generic abstraction
>    selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
>    selftest/vm: associate key on a mapped page and detect access
>      violation
>    selftest/vm: associate key on a mapped page and detect write
>      violation
>    selftest/vm: detect write violation on a mapped access-denied-key
>      page
>    selftest/vm: sub-page allocator
> 
> Thiago Jung Bauermann (4):
>    powerpc/ptrace: Add memory protection key regset
>    mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
>    selftests/powerpc: Add ptrace tests for Protection Key register
>    selftests/powerpc: Add core file test for Protection Key register
> 
>   Documentation/vm/protection-keys.txt               |  161 +++
>   Documentation/x86/protection-keys.txt              |   85 --
>   arch/powerpc/Kconfig                               |   15 +
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
>   arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
>   arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
>   arch/powerpc/include/asm/bug.h                     |    1 +
>   arch/powerpc/include/asm/cputable.h                |   15 +-
>   arch/powerpc/include/asm/mman.h                    |   13 +-
>   arch/powerpc/include/asm/mmu.h                     |    9 +
>   arch/powerpc/include/asm/mmu_context.h             |   24 +
>   arch/powerpc/include/asm/pkeys.h                   |  247 ++++
>   arch/powerpc/include/asm/processor.h               |    5 +
>   arch/powerpc/include/asm/systbl.h                  |    4 +
>   arch/powerpc/include/asm/unistd.h                  |    6 +-
>   arch/powerpc/include/uapi/asm/elf.h                |    1 +
>   arch/powerpc/include/uapi/asm/mman.h               |    6 +
>   arch/powerpc/include/uapi/asm/unistd.h             |    4 +
>   arch/powerpc/kernel/entry_64.S                     |    9 +
>   arch/powerpc/kernel/process.c                      |    7 +
>   arch/powerpc/kernel/prom.c                         |   18 +
>   arch/powerpc/kernel/ptrace.c                       |   66 +
>   arch/powerpc/kernel/traps.c                        |   19 +-
>   arch/powerpc/mm/Makefile                           |    1 +
>   arch/powerpc/mm/fault.c                            |   49 +-
>   arch/powerpc/mm/hash_utils_64.c                    |   29 +
>   arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
>   arch/powerpc/mm/pkeys.c                            |  463 +++++++
>   arch/x86/include/asm/mmu_context.h                 |    4 +-
>   arch/x86/include/asm/pkeys.h                       |    2 +
>   arch/x86/kernel/fpu/xstate.c                       |    5 +
>   arch/x86/kernel/setup.c                            |    8 -
>   arch/x86/mm/pkeys.c                                |    9 +
>   fs/proc/task_mmu.c                                 |   16 +-
>   include/linux/mm.h                                 |   12 +-
>   include/linux/pkeys.h                              |    7 +-
>   include/uapi/linux/elf.h                           |    1 +
>   mm/mprotect.c                                      |   88 ++
>   tools/testing/selftests/powerpc/include/reg.h      |    1 +
>   tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
>   tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
>   .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
>   tools/testing/selftests/vm/Makefile                |    1 +
>   tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
>   tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
>   tools/testing/selftests/x86/Makefile               |    2 +-
>   tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
>   tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
>   48 files changed, 4095 insertions(+), 1747 deletions(-)
>   create mode 100644 Documentation/vm/protection-keys.txt
>   delete mode 100644 Documentation/x86/protection-keys.txt
>   create mode 100644 arch/powerpc/include/asm/pkeys.h
>   create mode 100644 arch/powerpc/mm/pkeys.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
>   create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>   create mode 100644 tools/testing/selftests/vm/protection_keys.c
>   delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>   delete mode 100644 tools/testing/selftests/x86/protection_keys.c
> 

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-10 18:10   ` christophe.leroy
  0 siblings, 0 replies; 197+ messages in thread
From: christophe.leroy @ 2017-11-10 18:10 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 14206 bytes --]

Hi

Le 06/11/2017 à 09:56, Ram Pai a écrit :
> Memory protection keys enable applications to protect its
> address space from inadvertent access from or corruption
> by itself.
> 
> These patches along with the pte-bit freeing patch series
> enables the protection key feature on powerpc; 4k and 64k
> hashpage kernels. It also changes the generic and x86
> code to expose memkey features through sysfs. Finally
> testcases and Documentation is updated.
> 
> All patches can be found at --
> https://github.com/rampai/memorykeys.git memkey.v9

As far as I can see you are focussing the implementation on 64 bits 
powerpc. This could also be implemented on 32 bits powerpc, for instance 
the 8xx has MMU Access Protection Registers which can be used to define 
16 domains and could I think be used for implementing protection keys.
Of course the challenge after that would be to find 4 spare PTE bits, 
I'm sure we can find them on the 8xx, at least when using 16k pages we 
have 2 bits already available, then by merging PAGE_SHARED and PAGE_USER 
and by reducing PAGE_RO to only one bit we can get the 4 spare bits.

Therefore I think it would be great if you could implement a framework 
common to both PPC32 and PPC64.

Christophe

> 
> The overall idea:
> -----------------
>   A process allocates a key and associates it with
>   an address range within its address space.
>   The process then can dynamically set read/write
>   permissions on the key without involving the
>   kernel. Any code that violates the permissions
>   of the address space; as defined by its associated
>   key, will receive a segmentation fault.
> 
> This patch series enables the feature on PPC64 HPTE
> platform.
> 
> ISA3.0 section 5.7.13 describes the detailed
> specifications.
> 
> 
> Highlevel view of the design:
> ---------------------------
> When an application associates a key with a address
> address range, program the key in the Linux PTE.
> When the MMU detects a page fault, allocate a hash
> page and program the key into HPTE. And finally
> when the MMU detects a key violation; due to
> invalid application access, invoke the registered
> signal handler and provide the violated key number.
> 
> 
> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.
> 
> History:
> -------
> version v9:
> 	(1) used jump-labels to optimize code
> 		-- Balbir
> 	(2) fixed a register initialization bug noted
> 		by Balbir
> 	(3) fixed inappropriate use of paca to pass
> 		siginfo and keys to signal handler
> 	(4) Cleanup of comment style not to be right
> 		justified -- mpe
> 	(5) restructured the patches to depend on the
> 		availability of VM_PKEY_BIT4 in
> 		include/linux/mm.h
> 	(6) Incorporated comments from Dave Hansen
> 		towards changes to selftest and got
> 		them tested on x86.
> 
> version v8:
> 	(1) Contents of the AMR register withdrawn from
> 	the siginfo structure. Applications can always
> 	read the AMR register.
> 	(2) AMR/IAMR/UAMOR are now available through
> 		ptrace system call. -- thanks to Thiago
> 	(3) code changes to handle legacy power cpus
> 	that do not support execute-disable.
> 	(4) incorporates many code improvement
> 		suggestions.
> 
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	patch(2).
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	 patch(2).
> 
> version v6:
> 	(1) selftest changes are broken down into 20
> 		incremental patches.
> 	(2) A separate key allocation mask that
> 		includes PKEY_DISABLE_EXECUTE is
> 		added for powerpc
> 	(3) pkey feature is enabled for 64K HPT case
> 		only. RPT and 4k HPT is disabled.
> 	(4) Documentation is updated to better
> 		capture the semantics.
> 	(5) introduced arch_pkeys_enabled() to find
> 		if an arch enables pkeys. Correspond-
> 		ing change the logic that displays
> 		key value in smaps.
> 	(6) code rearranged in many places based on
> 		comments from Dave Hansen, Balbir,
> 		Anshuman.	
> 	(7) fixed one bug where a bogus key could be
> 		associated successfully in
> 		pkey_mprotect().
> 
> version v5:
> 	(1) reverted back to the old design -- store
> 	 the key in the pte, instead of bypassing
> 	 it. The v4 design slowed down the hash
> 	 page path.
> 	(2) detects key violation when kernel is told
> 		to access user pages.
> 	(3) further refined the patches into smaller
> 		consumable units
> 	(4) page faults handlers captures the fault-
> 		ing key
> 	 from the pte instead of the vma. This
> 	 closes a race between where the key
> 	 update in the vma and a key fault caused
> 	 by the key programmed in the pte.
> 	(5) a key created with access-denied should
> 	 also set it up to deny write. Fixed it.
> 	(6) protection-key number is displayed in
>   		smaps the x86 way.
> 
> version v4:
> 	(1) patches no more depend on the pte bits
> 		to program the hpte
> 			-- comment by Balbir
> 	(2) documentation updates
> 	(3) fixed a bug in the selftest.
> 	(4) unlike x86, powerpc lets signal handler
> 		change key permission bits; the
> 		change will persist across signal
> 		handler boundaries. Earlier we
> 		allowed the signal handler to
> 		modify a field in the siginfo
> 		structure which would than be used
> 		by the kernel to program the key
> 		protection register (AMR)
> 		 -- resolves a issue raised by Ben.
> 		"Calls to sys_swapcontext with a
> 		made-up context will end up with a
> 		crap AMR if done by code who didn't
> 		know about that register".
> 	(5) these changes enable protection keys on
>   		4k-page kernel aswell.
> 
> version v3:
> 	(1) split the patches into smaller consumable
> 		patches.
> 	(2) added the ability to disable execute
> 		permission on a key at creation.
> 	(3) rename calc_pte_to_hpte_pkey_bits() to
> 	pte_to_hpte_pkey_bits()
> 		-- suggested by Anshuman
> 	(4) some code optimization and clarity in
> 		do_page_fault()
> 	(5) A bug fix while invalidating a hpte slot
> 		in __hash_page_4K()
> 		-- noticed by Aneesh
> 	
> 
> version v2:
> 	(1) documentation and selftest added.
>   	(2) fixed a bug in 4k hpte backed 64k pte
> 		where page invalidation was not
> 		done correctly, and initialization
> 		of second-part-of-the-pte was not
> 		done correctly if the pte was not
> 		yet Hashed with a hpte.
> 		--	Reported by Aneesh.
> 	(3) Fixed ABI breakage caused in siginfo
> 		structure.
> 		-- Reported by Anshuman.
> 	
> 
> version v1: Initial version
> 
> Ram Pai (47):
>    mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
>      is enabled
>    mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
>    powerpc: initial pkey plumbing
>    powerpc: track allocation status of all pkeys
>    powerpc: helper function to read,write AMR,IAMR,UAMOR registers
>    powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
>    powerpc: cleanup AMR, IAMR when a key is allocated or freed
>    powerpc: implementation for arch_set_user_pkey_access()
>    powerpc: ability to create execute-disabled pkeys
>    powerpc: store and restore the pkey state across context switches
>    powerpc: introduce execute-only pkey
>    powerpc: ability to associate pkey to a vma
>    powerpc: implementation for arch_override_mprotect_pkey()
>    powerpc: map vma key-protection bits to pte key bits.
>    powerpc: Program HPTE key protection bits
>    powerpc: helper to validate key-access permissions of a pte
>    powerpc: check key protection for user page access
>    powerpc: implementation for arch_vma_access_permitted()
>    powerpc: Handle exceptions caused by pkey violation
>    powerpc: introduce get_mm_addr_key() helper
>    powerpc: Deliver SEGV signal on pkey violation
>    powerpc: Enable pkey subsystem
>    powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
>    powerpc: sys_pkey_mprotect() system call
>    powerpc: add sys_pkey_modify() system call
>    mm, x86 : introduce arch_pkeys_enabled()
>    mm: display pkey in smaps if arch_pkeys_enabled() is true
>    Documentation/x86: Move protecton key documentation to arch neutral
>      directory
>    Documentation/vm: PowerPC specific updates to memory protection keys
>    selftest/x86: Move protecton key selftest to arch neutral directory
>    selftest/vm: rename all references to pkru to a generic name
>    selftest/vm: move generic definitions to header file
>    selftest/vm: typecast the pkey register
>    selftest/vm: generic function to handle shadow key register
>    selftest/vm: fix the wrong assert in pkey_disable_set()
>    selftest/vm: fixed bugs in pkey_disable_clear()
>    selftest/vm: clear the bits in shadow reg when a pkey is freed.
>    selftest/vm: fix alloc_random_pkey() to make it really random
>    selftest/vm: introduce two arch independent abstraction
>    selftest/vm: pkey register should match shadow pkey
>    selftest/vm: generic cleanup
>    selftest/vm: powerpc implementation for generic abstraction
>    selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
>    selftest/vm: associate key on a mapped page and detect access
>      violation
>    selftest/vm: associate key on a mapped page and detect write
>      violation
>    selftest/vm: detect write violation on a mapped access-denied-key
>      page
>    selftest/vm: sub-page allocator
> 
> Thiago Jung Bauermann (4):
>    powerpc/ptrace: Add memory protection key regset
>    mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
>    selftests/powerpc: Add ptrace tests for Protection Key register
>    selftests/powerpc: Add core file test for Protection Key register
> 
>   Documentation/vm/protection-keys.txt               |  161 +++
>   Documentation/x86/protection-keys.txt              |   85 --
>   arch/powerpc/Kconfig                               |   15 +
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
>   arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
>   arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
>   arch/powerpc/include/asm/bug.h                     |    1 +
>   arch/powerpc/include/asm/cputable.h                |   15 +-
>   arch/powerpc/include/asm/mman.h                    |   13 +-
>   arch/powerpc/include/asm/mmu.h                     |    9 +
>   arch/powerpc/include/asm/mmu_context.h             |   24 +
>   arch/powerpc/include/asm/pkeys.h                   |  247 ++++
>   arch/powerpc/include/asm/processor.h               |    5 +
>   arch/powerpc/include/asm/systbl.h                  |    4 +
>   arch/powerpc/include/asm/unistd.h                  |    6 +-
>   arch/powerpc/include/uapi/asm/elf.h                |    1 +
>   arch/powerpc/include/uapi/asm/mman.h               |    6 +
>   arch/powerpc/include/uapi/asm/unistd.h             |    4 +
>   arch/powerpc/kernel/entry_64.S                     |    9 +
>   arch/powerpc/kernel/process.c                      |    7 +
>   arch/powerpc/kernel/prom.c                         |   18 +
>   arch/powerpc/kernel/ptrace.c                       |   66 +
>   arch/powerpc/kernel/traps.c                        |   19 +-
>   arch/powerpc/mm/Makefile                           |    1 +
>   arch/powerpc/mm/fault.c                            |   49 +-
>   arch/powerpc/mm/hash_utils_64.c                    |   29 +
>   arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
>   arch/powerpc/mm/pkeys.c                            |  463 +++++++
>   arch/x86/include/asm/mmu_context.h                 |    4 +-
>   arch/x86/include/asm/pkeys.h                       |    2 +
>   arch/x86/kernel/fpu/xstate.c                       |    5 +
>   arch/x86/kernel/setup.c                            |    8 -
>   arch/x86/mm/pkeys.c                                |    9 +
>   fs/proc/task_mmu.c                                 |   16 +-
>   include/linux/mm.h                                 |   12 +-
>   include/linux/pkeys.h                              |    7 +-
>   include/uapi/linux/elf.h                           |    1 +
>   mm/mprotect.c                                      |   88 ++
>   tools/testing/selftests/powerpc/include/reg.h      |    1 +
>   tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
>   tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
>   .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
>   tools/testing/selftests/vm/Makefile                |    1 +
>   tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
>   tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
>   tools/testing/selftests/x86/Makefile               |    2 +-
>   tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
>   tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
>   48 files changed, 4095 insertions(+), 1747 deletions(-)
>   create mode 100644 Documentation/vm/protection-keys.txt
>   delete mode 100644 Documentation/x86/protection-keys.txt
>   create mode 100644 arch/powerpc/include/asm/pkeys.h
>   create mode 100644 arch/powerpc/mm/pkeys.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
>   create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>   create mode 100644 tools/testing/selftests/vm/protection_keys.c
>   delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>   delete mode 100644 tools/testing/selftests/x86/protection_keys.c
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-10 18:10   ` christophe.leroy
  0 siblings, 0 replies; 197+ messages in thread
From: Christophe LEROY @ 2017-11-10 18:10 UTC (permalink / raw)


Hi

Le 06/11/2017 à 09:56, Ram Pai a écrit :
> Memory protection keys enable applications to protect its
> address space from inadvertent access from or corruption
> by itself.
> 
> These patches along with the pte-bit freeing patch series
> enables the protection key feature on powerpc; 4k and 64k
> hashpage kernels. It also changes the generic and x86
> code to expose memkey features through sysfs. Finally
> testcases and Documentation is updated.
> 
> All patches can be found at --
> https://github.com/rampai/memorykeys.git memkey.v9

As far as I can see you are focussing the implementation on 64 bits 
powerpc. This could also be implemented on 32 bits powerpc, for instance 
the 8xx has MMU Access Protection Registers which can be used to define 
16 domains and could I think be used for implementing protection keys.
Of course the challenge after that would be to find 4 spare PTE bits, 
I'm sure we can find them on the 8xx, at least when using 16k pages we 
have 2 bits already available, then by merging PAGE_SHARED and PAGE_USER 
and by reducing PAGE_RO to only one bit we can get the 4 spare bits.

Therefore I think it would be great if you could implement a framework 
common to both PPC32 and PPC64.

Christophe

> 
> The overall idea:
> -----------------
>   A process allocates a key and associates it with
>   an address range within its address space.
>   The process then can dynamically set read/write
>   permissions on the key without involving the
>   kernel. Any code that violates the permissions
>   of the address space; as defined by its associated
>   key, will receive a segmentation fault.
> 
> This patch series enables the feature on PPC64 HPTE
> platform.
> 
> ISA3.0 section 5.7.13 describes the detailed
> specifications.
> 
> 
> Highlevel view of the design:
> ---------------------------
> When an application associates a key with a address
> address range, program the key in the Linux PTE.
> When the MMU detects a page fault, allocate a hash
> page and program the key into HPTE. And finally
> when the MMU detects a key violation; due to
> invalid application access, invoke the registered
> signal handler and provide the violated key number.
> 
> 
> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.
> 
> History:
> -------
> version v9:
> 	(1) used jump-labels to optimize code
> 		-- Balbir
> 	(2) fixed a register initialization bug noted
> 		by Balbir
> 	(3) fixed inappropriate use of paca to pass
> 		siginfo and keys to signal handler
> 	(4) Cleanup of comment style not to be right
> 		justified -- mpe
> 	(5) restructured the patches to depend on the
> 		availability of VM_PKEY_BIT4 in
> 		include/linux/mm.h
> 	(6) Incorporated comments from Dave Hansen
> 		towards changes to selftest and got
> 		them tested on x86.
> 
> version v8:
> 	(1) Contents of the AMR register withdrawn from
> 	the siginfo structure. Applications can always
> 	read the AMR register.
> 	(2) AMR/IAMR/UAMOR are now available through
> 		ptrace system call. -- thanks to Thiago
> 	(3) code changes to handle legacy power cpus
> 	that do not support execute-disable.
> 	(4) incorporates many code improvement
> 		suggestions.
> 
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	patch(2).
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	 patch(2).
> 
> version v6:
> 	(1) selftest changes are broken down into 20
> 		incremental patches.
> 	(2) A separate key allocation mask that
> 		includes PKEY_DISABLE_EXECUTE is
> 		added for powerpc
> 	(3) pkey feature is enabled for 64K HPT case
> 		only. RPT and 4k HPT is disabled.
> 	(4) Documentation is updated to better
> 		capture the semantics.
> 	(5) introduced arch_pkeys_enabled() to find
> 		if an arch enables pkeys. Correspond-
> 		ing change the logic that displays
> 		key value in smaps.
> 	(6) code rearranged in many places based on
> 		comments from Dave Hansen, Balbir,
> 		Anshuman.	
> 	(7) fixed one bug where a bogus key could be
> 		associated successfully in
> 		pkey_mprotect().
> 
> version v5:
> 	(1) reverted back to the old design -- store
> 	 the key in the pte, instead of bypassing
> 	 it. The v4 design slowed down the hash
> 	 page path.
> 	(2) detects key violation when kernel is told
> 		to access user pages.
> 	(3) further refined the patches into smaller
> 		consumable units
> 	(4) page faults handlers captures the fault-
> 		ing key
> 	 from the pte instead of the vma. This
> 	 closes a race between where the key
> 	 update in the vma and a key fault caused
> 	 by the key programmed in the pte.
> 	(5) a key created with access-denied should
> 	 also set it up to deny write. Fixed it.
> 	(6) protection-key number is displayed in
>   		smaps the x86 way.
> 
> version v4:
> 	(1) patches no more depend on the pte bits
> 		to program the hpte
> 			-- comment by Balbir
> 	(2) documentation updates
> 	(3) fixed a bug in the selftest.
> 	(4) unlike x86, powerpc lets signal handler
> 		change key permission bits; the
> 		change will persist across signal
> 		handler boundaries. Earlier we
> 		allowed the signal handler to
> 		modify a field in the siginfo
> 		structure which would than be used
> 		by the kernel to program the key
> 		protection register (AMR)
> 		 -- resolves a issue raised by Ben.
> 		"Calls to sys_swapcontext with a
> 		made-up context will end up with a
> 		crap AMR if done by code who didn't
> 		know about that register".
> 	(5) these changes enable protection keys on
>   		4k-page kernel aswell.
> 
> version v3:
> 	(1) split the patches into smaller consumable
> 		patches.
> 	(2) added the ability to disable execute
> 		permission on a key at creation.
> 	(3) rename calc_pte_to_hpte_pkey_bits() to
> 	pte_to_hpte_pkey_bits()
> 		-- suggested by Anshuman
> 	(4) some code optimization and clarity in
> 		do_page_fault()
> 	(5) A bug fix while invalidating a hpte slot
> 		in __hash_page_4K()
> 		-- noticed by Aneesh
> 	
> 
> version v2:
> 	(1) documentation and selftest added.
>   	(2) fixed a bug in 4k hpte backed 64k pte
> 		where page invalidation was not
> 		done correctly, and initialization
> 		of second-part-of-the-pte was not
> 		done correctly if the pte was not
> 		yet Hashed with a hpte.
> 		--	Reported by Aneesh.
> 	(3) Fixed ABI breakage caused in siginfo
> 		structure.
> 		-- Reported by Anshuman.
> 	
> 
> version v1: Initial version
> 
> Ram Pai (47):
>    mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
>      is enabled
>    mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
>    powerpc: initial pkey plumbing
>    powerpc: track allocation status of all pkeys
>    powerpc: helper function to read,write AMR,IAMR,UAMOR registers
>    powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
>    powerpc: cleanup AMR, IAMR when a key is allocated or freed
>    powerpc: implementation for arch_set_user_pkey_access()
>    powerpc: ability to create execute-disabled pkeys
>    powerpc: store and restore the pkey state across context switches
>    powerpc: introduce execute-only pkey
>    powerpc: ability to associate pkey to a vma
>    powerpc: implementation for arch_override_mprotect_pkey()
>    powerpc: map vma key-protection bits to pte key bits.
>    powerpc: Program HPTE key protection bits
>    powerpc: helper to validate key-access permissions of a pte
>    powerpc: check key protection for user page access
>    powerpc: implementation for arch_vma_access_permitted()
>    powerpc: Handle exceptions caused by pkey violation
>    powerpc: introduce get_mm_addr_key() helper
>    powerpc: Deliver SEGV signal on pkey violation
>    powerpc: Enable pkey subsystem
>    powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
>    powerpc: sys_pkey_mprotect() system call
>    powerpc: add sys_pkey_modify() system call
>    mm, x86 : introduce arch_pkeys_enabled()
>    mm: display pkey in smaps if arch_pkeys_enabled() is true
>    Documentation/x86: Move protecton key documentation to arch neutral
>      directory
>    Documentation/vm: PowerPC specific updates to memory protection keys
>    selftest/x86: Move protecton key selftest to arch neutral directory
>    selftest/vm: rename all references to pkru to a generic name
>    selftest/vm: move generic definitions to header file
>    selftest/vm: typecast the pkey register
>    selftest/vm: generic function to handle shadow key register
>    selftest/vm: fix the wrong assert in pkey_disable_set()
>    selftest/vm: fixed bugs in pkey_disable_clear()
>    selftest/vm: clear the bits in shadow reg when a pkey is freed.
>    selftest/vm: fix alloc_random_pkey() to make it really random
>    selftest/vm: introduce two arch independent abstraction
>    selftest/vm: pkey register should match shadow pkey
>    selftest/vm: generic cleanup
>    selftest/vm: powerpc implementation for generic abstraction
>    selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
>    selftest/vm: associate key on a mapped page and detect access
>      violation
>    selftest/vm: associate key on a mapped page and detect write
>      violation
>    selftest/vm: detect write violation on a mapped access-denied-key
>      page
>    selftest/vm: sub-page allocator
> 
> Thiago Jung Bauermann (4):
>    powerpc/ptrace: Add memory protection key regset
>    mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
>    selftests/powerpc: Add ptrace tests for Protection Key register
>    selftests/powerpc: Add core file test for Protection Key register
> 
>   Documentation/vm/protection-keys.txt               |  161 +++
>   Documentation/x86/protection-keys.txt              |   85 --
>   arch/powerpc/Kconfig                               |   15 +
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
>   arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
>   arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
>   arch/powerpc/include/asm/bug.h                     |    1 +
>   arch/powerpc/include/asm/cputable.h                |   15 +-
>   arch/powerpc/include/asm/mman.h                    |   13 +-
>   arch/powerpc/include/asm/mmu.h                     |    9 +
>   arch/powerpc/include/asm/mmu_context.h             |   24 +
>   arch/powerpc/include/asm/pkeys.h                   |  247 ++++
>   arch/powerpc/include/asm/processor.h               |    5 +
>   arch/powerpc/include/asm/systbl.h                  |    4 +
>   arch/powerpc/include/asm/unistd.h                  |    6 +-
>   arch/powerpc/include/uapi/asm/elf.h                |    1 +
>   arch/powerpc/include/uapi/asm/mman.h               |    6 +
>   arch/powerpc/include/uapi/asm/unistd.h             |    4 +
>   arch/powerpc/kernel/entry_64.S                     |    9 +
>   arch/powerpc/kernel/process.c                      |    7 +
>   arch/powerpc/kernel/prom.c                         |   18 +
>   arch/powerpc/kernel/ptrace.c                       |   66 +
>   arch/powerpc/kernel/traps.c                        |   19 +-
>   arch/powerpc/mm/Makefile                           |    1 +
>   arch/powerpc/mm/fault.c                            |   49 +-
>   arch/powerpc/mm/hash_utils_64.c                    |   29 +
>   arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
>   arch/powerpc/mm/pkeys.c                            |  463 +++++++
>   arch/x86/include/asm/mmu_context.h                 |    4 +-
>   arch/x86/include/asm/pkeys.h                       |    2 +
>   arch/x86/kernel/fpu/xstate.c                       |    5 +
>   arch/x86/kernel/setup.c                            |    8 -
>   arch/x86/mm/pkeys.c                                |    9 +
>   fs/proc/task_mmu.c                                 |   16 +-
>   include/linux/mm.h                                 |   12 +-
>   include/linux/pkeys.h                              |    7 +-
>   include/uapi/linux/elf.h                           |    1 +
>   mm/mprotect.c                                      |   88 ++
>   tools/testing/selftests/powerpc/include/reg.h      |    1 +
>   tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
>   tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
>   .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
>   tools/testing/selftests/vm/Makefile                |    1 +
>   tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
>   tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
>   tools/testing/selftests/x86/Makefile               |    2 +-
>   tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
>   tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
>   48 files changed, 4095 insertions(+), 1747 deletions(-)
>   create mode 100644 Documentation/vm/protection-keys.txt
>   delete mode 100644 Documentation/x86/protection-keys.txt
>   create mode 100644 arch/powerpc/include/asm/pkeys.h
>   create mode 100644 arch/powerpc/mm/pkeys.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
>   create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>   create mode 100644 tools/testing/selftests/vm/protection_keys.c
>   delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>   delete mode 100644 tools/testing/selftests/x86/protection_keys.c
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-10 18:10   ` christophe.leroy
  0 siblings, 0 replies; 197+ messages in thread
From: Christophe LEROY @ 2017-11-10 18:10 UTC (permalink / raw)
  To: Ram Pai, mpe, mingo, akpm, corbet, arnd
  Cc: linux-arch, ebiederm, linux-doc, x86, dave.hansen, linux-kernel,
	mhocko, linux-mm, paulus, aneesh.kumar, linux-kselftest,
	bauerman, linuxppc-dev, khandual

Hi

Le 06/11/2017 à 09:56, Ram Pai a écrit :
> Memory protection keys enable applications to protect its
> address space from inadvertent access from or corruption
> by itself.
> 
> These patches along with the pte-bit freeing patch series
> enables the protection key feature on powerpc; 4k and 64k
> hashpage kernels. It also changes the generic and x86
> code to expose memkey features through sysfs. Finally
> testcases and Documentation is updated.
> 
> All patches can be found at --
> https://github.com/rampai/memorykeys.git memkey.v9

As far as I can see you are focussing the implementation on 64 bits 
powerpc. This could also be implemented on 32 bits powerpc, for instance 
the 8xx has MMU Access Protection Registers which can be used to define 
16 domains and could I think be used for implementing protection keys.
Of course the challenge after that would be to find 4 spare PTE bits, 
I'm sure we can find them on the 8xx, at least when using 16k pages we 
have 2 bits already available, then by merging PAGE_SHARED and PAGE_USER 
and by reducing PAGE_RO to only one bit we can get the 4 spare bits.

Therefore I think it would be great if you could implement a framework 
common to both PPC32 and PPC64.

Christophe

> 
> The overall idea:
> -----------------
>   A process allocates a key and associates it with
>   an address range within its address space.
>   The process then can dynamically set read/write
>   permissions on the key without involving the
>   kernel. Any code that violates the permissions
>   of the address space; as defined by its associated
>   key, will receive a segmentation fault.
> 
> This patch series enables the feature on PPC64 HPTE
> platform.
> 
> ISA3.0 section 5.7.13 describes the detailed
> specifications.
> 
> 
> Highlevel view of the design:
> ---------------------------
> When an application associates a key with a address
> address range, program the key in the Linux PTE.
> When the MMU detects a page fault, allocate a hash
> page and program the key into HPTE. And finally
> when the MMU detects a key violation; due to
> invalid application access, invoke the registered
> signal handler and provide the violated key number.
> 
> 
> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.
> 
> History:
> -------
> version v9:
> 	(1) used jump-labels to optimize code
> 		-- Balbir
> 	(2) fixed a register initialization bug noted
> 		by Balbir
> 	(3) fixed inappropriate use of paca to pass
> 		siginfo and keys to signal handler
> 	(4) Cleanup of comment style not to be right
> 		justified -- mpe
> 	(5) restructured the patches to depend on the
> 		availability of VM_PKEY_BIT4 in
> 		include/linux/mm.h
> 	(6) Incorporated comments from Dave Hansen
> 		towards changes to selftest and got
> 		them tested on x86.
> 
> version v8:
> 	(1) Contents of the AMR register withdrawn from
> 	the siginfo structure. Applications can always
> 	read the AMR register.
> 	(2) AMR/IAMR/UAMOR are now available through
> 		ptrace system call. -- thanks to Thiago
> 	(3) code changes to handle legacy power cpus
> 	that do not support execute-disable.
> 	(4) incorporates many code improvement
> 		suggestions.
> 
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	patch(2).
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	 patch(2).
> 
> version v6:
> 	(1) selftest changes are broken down into 20
> 		incremental patches.
> 	(2) A separate key allocation mask that
> 		includes PKEY_DISABLE_EXECUTE is
> 		added for powerpc
> 	(3) pkey feature is enabled for 64K HPT case
> 		only. RPT and 4k HPT is disabled.
> 	(4) Documentation is updated to better
> 		capture the semantics.
> 	(5) introduced arch_pkeys_enabled() to find
> 		if an arch enables pkeys. Correspond-
> 		ing change the logic that displays
> 		key value in smaps.
> 	(6) code rearranged in many places based on
> 		comments from Dave Hansen, Balbir,
> 		Anshuman.	
> 	(7) fixed one bug where a bogus key could be
> 		associated successfully in
> 		pkey_mprotect().
> 
> version v5:
> 	(1) reverted back to the old design -- store
> 	 the key in the pte, instead of bypassing
> 	 it. The v4 design slowed down the hash
> 	 page path.
> 	(2) detects key violation when kernel is told
> 		to access user pages.
> 	(3) further refined the patches into smaller
> 		consumable units
> 	(4) page faults handlers captures the fault-
> 		ing key
> 	 from the pte instead of the vma. This
> 	 closes a race between where the key
> 	 update in the vma and a key fault caused
> 	 by the key programmed in the pte.
> 	(5) a key created with access-denied should
> 	 also set it up to deny write. Fixed it.
> 	(6) protection-key number is displayed in
>   		smaps the x86 way.
> 
> version v4:
> 	(1) patches no more depend on the pte bits
> 		to program the hpte
> 			-- comment by Balbir
> 	(2) documentation updates
> 	(3) fixed a bug in the selftest.
> 	(4) unlike x86, powerpc lets signal handler
> 		change key permission bits; the
> 		change will persist across signal
> 		handler boundaries. Earlier we
> 		allowed the signal handler to
> 		modify a field in the siginfo
> 		structure which would than be used
> 		by the kernel to program the key
> 		protection register (AMR)
> 		 -- resolves a issue raised by Ben.
> 		"Calls to sys_swapcontext with a
> 		made-up context will end up with a
> 		crap AMR if done by code who didn't
> 		know about that register".
> 	(5) these changes enable protection keys on
>   		4k-page kernel aswell.
> 
> version v3:
> 	(1) split the patches into smaller consumable
> 		patches.
> 	(2) added the ability to disable execute
> 		permission on a key at creation.
> 	(3) rename calc_pte_to_hpte_pkey_bits() to
> 	pte_to_hpte_pkey_bits()
> 		-- suggested by Anshuman
> 	(4) some code optimization and clarity in
> 		do_page_fault()
> 	(5) A bug fix while invalidating a hpte slot
> 		in __hash_page_4K()
> 		-- noticed by Aneesh
> 	
> 
> version v2:
> 	(1) documentation and selftest added.
>   	(2) fixed a bug in 4k hpte backed 64k pte
> 		where page invalidation was not
> 		done correctly, and initialization
> 		of second-part-of-the-pte was not
> 		done correctly if the pte was not
> 		yet Hashed with a hpte.
> 		--	Reported by Aneesh.
> 	(3) Fixed ABI breakage caused in siginfo
> 		structure.
> 		-- Reported by Anshuman.
> 	
> 
> version v1: Initial version
> 
> Ram Pai (47):
>    mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
>      is enabled
>    mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
>    powerpc: initial pkey plumbing
>    powerpc: track allocation status of all pkeys
>    powerpc: helper function to read,write AMR,IAMR,UAMOR registers
>    powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
>    powerpc: cleanup AMR, IAMR when a key is allocated or freed
>    powerpc: implementation for arch_set_user_pkey_access()
>    powerpc: ability to create execute-disabled pkeys
>    powerpc: store and restore the pkey state across context switches
>    powerpc: introduce execute-only pkey
>    powerpc: ability to associate pkey to a vma
>    powerpc: implementation for arch_override_mprotect_pkey()
>    powerpc: map vma key-protection bits to pte key bits.
>    powerpc: Program HPTE key protection bits
>    powerpc: helper to validate key-access permissions of a pte
>    powerpc: check key protection for user page access
>    powerpc: implementation for arch_vma_access_permitted()
>    powerpc: Handle exceptions caused by pkey violation
>    powerpc: introduce get_mm_addr_key() helper
>    powerpc: Deliver SEGV signal on pkey violation
>    powerpc: Enable pkey subsystem
>    powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
>    powerpc: sys_pkey_mprotect() system call
>    powerpc: add sys_pkey_modify() system call
>    mm, x86 : introduce arch_pkeys_enabled()
>    mm: display pkey in smaps if arch_pkeys_enabled() is true
>    Documentation/x86: Move protecton key documentation to arch neutral
>      directory
>    Documentation/vm: PowerPC specific updates to memory protection keys
>    selftest/x86: Move protecton key selftest to arch neutral directory
>    selftest/vm: rename all references to pkru to a generic name
>    selftest/vm: move generic definitions to header file
>    selftest/vm: typecast the pkey register
>    selftest/vm: generic function to handle shadow key register
>    selftest/vm: fix the wrong assert in pkey_disable_set()
>    selftest/vm: fixed bugs in pkey_disable_clear()
>    selftest/vm: clear the bits in shadow reg when a pkey is freed.
>    selftest/vm: fix alloc_random_pkey() to make it really random
>    selftest/vm: introduce two arch independent abstraction
>    selftest/vm: pkey register should match shadow pkey
>    selftest/vm: generic cleanup
>    selftest/vm: powerpc implementation for generic abstraction
>    selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
>    selftest/vm: associate key on a mapped page and detect access
>      violation
>    selftest/vm: associate key on a mapped page and detect write
>      violation
>    selftest/vm: detect write violation on a mapped access-denied-key
>      page
>    selftest/vm: sub-page allocator
> 
> Thiago Jung Bauermann (4):
>    powerpc/ptrace: Add memory protection key regset
>    mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
>    selftests/powerpc: Add ptrace tests for Protection Key register
>    selftests/powerpc: Add core file test for Protection Key register
> 
>   Documentation/vm/protection-keys.txt               |  161 +++
>   Documentation/x86/protection-keys.txt              |   85 --
>   arch/powerpc/Kconfig                               |   15 +
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
>   arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
>   arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
>   arch/powerpc/include/asm/bug.h                     |    1 +
>   arch/powerpc/include/asm/cputable.h                |   15 +-
>   arch/powerpc/include/asm/mman.h                    |   13 +-
>   arch/powerpc/include/asm/mmu.h                     |    9 +
>   arch/powerpc/include/asm/mmu_context.h             |   24 +
>   arch/powerpc/include/asm/pkeys.h                   |  247 ++++
>   arch/powerpc/include/asm/processor.h               |    5 +
>   arch/powerpc/include/asm/systbl.h                  |    4 +
>   arch/powerpc/include/asm/unistd.h                  |    6 +-
>   arch/powerpc/include/uapi/asm/elf.h                |    1 +
>   arch/powerpc/include/uapi/asm/mman.h               |    6 +
>   arch/powerpc/include/uapi/asm/unistd.h             |    4 +
>   arch/powerpc/kernel/entry_64.S                     |    9 +
>   arch/powerpc/kernel/process.c                      |    7 +
>   arch/powerpc/kernel/prom.c                         |   18 +
>   arch/powerpc/kernel/ptrace.c                       |   66 +
>   arch/powerpc/kernel/traps.c                        |   19 +-
>   arch/powerpc/mm/Makefile                           |    1 +
>   arch/powerpc/mm/fault.c                            |   49 +-
>   arch/powerpc/mm/hash_utils_64.c                    |   29 +
>   arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
>   arch/powerpc/mm/pkeys.c                            |  463 +++++++
>   arch/x86/include/asm/mmu_context.h                 |    4 +-
>   arch/x86/include/asm/pkeys.h                       |    2 +
>   arch/x86/kernel/fpu/xstate.c                       |    5 +
>   arch/x86/kernel/setup.c                            |    8 -
>   arch/x86/mm/pkeys.c                                |    9 +
>   fs/proc/task_mmu.c                                 |   16 +-
>   include/linux/mm.h                                 |   12 +-
>   include/linux/pkeys.h                              |    7 +-
>   include/uapi/linux/elf.h                           |    1 +
>   mm/mprotect.c                                      |   88 ++
>   tools/testing/selftests/powerpc/include/reg.h      |    1 +
>   tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
>   tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
>   .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
>   tools/testing/selftests/vm/Makefile                |    1 +
>   tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
>   tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
>   tools/testing/selftests/x86/Makefile               |    2 +-
>   tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
>   tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
>   48 files changed, 4095 insertions(+), 1747 deletions(-)
>   create mode 100644 Documentation/vm/protection-keys.txt
>   delete mode 100644 Documentation/x86/protection-keys.txt
>   create mode 100644 arch/powerpc/include/asm/pkeys.h
>   create mode 100644 arch/powerpc/mm/pkeys.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
>   create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>   create mode 100644 tools/testing/selftests/vm/protection_keys.c
>   delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>   delete mode 100644 tools/testing/selftests/x86/protection_keys.c
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-10 18:10   ` christophe.leroy
  0 siblings, 0 replies; 197+ messages in thread
From: Christophe LEROY @ 2017-11-10 18:10 UTC (permalink / raw)
  To: Ram Pai, mpe, mingo, akpm, corbet, arnd
  Cc: linux-arch, ebiederm, linux-doc, x86, dave.hansen, linux-kernel,
	mhocko, linux-mm, paulus, aneesh.kumar, linux-kselftest,
	bauerman, linuxppc-dev, khandual

Hi

Le 06/11/2017 A  09:56, Ram Pai a A(C)critA :
> Memory protection keys enable applications to protect its
> address space from inadvertent access from or corruption
> by itself.
> 
> These patches along with the pte-bit freeing patch series
> enables the protection key feature on powerpc; 4k and 64k
> hashpage kernels. It also changes the generic and x86
> code to expose memkey features through sysfs. Finally
> testcases and Documentation is updated.
> 
> All patches can be found at --
> https://github.com/rampai/memorykeys.git memkey.v9

As far as I can see you are focussing the implementation on 64 bits 
powerpc. This could also be implemented on 32 bits powerpc, for instance 
the 8xx has MMU Access Protection Registers which can be used to define 
16 domains and could I think be used for implementing protection keys.
Of course the challenge after that would be to find 4 spare PTE bits, 
I'm sure we can find them on the 8xx, at least when using 16k pages we 
have 2 bits already available, then by merging PAGE_SHARED and PAGE_USER 
and by reducing PAGE_RO to only one bit we can get the 4 spare bits.

Therefore I think it would be great if you could implement a framework 
common to both PPC32 and PPC64.

Christophe

> 
> The overall idea:
> -----------------
>   A process allocates a key and associates it with
>   an address range within its address space.
>   The process then can dynamically set read/write
>   permissions on the key without involving the
>   kernel. Any code that violates the permissions
>   of the address space; as defined by its associated
>   key, will receive a segmentation fault.
> 
> This patch series enables the feature on PPC64 HPTE
> platform.
> 
> ISA3.0 section 5.7.13 describes the detailed
> specifications.
> 
> 
> Highlevel view of the design:
> ---------------------------
> When an application associates a key with a address
> address range, program the key in the Linux PTE.
> When the MMU detects a page fault, allocate a hash
> page and program the key into HPTE. And finally
> when the MMU detects a key violation; due to
> invalid application access, invoke the registered
> signal handler and provide the violated key number.
> 
> 
> Testing:
> -------
> This patch series has passed all the protection key
> tests available in the selftest directory.The
> tests are updated to work on both x86 and powerpc.
> The selftests have passed on x86 and powerpc hardware.
> 
> History:
> -------
> version v9:
> 	(1) used jump-labels to optimize code
> 		-- Balbir
> 	(2) fixed a register initialization bug noted
> 		by Balbir
> 	(3) fixed inappropriate use of paca to pass
> 		siginfo and keys to signal handler
> 	(4) Cleanup of comment style not to be right
> 		justified -- mpe
> 	(5) restructured the patches to depend on the
> 		availability of VM_PKEY_BIT4 in
> 		include/linux/mm.h
> 	(6) Incorporated comments from Dave Hansen
> 		towards changes to selftest and got
> 		them tested on x86.
> 
> version v8:
> 	(1) Contents of the AMR register withdrawn from
> 	the siginfo structure. Applications can always
> 	read the AMR register.
> 	(2) AMR/IAMR/UAMOR are now available through
> 		ptrace system call. -- thanks to Thiago
> 	(3) code changes to handle legacy power cpus
> 	that do not support execute-disable.
> 	(4) incorporates many code improvement
> 		suggestions.
> 
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	patch(2).
> version v7:
> 	(1) refers to device tree property to enable
> 		protection keys.
> 	(2) adds 4K PTE support.
> 	(3) fixes a couple of bugs noticed by Thiago
> 	(4) decouples this patch series from arch-
> 	 independent code. This patch series can
> 	 now stand by itself, with one kludge
> 	 patch(2).
> 
> version v6:
> 	(1) selftest changes are broken down into 20
> 		incremental patches.
> 	(2) A separate key allocation mask that
> 		includes PKEY_DISABLE_EXECUTE is
> 		added for powerpc
> 	(3) pkey feature is enabled for 64K HPT case
> 		only. RPT and 4k HPT is disabled.
> 	(4) Documentation is updated to better
> 		capture the semantics.
> 	(5) introduced arch_pkeys_enabled() to find
> 		if an arch enables pkeys. Correspond-
> 		ing change the logic that displays
> 		key value in smaps.
> 	(6) code rearranged in many places based on
> 		comments from Dave Hansen, Balbir,
> 		Anshuman.	
> 	(7) fixed one bug where a bogus key could be
> 		associated successfully in
> 		pkey_mprotect().
> 
> version v5:
> 	(1) reverted back to the old design -- store
> 	 the key in the pte, instead of bypassing
> 	 it. The v4 design slowed down the hash
> 	 page path.
> 	(2) detects key violation when kernel is told
> 		to access user pages.
> 	(3) further refined the patches into smaller
> 		consumable units
> 	(4) page faults handlers captures the fault-
> 		ing key
> 	 from the pte instead of the vma. This
> 	 closes a race between where the key
> 	 update in the vma and a key fault caused
> 	 by the key programmed in the pte.
> 	(5) a key created with access-denied should
> 	 also set it up to deny write. Fixed it.
> 	(6) protection-key number is displayed in
>   		smaps the x86 way.
> 
> version v4:
> 	(1) patches no more depend on the pte bits
> 		to program the hpte
> 			-- comment by Balbir
> 	(2) documentation updates
> 	(3) fixed a bug in the selftest.
> 	(4) unlike x86, powerpc lets signal handler
> 		change key permission bits; the
> 		change will persist across signal
> 		handler boundaries. Earlier we
> 		allowed the signal handler to
> 		modify a field in the siginfo
> 		structure which would than be used
> 		by the kernel to program the key
> 		protection register (AMR)
> 		 -- resolves a issue raised by Ben.
> 		"Calls to sys_swapcontext with a
> 		made-up context will end up with a
> 		crap AMR if done by code who didn't
> 		know about that register".
> 	(5) these changes enable protection keys on
>   		4k-page kernel aswell.
> 
> version v3:
> 	(1) split the patches into smaller consumable
> 		patches.
> 	(2) added the ability to disable execute
> 		permission on a key at creation.
> 	(3) rename calc_pte_to_hpte_pkey_bits() to
> 	pte_to_hpte_pkey_bits()
> 		-- suggested by Anshuman
> 	(4) some code optimization and clarity in
> 		do_page_fault()
> 	(5) A bug fix while invalidating a hpte slot
> 		in __hash_page_4K()
> 		-- noticed by Aneesh
> 	
> 
> version v2:
> 	(1) documentation and selftest added.
>   	(2) fixed a bug in 4k hpte backed 64k pte
> 		where page invalidation was not
> 		done correctly, and initialization
> 		of second-part-of-the-pte was not
> 		done correctly if the pte was not
> 		yet Hashed with a hpte.
> 		--	Reported by Aneesh.
> 	(3) Fixed ABI breakage caused in siginfo
> 		structure.
> 		-- Reported by Anshuman.
> 	
> 
> version v1: Initial version
> 
> Ram Pai (47):
>    mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
>      is enabled
>    mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
>    powerpc: initial pkey plumbing
>    powerpc: track allocation status of all pkeys
>    powerpc: helper function to read,write AMR,IAMR,UAMOR registers
>    powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
>    powerpc: cleanup AMR, IAMR when a key is allocated or freed
>    powerpc: implementation for arch_set_user_pkey_access()
>    powerpc: ability to create execute-disabled pkeys
>    powerpc: store and restore the pkey state across context switches
>    powerpc: introduce execute-only pkey
>    powerpc: ability to associate pkey to a vma
>    powerpc: implementation for arch_override_mprotect_pkey()
>    powerpc: map vma key-protection bits to pte key bits.
>    powerpc: Program HPTE key protection bits
>    powerpc: helper to validate key-access permissions of a pte
>    powerpc: check key protection for user page access
>    powerpc: implementation for arch_vma_access_permitted()
>    powerpc: Handle exceptions caused by pkey violation
>    powerpc: introduce get_mm_addr_key() helper
>    powerpc: Deliver SEGV signal on pkey violation
>    powerpc: Enable pkey subsystem
>    powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
>    powerpc: sys_pkey_mprotect() system call
>    powerpc: add sys_pkey_modify() system call
>    mm, x86 : introduce arch_pkeys_enabled()
>    mm: display pkey in smaps if arch_pkeys_enabled() is true
>    Documentation/x86: Move protecton key documentation to arch neutral
>      directory
>    Documentation/vm: PowerPC specific updates to memory protection keys
>    selftest/x86: Move protecton key selftest to arch neutral directory
>    selftest/vm: rename all references to pkru to a generic name
>    selftest/vm: move generic definitions to header file
>    selftest/vm: typecast the pkey register
>    selftest/vm: generic function to handle shadow key register
>    selftest/vm: fix the wrong assert in pkey_disable_set()
>    selftest/vm: fixed bugs in pkey_disable_clear()
>    selftest/vm: clear the bits in shadow reg when a pkey is freed.
>    selftest/vm: fix alloc_random_pkey() to make it really random
>    selftest/vm: introduce two arch independent abstraction
>    selftest/vm: pkey register should match shadow pkey
>    selftest/vm: generic cleanup
>    selftest/vm: powerpc implementation for generic abstraction
>    selftest/vm: fix an assertion in test_pkey_alloc_exhaust()
>    selftest/vm: associate key on a mapped page and detect access
>      violation
>    selftest/vm: associate key on a mapped page and detect write
>      violation
>    selftest/vm: detect write violation on a mapped access-denied-key
>      page
>    selftest/vm: sub-page allocator
> 
> Thiago Jung Bauermann (4):
>    powerpc/ptrace: Add memory protection key regset
>    mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
>    selftests/powerpc: Add ptrace tests for Protection Key register
>    selftests/powerpc: Add core file test for Protection Key register
> 
>   Documentation/vm/protection-keys.txt               |  161 +++
>   Documentation/x86/protection-keys.txt              |   85 --
>   arch/powerpc/Kconfig                               |   15 +
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h      |    5 +
>   arch/powerpc/include/asm/book3s/64/mmu.h           |   10 +
>   arch/powerpc/include/asm/book3s/64/pgtable.h       |   42 +-
>   arch/powerpc/include/asm/bug.h                     |    1 +
>   arch/powerpc/include/asm/cputable.h                |   15 +-
>   arch/powerpc/include/asm/mman.h                    |   13 +-
>   arch/powerpc/include/asm/mmu.h                     |    9 +
>   arch/powerpc/include/asm/mmu_context.h             |   24 +
>   arch/powerpc/include/asm/pkeys.h                   |  247 ++++
>   arch/powerpc/include/asm/processor.h               |    5 +
>   arch/powerpc/include/asm/systbl.h                  |    4 +
>   arch/powerpc/include/asm/unistd.h                  |    6 +-
>   arch/powerpc/include/uapi/asm/elf.h                |    1 +
>   arch/powerpc/include/uapi/asm/mman.h               |    6 +
>   arch/powerpc/include/uapi/asm/unistd.h             |    4 +
>   arch/powerpc/kernel/entry_64.S                     |    9 +
>   arch/powerpc/kernel/process.c                      |    7 +
>   arch/powerpc/kernel/prom.c                         |   18 +
>   arch/powerpc/kernel/ptrace.c                       |   66 +
>   arch/powerpc/kernel/traps.c                        |   19 +-
>   arch/powerpc/mm/Makefile                           |    1 +
>   arch/powerpc/mm/fault.c                            |   49 +-
>   arch/powerpc/mm/hash_utils_64.c                    |   29 +
>   arch/powerpc/mm/mmu_context_book3s64.c             |    2 +
>   arch/powerpc/mm/pkeys.c                            |  463 +++++++
>   arch/x86/include/asm/mmu_context.h                 |    4 +-
>   arch/x86/include/asm/pkeys.h                       |    2 +
>   arch/x86/kernel/fpu/xstate.c                       |    5 +
>   arch/x86/kernel/setup.c                            |    8 -
>   arch/x86/mm/pkeys.c                                |    9 +
>   fs/proc/task_mmu.c                                 |   16 +-
>   include/linux/mm.h                                 |   12 +-
>   include/linux/pkeys.h                              |    7 +-
>   include/uapi/linux/elf.h                           |    1 +
>   mm/mprotect.c                                      |   88 ++
>   tools/testing/selftests/powerpc/include/reg.h      |    1 +
>   tools/testing/selftests/powerpc/ptrace/Makefile    |    5 +-
>   tools/testing/selftests/powerpc/ptrace/core-pkey.c |  438 ++++++
>   .../testing/selftests/powerpc/ptrace/ptrace-pkey.c |  443 ++++++
>   tools/testing/selftests/vm/Makefile                |    1 +
>   tools/testing/selftests/vm/pkey-helpers.h          |  405 ++++++
>   tools/testing/selftests/vm/protection_keys.c       | 1464 ++++++++++++++++++++
>   tools/testing/selftests/x86/Makefile               |    2 +-
>   tools/testing/selftests/x86/pkey-helpers.h         |  220 ---
>   tools/testing/selftests/x86/protection_keys.c      | 1395 -------------------
>   48 files changed, 4095 insertions(+), 1747 deletions(-)
>   create mode 100644 Documentation/vm/protection-keys.txt
>   delete mode 100644 Documentation/x86/protection-keys.txt
>   create mode 100644 arch/powerpc/include/asm/pkeys.h
>   create mode 100644 arch/powerpc/mm/pkeys.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
>   create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
>   create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>   create mode 100644 tools/testing/selftests/vm/protection_keys.c
>   delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>   delete mode 100644 tools/testing/selftests/x86/protection_keys.c
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
  2017-11-10 18:10   ` christophe.leroy
  (?)
  (?)
@ 2017-11-12 20:45     ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-12 20:45 UTC (permalink / raw)
  To: Christophe LEROY
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Fri, Nov 10, 2017 at 07:10:31PM +0100, Christophe LEROY wrote:
> Hi
> 
> Le 06/11/2017 à 09:56, Ram Pai a écrit :
> >Memory protection keys enable applications to protect its
> >address space from inadvertent access from or corruption
> >by itself.
> >
> >These patches along with the pte-bit freeing patch series
> >enables the protection key feature on powerpc; 4k and 64k
> >hashpage kernels. It also changes the generic and x86
> >code to expose memkey features through sysfs. Finally
> >testcases and Documentation is updated.
> >
> >All patches can be found at --
> >https://github.com/rampai/memorykeys.git memkey.v9
> 
> As far as I can see you are focussing the implementation on 64 bits
> powerpc. This could also be implemented on 32 bits powerpc, for
> instance the 8xx has MMU Access Protection Registers which can be
> used to define 16 domains and could I think be used for implementing
> protection keys.

I was assuming non-existence of any 32bit powerpc
systems supporting memory keys. Sounds like it was a wrong assumption.

However, I think, the framework as it stands today should work. All
the functionality is captured in pkeys.c and pkeys.h which are generic
ppc files.  Its just a matter of providing the 32-bit implementation
for whichever sub-arch that support it.  Can you point me to problem
areas? I will fix them.

Thanks for you interest. Togather we should be able to make it
happen.


> Of course the challenge after that would be to find 4 spare PTE
> bits, I'm sure we can find them on the 8xx, at least when using 16k
> pages we have 2 bits already available, then by merging PAGE_SHARED
> and PAGE_USER and by reducing PAGE_RO to only one bit we can get the
> 4 spare bits.

yes. This needs to happen parallely.
RP

> 
> Therefore I think it would be great if you could implement a
> framework common to both PPC32 and PPC64.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-12 20:45     ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-11-12 20:45 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2073 bytes --]

On Fri, Nov 10, 2017 at 07:10:31PM +0100, Christophe LEROY wrote:
> Hi
> 
> Le 06/11/2017 à 09:56, Ram Pai a écrit :
> >Memory protection keys enable applications to protect its
> >address space from inadvertent access from or corruption
> >by itself.
> >
> >These patches along with the pte-bit freeing patch series
> >enables the protection key feature on powerpc; 4k and 64k
> >hashpage kernels. It also changes the generic and x86
> >code to expose memkey features through sysfs. Finally
> >testcases and Documentation is updated.
> >
> >All patches can be found at --
> >https://github.com/rampai/memorykeys.git memkey.v9
> 
> As far as I can see you are focussing the implementation on 64 bits
> powerpc. This could also be implemented on 32 bits powerpc, for
> instance the 8xx has MMU Access Protection Registers which can be
> used to define 16 domains and could I think be used for implementing
> protection keys.

I was assuming non-existence of any 32bit powerpc
systems supporting memory keys. Sounds like it was a wrong assumption.

However, I think, the framework as it stands today should work. All
the functionality is captured in pkeys.c and pkeys.h which are generic
ppc files.  Its just a matter of providing the 32-bit implementation
for whichever sub-arch that support it.  Can you point me to problem
areas? I will fix them.

Thanks for you interest. Togather we should be able to make it
happen.


> Of course the challenge after that would be to find 4 spare PTE
> bits, I'm sure we can find them on the 8xx, at least when using 16k
> pages we have 2 bits already available, then by merging PAGE_SHARED
> and PAGE_USER and by reducing PAGE_RO to only one bit we can get the
> 4 spare bits.

yes. This needs to happen parallely.
RP

> 
> Therefore I think it would be great if you could implement a
> framework common to both PPC32 and PPC64.

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-12 20:45     ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-12 20:45 UTC (permalink / raw)


On Fri, Nov 10, 2017@07:10:31PM +0100, Christophe LEROY wrote:
> Hi
> 
> Le 06/11/2017 à 09:56, Ram Pai a écrit :
> >Memory protection keys enable applications to protect its
> >address space from inadvertent access from or corruption
> >by itself.
> >
> >These patches along with the pte-bit freeing patch series
> >enables the protection key feature on powerpc; 4k and 64k
> >hashpage kernels. It also changes the generic and x86
> >code to expose memkey features through sysfs. Finally
> >testcases and Documentation is updated.
> >
> >All patches can be found at --
> >https://github.com/rampai/memorykeys.git memkey.v9
> 
> As far as I can see you are focussing the implementation on 64 bits
> powerpc. This could also be implemented on 32 bits powerpc, for
> instance the 8xx has MMU Access Protection Registers which can be
> used to define 16 domains and could I think be used for implementing
> protection keys.

I was assuming non-existence of any 32bit powerpc
systems supporting memory keys. Sounds like it was a wrong assumption.

However, I think, the framework as it stands today should work. All
the functionality is captured in pkeys.c and pkeys.h which are generic
ppc files.  Its just a matter of providing the 32-bit implementation
for whichever sub-arch that support it.  Can you point me to problem
areas? I will fix them.

Thanks for you interest. Togather we should be able to make it
happen.


> Of course the challenge after that would be to find 4 spare PTE
> bits, I'm sure we can find them on the 8xx, at least when using 16k
> pages we have 2 bits already available, then by merging PAGE_SHARED
> and PAGE_USER and by reducing PAGE_RO to only one bit we can get the
> 4 spare bits.

yes. This needs to happen parallely.
RP

> 
> Therefore I think it would be great if you could implement a
> framework common to both PPC32 and PPC64.

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys
@ 2017-11-12 20:45     ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-12 20:45 UTC (permalink / raw)
  To: Christophe LEROY
  Cc: mpe, mingo, akpm, corbet, arnd, linux-arch, ebiederm, linux-doc,
	x86, dave.hansen, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Fri, Nov 10, 2017 at 07:10:31PM +0100, Christophe LEROY wrote:
> Hi
> 
> Le 06/11/2017 a 09:56, Ram Pai a ecrit :
> >Memory protection keys enable applications to protect its
> >address space from inadvertent access from or corruption
> >by itself.
> >
> >These patches along with the pte-bit freeing patch series
> >enables the protection key feature on powerpc; 4k and 64k
> >hashpage kernels. It also changes the generic and x86
> >code to expose memkey features through sysfs. Finally
> >testcases and Documentation is updated.
> >
> >All patches can be found at --
> >https://github.com/rampai/memorykeys.git memkey.v9
> 
> As far as I can see you are focussing the implementation on 64 bits
> powerpc. This could also be implemented on 32 bits powerpc, for
> instance the 8xx has MMU Access Protection Registers which can be
> used to define 16 domains and could I think be used for implementing
> protection keys.

I was assuming non-existence of any 32bit powerpc
systems supporting memory keys. Sounds like it was a wrong assumption.

However, I think, the framework as it stands today should work. All
the functionality is captured in pkeys.c and pkeys.h which are generic
ppc files.  Its just a matter of providing the 32-bit implementation
for whichever sub-arch that support it.  Can you point me to problem
areas? I will fix them.

Thanks for you interest. Togather we should be able to make it
happen.


> Of course the challenge after that would be to find 4 spare PTE
> bits, I'm sure we can find them on the 8xx, at least when using 16k
> pages we have 2 bits already available, then by merging PAGE_SHARED
> and PAGE_USER and by reducing PAGE_RO to only one bit we can get the
> 4 spare bits.

yes. This needs to happen parallely.
RP

> 
> Therefore I think it would be great if you could implement a
> framework common to both PPC32 and PPC64.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 23/51] powerpc: Enable pkey subsystem
  2017-11-06  8:57   ` Ram Pai
  (?)
  (?)
@ 2017-11-13  0:54     ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-13  0:54 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm

On Mon, Nov 06, 2017 at 12:57:15AM -0800, Ram Pai wrote:
> PAPR defines 'ibm,processor-storage-keys' property. It exports two
> values. The first value holds the number of data-access keys and the
> second holds the number of instruction-access keys.  Due to a bug in
> the  firmware, instruction-access  keys is  always  reported  as zero.
> However any key can be configured to disable data-access and/or disable
> execution-access. The inavailablity of the second value is not a
> big handicap, though it could have been used to determine if the
> platform supported disable-execution-access.
> 
> Non PAPR platforms do not define this property   in the device tree yet.
> Here, we   hardcode   CPUs   that   support  pkey by consulting
> PowerISA3.0
> 
> This patch calculates the number of keys supported by the platform.
> Alsi it determines the platform support for read/write/execution access
> support for pkeys.

> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
> 
....snip...

> +static inline bool pkey_mmu_enabled(void)
> +{
> +	if (firmware_has_feature(FW_FEATURE_LPAR))
> +		return pkeys_total;
> +	else
> +		return cpu_has_feature(CPU_FTR_PKEY);
> +}
> +
>  void __init pkey_initialize(void)
>  {
>  	int os_reserved, i;
> @@ -46,14 +54,9 @@ void __init pkey_initialize(void)
>  		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
>  				!= (sizeof(u64) * BITS_PER_BYTE));
> 
> -	/*
> -	 * Disable the pkey system till everything is in place. A subsequent
> -	 * patch will enable it.
> -	 */
> -	static_branch_enable(&pkey_disabled);
> -
> -	/* Lets assume 32 keys */
> -	pkeys_total = 32;

vvvvvvvvvvvvvvvvvvvv
> +	/* Let's assume 32 keys if we are not told the number of pkeys. */
> +	if (!pkeys_total)
> +		pkeys_total = 32;
^^^^^^^^^^^^^^^^^^^^

There is a small bug here. 

On a KVM guest or a LPAR, if the device tree
does not expose pkeys, the pkey-subsystem must be disabled.

Unfortunately, the code above blindly sets the pkeys_total to 32.
This confuses pkey_mmu_enabled() into returning true. Because of this
bug the guest errorneously enables pkey-subsystem. 

The fix is to delete the code marked above.

> 
>  	/*
>  	 * Adjust the upper limit, based on the number of bits supported by
> @@ -62,11 +65,19 @@ void __init pkey_initialize(void)
>  	pkeys_total = min_t(int, pkeys_total,
>  			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
> 
> +	if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
> +		static_branch_enable(&pkey_disabled);
> +	else
> +		static_branch_disable(&pkey_disabled);
> +

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 23/51] powerpc: Enable pkey subsystem
@ 2017-11-13  0:54     ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-11-13  0:54 UTC (permalink / raw)


On Mon, Nov 06, 2017 at 12:57:15AM -0800, Ram Pai wrote:
> PAPR defines 'ibm,processor-storage-keys' property. It exports two
> values. The first value holds the number of data-access keys and the
> second holds the number of instruction-access keys.  Due to a bug in
> the  firmware, instruction-access  keys is  always  reported  as zero.
> However any key can be configured to disable data-access and/or disable
> execution-access. The inavailablity of the second value is not a
> big handicap, though it could have been used to determine if the
> platform supported disable-execution-access.
> 
> Non PAPR platforms do not define this property   in the device tree yet.
> Here, we   hardcode   CPUs   that   support  pkey by consulting
> PowerISA3.0
> 
> This patch calculates the number of keys supported by the platform.
> Alsi it determines the platform support for read/write/execution access
> support for pkeys.

> 
> Signed-off-by: Ram Pai <linuxram at us.ibm.com>
> ---
> 
....snip...

> +static inline bool pkey_mmu_enabled(void)
> +{
> +	if (firmware_has_feature(FW_FEATURE_LPAR))
> +		return pkeys_total;
> +	else
> +		return cpu_has_feature(CPU_FTR_PKEY);
> +}
> +
>  void __init pkey_initialize(void)
>  {
>  	int os_reserved, i;
> @@ -46,14 +54,9 @@ void __init pkey_initialize(void)
>  		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
>  				!= (sizeof(u64) * BITS_PER_BYTE));
> 
> -	/*
> -	 * Disable the pkey system till everything is in place. A subsequent
> -	 * patch will enable it.
> -	 */
> -	static_branch_enable(&pkey_disabled);
> -
> -	/* Lets assume 32 keys */
> -	pkeys_total = 32;

vvvvvvvvvvvvvvvvvvvv
> +	/* Let's assume 32 keys if we are not told the number of pkeys. */
> +	if (!pkeys_total)
> +		pkeys_total = 32;
^^^^^^^^^^^^^^^^^^^^

There is a small bug here. 

On a KVM guest or a LPAR, if the device tree
does not expose pkeys, the pkey-subsystem must be disabled.

Unfortunately, the code above blindly sets the pkeys_total to 32.
This confuses pkey_mmu_enabled() into returning true. Because of this
bug the guest errorneously enables pkey-subsystem. 

The fix is to delete the code marked above.

> 
>  	/*
>  	 * Adjust the upper limit, based on the number of bits supported by
> @@ -62,11 +65,19 @@ void __init pkey_initialize(void)
>  	pkeys_total = min_t(int, pkeys_total,
>  			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
> 
> +	if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
> +		static_branch_enable(&pkey_disabled);
> +	else
> +		static_branch_disable(&pkey_disabled);
> +

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 23/51] powerpc: Enable pkey subsystem
@ 2017-11-13  0:54     ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-13  0:54 UTC (permalink / raw)


On Mon, Nov 06, 2017@12:57:15AM -0800, Ram Pai wrote:
> PAPR defines 'ibm,processor-storage-keys' property. It exports two
> values. The first value holds the number of data-access keys and the
> second holds the number of instruction-access keys.  Due to a bug in
> the  firmware, instruction-access  keys is  always  reported  as zero.
> However any key can be configured to disable data-access and/or disable
> execution-access. The inavailablity of the second value is not a
> big handicap, though it could have been used to determine if the
> platform supported disable-execution-access.
> 
> Non PAPR platforms do not define this property   in the device tree yet.
> Here, we   hardcode   CPUs   that   support  pkey by consulting
> PowerISA3.0
> 
> This patch calculates the number of keys supported by the platform.
> Alsi it determines the platform support for read/write/execution access
> support for pkeys.

> 
> Signed-off-by: Ram Pai <linuxram at us.ibm.com>
> ---
> 
....snip...

> +static inline bool pkey_mmu_enabled(void)
> +{
> +	if (firmware_has_feature(FW_FEATURE_LPAR))
> +		return pkeys_total;
> +	else
> +		return cpu_has_feature(CPU_FTR_PKEY);
> +}
> +
>  void __init pkey_initialize(void)
>  {
>  	int os_reserved, i;
> @@ -46,14 +54,9 @@ void __init pkey_initialize(void)
>  		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
>  				!= (sizeof(u64) * BITS_PER_BYTE));
> 
> -	/*
> -	 * Disable the pkey system till everything is in place. A subsequent
> -	 * patch will enable it.
> -	 */
> -	static_branch_enable(&pkey_disabled);
> -
> -	/* Lets assume 32 keys */
> -	pkeys_total = 32;

vvvvvvvvvvvvvvvvvvvv
> +	/* Let's assume 32 keys if we are not told the number of pkeys. */
> +	if (!pkeys_total)
> +		pkeys_total = 32;
^^^^^^^^^^^^^^^^^^^^

There is a small bug here. 

On a KVM guest or a LPAR, if the device tree
does not expose pkeys, the pkey-subsystem must be disabled.

Unfortunately, the code above blindly sets the pkeys_total to 32.
This confuses pkey_mmu_enabled() into returning true. Because of this
bug the guest errorneously enables pkey-subsystem. 

The fix is to delete the code marked above.

> 
>  	/*
>  	 * Adjust the upper limit, based on the number of bits supported by
> @@ -62,11 +65,19 @@ void __init pkey_initialize(void)
>  	pkeys_total = min_t(int, pkeys_total,
>  			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
> 
> +	if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
> +		static_branch_enable(&pkey_disabled);
> +	else
> +		static_branch_disable(&pkey_disabled);
> +

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 23/51] powerpc: Enable pkey subsystem
@ 2017-11-13  0:54     ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-11-13  0:54 UTC (permalink / raw)
  To: mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm

On Mon, Nov 06, 2017 at 12:57:15AM -0800, Ram Pai wrote:
> PAPR defines 'ibm,processor-storage-keys' property. It exports two
> values. The first value holds the number of data-access keys and the
> second holds the number of instruction-access keys.  Due to a bug in
> the  firmware, instruction-access  keys is  always  reported  as zero.
> However any key can be configured to disable data-access and/or disable
> execution-access. The inavailablity of the second value is not a
> big handicap, though it could have been used to determine if the
> platform supported disable-execution-access.
> 
> Non PAPR platforms do not define this property   in the device tree yet.
> Here, we   hardcode   CPUs   that   support  pkey by consulting
> PowerISA3.0
> 
> This patch calculates the number of keys supported by the platform.
> Alsi it determines the platform support for read/write/execution access
> support for pkeys.

> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
> 
....snip...

> +static inline bool pkey_mmu_enabled(void)
> +{
> +	if (firmware_has_feature(FW_FEATURE_LPAR))
> +		return pkeys_total;
> +	else
> +		return cpu_has_feature(CPU_FTR_PKEY);
> +}
> +
>  void __init pkey_initialize(void)
>  {
>  	int os_reserved, i;
> @@ -46,14 +54,9 @@ void __init pkey_initialize(void)
>  		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
>  				!= (sizeof(u64) * BITS_PER_BYTE));
> 
> -	/*
> -	 * Disable the pkey system till everything is in place. A subsequent
> -	 * patch will enable it.
> -	 */
> -	static_branch_enable(&pkey_disabled);
> -
> -	/* Lets assume 32 keys */
> -	pkeys_total = 32;

vvvvvvvvvvvvvvvvvvvv
> +	/* Let's assume 32 keys if we are not told the number of pkeys. */
> +	if (!pkeys_total)
> +		pkeys_total = 32;
^^^^^^^^^^^^^^^^^^^^

There is a small bug here. 

On a KVM guest or a LPAR, if the device tree
does not expose pkeys, the pkey-subsystem must be disabled.

Unfortunately, the code above blindly sets the pkeys_total to 32.
This confuses pkey_mmu_enabled() into returning true. Because of this
bug the guest errorneously enables pkey-subsystem. 

The fix is to delete the code marked above.

> 
>  	/*
>  	 * Adjust the upper limit, based on the number of bits supported by
> @@ -62,11 +65,19 @@ void __init pkey_initialize(void)
>  	pkeys_total = min_t(int, pkeys_total,
>  			(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
> 
> +	if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
> +		static_branch_enable(&pkey_disabled);
> +	else
> +		static_branch_disable(&pkey_disabled);
> +

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-11-06  8:57   ` Ram Pai
  (?)
  (?)
@ 2017-12-18 18:54     ` dave.hansen
  -1 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-12-18 18:54 UTC (permalink / raw)
  To: Ram Pai, mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, benh, paulus, khandual,
	aneesh.kumar, bsingharora, hbabu, mhocko, bauerman, ebiederm

On 11/06/2017 12:57 AM, Ram Pai wrote:
> Expose useful information for programs using memory protection keys.
> Provide implementation for powerpc and x86.
> 
> On a powerpc system with pkeys support, here is what is shown:
> 
> $ head /sys/kernel/mm/protection_keys/*
> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> true

This is cute, but I don't think it should be part of the ABI.  Put it in
debugfs if you want it for cute tests.  The stuff that this tells you
can and should come from pkey_alloc() for the ABI.

http://man7.org/linux/man-pages/man7/pkeys.7.html

>        Any application wanting to use protection keys needs to be able to
>        function without them.  They might be unavailable because the
>        hardware that the application runs on does not support them, the
>        kernel code does not contain support, the kernel support has been
>        disabled, or because the keys have all been allocated, perhaps by a
>        library the application is using.  It is recommended that
>        applications wanting to use protection keys should simply call
>        pkey_alloc(2) and test whether the call succeeds, instead of
>        attempting to detect support for the feature in any other way.

Do you really not have standard way on ppc to say whether hardware
features are supported by the kernel?  For instance, how do you know if
a given set of registers are known to and are being context-switched by
the kernel?

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 18:54     ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: dave.hansen @ 2017-12-18 18:54 UTC (permalink / raw)


On 11/06/2017 12:57 AM, Ram Pai wrote:
> Expose useful information for programs using memory protection keys.
> Provide implementation for powerpc and x86.
> 
> On a powerpc system with pkeys support, here is what is shown:
> 
> $ head /sys/kernel/mm/protection_keys/*
> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> true

This is cute, but I don't think it should be part of the ABI.  Put it in
debugfs if you want it for cute tests.  The stuff that this tells you
can and should come from pkey_alloc() for the ABI.

http://man7.org/linux/man-pages/man7/pkeys.7.html

>        Any application wanting to use protection keys needs to be able to
>        function without them.  They might be unavailable because the
>        hardware that the application runs on does not support them, the
>        kernel code does not contain support, the kernel support has been
>        disabled, or because the keys have all been allocated, perhaps by a
>        library the application is using.  It is recommended that
>        applications wanting to use protection keys should simply call
>        pkey_alloc(2) and test whether the call succeeds, instead of
>        attempting to detect support for the feature in any other way.

Do you really not have standard way on ppc to say whether hardware
features are supported by the kernel?  For instance, how do you know if
a given set of registers are known to and are being context-switched by
the kernel?
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 18:54     ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-12-18 18:54 UTC (permalink / raw)


On 11/06/2017 12:57 AM, Ram Pai wrote:
> Expose useful information for programs using memory protection keys.
> Provide implementation for powerpc and x86.
> 
> On a powerpc system with pkeys support, here is what is shown:
> 
> $ head /sys/kernel/mm/protection_keys/*
> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> true

This is cute, but I don't think it should be part of the ABI.  Put it in
debugfs if you want it for cute tests.  The stuff that this tells you
can and should come from pkey_alloc() for the ABI.

http://man7.org/linux/man-pages/man7/pkeys.7.html

>        Any application wanting to use protection keys needs to be able to
>        function without them.  They might be unavailable because the
>        hardware that the application runs on does not support them, the
>        kernel code does not contain support, the kernel support has been
>        disabled, or because the keys have all been allocated, perhaps by a
>        library the application is using.  It is recommended that
>        applications wanting to use protection keys should simply call
>        pkey_alloc(2) and test whether the call succeeds, instead of
>        attempting to detect support for the feature in any other way.

Do you really not have standard way on ppc to say whether hardware
features are supported by the kernel?  For instance, how do you know if
a given set of registers are known to and are being context-switched by
the kernel?
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 18:54     ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-12-18 18:54 UTC (permalink / raw)
  To: Ram Pai, mpe, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, benh, paulus, khandual,
	aneesh.kumar, bsingharora, hbabu, mhocko, bauerman, ebiederm

On 11/06/2017 12:57 AM, Ram Pai wrote:
> Expose useful information for programs using memory protection keys.
> Provide implementation for powerpc and x86.
> 
> On a powerpc system with pkeys support, here is what is shown:
> 
> $ head /sys/kernel/mm/protection_keys/*
> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> true

This is cute, but I don't think it should be part of the ABI.  Put it in
debugfs if you want it for cute tests.  The stuff that this tells you
can and should come from pkey_alloc() for the ABI.

http://man7.org/linux/man-pages/man7/pkeys.7.html

>        Any application wanting to use protection keys needs to be able to
>        function without them.  They might be unavailable because the
>        hardware that the application runs on does not support them, the
>        kernel code does not contain support, the kernel support has been
>        disabled, or because the keys have all been allocated, perhaps by a
>        library the application is using.  It is recommended that
>        applications wanting to use protection keys should simply call
>        pkey_alloc(2) and test whether the call succeeds, instead of
>        attempting to detect support for the feature in any other way.

Do you really not have standard way on ppc to say whether hardware
features are supported by the kernel?  For instance, how do you know if
a given set of registers are known to and are being context-switched by
the kernel?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-18 18:54     ` dave.hansen
  (?)
  (?)
@ 2017-12-18 22:18       ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-18 22:18 UTC (permalink / raw)
  To: Dave Hansen
  Cc: mpe, mingo, akpm, corbet, arnd, linuxppc-dev, linux-mm, x86,
	linux-arch, linux-doc, linux-kselftest, linux-kernel, benh,
	paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

On Mon, Dec 18, 2017 at 10:54:26AM -0800, Dave Hansen wrote:
> On 11/06/2017 12:57 AM, Ram Pai wrote:
> > Expose useful information for programs using memory protection keys.
> > Provide implementation for powerpc and x86.
> > 
> > On a powerpc system with pkeys support, here is what is shown:
> > 
> > $ head /sys/kernel/mm/protection_keys/*

> > ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> > true
> 
> This is cute, but I don't think it should be part of the ABI.  Put it in

thanks :)

> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Applications can make system calls with different parameters and on
failure determine indirectly that such a feature may not be available in
the kernel/hardware.  But from an application point of view, I think, it
is a very clumsy/difficult way to determine that.

For example, an application can keep making pkey_alloc() calls and count
till the call fails, to determine the number of keys supported by the
system. And then the application has to release those keys too.  Too
much side-effect just to determine a simple thing. Do we want the
application to endure this pain?

I think we should aim to provide sufficient API/ABI for the application
to consume the feature efficiently, and not any more.

I do not claim that the ABI exposed by this patch is sufficiently
optimal. But I do believe it is tending towards it.

currently the following ABI is  exposed.

a) total number of keys available in the system. This information may
	not be useful and can possibly be dropped.

b) minimum number of keys available to the application.
	if libraries consumes a few, they could provide a library
	interface to the application informing the number available to
	the application.  The library interface can leverage (b) to
	provide the information.

c) types of disable-rights supported by keys.
	Helps the application to determine the types of disable-features
	available. This is helpful, otherwise the app has to 
	make pkey_alloc() call with the corresponding parameter set
	and see if it suceeds or fails. Painful from an application
	point of view, in my opinion.

> 
> http://man7.org/linux/man-pages/man7/pkeys.7.html
> 
> >        Any application wanting to use protection keys needs to be able to
> >        function without them.  They might be unavailable because the
> >        hardware that the application runs on does not support them, the
> >        kernel code does not contain support, the kernel support has been
> >        disabled, or because the keys have all been allocated, perhaps by a
> >        library the application is using.  It is recommended that
> >        applications wanting to use protection keys should simply call
> >        pkey_alloc(2) and test whether the call succeeds, instead of
> >        attempting to detect support for the feature in any other way.
> 
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

I think on x86 you look for some hardware registers to determine which
hardware features are enabled by the kernel.

We do not have generic support for something like that on ppc.
The kernel looks at the device tree to determine what hardware features
are available. But does not have mechanism to tell the hardware to track
which of its features are currently enabled/used by the kernel; atleast
not for the memory-key feature.

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 22:18       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-12-18 22:18 UTC (permalink / raw)


On Mon, Dec 18, 2017 at 10:54:26AM -0800, Dave Hansen wrote:
> On 11/06/2017 12:57 AM, Ram Pai wrote:
> > Expose useful information for programs using memory protection keys.
> > Provide implementation for powerpc and x86.
> > 
> > On a powerpc system with pkeys support, here is what is shown:
> > 
> > $ head /sys/kernel/mm/protection_keys/*

> > ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> > true
> 
> This is cute, but I don't think it should be part of the ABI.  Put it in

thanks :)

> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Applications can make system calls with different parameters and on
failure determine indirectly that such a feature may not be available in
the kernel/hardware.  But from an application point of view, I think, it
is a very clumsy/difficult way to determine that.

For example, an application can keep making pkey_alloc() calls and count
till the call fails, to determine the number of keys supported by the
system. And then the application has to release those keys too.  Too
much side-effect just to determine a simple thing. Do we want the
application to endure this pain?

I think we should aim to provide sufficient API/ABI for the application
to consume the feature efficiently, and not any more.

I do not claim that the ABI exposed by this patch is sufficiently
optimal. But I do believe it is tending towards it.

currently the following ABI is  exposed.

a) total number of keys available in the system. This information may
	not be useful and can possibly be dropped.

b) minimum number of keys available to the application.
	if libraries consumes a few, they could provide a library
	interface to the application informing the number available to
	the application.  The library interface can leverage (b) to
	provide the information.

c) types of disable-rights supported by keys.
	Helps the application to determine the types of disable-features
	available. This is helpful, otherwise the app has to 
	make pkey_alloc() call with the corresponding parameter set
	and see if it suceeds or fails. Painful from an application
	point of view, in my opinion.

> 
> http://man7.org/linux/man-pages/man7/pkeys.7.html
> 
> >        Any application wanting to use protection keys needs to be able to
> >        function without them.  They might be unavailable because the
> >        hardware that the application runs on does not support them, the
> >        kernel code does not contain support, the kernel support has been
> >        disabled, or because the keys have all been allocated, perhaps by a
> >        library the application is using.  It is recommended that
> >        applications wanting to use protection keys should simply call
> >        pkey_alloc(2) and test whether the call succeeds, instead of
> >        attempting to detect support for the feature in any other way.
> 
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

I think on x86 you look for some hardware registers to determine which
hardware features are enabled by the kernel.

We do not have generic support for something like that on ppc.
The kernel looks at the device tree to determine what hardware features
are available. But does not have mechanism to tell the hardware to track
which of its features are currently enabled/used by the kernel; atleast
not for the memory-key feature.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 22:18       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-18 22:18 UTC (permalink / raw)


On Mon, Dec 18, 2017@10:54:26AM -0800, Dave Hansen wrote:
> On 11/06/2017 12:57 AM, Ram Pai wrote:
> > Expose useful information for programs using memory protection keys.
> > Provide implementation for powerpc and x86.
> > 
> > On a powerpc system with pkeys support, here is what is shown:
> > 
> > $ head /sys/kernel/mm/protection_keys/*

> > ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> > true
> 
> This is cute, but I don't think it should be part of the ABI.  Put it in

thanks :)

> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Applications can make system calls with different parameters and on
failure determine indirectly that such a feature may not be available in
the kernel/hardware.  But from an application point of view, I think, it
is a very clumsy/difficult way to determine that.

For example, an application can keep making pkey_alloc() calls and count
till the call fails, to determine the number of keys supported by the
system. And then the application has to release those keys too.  Too
much side-effect just to determine a simple thing. Do we want the
application to endure this pain?

I think we should aim to provide sufficient API/ABI for the application
to consume the feature efficiently, and not any more.

I do not claim that the ABI exposed by this patch is sufficiently
optimal. But I do believe it is tending towards it.

currently the following ABI is  exposed.

a) total number of keys available in the system. This information may
	not be useful and can possibly be dropped.

b) minimum number of keys available to the application.
	if libraries consumes a few, they could provide a library
	interface to the application informing the number available to
	the application.  The library interface can leverage (b) to
	provide the information.

c) types of disable-rights supported by keys.
	Helps the application to determine the types of disable-features
	available. This is helpful, otherwise the app has to 
	make pkey_alloc() call with the corresponding parameter set
	and see if it suceeds or fails. Painful from an application
	point of view, in my opinion.

> 
> http://man7.org/linux/man-pages/man7/pkeys.7.html
> 
> >        Any application wanting to use protection keys needs to be able to
> >        function without them.  They might be unavailable because the
> >        hardware that the application runs on does not support them, the
> >        kernel code does not contain support, the kernel support has been
> >        disabled, or because the keys have all been allocated, perhaps by a
> >        library the application is using.  It is recommended that
> >        applications wanting to use protection keys should simply call
> >        pkey_alloc(2) and test whether the call succeeds, instead of
> >        attempting to detect support for the feature in any other way.
> 
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

I think on x86 you look for some hardware registers to determine which
hardware features are enabled by the kernel.

We do not have generic support for something like that on ppc.
The kernel looks at the device tree to determine what hardware features
are available. But does not have mechanism to tell the hardware to track
which of its features are currently enabled/used by the kernel; atleast
not for the memory-key feature.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 22:18       ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-18 22:18 UTC (permalink / raw)
  To: Dave Hansen
  Cc: mpe, mingo, akpm, corbet, arnd, linuxppc-dev, linux-mm, x86,
	linux-arch, linux-doc, linux-kselftest, linux-kernel, benh,
	paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

On Mon, Dec 18, 2017 at 10:54:26AM -0800, Dave Hansen wrote:
> On 11/06/2017 12:57 AM, Ram Pai wrote:
> > Expose useful information for programs using memory protection keys.
> > Provide implementation for powerpc and x86.
> > 
> > On a powerpc system with pkeys support, here is what is shown:
> > 
> > $ head /sys/kernel/mm/protection_keys/*

> > ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> > true
> 
> This is cute, but I don't think it should be part of the ABI.  Put it in

thanks :)

> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Applications can make system calls with different parameters and on
failure determine indirectly that such a feature may not be available in
the kernel/hardware.  But from an application point of view, I think, it
is a very clumsy/difficult way to determine that.

For example, an application can keep making pkey_alloc() calls and count
till the call fails, to determine the number of keys supported by the
system. And then the application has to release those keys too.  Too
much side-effect just to determine a simple thing. Do we want the
application to endure this pain?

I think we should aim to provide sufficient API/ABI for the application
to consume the feature efficiently, and not any more.

I do not claim that the ABI exposed by this patch is sufficiently
optimal. But I do believe it is tending towards it.

currently the following ABI is  exposed.

a) total number of keys available in the system. This information may
	not be useful and can possibly be dropped.

b) minimum number of keys available to the application.
	if libraries consumes a few, they could provide a library
	interface to the application informing the number available to
	the application.  The library interface can leverage (b) to
	provide the information.

c) types of disable-rights supported by keys.
	Helps the application to determine the types of disable-features
	available. This is helpful, otherwise the app has to 
	make pkey_alloc() call with the corresponding parameter set
	and see if it suceeds or fails. Painful from an application
	point of view, in my opinion.

> 
> http://man7.org/linux/man-pages/man7/pkeys.7.html
> 
> >        Any application wanting to use protection keys needs to be able to
> >        function without them.  They might be unavailable because the
> >        hardware that the application runs on does not support them, the
> >        kernel code does not contain support, the kernel support has been
> >        disabled, or because the keys have all been allocated, perhaps by a
> >        library the application is using.  It is recommended that
> >        applications wanting to use protection keys should simply call
> >        pkey_alloc(2) and test whether the call succeeds, instead of
> >        attempting to detect support for the feature in any other way.
> 
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

I think on x86 you look for some hardware registers to determine which
hardware features are enabled by the kernel.

We do not have generic support for something like that on ppc.
The kernel looks at the device tree to determine what hardware features
are available. But does not have mechanism to tell the hardware to track
which of its features are currently enabled/used by the kernel; atleast
not for the memory-key feature.

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-18 22:18       ` linuxram
  (?)
  (?)
@ 2017-12-18 22:28         ` dave.hansen
  -1 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-12-18 22:28 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linuxppc-dev, linux-mm, x86,
	linux-arch, linux-doc, linux-kselftest, linux-kernel, benh,
	paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

On 12/18/2017 02:18 PM, Ram Pai wrote:
> b) minimum number of keys available to the application.
> 	if libraries consumes a few, they could provide a library
> 	interface to the application informing the number available to
> 	the application.  The library interface can leverage (b) to
> 	provide the information.

OK, let's see a real user of this including a few libraries.  Then we'll
put it in the kernel.

> c) types of disable-rights supported by keys.
> 	Helps the application to determine the types of disable-features
> 	available. This is helpful, otherwise the app has to 
> 	make pkey_alloc() call with the corresponding parameter set
> 	and see if it suceeds or fails. Painful from an application
> 	point of view, in my opinion.

Again, let's see a real-world use of this.  How does it look?  How does
an app "fall back" if it can't set a restriction the way it wants to?

Are we *sure* that such an interface makes sense?  For instance, will it
be possible for some keys to be execute-disable while other are only
write-disable?

> I think on x86 you look for some hardware registers to determine which
> hardware features are enabled by the kernel.

No, we use CPUID.  It's a part of the ISA that's designed for
enumerating CPU and (sometimes) OS support for CPU features.

> We do not have generic support for something like that on ppc.
> The kernel looks at the device tree to determine what hardware features
> are available. But does not have mechanism to tell the hardware to track
> which of its features are currently enabled/used by the kernel; atleast
> not for the memory-key feature.

Bummer.  You're missing out.

But, you could still do this with a syscall.  "Hey, kernel, do you
support this feature?"

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 22:28         ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: dave.hansen @ 2017-12-18 22:28 UTC (permalink / raw)


On 12/18/2017 02:18 PM, Ram Pai wrote:
> b) minimum number of keys available to the application.
> 	if libraries consumes a few, they could provide a library
> 	interface to the application informing the number available to
> 	the application.  The library interface can leverage (b) to
> 	provide the information.

OK, let's see a real user of this including a few libraries.  Then we'll
put it in the kernel.

> c) types of disable-rights supported by keys.
> 	Helps the application to determine the types of disable-features
> 	available. This is helpful, otherwise the app has to 
> 	make pkey_alloc() call with the corresponding parameter set
> 	and see if it suceeds or fails. Painful from an application
> 	point of view, in my opinion.

Again, let's see a real-world use of this.  How does it look?  How does
an app "fall back" if it can't set a restriction the way it wants to?

Are we *sure* that such an interface makes sense?  For instance, will it
be possible for some keys to be execute-disable while other are only
write-disable?

> I think on x86 you look for some hardware registers to determine which
> hardware features are enabled by the kernel.

No, we use CPUID.  It's a part of the ISA that's designed for
enumerating CPU and (sometimes) OS support for CPU features.

> We do not have generic support for something like that on ppc.
> The kernel looks at the device tree to determine what hardware features
> are available. But does not have mechanism to tell the hardware to track
> which of its features are currently enabled/used by the kernel; atleast
> not for the memory-key feature.

Bummer.  You're missing out.

But, you could still do this with a syscall.  "Hey, kernel, do you
support this feature?"
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 22:28         ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-12-18 22:28 UTC (permalink / raw)


On 12/18/2017 02:18 PM, Ram Pai wrote:
> b) minimum number of keys available to the application.
> 	if libraries consumes a few, they could provide a library
> 	interface to the application informing the number available to
> 	the application.  The library interface can leverage (b) to
> 	provide the information.

OK, let's see a real user of this including a few libraries.  Then we'll
put it in the kernel.

> c) types of disable-rights supported by keys.
> 	Helps the application to determine the types of disable-features
> 	available. This is helpful, otherwise the app has to 
> 	make pkey_alloc() call with the corresponding parameter set
> 	and see if it suceeds or fails. Painful from an application
> 	point of view, in my opinion.

Again, let's see a real-world use of this.  How does it look?  How does
an app "fall back" if it can't set a restriction the way it wants to?

Are we *sure* that such an interface makes sense?  For instance, will it
be possible for some keys to be execute-disable while other are only
write-disable?

> I think on x86 you look for some hardware registers to determine which
> hardware features are enabled by the kernel.

No, we use CPUID.  It's a part of the ISA that's designed for
enumerating CPU and (sometimes) OS support for CPU features.

> We do not have generic support for something like that on ppc.
> The kernel looks at the device tree to determine what hardware features
> are available. But does not have mechanism to tell the hardware to track
> which of its features are currently enabled/used by the kernel; atleast
> not for the memory-key feature.

Bummer.  You're missing out.

But, you could still do this with a syscall.  "Hey, kernel, do you
support this feature?"
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 22:28         ` dave.hansen
  0 siblings, 0 replies; 197+ messages in thread
From: Dave Hansen @ 2017-12-18 22:28 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linuxppc-dev, linux-mm, x86,
	linux-arch, linux-doc, linux-kselftest, linux-kernel, benh,
	paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

On 12/18/2017 02:18 PM, Ram Pai wrote:
> b) minimum number of keys available to the application.
> 	if libraries consumes a few, they could provide a library
> 	interface to the application informing the number available to
> 	the application.  The library interface can leverage (b) to
> 	provide the information.

OK, let's see a real user of this including a few libraries.  Then we'll
put it in the kernel.

> c) types of disable-rights supported by keys.
> 	Helps the application to determine the types of disable-features
> 	available. This is helpful, otherwise the app has to 
> 	make pkey_alloc() call with the corresponding parameter set
> 	and see if it suceeds or fails. Painful from an application
> 	point of view, in my opinion.

Again, let's see a real-world use of this.  How does it look?  How does
an app "fall back" if it can't set a restriction the way it wants to?

Are we *sure* that such an interface makes sense?  For instance, will it
be possible for some keys to be execute-disable while other are only
write-disable?

> I think on x86 you look for some hardware registers to determine which
> hardware features are enabled by the kernel.

No, we use CPUID.  It's a part of the ISA that's designed for
enumerating CPU and (sometimes) OS support for CPU features.

> We do not have generic support for something like that on ppc.
> The kernel looks at the device tree to determine what hardware features
> are available. But does not have mechanism to tell the hardware to track
> which of its features are currently enabled/used by the kernel; atleast
> not for the memory-key feature.

Bummer.  You're missing out.

But, you could still do this with a syscall.  "Hey, kernel, do you
support this feature?"

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-18 22:28         ` dave.hansen
  (?)
  (?)
@ 2017-12-18 23:15           ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-18 23:15 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-arch, ebiederm, arnd, corbet, x86, linux-doc, linux-kernel,
	mhocko, linux-mm, mingo, paulus, aneesh.kumar, linux-kselftest,
	bauerman, akpm, linuxppc-dev, khandual

On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> On 12/18/2017 02:18 PM, Ram Pai wrote:
> > b) minimum number of keys available to the application.
> > 	if libraries consumes a few, they could provide a library
> > 	interface to the application informing the number available to
> > 	the application.  The library interface can leverage (b) to
> > 	provide the information.
> 
> OK, let's see a real user of this including a few libraries.  Then we'll
> put it in the kernel.
> 
> > c) types of disable-rights supported by keys.
> > 	Helps the application to determine the types of disable-features
> > 	available. This is helpful, otherwise the app has to 
> > 	make pkey_alloc() call with the corresponding parameter set
> > 	and see if it suceeds or fails. Painful from an application
> > 	point of view, in my opinion.
> 
> Again, let's see a real-world use of this.  How does it look?  How does
> an app "fall back" if it can't set a restriction the way it wants to?
> 
> Are we *sure* that such an interface makes sense?  For instance, will it
> be possible for some keys to be execute-disable while other are only
> write-disable?

Can it be on x86?

its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
time.

However you are right. Its conceivable that some arch could provide a
feature where it can be x-attribute-disable for key 'a' and
y-attribute-disable for key 'b'.  But than its a bit of a headache
for an application to consume such a feature.

Ben: I recall you requesting this feature.  Thoughts?

> 
> > I think on x86 you look for some hardware registers to determine
> > which hardware features are enabled by the kernel.
> 
> No, we use CPUID.  It's a part of the ISA that's designed for
> enumerating CPU and (sometimes) OS support for CPU features.
> 
> > We do not have generic support for something like that on ppc.  The
> > kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to
> > track which of its features are currently enabled/used by the
> > kernel; atleast not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

or do powerpc specific sysfs interface.
or a debugfs interface.

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 23:15           ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-12-18 23:15 UTC (permalink / raw)


On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> On 12/18/2017 02:18 PM, Ram Pai wrote:
> > b) minimum number of keys available to the application.
> > 	if libraries consumes a few, they could provide a library
> > 	interface to the application informing the number available to
> > 	the application.  The library interface can leverage (b) to
> > 	provide the information.
> 
> OK, let's see a real user of this including a few libraries.  Then we'll
> put it in the kernel.
> 
> > c) types of disable-rights supported by keys.
> > 	Helps the application to determine the types of disable-features
> > 	available. This is helpful, otherwise the app has to 
> > 	make pkey_alloc() call with the corresponding parameter set
> > 	and see if it suceeds or fails. Painful from an application
> > 	point of view, in my opinion.
> 
> Again, let's see a real-world use of this.  How does it look?  How does
> an app "fall back" if it can't set a restriction the way it wants to?
> 
> Are we *sure* that such an interface makes sense?  For instance, will it
> be possible for some keys to be execute-disable while other are only
> write-disable?

Can it be on x86?

its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
time.

However you are right. Its conceivable that some arch could provide a
feature where it can be x-attribute-disable for key 'a' and
y-attribute-disable for key 'b'.  But than its a bit of a headache
for an application to consume such a feature.

Ben: I recall you requesting this feature.  Thoughts?

> 
> > I think on x86 you look for some hardware registers to determine
> > which hardware features are enabled by the kernel.
> 
> No, we use CPUID.  It's a part of the ISA that's designed for
> enumerating CPU and (sometimes) OS support for CPU features.
> 
> > We do not have generic support for something like that on ppc.  The
> > kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to
> > track which of its features are currently enabled/used by the
> > kernel; atleast not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

or do powerpc specific sysfs interface.
or a debugfs interface.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 23:15           ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-18 23:15 UTC (permalink / raw)


On Mon, Dec 18, 2017@02:28:14PM -0800, Dave Hansen wrote:
> On 12/18/2017 02:18 PM, Ram Pai wrote:
> > b) minimum number of keys available to the application.
> > 	if libraries consumes a few, they could provide a library
> > 	interface to the application informing the number available to
> > 	the application.  The library interface can leverage (b) to
> > 	provide the information.
> 
> OK, let's see a real user of this including a few libraries.  Then we'll
> put it in the kernel.
> 
> > c) types of disable-rights supported by keys.
> > 	Helps the application to determine the types of disable-features
> > 	available. This is helpful, otherwise the app has to 
> > 	make pkey_alloc() call with the corresponding parameter set
> > 	and see if it suceeds or fails. Painful from an application
> > 	point of view, in my opinion.
> 
> Again, let's see a real-world use of this.  How does it look?  How does
> an app "fall back" if it can't set a restriction the way it wants to?
> 
> Are we *sure* that such an interface makes sense?  For instance, will it
> be possible for some keys to be execute-disable while other are only
> write-disable?

Can it be on x86?

its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
time.

However you are right. Its conceivable that some arch could provide a
feature where it can be x-attribute-disable for key 'a' and
y-attribute-disable for key 'b'.  But than its a bit of a headache
for an application to consume such a feature.

Ben: I recall you requesting this feature.  Thoughts?

> 
> > I think on x86 you look for some hardware registers to determine
> > which hardware features are enabled by the kernel.
> 
> No, we use CPUID.  It's a part of the ISA that's designed for
> enumerating CPU and (sometimes) OS support for CPU features.
> 
> > We do not have generic support for something like that on ppc.  The
> > kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to
> > track which of its features are currently enabled/used by the
> > kernel; atleast not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

or do powerpc specific sysfs interface.
or a debugfs interface.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-18 23:15           ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-18 23:15 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-arch, ebiederm, arnd, corbet, x86, linux-doc, linux-kernel,
	mhocko, linux-mm, mingo, paulus, aneesh.kumar, linux-kselftest,
	bauerman, akpm, linuxppc-dev, khandual

On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> On 12/18/2017 02:18 PM, Ram Pai wrote:
> > b) minimum number of keys available to the application.
> > 	if libraries consumes a few, they could provide a library
> > 	interface to the application informing the number available to
> > 	the application.  The library interface can leverage (b) to
> > 	provide the information.
> 
> OK, let's see a real user of this including a few libraries.  Then we'll
> put it in the kernel.
> 
> > c) types of disable-rights supported by keys.
> > 	Helps the application to determine the types of disable-features
> > 	available. This is helpful, otherwise the app has to 
> > 	make pkey_alloc() call with the corresponding parameter set
> > 	and see if it suceeds or fails. Painful from an application
> > 	point of view, in my opinion.
> 
> Again, let's see a real-world use of this.  How does it look?  How does
> an app "fall back" if it can't set a restriction the way it wants to?
> 
> Are we *sure* that such an interface makes sense?  For instance, will it
> be possible for some keys to be execute-disable while other are only
> write-disable?

Can it be on x86?

its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
time.

However you are right. Its conceivable that some arch could provide a
feature where it can be x-attribute-disable for key 'a' and
y-attribute-disable for key 'b'.  But than its a bit of a headache
for an application to consume such a feature.

Ben: I recall you requesting this feature.  Thoughts?

> 
> > I think on x86 you look for some hardware registers to determine
> > which hardware features are enabled by the kernel.
> 
> No, we use CPUID.  It's a part of the ISA that's designed for
> enumerating CPU and (sometimes) OS support for CPU features.
> 
> > We do not have generic support for something like that on ppc.  The
> > kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to
> > track which of its features are currently enabled/used by the
> > kernel; atleast not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

or do powerpc specific sysfs interface.
or a debugfs interface.

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-18 23:15           ` linuxram
  (?)
  (?)
@ 2017-12-19  8:31             ` paubert
  -1 siblings, 0 replies; 197+ messages in thread
From: Gabriel Paubert @ 2017-12-19  8:31 UTC (permalink / raw)
  To: Ram Pai
  Cc: Dave Hansen, linux-arch, corbet, arnd, linux-doc, x86,
	linux-kernel, mhocko, linux-mm, mingo, paulus, ebiederm,
	linux-kselftest, bauerman, akpm, khandual, linuxppc-dev,
	aneesh.kumar

On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > b) minimum number of keys available to the application.
> > > 	if libraries consumes a few, they could provide a library
> > > 	interface to the application informing the number available to
> > > 	the application.  The library interface can leverage (b) to
> > > 	provide the information.
> > 
> > OK, let's see a real user of this including a few libraries.  Then we'll
> > put it in the kernel.
> > 
> > > c) types of disable-rights supported by keys.
> > > 	Helps the application to determine the types of disable-features
> > > 	available. This is helpful, otherwise the app has to 
> > > 	make pkey_alloc() call with the corresponding parameter set
> > > 	and see if it suceeds or fails. Painful from an application
> > > 	point of view, in my opinion.
> > 
> > Again, let's see a real-world use of this.  How does it look?  How does
> > an app "fall back" if it can't set a restriction the way it wants to?
> > 
> > Are we *sure* that such an interface makes sense?  For instance, will it
> > be possible for some keys to be execute-disable while other are only
> > write-disable?
> 
> Can it be on x86?
> 
> its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
> time.
> 
> However you are right. Its conceivable that some arch could provide a
> feature where it can be x-attribute-disable for key 'a' and
> y-attribute-disable for key 'b'.  But than its a bit of a headache
> for an application to consume such a feature.
> 
> Ben: I recall you requesting this feature.  Thoughts?
> 
> > 
> > > I think on x86 you look for some hardware registers to determine
> > > which hardware features are enabled by the kernel.
> > 
> > No, we use CPUID.  It's a part of the ISA that's designed for
> > enumerating CPU and (sometimes) OS support for CPU features.
> > 
> > > We do not have generic support for something like that on ppc.  The
> > > kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to
> > > track which of its features are currently enabled/used by the
> > > kernel; atleast not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> or do powerpc specific sysfs interface.
> or a debugfs interface.

getauxval(3) ?

With AT_HWCAP or HWCAP2 as parameter already gives information about
features supported by the hardware and the kernel.

Taking one bit to expose the availability of protection keys to
applications does not look impossible.

Do I miss something obvious?

	Gabriel

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19  8:31             ` paubert
  0 siblings, 0 replies; 197+ messages in thread
From: paubert @ 2017-12-19  8:31 UTC (permalink / raw)


On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > b) minimum number of keys available to the application.
> > > 	if libraries consumes a few, they could provide a library
> > > 	interface to the application informing the number available to
> > > 	the application.  The library interface can leverage (b) to
> > > 	provide the information.
> > 
> > OK, let's see a real user of this including a few libraries.  Then we'll
> > put it in the kernel.
> > 
> > > c) types of disable-rights supported by keys.
> > > 	Helps the application to determine the types of disable-features
> > > 	available. This is helpful, otherwise the app has to 
> > > 	make pkey_alloc() call with the corresponding parameter set
> > > 	and see if it suceeds or fails. Painful from an application
> > > 	point of view, in my opinion.
> > 
> > Again, let's see a real-world use of this.  How does it look?  How does
> > an app "fall back" if it can't set a restriction the way it wants to?
> > 
> > Are we *sure* that such an interface makes sense?  For instance, will it
> > be possible for some keys to be execute-disable while other are only
> > write-disable?
> 
> Can it be on x86?
> 
> its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
> time.
> 
> However you are right. Its conceivable that some arch could provide a
> feature where it can be x-attribute-disable for key 'a' and
> y-attribute-disable for key 'b'.  But than its a bit of a headache
> for an application to consume such a feature.
> 
> Ben: I recall you requesting this feature.  Thoughts?
> 
> > 
> > > I think on x86 you look for some hardware registers to determine
> > > which hardware features are enabled by the kernel.
> > 
> > No, we use CPUID.  It's a part of the ISA that's designed for
> > enumerating CPU and (sometimes) OS support for CPU features.
> > 
> > > We do not have generic support for something like that on ppc.  The
> > > kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to
> > > track which of its features are currently enabled/used by the
> > > kernel; atleast not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> or do powerpc specific sysfs interface.
> or a debugfs interface.

getauxval(3) ?

With AT_HWCAP or HWCAP2 as parameter already gives information about
features supported by the hardware and the kernel.

Taking one bit to expose the availability of protection keys to
applications does not look impossible.

Do I miss something obvious?

	Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19  8:31             ` paubert
  0 siblings, 0 replies; 197+ messages in thread
From: Gabriel Paubert @ 2017-12-19  8:31 UTC (permalink / raw)


On Mon, Dec 18, 2017@03:15:51PM -0800, Ram Pai wrote:
> On Mon, Dec 18, 2017@02:28:14PM -0800, Dave Hansen wrote:
> > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > b) minimum number of keys available to the application.
> > > 	if libraries consumes a few, they could provide a library
> > > 	interface to the application informing the number available to
> > > 	the application.  The library interface can leverage (b) to
> > > 	provide the information.
> > 
> > OK, let's see a real user of this including a few libraries.  Then we'll
> > put it in the kernel.
> > 
> > > c) types of disable-rights supported by keys.
> > > 	Helps the application to determine the types of disable-features
> > > 	available. This is helpful, otherwise the app has to 
> > > 	make pkey_alloc() call with the corresponding parameter set
> > > 	and see if it suceeds or fails. Painful from an application
> > > 	point of view, in my opinion.
> > 
> > Again, let's see a real-world use of this.  How does it look?  How does
> > an app "fall back" if it can't set a restriction the way it wants to?
> > 
> > Are we *sure* that such an interface makes sense?  For instance, will it
> > be possible for some keys to be execute-disable while other are only
> > write-disable?
> 
> Can it be on x86?
> 
> its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
> time.
> 
> However you are right. Its conceivable that some arch could provide a
> feature where it can be x-attribute-disable for key 'a' and
> y-attribute-disable for key 'b'.  But than its a bit of a headache
> for an application to consume such a feature.
> 
> Ben: I recall you requesting this feature.  Thoughts?
> 
> > 
> > > I think on x86 you look for some hardware registers to determine
> > > which hardware features are enabled by the kernel.
> > 
> > No, we use CPUID.  It's a part of the ISA that's designed for
> > enumerating CPU and (sometimes) OS support for CPU features.
> > 
> > > We do not have generic support for something like that on ppc.  The
> > > kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to
> > > track which of its features are currently enabled/used by the
> > > kernel; atleast not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> or do powerpc specific sysfs interface.
> or a debugfs interface.

getauxval(3) ?

With AT_HWCAP or HWCAP2 as parameter already gives information about
features supported by the hardware and the kernel.

Taking one bit to expose the availability of protection keys to
applications does not look impossible.

Do I miss something obvious?

	Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19  8:31             ` paubert
  0 siblings, 0 replies; 197+ messages in thread
From: Gabriel Paubert @ 2017-12-19  8:31 UTC (permalink / raw)
  To: Ram Pai
  Cc: Dave Hansen, linux-arch, corbet, arnd, linux-doc, x86,
	linux-kernel, mhocko, linux-mm, mingo, paulus, ebiederm,
	linux-kselftest, bauerman, akpm, khandual, linuxppc-dev,
	aneesh.kumar

On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > b) minimum number of keys available to the application.
> > > 	if libraries consumes a few, they could provide a library
> > > 	interface to the application informing the number available to
> > > 	the application.  The library interface can leverage (b) to
> > > 	provide the information.
> > 
> > OK, let's see a real user of this including a few libraries.  Then we'll
> > put it in the kernel.
> > 
> > > c) types of disable-rights supported by keys.
> > > 	Helps the application to determine the types of disable-features
> > > 	available. This is helpful, otherwise the app has to 
> > > 	make pkey_alloc() call with the corresponding parameter set
> > > 	and see if it suceeds or fails. Painful from an application
> > > 	point of view, in my opinion.
> > 
> > Again, let's see a real-world use of this.  How does it look?  How does
> > an app "fall back" if it can't set a restriction the way it wants to?
> > 
> > Are we *sure* that such an interface makes sense?  For instance, will it
> > be possible for some keys to be execute-disable while other are only
> > write-disable?
> 
> Can it be on x86?
> 
> its not possible on ppc. the keys can *all* be  the-same-attributes-disable all the
> time.
> 
> However you are right. Its conceivable that some arch could provide a
> feature where it can be x-attribute-disable for key 'a' and
> y-attribute-disable for key 'b'.  But than its a bit of a headache
> for an application to consume such a feature.
> 
> Ben: I recall you requesting this feature.  Thoughts?
> 
> > 
> > > I think on x86 you look for some hardware registers to determine
> > > which hardware features are enabled by the kernel.
> > 
> > No, we use CPUID.  It's a part of the ISA that's designed for
> > enumerating CPU and (sometimes) OS support for CPU features.
> > 
> > > We do not have generic support for something like that on ppc.  The
> > > kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to
> > > track which of its features are currently enabled/used by the
> > > kernel; atleast not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> or do powerpc specific sysfs interface.
> or a debugfs interface.

getauxval(3) ?

With AT_HWCAP or HWCAP2 as parameter already gives information about
features supported by the hardware and the kernel.

Taking one bit to expose the availability of protection keys to
applications does not look impossible.

Do I miss something obvious?

	Gabriel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-18 18:54     ` dave.hansen
  (?)
  (?)
@ 2017-12-19 10:50       ` mpe
  -1 siblings, 0 replies; 197+ messages in thread
From: Michael Ellerman @ 2017-12-19 10:50 UTC (permalink / raw)
  To: Dave Hansen, Ram Pai, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, benh, paulus, khandual,
	aneesh.kumar, bsingharora, hbabu, mhocko, bauerman, ebiederm

Dave Hansen <dave.hansen@intel.com> writes:

> On 11/06/2017 12:57 AM, Ram Pai wrote:
>> Expose useful information for programs using memory protection keys.
>> Provide implementation for powerpc and x86.
>> 
>> On a powerpc system with pkeys support, here is what is shown:
>> 
>> $ head /sys/kernel/mm/protection_keys/*
>> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
>> true
>
> This is cute, but I don't think it should be part of the ABI.  Put it in
> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Yeah I agree this is not sysfs material.

In particular the total/usable numbers are completely useless vs other
threads allocating pkeys out from under you.

> http://man7.org/linux/man-pages/man7/pkeys.7.html
>
>>        Any application wanting to use protection keys needs to be able to
>>        function without them.  They might be unavailable because the
>>        hardware that the application runs on does not support them, the
>>        kernel code does not contain support, the kernel support has been
>>        disabled, or because the keys have all been allocated, perhaps by a
>>        library the application is using.  It is recommended that
>>        applications wanting to use protection keys should simply call
>>        pkey_alloc(2) and test whether the call succeeds, instead of
>>        attempting to detect support for the feature in any other way.
>
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
same as some other architectures.

But I don't see the need to use a feature bit for pkeys. If they're not
supported then pkey_alloc() will just always fail. Apps have to handle
that anyway because keys are a finite resource.

cheers

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 10:50       ` mpe
  0 siblings, 0 replies; 197+ messages in thread
From: mpe @ 2017-12-19 10:50 UTC (permalink / raw)


Dave Hansen <dave.hansen at intel.com> writes:

> On 11/06/2017 12:57 AM, Ram Pai wrote:
>> Expose useful information for programs using memory protection keys.
>> Provide implementation for powerpc and x86.
>> 
>> On a powerpc system with pkeys support, here is what is shown:
>> 
>> $ head /sys/kernel/mm/protection_keys/*
>> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
>> true
>
> This is cute, but I don't think it should be part of the ABI.  Put it in
> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Yeah I agree this is not sysfs material.

In particular the total/usable numbers are completely useless vs other
threads allocating pkeys out from under you.

> http://man7.org/linux/man-pages/man7/pkeys.7.html
>
>>        Any application wanting to use protection keys needs to be able to
>>        function without them.  They might be unavailable because the
>>        hardware that the application runs on does not support them, the
>>        kernel code does not contain support, the kernel support has been
>>        disabled, or because the keys have all been allocated, perhaps by a
>>        library the application is using.  It is recommended that
>>        applications wanting to use protection keys should simply call
>>        pkey_alloc(2) and test whether the call succeeds, instead of
>>        attempting to detect support for the feature in any other way.
>
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
same as some other architectures.

But I don't see the need to use a feature bit for pkeys. If they're not
supported then pkey_alloc() will just always fail. Apps have to handle
that anyway because keys are a finite resource.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 10:50       ` mpe
  0 siblings, 0 replies; 197+ messages in thread
From: Michael Ellerman @ 2017-12-19 10:50 UTC (permalink / raw)


Dave Hansen <dave.hansen at intel.com> writes:

> On 11/06/2017 12:57 AM, Ram Pai wrote:
>> Expose useful information for programs using memory protection keys.
>> Provide implementation for powerpc and x86.
>> 
>> On a powerpc system with pkeys support, here is what is shown:
>> 
>> $ head /sys/kernel/mm/protection_keys/*
>> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
>> true
>
> This is cute, but I don't think it should be part of the ABI.  Put it in
> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Yeah I agree this is not sysfs material.

In particular the total/usable numbers are completely useless vs other
threads allocating pkeys out from under you.

> http://man7.org/linux/man-pages/man7/pkeys.7.html
>
>>        Any application wanting to use protection keys needs to be able to
>>        function without them.  They might be unavailable because the
>>        hardware that the application runs on does not support them, the
>>        kernel code does not contain support, the kernel support has been
>>        disabled, or because the keys have all been allocated, perhaps by a
>>        library the application is using.  It is recommended that
>>        applications wanting to use protection keys should simply call
>>        pkey_alloc(2) and test whether the call succeeds, instead of
>>        attempting to detect support for the feature in any other way.
>
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
same as some other architectures.

But I don't see the need to use a feature bit for pkeys. If they're not
supported then pkey_alloc() will just always fail. Apps have to handle
that anyway because keys are a finite resource.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 10:50       ` mpe
  0 siblings, 0 replies; 197+ messages in thread
From: Michael Ellerman @ 2017-12-19 10:50 UTC (permalink / raw)
  To: Dave Hansen, Ram Pai, mingo, akpm, corbet, arnd
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, benh, paulus, khandual,
	aneesh.kumar, bsingharora, hbabu, mhocko, bauerman, ebiederm

Dave Hansen <dave.hansen@intel.com> writes:

> On 11/06/2017 12:57 AM, Ram Pai wrote:
>> Expose useful information for programs using memory protection keys.
>> Provide implementation for powerpc and x86.
>> 
>> On a powerpc system with pkeys support, here is what is shown:
>> 
>> $ head /sys/kernel/mm/protection_keys/*
>> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
>> true
>
> This is cute, but I don't think it should be part of the ABI.  Put it in
> debugfs if you want it for cute tests.  The stuff that this tells you
> can and should come from pkey_alloc() for the ABI.

Yeah I agree this is not sysfs material.

In particular the total/usable numbers are completely useless vs other
threads allocating pkeys out from under you.

> http://man7.org/linux/man-pages/man7/pkeys.7.html
>
>>        Any application wanting to use protection keys needs to be able to
>>        function without them.  They might be unavailable because the
>>        hardware that the application runs on does not support them, the
>>        kernel code does not contain support, the kernel support has been
>>        disabled, or because the keys have all been allocated, perhaps by a
>>        library the application is using.  It is recommended that
>>        applications wanting to use protection keys should simply call
>>        pkey_alloc(2) and test whether the call succeeds, instead of
>>        attempting to detect support for the feature in any other way.
>
> Do you really not have standard way on ppc to say whether hardware
> features are supported by the kernel?  For instance, how do you know if
> a given set of registers are known to and are being context-switched by
> the kernel?

Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
same as some other architectures.

But I don't see the need to use a feature bit for pkeys. If they're not
supported then pkey_alloc() will just always fail. Apps have to handle
that anyway because keys are a finite resource.

cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-19  8:31             ` paubert
  (?)
  (?)
@ 2017-12-19 16:22               ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-19 16:22 UTC (permalink / raw)
  To: Gabriel Paubert
  Cc: linux-arch, corbet, arnd, linux-doc, aneesh.kumar, x86,
	linux-kernel, mhocko, linux-mm, Dave Hansen, mingo, paulus,
	ebiederm, linux-kselftest, bauerman, akpm, linuxppc-dev,
	khandual

On Tue, Dec 19, 2017 at 09:31:22AM +0100, Gabriel Paubert wrote:
> On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> > On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > 
....snip...
> > > > I think on x86 you look for some hardware registers to determine
> > > > which hardware features are enabled by the kernel.
> > > 
> > > No, we use CPUID.  It's a part of the ISA that's designed for
> > > enumerating CPU and (sometimes) OS support for CPU features.
> > > 
> > > > We do not have generic support for something like that on ppc.  The
> > > > kernel looks at the device tree to determine what hardware features
> > > > are available. But does not have mechanism to tell the hardware to
> > > > track which of its features are currently enabled/used by the
> > > > kernel; atleast not for the memory-key feature.
> > > 
> > > Bummer.  You're missing out.
> > > 
> > > But, you could still do this with a syscall.  "Hey, kernel, do you
> > > support this feature?"
> > 
> > or do powerpc specific sysfs interface.
> > or a debugfs interface.
> 
> getauxval(3) ?
> 
> With AT_HWCAP or HWCAP2 as parameter already gives information about
> features supported by the hardware and the kernel.
> 
> Taking one bit to expose the availability of protection keys to
> applications does not look impossible.
> 
> Do I miss something obvious?

No. I am told this is possible aswell.

RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 16:22               ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-12-19 16:22 UTC (permalink / raw)


On Tue, Dec 19, 2017 at 09:31:22AM +0100, Gabriel Paubert wrote:
> On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> > On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > 
....snip...
> > > > I think on x86 you look for some hardware registers to determine
> > > > which hardware features are enabled by the kernel.
> > > 
> > > No, we use CPUID.  It's a part of the ISA that's designed for
> > > enumerating CPU and (sometimes) OS support for CPU features.
> > > 
> > > > We do not have generic support for something like that on ppc.  The
> > > > kernel looks at the device tree to determine what hardware features
> > > > are available. But does not have mechanism to tell the hardware to
> > > > track which of its features are currently enabled/used by the
> > > > kernel; atleast not for the memory-key feature.
> > > 
> > > Bummer.  You're missing out.
> > > 
> > > But, you could still do this with a syscall.  "Hey, kernel, do you
> > > support this feature?"
> > 
> > or do powerpc specific sysfs interface.
> > or a debugfs interface.
> 
> getauxval(3) ?
> 
> With AT_HWCAP or HWCAP2 as parameter already gives information about
> features supported by the hardware and the kernel.
> 
> Taking one bit to expose the availability of protection keys to
> applications does not look impossible.
> 
> Do I miss something obvious?

No. I am told this is possible aswell.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 16:22               ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-19 16:22 UTC (permalink / raw)


On Tue, Dec 19, 2017@09:31:22AM +0100, Gabriel Paubert wrote:
> On Mon, Dec 18, 2017@03:15:51PM -0800, Ram Pai wrote:
> > On Mon, Dec 18, 2017@02:28:14PM -0800, Dave Hansen wrote:
> > > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > 
....snip...
> > > > I think on x86 you look for some hardware registers to determine
> > > > which hardware features are enabled by the kernel.
> > > 
> > > No, we use CPUID.  It's a part of the ISA that's designed for
> > > enumerating CPU and (sometimes) OS support for CPU features.
> > > 
> > > > We do not have generic support for something like that on ppc.  The
> > > > kernel looks at the device tree to determine what hardware features
> > > > are available. But does not have mechanism to tell the hardware to
> > > > track which of its features are currently enabled/used by the
> > > > kernel; atleast not for the memory-key feature.
> > > 
> > > Bummer.  You're missing out.
> > > 
> > > But, you could still do this with a syscall.  "Hey, kernel, do you
> > > support this feature?"
> > 
> > or do powerpc specific sysfs interface.
> > or a debugfs interface.
> 
> getauxval(3) ?
> 
> With AT_HWCAP or HWCAP2 as parameter already gives information about
> features supported by the hardware and the kernel.
> 
> Taking one bit to expose the availability of protection keys to
> applications does not look impossible.
> 
> Do I miss something obvious?

No. I am told this is possible aswell.

RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 16:22               ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-19 16:22 UTC (permalink / raw)
  To: Gabriel Paubert
  Cc: linux-arch, corbet, arnd, linux-doc, aneesh.kumar, x86,
	linux-kernel, mhocko, linux-mm, Dave Hansen, mingo, paulus,
	ebiederm, linux-kselftest, bauerman, akpm, linuxppc-dev,
	khandual

On Tue, Dec 19, 2017 at 09:31:22AM +0100, Gabriel Paubert wrote:
> On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> > On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > 
....snip...
> > > > I think on x86 you look for some hardware registers to determine
> > > > which hardware features are enabled by the kernel.
> > > 
> > > No, we use CPUID.  It's a part of the ISA that's designed for
> > > enumerating CPU and (sometimes) OS support for CPU features.
> > > 
> > > > We do not have generic support for something like that on ppc.  The
> > > > kernel looks at the device tree to determine what hardware features
> > > > are available. But does not have mechanism to tell the hardware to
> > > > track which of its features are currently enabled/used by the
> > > > kernel; atleast not for the memory-key feature.
> > > 
> > > Bummer.  You're missing out.
> > > 
> > > But, you could still do this with a syscall.  "Hey, kernel, do you
> > > support this feature?"
> > 
> > or do powerpc specific sysfs interface.
> > or a debugfs interface.
> 
> getauxval(3) ?
> 
> With AT_HWCAP or HWCAP2 as parameter already gives information about
> features supported by the hardware and the kernel.
> 
> Taking one bit to expose the availability of protection keys to
> applications does not look impossible.
> 
> Do I miss something obvious?

No. I am told this is possible aswell.

RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-19 10:50       ` mpe
  (?)
  (?)
@ 2017-12-19 16:32         ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-19 16:32 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Dave Hansen, mingo, akpm, corbet, arnd, linux-arch, ebiederm,
	linux-doc, x86, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Tue, Dec 19, 2017 at 09:50:24PM +1100, Michael Ellerman wrote:
> Dave Hansen <dave.hansen@intel.com> writes:
> 
> > On 11/06/2017 12:57 AM, Ram Pai wrote:
> >> Expose useful information for programs using memory protection keys.
> >> Provide implementation for powerpc and x86.
> >> 
> >> On a powerpc system with pkeys support, here is what is shown:
> >> 
> >> $ head /sys/kernel/mm/protection_keys/*
> >> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> >> true
> >
> > This is cute, but I don't think it should be part of the ABI.  Put it in
> > debugfs if you want it for cute tests.  The stuff that this tells you
> > can and should come from pkey_alloc() for the ABI.
> 
> Yeah I agree this is not sysfs material.
> 
> In particular the total/usable numbers are completely useless vs other
> threads allocating pkeys out from under you.

The usable number is the minimum number of keys available for use by the
application, not the number of keys **currently** available.  Its a
static number.

I am dropping this patch. We can revisit this when a clear request for
such a feature emerges.

> 
> >
> >>        Any application wanting to use protection keys needs to be able to
> >>        function without them.  They might be unavailable because the
> >>        hardware that the application runs on does not support them, the
> >>        kernel code does not contain support, the kernel support has been
> >>        disabled, or because the keys have all been allocated, perhaps by a
> >>        library the application is using.  It is recommended that
> >>        applications wanting to use protection keys should simply call
> >>        pkey_alloc(2) and test whether the call succeeds, instead of
> >>        attempting to detect support for the feature in any other way.
> >
> > Do you really not have standard way on ppc to say whether hardware
> > features are supported by the kernel?  For instance, how do you know if
> > a given set of registers are known to and are being context-switched by
> > the kernel?
> 
> Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
> same as some other architectures.

Ah. I was not aware of this.
Thanks,
RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 16:32         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-12-19 16:32 UTC (permalink / raw)


On Tue, Dec 19, 2017 at 09:50:24PM +1100, Michael Ellerman wrote:
> Dave Hansen <dave.hansen at intel.com> writes:
> 
> > On 11/06/2017 12:57 AM, Ram Pai wrote:
> >> Expose useful information for programs using memory protection keys.
> >> Provide implementation for powerpc and x86.
> >> 
> >> On a powerpc system with pkeys support, here is what is shown:
> >> 
> >> $ head /sys/kernel/mm/protection_keys/*
> >> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> >> true
> >
> > This is cute, but I don't think it should be part of the ABI.  Put it in
> > debugfs if you want it for cute tests.  The stuff that this tells you
> > can and should come from pkey_alloc() for the ABI.
> 
> Yeah I agree this is not sysfs material.
> 
> In particular the total/usable numbers are completely useless vs other
> threads allocating pkeys out from under you.

The usable number is the minimum number of keys available for use by the
application, not the number of keys **currently** available.  Its a
static number.

I am dropping this patch. We can revisit this when a clear request for
such a feature emerges.

> 
> >
> >>        Any application wanting to use protection keys needs to be able to
> >>        function without them.  They might be unavailable because the
> >>        hardware that the application runs on does not support them, the
> >>        kernel code does not contain support, the kernel support has been
> >>        disabled, or because the keys have all been allocated, perhaps by a
> >>        library the application is using.  It is recommended that
> >>        applications wanting to use protection keys should simply call
> >>        pkey_alloc(2) and test whether the call succeeds, instead of
> >>        attempting to detect support for the feature in any other way.
> >
> > Do you really not have standard way on ppc to say whether hardware
> > features are supported by the kernel?  For instance, how do you know if
> > a given set of registers are known to and are being context-switched by
> > the kernel?
> 
> Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
> same as some other architectures.

Ah. I was not aware of this.
Thanks,
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 16:32         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-19 16:32 UTC (permalink / raw)


On Tue, Dec 19, 2017@09:50:24PM +1100, Michael Ellerman wrote:
> Dave Hansen <dave.hansen at intel.com> writes:
> 
> > On 11/06/2017 12:57 AM, Ram Pai wrote:
> >> Expose useful information for programs using memory protection keys.
> >> Provide implementation for powerpc and x86.
> >> 
> >> On a powerpc system with pkeys support, here is what is shown:
> >> 
> >> $ head /sys/kernel/mm/protection_keys/*
> >> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> >> true
> >
> > This is cute, but I don't think it should be part of the ABI.  Put it in
> > debugfs if you want it for cute tests.  The stuff that this tells you
> > can and should come from pkey_alloc() for the ABI.
> 
> Yeah I agree this is not sysfs material.
> 
> In particular the total/usable numbers are completely useless vs other
> threads allocating pkeys out from under you.

The usable number is the minimum number of keys available for use by the
application, not the number of keys **currently** available.  Its a
static number.

I am dropping this patch. We can revisit this when a clear request for
such a feature emerges.

> 
> >
> >>        Any application wanting to use protection keys needs to be able to
> >>        function without them.  They might be unavailable because the
> >>        hardware that the application runs on does not support them, the
> >>        kernel code does not contain support, the kernel support has been
> >>        disabled, or because the keys have all been allocated, perhaps by a
> >>        library the application is using.  It is recommended that
> >>        applications wanting to use protection keys should simply call
> >>        pkey_alloc(2) and test whether the call succeeds, instead of
> >>        attempting to detect support for the feature in any other way.
> >
> > Do you really not have standard way on ppc to say whether hardware
> > features are supported by the kernel?  For instance, how do you know if
> > a given set of registers are known to and are being context-switched by
> > the kernel?
> 
> Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
> same as some other architectures.

Ah. I was not aware of this.
Thanks,
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 16:32         ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-19 16:32 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Dave Hansen, mingo, akpm, corbet, arnd, linux-arch, ebiederm,
	linux-doc, x86, linux-kernel, mhocko, linux-mm, paulus,
	aneesh.kumar, linux-kselftest, bauerman, linuxppc-dev, khandual

On Tue, Dec 19, 2017 at 09:50:24PM +1100, Michael Ellerman wrote:
> Dave Hansen <dave.hansen@intel.com> writes:
> 
> > On 11/06/2017 12:57 AM, Ram Pai wrote:
> >> Expose useful information for programs using memory protection keys.
> >> Provide implementation for powerpc and x86.
> >> 
> >> On a powerpc system with pkeys support, here is what is shown:
> >> 
> >> $ head /sys/kernel/mm/protection_keys/*
> >> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> >> true
> >
> > This is cute, but I don't think it should be part of the ABI.  Put it in
> > debugfs if you want it for cute tests.  The stuff that this tells you
> > can and should come from pkey_alloc() for the ABI.
> 
> Yeah I agree this is not sysfs material.
> 
> In particular the total/usable numbers are completely useless vs other
> threads allocating pkeys out from under you.

The usable number is the minimum number of keys available for use by the
application, not the number of keys **currently** available.  Its a
static number.

I am dropping this patch. We can revisit this when a clear request for
such a feature emerges.

> 
> >
> >>        Any application wanting to use protection keys needs to be able to
> >>        function without them.  They might be unavailable because the
> >>        hardware that the application runs on does not support them, the
> >>        kernel code does not contain support, the kernel support has been
> >>        disabled, or because the keys have all been allocated, perhaps by a
> >>        library the application is using.  It is recommended that
> >>        applications wanting to use protection keys should simply call
> >>        pkey_alloc(2) and test whether the call succeeds, instead of
> >>        attempting to detect support for the feature in any other way.
> >
> > Do you really not have standard way on ppc to say whether hardware
> > features are supported by the kernel?  For instance, how do you know if
> > a given set of registers are known to and are being context-switched by
> > the kernel?
> 
> Yes we do, we emit feature bits in the AT_HWCAP entry of the aux vector,
> same as some other architectures.

Ah. I was not aware of this.
Thanks,
RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-18 22:28         ` dave.hansen
  (?)
  (?)
@ 2017-12-19 21:34           ` benh
  -1 siblings, 0 replies; 197+ messages in thread
From: Benjamin Herrenschmidt @ 2017-12-19 21:34 UTC (permalink / raw)
  To: Dave Hansen, Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linuxppc-dev, linux-mm, x86,
	linux-arch, linux-doc, linux-kselftest, linux-kernel, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm

On Mon, 2017-12-18 at 14:28 -0800, Dave Hansen wrote:
> > We do not have generic support for something like that on ppc.
> > The kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to track
> > which of its features are currently enabled/used by the kernel; atleast
> > not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

I'm not sure I understand Ram's original (quoted) point, but informing
userspace of CPU features is what AT_HWCAP's are about.

Ben.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 21:34           ` benh
  0 siblings, 0 replies; 197+ messages in thread
From: benh @ 2017-12-19 21:34 UTC (permalink / raw)


On Mon, 2017-12-18 at 14:28 -0800, Dave Hansen wrote:
> > We do not have generic support for something like that on ppc.
> > The kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to track
> > which of its features are currently enabled/used by the kernel; atleast
> > not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

I'm not sure I understand Ram's original (quoted) point, but informing
userspace of CPU features is what AT_HWCAP's are about.

Ben.
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 21:34           ` benh
  0 siblings, 0 replies; 197+ messages in thread
From: Benjamin Herrenschmidt @ 2017-12-19 21:34 UTC (permalink / raw)


On Mon, 2017-12-18@14:28 -0800, Dave Hansen wrote:
> > We do not have generic support for something like that on ppc.
> > The kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to track
> > which of its features are currently enabled/used by the kernel; atleast
> > not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

I'm not sure I understand Ram's original (quoted) point, but informing
userspace of CPU features is what AT_HWCAP's are about.

Ben.
--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-19 21:34           ` benh
  0 siblings, 0 replies; 197+ messages in thread
From: Benjamin Herrenschmidt @ 2017-12-19 21:34 UTC (permalink / raw)
  To: Dave Hansen, Ram Pai
  Cc: mpe, mingo, akpm, corbet, arnd, linuxppc-dev, linux-mm, x86,
	linux-arch, linux-doc, linux-kselftest, linux-kernel, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm

On Mon, 2017-12-18 at 14:28 -0800, Dave Hansen wrote:
> > We do not have generic support for something like that on ppc.
> > The kernel looks at the device tree to determine what hardware features
> > are available. But does not have mechanism to tell the hardware to track
> > which of its features are currently enabled/used by the kernel; atleast
> > not for the memory-key feature.
> 
> Bummer.  You're missing out.
> 
> But, you could still do this with a syscall.  "Hey, kernel, do you
> support this feature?"

I'm not sure I understand Ram's original (quoted) point, but informing
userspace of CPU features is what AT_HWCAP's are about.

Ben.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-19 21:34           ` benh
  (?)
  (?)
@ 2017-12-20 17:50             ` linuxram
  -1 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-20 17:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Dave Hansen, mpe, mingo, akpm, corbet, arnd, linuxppc-dev,
	linux-mm, x86, linux-arch, linux-doc, linux-kselftest,
	linux-kernel, paulus, khandual, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Dec 20, 2017 at 08:34:56AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2017-12-18 at 14:28 -0800, Dave Hansen wrote:
> > > We do not have generic support for something like that on ppc.
> > > The kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to track
> > > which of its features are currently enabled/used by the kernel; atleast
> > > not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> I'm not sure I understand Ram's original (quoted) point, but informing
> userspace of CPU features is what AT_HWCAP's are about.

Ben, my original point was -- we developed this patch to satisfy a concern
you raised back on July 11th;  cut-n-pasted below.

-------------------------------------------------------------------
That leads to the question... How do you tell userspace.

(apologies if I missed that in an existing patch in the series)

How do we inform userspace of the key capabilities ? There are
at least two things userspace may want to know already:

	 - What protection bits are supported for a key

	 - How many keys exist

	 - Which keys are available for use by userspace. On PowerPC,
	 the kernel can reserve some keys for itself, so can the
	 hypervisor. In fact, they do.
--------------------------------------------------------------------


The argument against this patch is --  it should not be baked into
the ABI as yet, since we do not have clarity on what applications need.

As it stands today the only way to figure out the information from
userspace is by probing the kernel through calls to sys_pkey_alloc().

AT_HWCAP can be used, but that will certainly not be capable of
providing all the information that userspace might expect.

Your thoughts?
RP

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-20 17:50             ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: linuxram @ 2017-12-20 17:50 UTC (permalink / raw)


On Wed, Dec 20, 2017 at 08:34:56AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2017-12-18 at 14:28 -0800, Dave Hansen wrote:
> > > We do not have generic support for something like that on ppc.
> > > The kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to track
> > > which of its features are currently enabled/used by the kernel; atleast
> > > not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> I'm not sure I understand Ram's original (quoted) point, but informing
> userspace of CPU features is what AT_HWCAP's are about.

Ben, my original point was -- we developed this patch to satisfy a concern
you raised back on July 11th;  cut-n-pasted below.

-------------------------------------------------------------------
That leads to the question... How do you tell userspace.

(apologies if I missed that in an existing patch in the series)

How do we inform userspace of the key capabilities ? There are
at least two things userspace may want to know already:

	 - What protection bits are supported for a key

	 - How many keys exist

	 - Which keys are available for use by userspace. On PowerPC,
	 the kernel can reserve some keys for itself, so can the
	 hypervisor. In fact, they do.
--------------------------------------------------------------------


The argument against this patch is --  it should not be baked into
the ABI as yet, since we do not have clarity on what applications need.

As it stands today the only way to figure out the information from
userspace is by probing the kernel through calls to sys_pkey_alloc().

AT_HWCAP can be used, but that will certainly not be capable of
providing all the information that userspace might expect.

Your thoughts?
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-20 17:50             ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-20 17:50 UTC (permalink / raw)


On Wed, Dec 20, 2017@08:34:56AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2017-12-18@14:28 -0800, Dave Hansen wrote:
> > > We do not have generic support for something like that on ppc.
> > > The kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to track
> > > which of its features are currently enabled/used by the kernel; atleast
> > > not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> I'm not sure I understand Ram's original (quoted) point, but informing
> userspace of CPU features is what AT_HWCAP's are about.

Ben, my original point was -- we developed this patch to satisfy a concern
you raised back on July 11th;  cut-n-pasted below.

-------------------------------------------------------------------
That leads to the question... How do you tell userspace.

(apologies if I missed that in an existing patch in the series)

How do we inform userspace of the key capabilities ? There are
at least two things userspace may want to know already:

	 - What protection bits are supported for a key

	 - How many keys exist

	 - Which keys are available for use by userspace. On PowerPC,
	 the kernel can reserve some keys for itself, so can the
	 hypervisor. In fact, they do.
--------------------------------------------------------------------


The argument against this patch is --  it should not be baked into
the ABI as yet, since we do not have clarity on what applications need.

As it stands today the only way to figure out the information from
userspace is by probing the kernel through calls to sys_pkey_alloc().

AT_HWCAP can be used, but that will certainly not be capable of
providing all the information that userspace might expect.

Your thoughts?
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-20 17:50             ` linuxram
  0 siblings, 0 replies; 197+ messages in thread
From: Ram Pai @ 2017-12-20 17:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Dave Hansen, mpe, mingo, akpm, corbet, arnd, linuxppc-dev,
	linux-mm, x86, linux-arch, linux-doc, linux-kselftest,
	linux-kernel, paulus, khandual, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Dec 20, 2017 at 08:34:56AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2017-12-18 at 14:28 -0800, Dave Hansen wrote:
> > > We do not have generic support for something like that on ppc.
> > > The kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to track
> > > which of its features are currently enabled/used by the kernel; atleast
> > > not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> I'm not sure I understand Ram's original (quoted) point, but informing
> userspace of CPU features is what AT_HWCAP's are about.

Ben, my original point was -- we developed this patch to satisfy a concern
you raised back on July 11th;  cut-n-pasted below.

-------------------------------------------------------------------
That leads to the question... How do you tell userspace.

(apologies if I missed that in an existing patch in the series)

How do we inform userspace of the key capabilities ? There are
at least two things userspace may want to know already:

	 - What protection bits are supported for a key

	 - How many keys exist

	 - Which keys are available for use by userspace. On PowerPC,
	 the kernel can reserve some keys for itself, so can the
	 hypervisor. In fact, they do.
--------------------------------------------------------------------


The argument against this patch is --  it should not be baked into
the ABI as yet, since we do not have clarity on what applications need.

As it stands today the only way to figure out the information from
userspace is by probing the kernel through calls to sys_pkey_alloc().

AT_HWCAP can be used, but that will certainly not be capable of
providing all the information that userspace might expect.

Your thoughts?
RP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
  2017-12-20 17:50             ` linuxram
  (?)
  (?)
@ 2017-12-20 22:49               ` benh
  -1 siblings, 0 replies; 197+ messages in thread
From: Benjamin Herrenschmidt @ 2017-12-20 22:49 UTC (permalink / raw)
  To: Ram Pai
  Cc: Dave Hansen, mpe, mingo, akpm, corbet, arnd, linuxppc-dev,
	linux-mm, x86, linux-arch, linux-doc, linux-kselftest,
	linux-kernel, paulus, khandual, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Wed, 2017-12-20 at 09:50 -0800, Ram Pai wrote:
> The argument against this patch is --  it should not be baked into
> the ABI as yet, since we do not have clarity on what applications need.
> 
> As it stands today the only way to figure out the information from
> userspace is by probing the kernel through calls to sys_pkey_alloc().
> 
> AT_HWCAP can be used, but that will certainly not be capable of
> providing all the information that userspace might expect.
> 
> Your thoughts?

Well, there's one well known application wanting that whole keys
business, so why not ask them what works for them ?

In the meantime, that shouldn't block the rest of the patches.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-20 22:49               ` benh
  0 siblings, 0 replies; 197+ messages in thread
From: benh @ 2017-12-20 22:49 UTC (permalink / raw)


On Wed, 2017-12-20 at 09:50 -0800, Ram Pai wrote:
> The argument against this patch is --  it should not be baked into
> the ABI as yet, since we do not have clarity on what applications need.
> 
> As it stands today the only way to figure out the information from
> userspace is by probing the kernel through calls to sys_pkey_alloc().
> 
> AT_HWCAP can be used, but that will certainly not be capable of
> providing all the information that userspace might expect.
> 
> Your thoughts?

Well, there's one well known application wanting that whole keys
business, so why not ask them what works for them ?

In the meantime, that shouldn't block the rest of the patches.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* [Linux-kselftest-mirror] [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-20 22:49               ` benh
  0 siblings, 0 replies; 197+ messages in thread
From: Benjamin Herrenschmidt @ 2017-12-20 22:49 UTC (permalink / raw)


On Wed, 2017-12-20@09:50 -0800, Ram Pai wrote:
> The argument against this patch is --  it should not be baked into
> the ABI as yet, since we do not have clarity on what applications need.
> 
> As it stands today the only way to figure out the information from
> userspace is by probing the kernel through calls to sys_pkey_alloc().
> 
> AT_HWCAP can be used, but that will certainly not be capable of
> providing all the information that userspace might expect.
> 
> Your thoughts?

Well, there's one well known application wanting that whole keys
business, so why not ask them what works for them ?

In the meantime, that shouldn't block the rest of the patches.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 197+ messages in thread

* Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
@ 2017-12-20 22:49               ` benh
  0 siblings, 0 replies; 197+ messages in thread
From: Benjamin Herrenschmidt @ 2017-12-20 22:49 UTC (permalink / raw)
  To: Ram Pai
  Cc: Dave Hansen, mpe, mingo, akpm, corbet, arnd, linuxppc-dev,
	linux-mm, x86, linux-arch, linux-doc, linux-kselftest,
	linux-kernel, paulus, khandual, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Wed, 2017-12-20 at 09:50 -0800, Ram Pai wrote:
> The argument against this patch is --  it should not be baked into
> the ABI as yet, since we do not have clarity on what applications need.
> 
> As it stands today the only way to figure out the information from
> userspace is by probing the kernel through calls to sys_pkey_alloc().
> 
> AT_HWCAP can be used, but that will certainly not be capable of
> providing all the information that userspace might expect.
> 
> Your thoughts?

Well, there's one well known application wanting that whole keys
business, so why not ask them what works for them ?

In the meantime, that shouldn't block the rest of the patches.

Cheers,
Ben.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 197+ messages in thread

end of thread, other threads:[~2017-12-21  0:30 UTC | newest]

Thread overview: 197+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-06  8:56 [PATCH v9 00/51] powerpc, mm: Memory Protection Keys Ram Pai
2017-11-06  8:56 ` Ram Pai
2017-11-06  8:56 ` [PATCH v9 01/51] mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled Ram Pai
2017-11-06  8:56   ` Ram Pai
2017-11-06  8:56 ` [PATCH v9 02/51] mm, powerpc, x86: introduce an additional vma bit for powerpc pkey Ram Pai
2017-11-06  8:56   ` Ram Pai
2017-11-06  8:56 ` [PATCH v9 03/51] powerpc: initial pkey plumbing Ram Pai
2017-11-06  8:56   ` Ram Pai
2017-11-06  8:56 ` [PATCH v9 04/51] powerpc: track allocation status of all pkeys Ram Pai
2017-11-06  8:56   ` Ram Pai
2017-11-06  8:56 ` [PATCH v9 05/51] powerpc: helper function to read,write AMR,IAMR,UAMOR registers Ram Pai
2017-11-06  8:56   ` [PATCH v9 05/51] powerpc: helper function to read, write AMR, IAMR, UAMOR registers Ram Pai
2017-11-06  8:56   ` [PATCH v9 05/51] powerpc: helper function to read,write AMR,IAMR,UAMOR registers Ram Pai
2017-11-06  8:56 ` [PATCH v9 06/51] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers Ram Pai
2017-11-06  8:56   ` Ram Pai
2017-11-06  8:56 ` [PATCH v9 07/51] powerpc: cleanup AMR, IAMR when a key is allocated or freed Ram Pai
2017-11-06  8:56   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 08/51] powerpc: implementation for arch_set_user_pkey_access() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 09/51] powerpc: ability to create execute-disabled pkeys Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 10/51] powerpc: store and restore the pkey state across context switches Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 11/51] powerpc: introduce execute-only pkey Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 12/51] powerpc: ability to associate pkey to a vma Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 13/51] powerpc: implementation for arch_override_mprotect_pkey() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 14/51] powerpc: map vma key-protection bits to pte key bits Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 15/51] powerpc: Program HPTE key protection bits Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 16/51] powerpc: helper to validate key-access permissions of a pte Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 17/51] powerpc: check key protection for user page access Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 18/51] powerpc: implementation for arch_vma_access_permitted() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 19/51] powerpc: Handle exceptions caused by pkey violation Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 20/51] powerpc: introduce get_mm_addr_key() helper Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 21/51] powerpc: Deliver SEGV signal on pkey violation Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 22/51] powerpc/ptrace: Add memory protection key regset Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 23/51] powerpc: Enable pkey subsystem Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-13  0:54   ` Ram Pai
2017-11-13  0:54     ` Ram Pai
2017-11-13  0:54     ` [Linux-kselftest-mirror] " Ram Pai
2017-11-13  0:54     ` linuxram
2017-11-06  8:57 ` [PATCH v9 24/51] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 25/51] powerpc: sys_pkey_mprotect() system call Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 26/51] powerpc: add sys_pkey_modify() " Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 27/51] mm, x86 : introduce arch_pkeys_enabled() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 28/51] mm: display pkey in smaps if arch_pkeys_enabled() is true Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-12-18 18:54   ` Dave Hansen
2017-12-18 18:54     ` Dave Hansen
2017-12-18 18:54     ` [Linux-kselftest-mirror] " Dave Hansen
2017-12-18 18:54     ` dave.hansen
2017-12-18 22:18     ` Ram Pai
2017-12-18 22:18       ` Ram Pai
2017-12-18 22:18       ` [Linux-kselftest-mirror] " Ram Pai
2017-12-18 22:18       ` linuxram
2017-12-18 22:28       ` Dave Hansen
2017-12-18 22:28         ` Dave Hansen
2017-12-18 22:28         ` [Linux-kselftest-mirror] " Dave Hansen
2017-12-18 22:28         ` dave.hansen
2017-12-18 23:15         ` Ram Pai
2017-12-18 23:15           ` Ram Pai
2017-12-18 23:15           ` [Linux-kselftest-mirror] " Ram Pai
2017-12-18 23:15           ` linuxram
2017-12-19  8:31           ` Gabriel Paubert
2017-12-19  8:31             ` Gabriel Paubert
2017-12-19  8:31             ` [Linux-kselftest-mirror] " Gabriel Paubert
2017-12-19  8:31             ` paubert
2017-12-19 16:22             ` Ram Pai
2017-12-19 16:22               ` Ram Pai
2017-12-19 16:22               ` [Linux-kselftest-mirror] " Ram Pai
2017-12-19 16:22               ` linuxram
2017-12-19 21:34         ` Benjamin Herrenschmidt
2017-12-19 21:34           ` Benjamin Herrenschmidt
2017-12-19 21:34           ` [Linux-kselftest-mirror] " Benjamin Herrenschmidt
2017-12-19 21:34           ` benh
2017-12-20 17:50           ` Ram Pai
2017-12-20 17:50             ` Ram Pai
2017-12-20 17:50             ` [Linux-kselftest-mirror] " Ram Pai
2017-12-20 17:50             ` linuxram
2017-12-20 22:49             ` Benjamin Herrenschmidt
2017-12-20 22:49               ` Benjamin Herrenschmidt
2017-12-20 22:49               ` [Linux-kselftest-mirror] " Benjamin Herrenschmidt
2017-12-20 22:49               ` benh
2017-12-19 10:50     ` Michael Ellerman
2017-12-19 10:50       ` Michael Ellerman
2017-12-19 10:50       ` [Linux-kselftest-mirror] " Michael Ellerman
2017-12-19 10:50       ` mpe
2017-12-19 16:32       ` Ram Pai
2017-12-19 16:32         ` Ram Pai
2017-12-19 16:32         ` [Linux-kselftest-mirror] " Ram Pai
2017-12-19 16:32         ` linuxram
2017-11-06  8:57 ` [PATCH v9 30/51] Documentation/x86: Move protecton key documentation to arch neutral directory Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 31/51] Documentation/vm: PowerPC specific updates to memory protection keys Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 32/51] selftest/x86: Move protecton key selftest to arch neutral directory Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 33/51] selftest/vm: rename all references to pkru to a generic name Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 34/51] selftest/vm: move generic definitions to header file Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 35/51] selftest/vm: typecast the pkey register Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 36/51] selftest/vm: generic function to handle shadow key register Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 37/51] selftest/vm: fix the wrong assert in pkey_disable_set() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 38/51] selftest/vm: fixed bugs in pkey_disable_clear() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 39/51] selftest/vm: clear the bits in shadow reg when a pkey is freed Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 40/51] selftest/vm: fix alloc_random_pkey() to make it really random Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 41/51] selftest/vm: introduce two arch independent abstraction Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 42/51] selftest/vm: pkey register should match shadow pkey Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 43/51] selftest/vm: generic cleanup Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 44/51] selftest/vm: powerpc implementation for generic abstraction Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-09 18:47   ` Breno Leitao
2017-11-09 18:47     ` Breno Leitao
2017-11-09 18:47     ` [Linux-kselftest-mirror] " Breno Leitao
2017-11-09 18:47     ` leitao
2017-11-09 23:37     ` Ram Pai
2017-11-09 23:37       ` Ram Pai
2017-11-09 23:37       ` [Linux-kselftest-mirror] " Ram Pai
2017-11-09 23:37       ` linuxram
2017-11-10 11:36       ` Breno Leitao
2017-11-10 11:36         ` Breno Leitao
2017-11-10 11:36         ` [Linux-kselftest-mirror] " Breno Leitao
2017-11-10 11:36         ` leitao
2017-11-06  8:57 ` [PATCH v9 45/51] selftest/vm: fix an assertion in test_pkey_alloc_exhaust() Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 46/51] selftest/vm: associate key on a mapped page and detect access violation Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 47/51] selftest/vm: associate key on a mapped page and detect write violation Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 48/51] selftest/vm: detect write violation on a mapped access-denied-key page Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 49/51] selftest/vm: sub-page allocator Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 50/51] selftests/powerpc: Add ptrace tests for Protection Key register Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06  8:57 ` [PATCH v9 51/51] selftests/powerpc: Add core file test " Ram Pai
2017-11-06  8:57   ` Ram Pai
2017-11-06 21:28 ` [PATCH v9 00/51] powerpc, mm: Memory Protection Keys Florian Weimer
2017-11-06 21:28   ` Florian Weimer
2017-11-07  1:22   ` Ram Pai
2017-11-07  1:22     ` Ram Pai
2017-11-07  7:32     ` Florian Weimer
2017-11-07  7:32       ` Florian Weimer
2017-11-07  7:32       ` Florian Weimer
2017-11-07 22:39       ` Ram Pai
2017-11-07 22:39         ` Ram Pai
2017-11-07 22:39         ` Ram Pai
2017-11-07 22:39         ` Ram Pai
2017-11-07 22:39         ` [Linux-kselftest-mirror] " Ram Pai
2017-11-07 22:39         ` linuxram
2017-11-07 22:47         ` Dave Hansen
2017-11-07 22:47           ` Dave Hansen
2017-11-07 22:47           ` [Linux-kselftest-mirror] " Dave Hansen
2017-11-07 22:47           ` dave.hansen
2017-11-07 23:44           ` Ram Pai
2017-11-07 23:44             ` Ram Pai
2017-11-09 22:23     ` Ram Pai
2017-11-09 22:23       ` Ram Pai
2017-11-09 22:23       ` [Linux-kselftest-mirror] " Ram Pai
2017-11-09 22:23       ` linuxram
2017-11-10 18:10 ` Christophe LEROY
2017-11-10 18:10   ` Christophe LEROY
2017-11-10 18:10   ` Christophe LEROY
2017-11-10 18:10   ` [Linux-kselftest-mirror] " Christophe LEROY
2017-11-10 18:10   ` christophe.leroy
2017-11-12 20:45   ` Ram Pai
2017-11-12 20:45     ` Ram Pai
2017-11-12 20:45     ` [Linux-kselftest-mirror] " Ram Pai
2017-11-12 20:45     ` linuxram

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.