linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com, linux-api@vger.kernel.org
Subject: [PATCH 32/34] x86, pkeys: add pkey set/get syscalls
Date: Thu, 03 Dec 2015 17:15:08 -0800	[thread overview]
Message-ID: <20151204011508.0275A2E4@viggo.jf.intel.com> (raw)
In-Reply-To: <20151204011424.8A36E365@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

This establishes two more system calls for protection key management:

	unsigned long pkey_get(int pkey);
	int pkey_set(int pkey, unsigned long access_rights);

The return value from pkey_get() and the 'access_rights' passed
to pkey_set() are the same format: a bitmask containing
PKEY_DENY_WRITE and/or PKEY_DENY_ACCESS, or nothing set at all.

These replace userspace's direct use of rdpkru/wrpkru.

With current hardware, the kernel can not enforce that it has
control over a given key.  But, this at least allows the kernel
to indicate to userspace that userspace does not control a given
protection key.

The kernel does _not_ enforce that this interface must be used for
changes to PKRU, even for keys it has not "allocated".

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-api@vger.kernel.org
---

 b/arch/x86/entry/syscalls/syscall_32.tbl |    2 +
 b/arch/x86/entry/syscalls/syscall_64.tbl |    2 +
 b/arch/x86/include/asm/mmu_context.h     |    2 +
 b/arch/x86/include/asm/pkeys.h           |    2 -
 b/arch/x86/kernel/fpu/xstate.c           |   55 +++++++++++++++++++++++++++++--
 b/include/linux/pkeys.h                  |    8 ++++
 b/mm/mprotect.c                          |   34 +++++++++++++++++++
 7 files changed, 102 insertions(+), 3 deletions(-)

diff -puN arch/x86/entry/syscalls/syscall_32.tbl~pkey-syscalls-set-get arch/x86/entry/syscalls/syscall_32.tbl
--- a/arch/x86/entry/syscalls/syscall_32.tbl~pkey-syscalls-set-get	2015-12-03 16:21:33.139012003 -0800
+++ b/arch/x86/entry/syscalls/syscall_32.tbl	2015-12-03 16:21:33.151012548 -0800
@@ -386,3 +386,5 @@
 377	i386	pkey_mprotect		sys_pkey_mprotect
 378	i386	pkey_alloc		sys_pkey_alloc
 379	i386	pkey_free		sys_pkey_free
+380	i386	pkey_get		sys_pkey_get
+381	i386	pkey_set		sys_pkey_set
diff -puN arch/x86/entry/syscalls/syscall_64.tbl~pkey-syscalls-set-get arch/x86/entry/syscalls/syscall_64.tbl
--- a/arch/x86/entry/syscalls/syscall_64.tbl~pkey-syscalls-set-get	2015-12-03 16:21:33.141012094 -0800
+++ b/arch/x86/entry/syscalls/syscall_64.tbl	2015-12-03 16:21:33.152012593 -0800
@@ -335,6 +335,8 @@
 326	common	pkey_mprotect		sys_pkey_mprotect
 327	common	pkey_alloc		sys_pkey_alloc
 328	common	pkey_free		sys_pkey_free
+329	common	pkey_get		sys_pkey_get
+330	common	pkey_set		sys_pkey_set
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff -puN arch/x86/include/asm/mmu_context.h~pkey-syscalls-set-get arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkey-syscalls-set-get	2015-12-03 16:21:33.142012139 -0800
+++ b/arch/x86/include/asm/mmu_context.h	2015-12-03 16:21:33.152012593 -0800
@@ -340,5 +340,7 @@ static inline bool arch_pte_access_permi
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
+extern unsigned long arch_get_user_pkey_access(struct task_struct *tsk,
+		int pkey);
 
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff -puN arch/x86/include/asm/pkeys.h~pkey-syscalls-set-get arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~pkey-syscalls-set-get	2015-12-03 16:21:33.144012230 -0800
+++ b/arch/x86/include/asm/pkeys.h	2015-12-03 16:21:33.152012593 -0800
@@ -16,7 +16,7 @@
 } while (0)
 
 static inline
-bool mm_pkey_is_allocated(struct mm_struct *mm, unsigned long pkey)
+bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
 	if (!arch_validate_pkey(pkey))
 		return true;
diff -puN arch/x86/kernel/fpu/xstate.c~pkey-syscalls-set-get arch/x86/kernel/fpu/xstate.c
--- a/arch/x86/kernel/fpu/xstate.c~pkey-syscalls-set-get	2015-12-03 16:21:33.145012275 -0800
+++ b/arch/x86/kernel/fpu/xstate.c	2015-12-03 16:21:33.153012638 -0800
@@ -687,7 +687,7 @@ void fpu__resume_cpu(void)
  *
  * Note: does not work for compacted buffers.
  */
-void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask)
+static void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask)
 {
 	int feature_nr = fls64(xstate_feature_mask) - 1;
 
@@ -862,6 +862,7 @@ out:
 
 #define NR_VALID_PKRU_BITS (CONFIG_NR_PROTECTION_KEYS * 2)
 #define PKRU_VALID_MASK (NR_VALID_PKRU_BITS - 1)
+#define PKRU_INIT_STATE	0
 
 /*
  * This will go out and modify the XSAVE buffer so that PKRU is
@@ -880,6 +881,9 @@ int arch_set_user_pkey_access(struct tas
 	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
 	u32 new_pkru_bits = 0;
 
+	/* Only support manipulating current task for now */
+	if (tsk != current)
+		return -EINVAL;
 	if (!arch_validate_pkey(pkey))
 		return -EINVAL;
 	/*
@@ -907,7 +911,7 @@ int arch_set_user_pkey_access(struct tas
 	 * state.
 	 */
 	if (!old_pkru_state)
-		new_pkru_state.pkru = 0;
+		new_pkru_state.pkru = PKRU_INIT_STATE;
 	else
 		new_pkru_state.pkru = old_pkru_state->pkru;
 
@@ -932,4 +936,51 @@ int arch_set_user_pkey_access(struct tas
 
 	return 0;
 }
+
+/*
+ * Figures out what the rights are currently for 'pkey'.
+ * Converts from PKRU's format to the user-visible PKEY_DISABLE_*
+ * format.
+ */
+unsigned long arch_get_user_pkey_access(struct task_struct *tsk, int pkey)
+{
+	struct fpu *fpu = &current->thread.fpu;
+	u32 pkru_reg;
+	int ret = 0;
+
+	/* Only support manipulating current task for now */
+	if (tsk != current)
+		return -1;
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return -1;
+	/*
+	 * The contents of PKRU itself are invalid.  Consult the
+	 * task's XSAVE buffer for PKRU contents.  This is much
+	 * more expensive than reading PKRU directly, but should
+	 * be rare or impossible with eagerfpu mode.
+	 */
+	if (!fpu->fpregs_active) {
+		struct xregs_state *xsave = &fpu->state.xsave;
+		struct pkru_state *pkru_state =
+			get_xsave_addr(xsave, XFEATURE_MASK_PKRU);
+		/*
+		 * PKRU is in its init state and not present in
+		 * the buffer in a saved form.
+		 */
+		if (!pkru_state)
+			return PKRU_INIT_STATE;
+
+		return pkru_state->pkru;
+	}
+	/*
+	 * Consult the user register directly.
+	 */
+	pkru_reg = read_pkru();
+	if (!__pkru_allows_read(pkru_reg, pkey))
+		ret |= PKEY_DISABLE_ACCESS;
+	if (!__pkru_allows_write(pkru_reg, pkey))
+		ret |= PKEY_DISABLE_WRITE;
+
+	return ret;
+}
 #endif /* CONFIG_ARCH_HAS_PKEYS */
diff -puN include/linux/pkeys.h~pkey-syscalls-set-get include/linux/pkeys.h
--- a/include/linux/pkeys.h~pkey-syscalls-set-get	2015-12-03 16:21:33.147012366 -0800
+++ b/include/linux/pkeys.h	2015-12-03 16:21:33.153012638 -0800
@@ -43,6 +43,14 @@ static inline int mm_pkey_free(struct mm
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 			unsigned long init_val)
 {
+	return -EINVAL;
+}
+
+static inline
+unsigned long arch_get_user_pkey_access(struct task_struct *tsk, int pkey)
+{
+	if (pkey)
+		return -1;
 	return 0;
 }
 
diff -puN mm/mprotect.c~pkey-syscalls-set-get mm/mprotect.c
--- a/mm/mprotect.c~pkey-syscalls-set-get	2015-12-03 16:21:33.148012412 -0800
+++ b/mm/mprotect.c	2015-12-03 16:21:33.154012684 -0800
@@ -531,3 +531,37 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 	 */
 	return ret;
 }
+
+SYSCALL_DEFINE1(pkey_get, int, pkey)
+{
+	unsigned long ret = 0;
+
+	down_write(&current->mm->mmap_sem);
+	if (!mm_pkey_is_allocated(current->mm, pkey))
+		ret = -EBADF;
+	up_write(&current->mm->mmap_sem);
+
+	if (ret)
+		return ret;
+
+	ret = arch_get_user_pkey_access(current, pkey);
+
+	return ret;
+}
+
+SYSCALL_DEFINE2(pkey_set, int, pkey, unsigned long, access_rights)
+{
+	unsigned long ret = 0;
+
+	down_write(&current->mm->mmap_sem);
+	if (!mm_pkey_is_allocated(current->mm, pkey))
+		ret = -EBADF;
+	up_write(&current->mm->mmap_sem);
+
+	if (ret)
+		return ret;
+
+	ret = arch_set_user_pkey_access(current, pkey, access_rights);
+
+	return ret;
+}
_

  parent reply	other threads:[~2015-12-04  1:15 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-04  1:14 [PATCH 00/34] x86: Memory Protection Keys (v5) Dave Hansen
2015-12-04  1:14 ` [PATCH 01/34] mm, gup: introduce concept of "foreign" get_user_pages() Dave Hansen
2015-12-04  1:14 ` [PATCH 02/34] x86, fpu: add placeholder for Processor Trace XSAVE state Dave Hansen
2015-12-04  1:14 ` [PATCH 03/34] x86, pkeys: Add Kconfig option Dave Hansen
2015-12-04  1:14 ` [PATCH 04/34] x86, pkeys: cpuid bit definition Dave Hansen
2015-12-04  1:14 ` [PATCH 05/34] x86, pkeys: define new CR4 bit Dave Hansen
2015-12-04  1:14 ` [PATCH 06/34] x86, pkeys: add PKRU xsave fields and data structure(s) Dave Hansen
2015-12-04  1:14 ` [PATCH 07/34] x86, pkeys: PTE bits for storing protection key Dave Hansen
2015-12-04  1:14 ` [PATCH 08/34] x86, pkeys: new page fault error code bit: PF_PK Dave Hansen
2015-12-04  1:14 ` [PATCH 09/34] x86, pkeys: store protection in high VMA flags Dave Hansen
2015-12-08 14:17   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 10/34] x86, pkeys: arch-specific protection bits Dave Hansen
2015-12-08 15:15   ` [PATCH 10/34] x86, pkeys: arch-specific protection bitsy Thomas Gleixner
2015-12-08 16:34     ` Dave Hansen
2015-12-08 17:24       ` Thomas Gleixner
2015-12-08 18:06         ` Dave Hansen
2015-12-08 18:29           ` Thomas Gleixner
2015-12-08 18:35             ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 11/34] x86, pkeys: pass VMA down in to fault signal generation code Dave Hansen
2015-12-04  1:14 ` [PATCH 12/34] signals, pkeys: notify userspace about protection key faults Dave Hansen
2015-12-04  1:14 ` [PATCH 13/34] x86, pkeys: fill in pkey field in siginfo Dave Hansen
2015-12-04  1:14 ` [PATCH 14/34] x86, pkeys: add functions to fetch PKRU Dave Hansen
2015-12-08 15:18   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 15/34] mm: factor out VMA fault permission checking Dave Hansen
2015-12-08 17:26   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 16/34] x86, mm: simplify get_user_pages() PTE bit handling Dave Hansen
2015-12-08 18:01   ` Thomas Gleixner
2015-12-08 18:30     ` Dave Hansen
2015-12-04  1:14 ` [PATCH 17/34] x86, pkeys: check VMAs and PTEs for protection keys Dave Hansen
2015-12-08 18:11   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 18/34] mm: add gup flag to indicate "foreign" mm access Dave Hansen
2015-12-04  1:14 ` [PATCH 19/34] x86, pkeys: optimize fault handling in access_error() Dave Hansen
2015-12-08 18:14   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 20/34] x86, pkeys: differentiate instruction fetches Dave Hansen
2015-12-08 18:17   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 21/34] x86, pkeys: dump PKRU with other kernel registers Dave Hansen
2015-12-08 18:19   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 22/34] x86, pkeys: dump PTE pkey in /proc/pid/smaps Dave Hansen
2015-12-08 18:20   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 23/34] x86, pkeys: add Kconfig prompt to existing config option Dave Hansen
2015-12-08 18:21   ` Thomas Gleixner
2015-12-04  1:14 ` [PATCH 24/34] mm, multi-arch: pass a protection key in to calc_vm_flag_bits() Dave Hansen
2015-12-04  1:14 ` [PATCH 25/34] x86, pkeys: add arch_validate_pkey() Dave Hansen
2015-12-08 18:39   ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 26/34] mm: implement new mprotect_key() system call Dave Hansen
2015-12-05  6:50   ` Michael Kerrisk (man-pages)
2015-12-07 16:44     ` Dave Hansen
2015-12-09 11:08       ` Michael Kerrisk (man-pages)
2015-12-09 15:48         ` Dave Hansen
2015-12-09 16:45           ` Michael Kerrisk (man-pages)
2015-12-09 17:05             ` Dave Hansen
2015-12-11 20:13               ` Michael Kerrisk (man-pages)
2015-12-04  1:15 ` [PATCH 27/34] x86, pkeys: make mprotect_key() mask off additional vm_flags Dave Hansen
2015-12-08 18:41   ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 28/34] x86: wire up mprotect_key() system call Dave Hansen
2015-12-08 18:44   ` Thomas Gleixner
2015-12-08 19:06     ` Dave Hansen
2015-12-08 20:38       ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 29/34] x86: separate out LDT init from context init Dave Hansen
2015-12-08 18:45   ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 30/34] x86, fpu: allow setting of XSAVE state Dave Hansen
2015-12-08 18:48   ` Thomas Gleixner
2015-12-04  1:15 ` [PATCH 31/34] x86, pkeys: allocation/free syscalls Dave Hansen
2015-12-04  1:15 ` Dave Hansen [this message]
2015-12-04  1:15 ` [PATCH 33/34] x86, pkeys: actually enable Memory Protection Keys in CPU Dave Hansen
2015-12-04  1:15 ` [PATCH 34/34] x86, pkeys: Documentation Dave Hansen
2015-12-04 23:31 ` [PATCH 00/34] x86: Memory Protection Keys (v5) Andy Lutomirski
2015-12-04 23:38   ` Dave Hansen
2015-12-11 20:16     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151204011508.0275A2E4@viggo.jf.intel.com \
    --to=dave@sr71.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).