All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: linux-kernel@vger.kernel.org
Cc: x86@kernel.org, Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com, linux-api@vger.kernel.org
Subject: [PATCH 35/37] x86, pkeys: add pkey set/get syscalls
Date: Mon, 16 Nov 2015 19:36:00 -0800	[thread overview]
Message-ID: <20151117033600.D9061172@viggo.jf.intel.com> (raw)
In-Reply-To: <20151117033511.BFFA1440@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

This establishes two more system calls for protection key management:

	unsigned long pkey_get(int pkey);
	int pkey_set(int pkey, unsigned long access_rights);

The return value from pkey_get() and the 'access_rights' passed
to pkey_set() are the same format: a bitmask containing
PKEY_DENY_WRITE and/or PKEY_DENY_ACCESS, or nothing set at all.

These replace userspace's direct use of rdpkru/wrpkru.

With current hardware, the kernel can not enforce that it has
control over a given key.  But, this at least allows the kernel
to indicate to userspace that userspace does not control a given
protection key.

The kernel does _not_ enforce that this interface must be used for
changes to PKRU, even for keys it has not "allocated".

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-api@vger.kernel.org
---

 b/arch/x86/entry/syscalls/syscall_32.tbl |    2 +
 b/arch/x86/entry/syscalls/syscall_64.tbl |    2 +
 b/arch/x86/include/asm/mmu_context.h     |    2 +
 b/arch/x86/include/asm/pkeys.h           |    2 -
 b/arch/x86/kernel/fpu/xstate.c           |   55 +++++++++++++++++++++++++++++--
 b/include/linux/pkeys.h                  |    8 ++++
 b/mm/mprotect.c                          |   34 +++++++++++++++++++
 7 files changed, 102 insertions(+), 3 deletions(-)

diff -puN arch/x86/entry/syscalls/syscall_32.tbl~pkey-syscalls-set-get arch/x86/entry/syscalls/syscall_32.tbl
--- a/arch/x86/entry/syscalls/syscall_32.tbl~pkey-syscalls-set-get	2015-11-16 12:36:34.851873071 -0800
+++ b/arch/x86/entry/syscalls/syscall_32.tbl	2015-11-16 12:36:34.863873616 -0800
@@ -386,3 +386,5 @@
 377	i386	pkey_mprotect		sys_pkey_mprotect
 378	i386	pkey_alloc		sys_pkey_alloc
 379	i386	pkey_free		sys_pkey_free
+380	i386	pkey_get		sys_pkey_get
+381	i386	pkey_set		sys_pkey_set
diff -puN arch/x86/entry/syscalls/syscall_64.tbl~pkey-syscalls-set-get arch/x86/entry/syscalls/syscall_64.tbl
--- a/arch/x86/entry/syscalls/syscall_64.tbl~pkey-syscalls-set-get	2015-11-16 12:36:34.852873116 -0800
+++ b/arch/x86/entry/syscalls/syscall_64.tbl	2015-11-16 12:36:34.864873661 -0800
@@ -335,6 +335,8 @@
 326	common	pkey_mprotect		sys_pkey_mprotect
 327	common	pkey_alloc		sys_pkey_alloc
 328	common	pkey_free		sys_pkey_free
+329	common	pkey_get		sys_pkey_get
+330	common	pkey_set		sys_pkey_set
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff -puN arch/x86/include/asm/mmu_context.h~pkey-syscalls-set-get arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkey-syscalls-set-get	2015-11-16 12:36:34.854873207 -0800
+++ b/arch/x86/include/asm/mmu_context.h	2015-11-16 12:36:34.864873661 -0800
@@ -340,5 +340,7 @@ static inline bool arch_pte_access_permi
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
+extern unsigned long arch_get_user_pkey_access(struct task_struct *tsk,
+		int pkey);
 
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff -puN arch/x86/include/asm/pkeys.h~pkey-syscalls-set-get arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~pkey-syscalls-set-get	2015-11-16 12:36:34.855873253 -0800
+++ b/arch/x86/include/asm/pkeys.h	2015-11-16 19:14:09.231117816 -0800
@@ -16,7 +16,7 @@
 } while (0)
 
 static inline
-bool mm_pkey_is_allocated(struct mm_struct *mm, unsigned long pkey)
+bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
 	if (!arch_validate_pkey(pkey))
 		return true;
diff -puN arch/x86/kernel/fpu/xstate.c~pkey-syscalls-set-get arch/x86/kernel/fpu/xstate.c
--- a/arch/x86/kernel/fpu/xstate.c~pkey-syscalls-set-get	2015-11-16 12:36:34.857873343 -0800
+++ b/arch/x86/kernel/fpu/xstate.c	2015-11-16 12:36:34.865873706 -0800
@@ -687,7 +687,7 @@ void fpu__resume_cpu(void)
  *
  * Note: does not work for compacted buffers.
  */
-void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask)
+static void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask)
 {
 	int feature_nr = fls64(xstate_feature_mask) - 1;
 
@@ -871,6 +871,7 @@ static void fpu__xfeature_set_state(int
 
 #define NR_VALID_PKRU_BITS (CONFIG_NR_PROTECTION_KEYS * 2)
 #define PKRU_VALID_MASK (NR_VALID_PKRU_BITS - 1)
+#define PKRU_INIT_STATE	0
 
 /*
  * This will go out and modify the XSAVE buffer so that PKRU is
@@ -889,6 +890,9 @@ int arch_set_user_pkey_access(struct tas
 	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
 	u32 new_pkru_bits = 0;
 
+	/* Only support manipulating current task for now */
+	if (tsk != current)
+		return -EINVAL;
 	if (!arch_validate_pkey(pkey))
 		return -EINVAL;
 	/*
@@ -916,7 +920,7 @@ int arch_set_user_pkey_access(struct tas
 	 * state.
 	 */
 	if (!old_pkru_state)
-		new_pkru_state.pkru = 0;
+		new_pkru_state.pkru = PKRU_INIT_STATE;
 	else
 		new_pkru_state.pkru = old_pkru_state->pkru;
 
@@ -941,4 +945,51 @@ int arch_set_user_pkey_access(struct tas
 
 	return 0;
 }
+
+/*
+ * Figures out what the rights are currently for 'pkey'.
+ * Converts from PKRU's format to the user-visible PKEY_DISABLE_*
+ * format.
+ */
+unsigned long arch_get_user_pkey_access(struct task_struct *tsk, int pkey)
+{
+	struct fpu *fpu = &current->thread.fpu;
+	u32 pkru_reg;
+	int ret = 0;
+
+	/* Only support manipulating current task for now */
+	if (tsk != current)
+		return -1;
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return -1;
+	/*
+	 * The contents of PKRU itself are invalid.  Consult the
+	 * task's XSAVE buffer for PKRU contents.  This is much
+	 * more expensive than reading PKRU directly, but should
+	 * be rare or impossible with eagerfpu mode.
+	 */
+	if (!fpu->fpregs_active) {
+		struct xregs_state *xsave = &fpu->state.xsave;
+		struct pkru_state *pkru_state =
+			get_xsave_addr(xsave, XFEATURE_MASK_PKRU);
+		/*
+		 * PKRU is in its init state and not present in
+		 * the buffer in a saved form.
+		 */
+		if (!pkru_state)
+			return PKRU_INIT_STATE;
+
+		return pkru_state->pkru;
+	}
+	/*
+	 * Consult the user register directly.
+	 */
+	pkru_reg = read_pkru();
+	if (!__pkru_allows_read(pkru_reg, pkey))
+		ret |= PKEY_DISABLE_ACCESS;
+	if (!__pkru_allows_write(pkru_reg, pkey))
+		ret |= PKEY_DISABLE_WRITE;
+
+	return ret;
+}
 #endif /* CONFIG_ARCH_HAS_PKEYS */
diff -puN include/linux/pkeys.h~pkey-syscalls-set-get include/linux/pkeys.h
--- a/include/linux/pkeys.h~pkey-syscalls-set-get	2015-11-16 12:36:34.858873389 -0800
+++ b/include/linux/pkeys.h	2015-11-16 12:36:34.865873706 -0800
@@ -43,6 +43,14 @@ static inline int mm_pkey_free(struct mm
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 			unsigned long init_val)
 {
+	return -EINVAL;
+}
+
+static inline
+unsigned long arch_get_user_pkey_access(struct task_struct *tsk, int pkey)
+{
+	if (pkey)
+		return -1;
 	return 0;
 }
 
diff -puN mm/mprotect.c~pkey-syscalls-set-get mm/mprotect.c
--- a/mm/mprotect.c~pkey-syscalls-set-get	2015-11-16 12:36:34.860873479 -0800
+++ b/mm/mprotect.c	2015-11-16 12:36:34.865873706 -0800
@@ -529,3 +529,37 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 	 */
 	return ret;
 }
+
+SYSCALL_DEFINE1(pkey_get, int, pkey)
+{
+	unsigned long ret = 0;
+
+	down_write(&current->mm->mmap_sem);
+	if (!mm_pkey_is_allocated(current->mm, pkey))
+		ret = -EBADF;
+	up_write(&current->mm->mmap_sem);
+
+	if (ret)
+		return ret;
+
+	ret = arch_get_user_pkey_access(current, pkey);
+
+	return ret;
+}
+
+SYSCALL_DEFINE2(pkey_set, int, pkey, unsigned long, access_rights)
+{
+	unsigned long ret = 0;
+
+	down_write(&current->mm->mmap_sem);
+	if (!mm_pkey_is_allocated(current->mm, pkey))
+		ret = -EBADF;
+	up_write(&current->mm->mmap_sem);
+
+	if (ret)
+		return ret;
+
+	ret = arch_set_user_pkey_access(current, pkey, access_rights);
+
+	return ret;
+}
_

  parent reply	other threads:[~2015-11-17  3:39 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-17  3:35 [PATCH 00/37] x86: Memory Protection Keys Dave Hansen
2015-11-17  3:35 ` [PATCH 01/37] uprobes: dont pass around current->mm Dave Hansen
2015-11-17  3:35 ` [PATCH 02/37] mm, frame_vector: do not use get_user_pages_locked() Dave Hansen
2015-11-18 12:29   ` Jan Kara
2015-11-18 17:04     ` Andrea Arcangeli
2015-11-17  3:35 ` [PATCH 03/37] mm: kill get_user_pages_locked() Dave Hansen
2015-11-17  3:35 ` [PATCH 04/37] mm: simplify __get_user_pages() Dave Hansen
2015-11-17  3:35 ` [PATCH 05/37] mm, gup: introduce concept of "foreign" get_user_pages() Dave Hansen
2015-11-17  3:35 ` [PATCH 06/37] x86, fpu: add placeholder for Processor Trace XSAVE state Dave Hansen
2015-11-17  3:35 ` [PATCH 07/37] x86, pkeys: Add Kconfig option Dave Hansen
2015-11-17  3:35 ` [PATCH 08/37] x86, pkeys: cpuid bit definition Dave Hansen
2015-11-17  3:35 ` [PATCH 09/37] x86, pkeys: define new CR4 bit Dave Hansen
2015-11-17  3:35 ` [PATCH 10/37] x86, pkeys: add PKRU xsave fields and data structure(s) Dave Hansen
2015-11-27  9:23   ` Thomas Gleixner
2015-11-17  3:35 ` [PATCH 11/37] x86, pkeys: PTE bits for storing protection key Dave Hansen
2015-11-17  3:35 ` [PATCH 12/37] x86, pkeys: new page fault error code bit: PF_PK Dave Hansen
2015-11-17  3:35 ` [PATCH 13/37] x86, pkeys: store protection in high VMA flags Dave Hansen
2015-11-17  3:35 ` [PATCH 14/37] x86, pkeys: arch-specific protection bits Dave Hansen
2015-11-17  3:35 ` [PATCH 15/37] x86, pkeys: pass VMA down in to fault signal generation code Dave Hansen
2015-11-27  9:30   ` Thomas Gleixner
2015-11-17  3:35 ` [PATCH 16/37] x86, pkeys: notify userspace about protection key faults Dave Hansen
2015-11-27  9:49   ` Thomas Gleixner
2015-11-17  3:35 ` [PATCH 17/37] x86, pkeys: add functions to fetch PKRU Dave Hansen
2015-11-27  9:51   ` Thomas Gleixner
2015-11-30 15:51     ` Dave Hansen
2015-11-17  3:35 ` [PATCH 18/37] mm: factor out VMA fault permission checking Dave Hansen
2015-11-27  9:53   ` Thomas Gleixner
2015-11-17  3:35 ` [PATCH 19/37] x86, mm: simplify get_user_pages() PTE bit handling Dave Hansen
2015-11-27 10:12   ` Thomas Gleixner
2015-11-30 16:25     ` Dave Hansen
2015-11-17  3:35 ` [PATCH 20/37] x86, pkeys: check VMAs and PTEs for protection keys Dave Hansen
2015-11-17  3:35 ` [PATCH 21/37] mm: add gup flag to indicate "foreign" mm access Dave Hansen
2015-11-17  3:35 ` [PATCH 22/37] x86, pkeys: optimize fault handling in access_error() Dave Hansen
2015-11-17  3:35 ` [PATCH 23/37] x86, pkeys: differentiate instruction fetches Dave Hansen
2015-11-17  3:35 ` [PATCH 24/37] x86, pkeys: dump PKRU with other kernel registers Dave Hansen
2015-11-17  3:35 ` [PATCH 25/37] x86, pkeys: dump PTE pkey in /proc/pid/smaps Dave Hansen
2015-11-17  3:35 ` [PATCH 26/37] x86, pkeys: add Kconfig prompt to existing config option Dave Hansen
2015-11-17  3:35 ` [PATCH 27/37] mm, multi-arch: pass a protection key in to calc_vm_flag_bits() Dave Hansen
2015-11-17  3:35 ` [PATCH 28/37] x86, pkeys: add arch_validate_pkey() Dave Hansen
2015-11-17  3:35 ` [PATCH 29/37] mm: implement new mprotect_key() system call Dave Hansen
2015-11-17  3:35 ` [PATCH 30/37] x86, pkeys: make mprotect_key() mask off additional vm_flags Dave Hansen
2015-11-17  3:35 ` [PATCH 31/37] x86: wire up mprotect_key() system call Dave Hansen
2015-11-17  3:35 ` [PATCH 32/37] x86: separate out LDT init from context init Dave Hansen
2015-11-17  3:35 ` [PATCH 33/37] x86, fpu: allow setting of XSAVE state Dave Hansen
2015-11-17  3:35 ` [PATCH 34/37] x86, pkeys: allocation/free syscalls Dave Hansen
2015-11-17  3:36 ` Dave Hansen [this message]
2015-11-17  3:36 ` [PATCH 36/37] x86, pkeys: actually enable Memory Protection Keys in CPU Dave Hansen
2015-11-17  3:36 ` [PATCH 37/37] x86, pkeys: Documentation Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151117033600.D9061172@viggo.jf.intel.com \
    --to=dave@sr71.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.