linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ira.weiny@intel.com
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	x86@kernel.org, Dave Hansen <dave.hansen@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: [PATCH RFC V2 05/17] x86/pks: Add PKS kernel API
Date: Fri, 17 Jul 2020 00:20:44 -0700	[thread overview]
Message-ID: <20200717072056.73134-6-ira.weiny@intel.com> (raw)
In-Reply-To: <20200717072056.73134-1-ira.weiny@intel.com>

From: Fenghua Yu <fenghua.yu@intel.com>

PKS allows kernel users to define domains of page mappings which have
additional protections beyond the paging protections.

Add an API to allocate, use, and free a protection key which identifies
such a domain.

We export 2 new symbols pks_key_alloc() and pks_key_free() while
pks_update_protection() is exposed as an inline function via header
file.

pks_key_alloc() reserves pkey 0 for default kernel pages.  The other 15
keys are dynamically allocated to allow better use of the limited key
space.

This, and the fact that PKS may not be available on all arch's, means
callers of the allocator _must_ be prepared for it to fail and take
appropriate action to run without their allocated domain.  This is not
anticipated to be a problem as these protections only serve to harden
memory and users should be no worse off than before the introduction of
PKS.

PAGE_KERNEL_PKEY(key) and _PAGE_PKEY(pkey) aid in setting page table
entry bits by kernel users.  Note these defines will be used in follow
on patches but are included here for a complete interface.

pks_update_protection() is inlined for performance and allows kernel
users the ability to change the protections for the domain identified by
the pkey specified.  It is undefined behavior to call this on a pkey not
allocated by the allocator.  And will WARN_ON if called on architectures
which do not support PKS.  (Again callers are expected to check the
return of pks_key_alloc() before using this API further.)

It should also be noted that the underlying WRMSR(MSR_IA32_PKRS) is not
serializing but still maintains ordering properties similar to WRPKRU.
The current SDM section on PKRS needs updating but should be the same as
that of WRPKRU.  So to quote from the WRPKRU text:

	WRPKRU will never execute speculatively. Memory accesses
	affected by PKRU register will not execute (even speculatively)
	until all prior executions of WRPKRU have completed execution
	and updated the PKRU register.

Finally, pks_key_free() allows a user to return the key to the
allocator for use by others.

The interface maintains Access Disabled (AD=1) for all keys not
currently allocated.  Therefore, the user can depend on access being
disabled when pks_key_alloc() returns a key.

Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/include/asm/pgtable_types.h  |  4 ++
 arch/x86/include/asm/pkeys.h          | 30 ++++++++++
 arch/x86/include/asm/pkeys_internal.h |  4 ++
 arch/x86/mm/pkeys.c                   | 79 +++++++++++++++++++++++++++
 include/linux/pkeys.h                 | 14 +++++
 5 files changed, 131 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 816b31c68550..2ab45ef89c7d 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -73,6 +73,8 @@
 			 _PAGE_PKEY_BIT2 | \
 			 _PAGE_PKEY_BIT3)
 
+#define _PAGE_PKEY(pkey)	(_AT(pteval_t, pkey) << _PAGE_BIT_PKEY_BIT0)
+
 #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
 #define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY | _PAGE_ACCESSED)
 #else
@@ -229,6 +231,8 @@ enum page_cache_mode {
 #define PAGE_KERNEL_IO		__pgprot_mask(__PAGE_KERNEL_IO)
 #define PAGE_KERNEL_IO_NOCACHE	__pgprot_mask(__PAGE_KERNEL_IO_NOCACHE)
 
+#define PAGE_KERNEL_PKEY(pkey)	__pgprot_mask(__PAGE_KERNEL | _PAGE_PKEY(pkey))
+
 #endif	/* __ASSEMBLY__ */
 
 /*         xwr */
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index 34cef29fed20..e30ea907abb6 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -138,4 +138,34 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 u32 get_new_pkr(u32 old_pkr, int pkey, unsigned long init_val);
 
+#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
+int pks_key_alloc(const char *const pkey_user);
+void pks_key_free(int pkey);
+u32 get_new_pkr(u32 old_pkr, int pkey, unsigned long init_val);
+
+/*
+ * pks_update_protection - Update the protection of the specified key
+ *
+ * @pkey: Key for the domain to change
+ * @protection: protection bits to be used
+ *
+ * Protection utilizes the same protection bits specified for User pkeys
+ *     PKEY_DISABLE_ACCESS
+ *     PKEY_DISABLE_WRITE
+ *
+ * This is not a global update.  It only affects the current running thread.
+ *
+ * It is undefined and a bug for users to call this without having allocated a
+ * pkey and using it as pkey here.
+ */
+static inline void pks_update_protection(int pkey, unsigned long protection)
+{
+	current->thread.saved_pkrs = get_new_pkr(current->thread.saved_pkrs,
+						 pkey, protection);
+	preempt_disable();
+	write_pkrs(current->thread.saved_pkrs);
+	preempt_enable();
+}
+#endif /* CONFIG_ARCH_HAS_SUPERVISOR_PKEYS */
+
 #endif /*_ASM_X86_PKEYS_H */
diff --git a/arch/x86/include/asm/pkeys_internal.h b/arch/x86/include/asm/pkeys_internal.h
index 05257cdc7200..e34f380c66d1 100644
--- a/arch/x86/include/asm/pkeys_internal.h
+++ b/arch/x86/include/asm/pkeys_internal.h
@@ -22,6 +22,10 @@
 			 PKR_AD_KEY(10) | PKR_AD_KEY(11) | PKR_AD_KEY(12) | \
 			 PKR_AD_KEY(13) | PKR_AD_KEY(14) | PKR_AD_KEY(15))
 
+/*  PKS supports 16 keys. Key 0 is reserved for the kernel. */
+#define        PKS_KERN_DEFAULT_KEY    0
+#define        PKS_NUM_KEYS            16
+
 #ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
 void write_pkrs(u32 pkrs_val);
 #else
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 0f86f2374bd7..16f735c12fcd 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -3,6 +3,9 @@
  * Intel Memory Protection Keys management
  * Copyright (c) 2015, Intel Corporation.
  */
+#undef pr_fmt
+#define pr_fmt(fmt) "x86/pkeys: " fmt
+
 #include <linux/debugfs.h>		/* debugfs_create_u32()		*/
 #include <linux/mm_types.h>             /* mm_struct, vma, etc...       */
 #include <linux/pkeys.h>                /* PKEY_*                       */
@@ -249,3 +252,79 @@ void write_pkrs(u32 pkrs_val)
 	this_cpu_write(pkrs_cache, pkrs_val);
 	wrmsrl(MSR_IA32_PKRS, pkrs_val);
 }
+
+DEFINE_MUTEX(pks_lock);
+static const char pks_key_user0[] = "kernel";
+
+/* Store names of allocated keys for debug.  Key 0 is reserved for the kernel.  */
+static const char *pks_key_users[PKS_NUM_KEYS] = {
+	pks_key_user0
+};
+
+/*
+ * Each key is represented by a bit.  Bit 0 is set for key 0 and reserved for
+ * its use.  We use ulong for the bit operations but only 16 bits are used.
+ */
+static unsigned long pks_key_allocation_map = 1 << PKS_KERN_DEFAULT_KEY;
+
+/*
+ * pks_key_alloc - Allocate a PKS key
+ *
+ * @pkey_user: String stored for debugging of key exhaustion.  The caller is
+ * responsible to maintain this memory until pks_key_free().
+ */
+int pks_key_alloc(const char * const pkey_user)
+{
+	int nr, old_bit, pkey;
+
+	might_sleep();
+
+	if (!cpu_feature_enabled(X86_FEATURE_PKS))
+		return -EINVAL;
+
+	mutex_lock(&pks_lock);
+	/* Find a free bit (0) in the bit map. */
+	old_bit = 1;
+	while (old_bit) {
+		nr = ffz(pks_key_allocation_map);
+		old_bit = __test_and_set_bit(nr, &pks_key_allocation_map);
+	}
+
+	if (nr < PKS_NUM_KEYS) {
+		pkey = nr;
+		/* for debugging key exhaustion */
+		pks_key_users[pkey] = pkey_user;
+	} else {
+		pkey = -ENOSPC;
+		pr_info("Cannot allocate supervisor key for %s.\n",
+			pkey_user);
+	}
+
+	mutex_unlock(&pks_lock);
+	return pkey;
+}
+EXPORT_SYMBOL_GPL(pks_key_alloc);
+
+/*
+ * pks_key_free - Free a previously allocate PKS key
+ *
+ * @pkey: Key to be free'ed
+ */
+void pks_key_free(int pkey)
+{
+	might_sleep();
+
+	if (!cpu_feature_enabled(X86_FEATURE_PKS))
+		return;
+
+	if (pkey >= PKS_NUM_KEYS || pkey <= PKS_KERN_DEFAULT_KEY)
+		return;
+
+	mutex_lock(&pks_lock);
+	__clear_bit(pkey, &pks_key_allocation_map);
+	pks_key_users[pkey] = NULL;
+	/* Restore to default AD=1 and WD=0. */
+	pks_update_protection(pkey, PKEY_DISABLE_ACCESS);
+	mutex_unlock(&pks_lock);
+}
+EXPORT_SYMBOL_GPL(pks_key_free);
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 2955ba976048..e4bff77d7b49 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -50,4 +50,18 @@ static inline void copy_init_pkru_to_fpregs(void)
 
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
+#ifndef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
+static inline int pks_key_alloc(const char * const pkey_user)
+{
+	return -EINVAL;
+}
+static inline void pks_key_free(int pkey)
+{
+}
+static inline void pks_update_protection(int pkey, unsigned long protection)
+{
+	WARN_ON_ONCE(1);
+}
+#endif /* ! CONFIG_ARCH_HAS_SUPERVISOR_PKEYS */
+
 #endif /* _LINUX_PKEYS_H */
-- 
2.28.0.rc0.12.gb6a658bd00c9


  parent reply	other threads:[~2020-07-17  7:21 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-17  7:20 [PATCH RFC V2 00/17] PKS: Add Protection Keys Supervisor (PKS) support ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 01/17] x86/pkeys: Create pkeys_internal.h ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 02/17] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support ira.weiny
2020-07-17  8:54   ` Peter Zijlstra
2020-07-17 20:52     ` Ira Weiny
2020-07-20  9:14       ` Peter Zijlstra
2020-07-17 22:36     ` Dave Hansen
2020-07-20  9:13       ` Peter Zijlstra
2020-07-17  7:20 ` [PATCH RFC V2 03/17] x86/pks: Enable Protection Keys Supervisor (PKS) ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 04/17] x86/pks: Preserve the PKRS MSR on context switch ira.weiny
2020-07-17  8:31   ` Peter Zijlstra
2020-07-17 21:39     ` Ira Weiny
2020-07-17  8:59   ` Peter Zijlstra
2020-07-17 22:34     ` Ira Weiny
2020-07-20  9:15       ` Peter Zijlstra
2020-07-20 18:35         ` Ira Weiny
2020-07-17  7:20 ` ira.weiny [this message]
2020-07-17  7:20 ` [PATCH RFC V2 06/17] x86/pks: Add a debugfs file for allocated PKS keys ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 07/17] Documentation/pkeys: Update documentation for kernel pkeys ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 08/17] x86/pks: Add PKS Test code ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 09/17] memremap: Convert devmap static branch to {inc,dec} ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 10/17] fs/dax: Remove unused size parameter ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 11/17] drivers/dax: Expand lock scope to cover the use of addresses ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 12/17] memremap: Add zone device access protection ira.weiny
2020-07-17  9:10   ` Peter Zijlstra
2020-07-18  5:06     ` Ira Weiny
2020-07-20  9:16       ` Peter Zijlstra
2020-07-17  9:17   ` Peter Zijlstra
2020-07-18  5:51     ` Ira Weiny
2020-07-17  9:20   ` Peter Zijlstra
2020-07-17  7:20 ` [PATCH RFC V2 13/17] kmap: Add stray write protection for device pages ira.weiny
2020-07-17  9:21   ` Peter Zijlstra
2020-07-19  4:13     ` Ira Weiny
2020-07-20  9:17       ` Peter Zijlstra
2020-07-21 16:31         ` Ira Weiny
2020-07-17  7:20 ` [PATCH RFC V2 14/17] dax: Stray write protection for dax_direct_access() ira.weiny
2020-07-17  9:22   ` Peter Zijlstra
2020-07-19  4:41     ` Ira Weiny
2020-07-17  7:20 ` [PATCH RFC V2 15/17] nvdimm/pmem: Stray write protection for pmem->virt_addr ira.weiny
2020-07-17  7:20 ` [PATCH RFC V2 16/17] [dax|pmem]: Enable stray write protection ira.weiny
2020-07-17  9:25   ` Peter Zijlstra
2020-07-17  7:20 ` [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions ira.weiny
2020-07-17  9:30   ` Peter Zijlstra
2020-07-21 18:01     ` Ira Weiny
2020-07-21 19:11       ` Peter Zijlstra
2020-07-17  9:34   ` Peter Zijlstra
2020-07-17 10:06   ` Peter Zijlstra
2020-07-22  5:27     ` Ira Weiny
2020-07-22  9:48       ` Peter Zijlstra
2020-07-22 21:24         ` Ira Weiny
2020-07-23 20:08       ` Thomas Gleixner
2020-07-23 20:15         ` Thomas Gleixner
2020-07-24 17:23           ` Ira Weiny
2020-07-24 17:29             ` Andy Lutomirski
2020-07-24 19:43               ` Ira Weiny
2020-07-22 16:21   ` Andy Lutomirski
2020-07-23 16:18     ` Fenghua Yu
2020-07-23 16:23       ` Dave Hansen
2020-07-23 16:52         ` Fenghua Yu
2020-07-23 17:08           ` Andy Lutomirski
2020-07-23 17:30             ` Dave Hansen
2020-07-23 20:23               ` Thomas Gleixner
2020-07-23 20:22             ` Thomas Gleixner
2020-07-23 21:30               ` Andy Lutomirski
2020-07-23 22:14                 ` Thomas Gleixner
2020-07-23 19:53   ` Thomas Gleixner
2020-07-23 22:04     ` Ira Weiny
2020-07-23 23:41       ` Thomas Gleixner
2020-07-24 21:24         ` Thomas Gleixner
2020-07-24 21:31           ` Thomas Gleixner
2020-07-25  0:09           ` Andy Lutomirski
2020-07-27 20:59           ` Ira Weiny
2020-07-24 22:19 ` [PATCH RFC V2 00/17] PKS: Add Protection Keys Supervisor (PKS) support Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200717072056.73134-6-ira.weiny@intel.com \
    --to=ira.weiny@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vishal.l.verma@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).