linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ira.weiny@intel.com
To: Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH V8 29/44] mm/pkeys: Introduce PKS fault callbacks
Date: Thu, 27 Jan 2022 09:54:50 -0800	[thread overview]
Message-ID: <20220127175505.851391-30-ira.weiny@intel.com> (raw)
In-Reply-To: <20220127175505.851391-1-ira.weiny@intel.com>

From: Rick Edgecombe <rick.p.edgecombe@intel.com>

Some PKS keys will want special handling on accesses that violate the
Pkey permissions.  One of these is PMEM which will want to have a mode
that logs the access violation, disables protection, and continues
rather than oops'ing the machine.

Provide an API to set callbacks for individual Pkeys.  Call these
through pks_handle_key_fault() which is called in the fault handler.

Since PKS faults do not provide the key that faulted, this information
needs to be recovered by walking the page tables and extracting it from
the leaf entry.  The key can then be used to call the specific user
defined callback.

This infrastructure could be used to implement the PKS testing code.
Unfortunately, this would limit the ability to test this code itself as
well as limit the testing code to a single Pkey.  Because
pks_test_callback() is zero overhead if CONFIG_PKS_TEST is not specified
it is left as a separate hook in the fault handler.

Add documentation.

Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---
Changes for V8:
	Add pt_regs to the callback signature so that
		pks_update_exception() can be called if needed.
	Update commit message
	Determine if page is large prior to not present
	Update commit message with more clarity as to why this was kept
		separate from pks_abandon_protections() and
		pks_test_callback()
	Embed documentation in c file.
	Move handle_pks_key_fault() to pkeys.c
		s/handle_pks_key_fault/pks_handle_key_fault/
		This consolidates the PKS code nicely
	Add feature check to pks_handle_key_fault()
	From Rick Edgecombe
		Fix key value check
	From kernel test robot
		Add static to handle_pks_key_fault

Changes for V7:
	New patch
---
 Documentation/core-api/protection-keys.rst |  9 ++-
 arch/x86/include/asm/pks.h                 |  9 +++
 arch/x86/mm/fault.c                        |  3 +
 arch/x86/mm/pkeys.c                        | 86 ++++++++++++++++++++++
 include/linux/pkeys.h                      |  3 +
 include/linux/pks-keys.h                   |  2 +
 6 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst
index b89308bf117e..267efa2112e7 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -115,7 +115,8 @@ Overview
 
 Similar to user space pkeys, supervisor pkeys allow additional protections to
 be defined for a supervisor mappings.  Unlike user space pkeys, violations of
-these protections result in a kernel oops.
+these protections result in a kernel oops unless a PKS fault handler is
+provided which handles the fault.
 
 Supervisor Memory Protection Keys (PKS) is a feature which is found on Intel's
 Sapphire Rapids (and later) "Scalable Processor" Server CPUs.  It will also be
@@ -150,6 +151,12 @@ Changing permissions of individual keys
 .. kernel-doc:: arch/x86/mm/pkeys.c
         :identifiers: pks_update_exception
 
+Overriding Default Fault Behavior
+---------------------------------
+
+.. kernel-doc:: arch/x86/mm/pkeys.c
+        :doc: DEFINE_PKS_FAULT_CALLBACK
+
 MSR details
 -----------
 
diff --git a/arch/x86/include/asm/pks.h b/arch/x86/include/asm/pks.h
index 065386c8bf37..55541bb64d08 100644
--- a/arch/x86/include/asm/pks.h
+++ b/arch/x86/include/asm/pks.h
@@ -9,6 +9,8 @@ void pks_write_current(void);
 void pks_save_pt_regs(struct pt_regs *regs);
 void pks_restore_pt_regs(struct pt_regs *regs);
 void pks_dump_fault_info(struct pt_regs *regs);
+bool pks_handle_key_fault(struct pt_regs *regs, unsigned long hw_error_code,
+			  unsigned long address);
 
 #else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
 
@@ -18,6 +20,13 @@ static inline void pks_save_pt_regs(struct pt_regs *regs) { }
 static inline void pks_restore_pt_regs(struct pt_regs *regs) { }
 static inline void pks_dump_fault_info(struct pt_regs *regs) { }
 
+static inline bool pks_handle_key_fault(struct pt_regs *regs,
+					unsigned long hw_error_code,
+					unsigned long address)
+{
+	return false;
+}
+
 #endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
 
 
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 697c06f08103..e378573d97a7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1162,6 +1162,9 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned long hw_error_code,
 		 */
 		WARN_ON_ONCE(!cpu_feature_enabled(X86_FEATURE_PKS));
 
+		if (pks_handle_key_fault(regs, hw_error_code, address))
+			return;
+
 		/*
 		 * If a protection key exception occurs it could be because a PKS test
 		 * is running.  If so, pks_test_callback() will clear the protection
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 6723ae42732a..531cf6c74ad7 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -11,6 +11,7 @@
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
 #include <asm/pks.h>
+#include <asm/trap_pf.h>		/* X86_PF_WRITE */
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -212,6 +213,91 @@ u32 pkey_update_pkval(u32 pkval, int pkey, u32 accessbits)
 
 __static_or_pks_test DEFINE_PER_CPU(u32, pkrs_cache);
 
+/**
+ * DOC: DEFINE_PKS_FAULT_CALLBACK
+ *
+ * Users may also provide a fault handler which can handle a fault differently
+ * than an oops.  For example if 'MY_FEATURE' wanted to define a handler they
+ * can do so by adding the coresponding entry to the pks_key_callbacks array.
+ *
+ * .. code-block:: c
+ *
+ *	#ifdef CONFIG_MY_FEATURE
+ *	bool my_feature_pks_fault_callback(struct pt_regs *regs,
+ *					   unsigned long address, bool write)
+ *	{
+ *		if (my_feature_fault_is_ok)
+ *			return true;
+ *		return false;
+ *	}
+ *	#endif
+ *
+ *	static const pks_key_callback pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = {
+ *		[PKS_KEY_DEFAULT]            = NULL,
+ *	#ifdef CONFIG_MY_FEATURE
+ *		[PKS_KEY_PGMAP_PROTECTION]   = my_feature_pks_fault_callback,
+ *	#endif
+ *	};
+ */
+static const pks_key_callback pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = { 0 };
+
+static bool pks_call_fault_callback(struct pt_regs *regs, unsigned long address,
+				    bool write, u16 key)
+{
+	if (key >= PKS_KEY_NR_CONSUMERS)
+		return false;
+
+	if (pks_key_callbacks[key])
+		return pks_key_callbacks[key](regs, address, write);
+
+	return false;
+}
+
+bool pks_handle_key_fault(struct pt_regs *regs, unsigned long hw_error_code,
+			  unsigned long address)
+{
+	bool write;
+	pgd_t pgd;
+	p4d_t p4d;
+	pud_t pud;
+	pmd_t pmd;
+	pte_t pte;
+
+	if (!cpu_feature_enabled(X86_FEATURE_PKS))
+		return false;
+
+	write = (hw_error_code & X86_PF_WRITE);
+
+	pgd = READ_ONCE(*(init_mm.pgd + pgd_index(address)));
+	if (!pgd_present(pgd))
+		return false;
+
+	p4d = READ_ONCE(*p4d_offset(&pgd, address));
+	if (p4d_large(p4d))
+		return pks_call_fault_callback(regs, address, write,
+					       pte_flags_pkey(p4d_val(p4d)));
+	if (!p4d_present(p4d))
+		return false;
+
+	pud = READ_ONCE(*pud_offset(&p4d, address));
+	if (pud_large(pud))
+		return pks_call_fault_callback(regs, address, write,
+					       pte_flags_pkey(pud_val(pud)));
+	if (!pud_present(pud))
+		return false;
+
+	pmd = READ_ONCE(*pmd_offset(&pud, address));
+	if (pmd_large(pmd))
+		return pks_call_fault_callback(regs, address, write,
+					       pte_flags_pkey(pmd_val(pmd)));
+	if (!pmd_present(pmd))
+		return false;
+
+	pte = READ_ONCE(*pte_offset_kernel(&pmd, address));
+	return pks_call_fault_callback(regs, address, write,
+				       pte_flags_pkey(pte_val(pte)));
+}
+
 /*
  * pks_write_pkrs() - Write the pkrs of the current CPU
  * @new_pkrs: New value to write to the current CPU register
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index c318d97f5da8..a53e4f2c41af 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -82,6 +82,9 @@ static inline void pks_mk_readwrite(int pkey)
 	pks_update_protection(pkey, PKEY_READ_WRITE);
 }
 
+typedef bool (*pks_key_callback)(struct pt_regs *regs, unsigned long address,
+				 bool write);
+
 #else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
 
 static inline void pks_mk_noaccess(int pkey) {}
diff --git a/include/linux/pks-keys.h b/include/linux/pks-keys.h
index 69a0be979515..a3fcd8df8688 100644
--- a/include/linux/pks-keys.h
+++ b/include/linux/pks-keys.h
@@ -27,6 +27,7 @@
  *	{
  *		PKS_KEY_DEFAULT         = 0,
  *		PKS_KEY_MY_FEATURE      = 1,
+ *		PKS_KEY_NR_CONSUMERS    = 2,
  *	}
  *
  *	#define PKS_INIT_VALUE (PKR_RW_KEY(PKS_KEY_DEFAULT)		|
@@ -43,6 +44,7 @@
 enum pks_pkey_consumers {
 	PKS_KEY_DEFAULT		= 0, /* Must be 0 for default PTE values */
 	PKS_KEY_TEST		= 1,
+	PKS_KEY_NR_CONSUMERS	= 2,
 };
 
 #define PKS_INIT_VALUE (PKR_RW_KEY(PKS_KEY_DEFAULT)		| \
-- 
2.31.1


  parent reply	other threads:[~2022-01-27 17:57 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27 17:54 [PATCH V8 00/44] PKS/PMEM: Add Stray Write Protection ira.weiny
2022-01-27 17:54 ` [PATCH V8 01/44] entry: Create an internal irqentry_exit_cond_resched() call ira.weiny
2022-01-27 17:54 ` [PATCH V8 02/44] Documentation/protection-keys: Clean up documentation for User Space pkeys ira.weiny
2022-01-28 22:39   ` Dave Hansen
2022-02-01 23:49     ` Ira Weiny
2022-02-01 23:54       ` Dave Hansen
2022-01-27 17:54 ` [PATCH V8 03/44] x86/pkeys: Create pkeys_common.h ira.weiny
2022-01-28 22:43   ` Dave Hansen
2022-02-02  1:00     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 04/44] x86/pkeys: Add additional PKEY helper macros ira.weiny
2022-01-28 22:47   ` Dave Hansen
2022-02-02 20:21     ` Ira Weiny
2022-02-02 20:26       ` Dave Hansen
2022-02-02 20:28         ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 05/44] x86/fpu: Refactor arch_set_user_pkey_access() ira.weiny
2022-01-28 22:50   ` Dave Hansen
2022-02-02 20:22     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 06/44] mm/pkeys: Add Kconfig options for PKS ira.weiny
2022-01-28 22:54   ` Dave Hansen
2022-01-28 23:10     ` Ira Weiny
2022-01-28 23:51       ` Dave Hansen
2022-02-04 19:08         ` Ira Weiny
2022-02-09  5:34           ` Ira Weiny
2022-02-14 19:20             ` Dave Hansen
2022-02-14 23:03               ` Ira Weiny
2022-01-29  0:06   ` Dave Hansen
2022-02-04 19:14     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 07/44] x86/pkeys: Add PKS CPU feature bit ira.weiny
2022-01-28 23:05   ` Dave Hansen
2022-02-04 19:21     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 08/44] x86/fault: Adjust WARN_ON for PKey fault ira.weiny
2022-01-28 23:10   ` Dave Hansen
2022-02-04 20:06     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 09/44] x86/pkeys: Enable PKS on cpus which support it ira.weiny
2022-01-28 23:18   ` Dave Hansen
2022-01-28 23:41     ` Ira Weiny
2022-01-28 23:53       ` Dave Hansen
2022-01-27 17:54 ` [PATCH V8 10/44] Documentation/pkeys: Add initial PKS documentation ira.weiny
2022-01-28 23:57   ` Dave Hansen
2022-01-27 17:54 ` [PATCH V8 11/44] mm/pkeys: Define static PKS key array and default values ira.weiny
2022-01-29  0:02   ` Dave Hansen
2022-02-04 23:54     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 12/44] mm/pkeys: Define PKS page table macros ira.weiny
2022-01-27 17:54 ` [PATCH V8 13/44] mm/pkeys: Add initial PKS Test code ira.weiny
2022-01-31 19:30   ` Edgecombe, Rick P
2022-02-09 23:44     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 14/44] x86/pkeys: Introduce pks_write_pkrs() ira.weiny
2022-01-29  0:12   ` Dave Hansen
2022-01-29  0:16     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 15/44] x86/pkeys: Preserve the PKS MSR on context switch ira.weiny
2022-01-29  0:22   ` Dave Hansen
2022-02-11  6:10     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 16/44] mm/pkeys: Introduce pks_mk_readwrite() ira.weiny
2022-01-31 23:10   ` Edgecombe, Rick P
2022-02-18  2:22     ` Ira Weiny
2022-02-01 17:40   ` Dave Hansen
2022-02-18  4:39     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 17/44] mm/pkeys: Introduce pks_mk_noaccess() ira.weiny
2022-01-27 17:54 ` [PATCH V8 18/44] x86/fault: Add a PKS test fault hook ira.weiny
2022-01-31 19:56   ` Edgecombe, Rick P
2022-02-11 20:40     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 19/44] mm/pkeys: PKS Testing, add pks_mk_*() tests ira.weiny
2022-02-01 17:45   ` Dave Hansen
2022-02-18  5:34     ` Ira Weiny
2022-02-18 15:28       ` Dave Hansen
2022-02-18 17:25         ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 20/44] mm/pkeys: Add PKS test for context switching ira.weiny
2022-02-01 17:43   ` Edgecombe, Rick P
2022-02-22 21:42     ` Ira Weiny
2022-02-01 17:47   ` Edgecombe, Rick P
2022-02-01 19:52     ` Edgecombe, Rick P
2022-02-18  6:03       ` Ira Weiny
2022-02-18  6:02     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 21/44] x86/entry: Add auxiliary pt_regs space ira.weiny
2022-01-27 17:54 ` [PATCH V8 22/44] entry: Pass pt_regs to irqentry_exit_cond_resched() ira.weiny
2022-01-27 17:54 ` [PATCH V8 23/44] entry: Add architecture auxiliary pt_regs save/restore calls ira.weiny
2022-01-27 17:54 ` [PATCH V8 24/44] x86/entry: Define arch_{save|restore}_auxiliary_pt_regs() ira.weiny
2022-01-27 17:54 ` [PATCH V8 25/44] x86/pkeys: Preserve PKRS MSR across exceptions ira.weiny
2022-01-27 17:54 ` [PATCH V8 26/44] x86/fault: Print PKS MSR on fault ira.weiny
2022-02-01 18:13   ` Edgecombe, Rick P
2022-02-18  6:01     ` Ira Weiny
2022-02-18 17:28       ` Edgecombe, Rick P
2022-02-18 20:20         ` Dave Hansen
2022-02-18 20:54           ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 27/44] mm/pkeys: Add PKS exception test ira.weiny
2022-01-27 17:54 ` [PATCH V8 28/44] mm/pkeys: Introduce pks_update_exception() ira.weiny
2022-01-27 17:54 ` ira.weiny [this message]
2022-01-27 17:54 ` [PATCH V8 30/44] mm/pkeys: Test setting a PKS key in a custom fault callback ira.weiny
2022-02-01  0:55   ` Edgecombe, Rick P
2022-03-01 15:39     ` Ira Weiny
2022-02-01 17:42   ` Edgecombe, Rick P
2022-02-11 20:44     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 31/44] mm/pkeys: Add pks_available() ira.weiny
2022-01-27 17:54 ` [PATCH V8 32/44] memremap_pages: Add Kconfig for DEVMAP_ACCESS_PROTECTION ira.weiny
2022-02-04 15:49   ` Dan Williams
2022-01-27 17:54 ` [PATCH V8 33/44] memremap_pages: Introduce pgmap_protection_available() ira.weiny
2022-02-04 16:19   ` Dan Williams
2022-02-28 16:59     ` Ira Weiny
2022-03-01 15:56       ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 34/44] memremap_pages: Introduce a PGMAP_PROTECTION flag ira.weiny
2022-01-27 17:54 ` [PATCH V8 35/44] memremap_pages: Introduce devmap_protected() ira.weiny
2022-01-27 17:54 ` [PATCH V8 36/44] memremap_pages: Reserve a PKS PKey for eventual use by PMEM ira.weiny
2022-02-01 18:35   ` Edgecombe, Rick P
2022-02-04 17:12     ` Dan Williams
2022-02-05  5:40       ` Ira Weiny
2022-02-05  8:19         ` Dan Williams
2022-02-06 18:14           ` Dan Williams
2022-02-08 22:48           ` Ira Weiny
2022-02-08 23:22             ` Dan Williams
2022-02-08 23:42               ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 37/44] memremap_pages: Set PKS PKey in PTEs if PGMAP_PROTECTIONS is requested ira.weiny
2022-02-04 17:41   ` Dan Williams
2022-03-01 18:15     ` Ira Weiny
2022-01-27 17:54 ` [PATCH V8 38/44] memremap_pages: Define pgmap_mk_{readwrite|noaccess}() calls ira.weiny
2022-02-04 18:35   ` Dan Williams
2022-02-05  0:09     ` Ira Weiny
2022-02-05  0:19       ` Dan Williams
2022-02-05  0:25         ` Dan Williams
2022-02-05  0:27           ` Dan Williams
2022-02-05  5:55             ` Ira Weiny
2022-02-05  6:28               ` Dan Williams
2022-02-22 22:05     ` Ira Weiny
2022-01-27 17:55 ` [PATCH V8 39/44] memremap_pages: Add memremap.pks_fault_mode ira.weiny
2022-02-01  1:16   ` Edgecombe, Rick P
2022-03-02  0:20     ` Ira Weiny
2022-02-04 19:01   ` Dan Williams
2022-03-02  2:00     ` Ira Weiny
2022-01-27 17:55 ` [PATCH V8 40/44] memremap_pages: Add pgmap_protection_flag_invalid() ira.weiny
2022-02-01  1:37   ` Edgecombe, Rick P
2022-03-02  2:01     ` Ira Weiny
2022-02-04 19:18   ` Dan Williams
2022-01-27 17:55 ` [PATCH V8 41/44] kmap: Ensure kmap works for devmap pages ira.weiny
2022-02-04 21:07   ` Dan Williams
2022-03-01 19:45     ` Ira Weiny
2022-03-01 19:50       ` Ira Weiny
2022-03-01 20:05       ` Dan Williams
2022-03-01 23:03         ` Ira Weiny
2022-01-27 17:55 ` [PATCH V8 42/44] dax: Stray access protection for dax_direct_access() ira.weiny
2022-02-04  5:19   ` Dan Williams
2022-03-01 18:13     ` Ira Weiny
2022-01-27 17:55 ` [PATCH V8 43/44] nvdimm/pmem: Enable stray access protection ira.weiny
2022-02-04 21:10   ` Dan Williams
2022-03-01 18:18     ` Ira Weiny
2022-01-27 17:55 ` [PATCH V8 44/44] devdax: " ira.weiny
2022-02-04 21:12   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220127175505.851391-30-ira.weiny@intel.com \
    --to=ira.weiny@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).