All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH V9 00/45] PKS/PMEM: Add Stray Write Protection
Date: Thu, 31 Mar 2022 10:13:07 -0700	[thread overview]
Message-ID: <YkXhIxfqjVBRTyjS@iweiny-desk3> (raw)
In-Reply-To: <20220310172019.850939-1-ira.weiny@intel.com>

On Thu, Mar 10, 2022 at 09:19:34AM -0800, Ira wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> 
> I'm looking for Intel acks on the series prior to submitting to maintainers.
> Most of the changes from V8 to V9 was in getting the tests straightened out.
> But there are some improvements in the actual code.

Is there any feedback on this?

Ira

> 
> 
> Changes for V9
> 
> Review and update all commit messages.
> Update cover letter below
> 
> PKS Core
> 	Separate user and supervisor pkey code in the headers
> 		create linux/pks.h for supervisor calls
> 		This facilitated making the pmem code more efficient 
> 	Completely rearchitect the test code
> 		[After Dave Hansen and Rick Edgecombe found issues in the test
> 			code it was easier to rearchitect the code completely
> 			rather than attempt to fix it.]
> 		Remove pks_test_callback in favor of using fault hooks
> 			Fault hooks also isolate the fault callbacks from being
> 			false positives if non-test consumers are running
> 		Make additional PKS_TEST_RUN_ALL Kconfig option which is
> 			mutually exclusive to any non-test PKS consumer
> 			PKS_TEST_RUN_ALL takes over all pkey callbacks
> 		Ensure that each test runs within it's own context and is
> 			mutually exclusive from running while any other test is
> 			running.
> 		Ensure test session and context memory is cleaned up on file
> 			close
> 		Use pr_debug() and dynamic debug for in kernel debug messages
> 		Enhance test_pks selftest
> 			Add the ability to run all tests not just the context
> 				switch test
> 			Standardize output [PASS][FAIL][SKIP]
> 			Add '-d' option enables dynamic debug to see the kernel
> 				debug messages
> 
> 	Incorporate feedback from Rick Edgecombe
> 		Update all pkey types to u8
> 		Fix up test code barriers
> 	Move patch declaring PKS_INIT_VALUE ahead of the patch which enables
> 		PKS so that PKS_INIT_VALUE can be used when pks_setup() is
> 		first created
> 	From Dan Williams
> 		Use macros instead of an enum for a pkey allocation scheme
> 			which is predicated on the config options of consumers
> 			This almost worked perfectly.  It required a bit of
> 			tweeking to be able to allocate all of the keys.
> 
> 	From Dave Hansen
> 		Reposition some code to be near/similar to user pkeys
> 			s/pks_write_current/x86_pkrs_load
> 			s/pks_saved_pkrs/pkrs
> 		Update Documentation
> 		s/PKR_{RW,AD,WD}_KEY/PKR_{RW,AD,WD}_MASK
> 		Consistently use lower case for pkey
> 		Update commit messages
> 		Add Acks
> 
> PMEM Stray Write
> 	Building on the change to the pks_mk_*() function rename
> 		s/pgmap_mk_*/pgmap_set_*/
> 		s/dax_mk_*/dax_set_*/
> 	From Dan Williams
> 		Avoid adding new dax operations by teaching dax_device about pgmap
> 		Remove pgmap_protection_flag_invalid() patch (Just let
> 			kmap'ings fail)
> 
> 
> PKS/PMEM Stray write protection
> ===============================
> 
> This series is broken into 2 parts.
> 
> 	1) Introduce Protection Key Supervisor (PKS), testing, and
> 	   documentation
> 	2) Use PKS to protect PMEM from stray writes
> 
> Introduce Protection Key Supervisor (PKS)
> -----------------------------------------
> 
> PKS enables protections on 'domains' of supervisor pages to limit supervisor
> mode access to pages beyond the normal paging protections.  PKS works in a
> similar fashion to user space pkeys, PKU.  As with PKU, supervisor pkeys are
> checked in addition to normal paging protections.  And page mappings are
> assigned to a domain by setting a 4 bit pkey in the PTE of that mapping.
> 
> Unlike PKU, permissions are changed via a MSR update.  This update avoids TLB
> flushes making this an efficient way to alter protections vs PTE updates.
> 
> Also, unlike PTE updates PKS permission changes apply only to the current
> processor.  Therefore changing permissions apply only to that thread and not
> any other cpu/process.  This allows protections to remain in place on other
> cpus for additional protection and isolation.
> 
> Even though PKS updates are thread local, XSAVE is not supported for the PKRS
> MSR.  Therefore this implementation saves and restores the MSR across context
> switches and during exceptions within software.  Nested exceptions are
> supported by each exception getting a new PKS state.
> 
> For consistent behavior with current paging protections, pkey 0 is reserved and
> configured to allow full access via the pkey mechanism, thus preserving the
> default paging protections because PTEs naturally have a pkey value of 0.
> 
> Other keys, (1-15) are statically allocated by kernel consumers when
> configured.  This is done by adding the appropriate PKS_NEW_KEY and
> PKS_DECLARE_INIT_VALUE macros to pks-keys.h.
> 
> Two PKS consumers, PKS_TEST and PMEM stray write protection, are included in
> this series.  When the number of users grows larger the sharing of keys will
> need to be resolved depending on the needs of the users at that time.  Many
> methods have been contemplated but the number of kernel users and use cases
> envisioned is still quite small, much less than the 15 available keys.
> 
> To summarize, the following are key attributes of PKS.
> 
> 	1) Fast switching of permissions
> 		1a) Prevents access without page table manipulations
> 		1b) No TLB flushes required
> 	2) Works on a per thread basis, thus allowing protections to be
> 	   preserved on threads which are not actively accessing data through
> 	   the mapping.
> 
> PKS is available with 4 and 5 level paging.  For this and simplicity of
> implementation, the feature is restricted to x86_64.
> 
> 
> Use PKS to protect PMEM from stray writes
> -----------------------------------------
> 
> DAX leverages the direct-map to enable 'struct page' services for PMEM.  Given
> that PMEM capacity may be an order of magnitude higher capacity than System RAM
> it presents a large vulnerability surface to stray writes.  Such a stray write
> becomes a silent data corruption bug.
> 
> Stray pointers to System RAM may result in a crash or other undesirable
> behavior which, while unfortunate, are usually recoverable with a reboot.
> Stray writes to PMEM are permanent in nature and thus are more likely to result
> in permanent user data loss.  Given that PMEM access from the kernel is limited
> to a constrained set of locations (PMEM driver, Filesystem-DAX, direct-I/O, and
> any properly kmap'ed page), it is amenable to PKS protection.
> 
> Set up an infrastructure for extra device access protection. Then implement the
> protection using the new Protection Keys Supervisor (PKS) on architectures
> which support it.
> 
> Because PMEM pages are all associated with a struct dev_pagemap and flags in
> struct page are valuable the flag of protecting memory can be stored in struct
> dev_pagemap.  All PMEM is protected by the same pkey.  So a single flag is all
> that is needed in each dev_pagemap to indicate protection.
> 
> General access in the kernel is supported by modifying the kmap infrastructure
> which can detect if a page is pks protected and enable access until the
> corresponding unmap is called.
> 
> Because PKS is a thread local mechanism and because kmap was never really
> intended to create a long term mapping, this implementation does not support
> the kmap()/kunmap() calls.  Calling kmap() on a PMEM protected page is allowed
> but accessing that mapping will cause a fault.
> 
> Originally this series modified many of the kmap call sites to indicate they
> were thread local.[1]  And an attempt to support kmap()[2] was made.  But now
> that kmap_local_page() has been developed[3] and in more wide spread use,
> kmap() can safely be left unsupported.
> 
> How the fault is handled is configurable via a new module parameter
> memremap.pks_fault_mode.  Two modes are supported.
> 
> 	'relaxed' (default) -- WARN_ONCE, disable the protection and allow
> 	                       access
> 
> 	'strict' -- prevent any unguarded access to a protected dev_pagemap
> 		    range
> 
> This 'safety valve' feature has already been useful in the development of this
> feature.
> 
> 
> [1] https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/
> 
> [2] https://lore.kernel.org/lkml/87mtycqcjf.fsf@nanos.tec.linutronix.de/
> 
> [3] https://lore.kernel.org/lkml/20210128061503.1496847-1-ira.weiny@intel.com/
>     https://lore.kernel.org/lkml/20210210062221.3023586-1-ira.weiny@intel.com/
>     https://lore.kernel.org/lkml/20210205170030.856723-1-ira.weiny@intel.com/
>     https://lore.kernel.org/lkml/20210217024826.3466046-1-ira.weiny@intel.com/
> 
> 
> ----------------------------------------------------------------------------
> Changes for V8
> 
> Feedback from Thomas
> 	* clean up noinstr mess
> 	* Fix static PKEY allocation mess
> 	* Ensure all functions are consistently named.
> 	* Split up patches to do 1 thing per patch
> 	* pkey_update_pkval() implementation
> 	* Streamline the use of pks_write_pkrs() by not disabling preemption
> 		- Leave this to the callers who require it.
> 		- Use documentation and lockdep to prevent errors
> 	* Clean up commit messages to explain in detail _why_ each patch is
> 		there.
> 
> Feedback from Dave H.
> 	* Leave out pks_mk_readonly() as it is not used by the PMEM use case
> 
> Feedback from Peter Anvin
> 	* Replace pks_abandon_pkey() with pks_update_exception()
> 		This is an even greater simplification in that it no longer
> 		attempts to shield users from faults.  As the main use case for
> 		abandoning a key was to allow a system to continue running even
> 		with an error.  This should be a rare event so the performance
> 		should not be an issue.
> 
> * Simplify ARCH_ENABLE_SUPERVISOR_PKEYS
> 
> * Update PKS Test code
> 	- Add default value test
> 	- Split up the test code into patches which follow each feature
> 	  addition
> 	- simplify test code processing
> 	- ensure consistent reporting of errors.
> 
> * Ensure all entry points to the PKS code are protected by
> 	cpu_feature_enabled(X86_FEATURE_PKS)
> 	- At the same time make sure non-entry points or sub-functions to the
> 	  PKS code are not _unnecessarily_ protected by the feature check
> 
> * Update documentation
> 	- Use kernel docs to place the docs with the code for easier internal
> 	  developer use
> 
> * Adjust the PMEM use cases for the core changes
> 
> * Split the PMEM patches up to be 1 change per patch and help clarify review
> 
> * Review all header files and remove those no longer needed
> 
> * Review/update/clarify all commit messages
> 
> Fenghua Yu (1):
> mm/pkeys: Define PKS page table macros
> 
> Ira Weiny (43):
> entry: Create an internal irqentry_exit_cond_resched() call
> Documentation/protection-keys: Clean up documentation for User Space
> pkeys
> x86/pkeys: Clarify PKRU_AD_KEY macro
> x86/pkeys: Make PKRU macros generic
> x86/fpu: Refactor arch_set_user_pkey_access()
> mm/pkeys: Add Kconfig options for PKS
> x86/pkeys: Add PKS CPU feature bit
> x86/fault: Adjust WARN_ON for pkey fault
> Documentation/pkeys: Add initial PKS documentation
> mm/pkeys: Provide for PKS key allocation
> x86/pkeys: Enable PKS on cpus which support it
> mm/pkeys: PKS testing, add initial test code
> x86/selftests: Add test_pks
> x86/pkeys: Introduce pks_write_pkrs()
> x86/pkeys: Preserve the PKS MSR on context switch
> mm/pkeys: Introduce pks_set_readwrite()
> mm/pkeys: Introduce pks_set_noaccess()
> mm/pkeys: PKS testing, add a fault call back
> mm/pkeys: PKS testing, add pks_set_*() tests
> mm/pkeys: PKS testing, test context switching
> x86/entry: Add auxiliary pt_regs space
> entry: Split up irqentry_exit_cond_resched()
> entry: Add calls for save/restore auxiliary pt_regs
> x86/entry: Define arch_{save|restore}_auxiliary_pt_regs()
> x86/pkeys: Preserve PKRS MSR across exceptions
> x86/fault: Print PKS MSR on fault
> mm/pkeys: PKS testing, Add exception test
> mm/pkeys: Introduce pks_update_exception()
> mm/pkeys: PKS testing, test pks_update_exception()
> mm/pkeys: PKS testing, add test for all keys
> mm/pkeys: Add pks_available()
> memremap_pages: Add Kconfig for DEVMAP_ACCESS_PROTECTION
> memremap_pages: Introduce pgmap_protection_available()
> memremap_pages: Introduce a PGMAP_PROTECTION flag
> memremap_pages: Introduce devmap_protected()
> memremap_pages: Reserve a PKS pkey for eventual use by PMEM
> memremap_pages: Set PKS pkey in PTEs if requested
> memremap_pages: Define pgmap_set_{readwrite|noaccess}() calls
> memremap_pages: Add memremap.pks_fault_mode
> kmap: Make kmap work for devmap protected pages
> dax: Stray access protection for dax_direct_access()
> nvdimm/pmem: Enable stray access protection
> devdax: Enable stray access protection
> 
> Rick Edgecombe (1):
> mm/pkeys: Introduce PKS fault callbacks
> 
> .../admin-guide/kernel-parameters.txt | 12 +
> Documentation/core-api/protection-keys.rst | 130 ++-
> arch/x86/Kconfig | 6 +
> arch/x86/entry/calling.h | 20 +
> arch/x86/entry/common.c | 2 +-
> arch/x86/entry/entry_64.S | 22 +
> arch/x86/entry/entry_64_compat.S | 6 +
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/disabled-features.h | 8 +-
> arch/x86/include/asm/entry-common.h | 15 +
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/include/asm/pgtable_types.h | 22 +
> arch/x86/include/asm/pkeys.h | 2 +
> arch/x86/include/asm/pkeys_common.h | 18 +
> arch/x86/include/asm/pkru.h | 20 +-
> arch/x86/include/asm/pks.h | 46 ++
> arch/x86/include/asm/processor.h | 15 +-
> arch/x86/include/asm/ptrace.h | 21 +
> arch/x86/include/uapi/asm/processor-flags.h | 2 +
> arch/x86/kernel/asm-offsets_64.c | 15 +
> arch/x86/kernel/cpu/common.c | 2 +
> arch/x86/kernel/dumpstack.c | 32 +-
> arch/x86/kernel/fpu/xstate.c | 22 +-
> arch/x86/kernel/head_64.S | 6 +
> arch/x86/kernel/process_64.c | 3 +
> arch/x86/mm/fault.c | 17 +-
> arch/x86/mm/pkeys.c | 320 +++++++-
> drivers/dax/device.c | 2 +
> drivers/dax/super.c | 59 ++
> drivers/md/dm-writecache.c | 8 +-
> drivers/nvdimm/pmem.c | 26 +
> fs/dax.c | 8 +
> fs/fuse/virtio_fs.c | 2 +
> include/linux/dax.h | 5 +
> include/linux/entry-common.h | 15 +-
> include/linux/highmem-internal.h | 4 +
> include/linux/memremap.h | 1 +
> include/linux/mm.h | 72 ++
> include/linux/pgtable.h | 4 +
> include/linux/pks-keys.h | 92 +++
> include/linux/pks.h | 73 ++
> include/linux/sched.h | 7 +
> include/uapi/asm-generic/mman-common.h | 1 +
> init/init_task.c | 3 +
> kernel/entry/common.c | 44 +-
> kernel/sched/core.c | 40 +-
> lib/Kconfig.debug | 33 +
> lib/Makefile | 3 +
> lib/pks/Makefile | 3 +
> lib/pks/pks_test.c | 755 ++++++++++++++++++
> mm/Kconfig | 32 +
> mm/memremap.c | 132 +++
> tools/testing/selftests/x86/Makefile | 2 +-
> tools/testing/selftests/x86/test_pks.c | 514 ++++++++++++
> 54 files changed, 2617 insertions(+), 109 deletions(-)
> create mode 100644 arch/x86/include/asm/pkeys_common.h
> create mode 100644 arch/x86/include/asm/pks.h
> create mode 100644 include/linux/pks-keys.h
> create mode 100644 include/linux/pks.h
> create mode 100644 lib/pks/Makefile
> create mode 100644 lib/pks/pks_test.c
> create mode 100644 tools/testing/selftests/x86/test_pks.c
> 
> --
> 2.35.1
> 

      parent reply	other threads:[~2022-03-31 17:13 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-10 17:19 [PATCH V9 00/45] PKS/PMEM: Add Stray Write Protection ira.weiny
2022-03-10 17:19 ` [PATCH V9 01/45] entry: Create an internal irqentry_exit_cond_resched() call ira.weiny
2022-04-07  2:48   ` Ira Weiny
2022-03-10 17:19 ` [PATCH V9 02/45] Documentation/protection-keys: Clean up documentation for User Space pkeys ira.weiny
2022-03-10 17:19 ` [PATCH V9 03/45] x86/pkeys: Clarify PKRU_AD_KEY macro ira.weiny
2022-03-10 17:19 ` [PATCH V9 04/45] x86/pkeys: Make PKRU macros generic ira.weiny
2022-03-10 17:19 ` [PATCH V9 05/45] x86/fpu: Refactor arch_set_user_pkey_access() ira.weiny
2022-03-10 17:19 ` [PATCH V9 06/45] mm/pkeys: Add Kconfig options for PKS ira.weiny
2022-03-10 17:19 ` [PATCH V9 07/45] x86/pkeys: Add PKS CPU feature bit ira.weiny
2022-03-10 17:19 ` [PATCH V9 08/45] x86/fault: Adjust WARN_ON for pkey fault ira.weiny
2022-03-10 17:19 ` [PATCH V9 09/45] Documentation/pkeys: Add initial PKS documentation ira.weiny
2022-03-10 17:19 ` [PATCH V9 10/45] mm/pkeys: Provide for PKS key allocation ira.weiny
2022-03-10 17:19 ` [PATCH V9 11/45] x86/pkeys: Enable PKS on cpus which support it ira.weiny
2022-03-10 17:19 ` [PATCH V9 12/45] mm/pkeys: Define PKS page table macros ira.weiny
2022-03-10 17:19 ` [PATCH V9 13/45] mm/pkeys: PKS testing, add initial test code ira.weiny
2022-03-10 17:19 ` [PATCH V9 14/45] x86/selftests: Add test_pks ira.weiny
2022-03-10 17:19 ` [PATCH V9 15/45] x86/pkeys: Introduce pks_write_pkrs() ira.weiny
2022-03-10 17:19 ` [PATCH V9 16/45] x86/pkeys: Preserve the PKS MSR on context switch ira.weiny
2022-03-10 17:19 ` [PATCH V9 17/45] mm/pkeys: Introduce pks_set_readwrite() ira.weiny
2022-03-10 17:19 ` [PATCH V9 18/45] mm/pkeys: Introduce pks_set_noaccess() ira.weiny
2022-03-10 17:19 ` [PATCH V9 19/45] mm/pkeys: Introduce PKS fault callbacks ira.weiny
2022-03-10 17:19 ` [PATCH V9 20/45] mm/pkeys: PKS testing, add a fault call back ira.weiny
2022-03-10 17:19 ` [PATCH V9 21/45] mm/pkeys: PKS testing, add pks_set_*() tests ira.weiny
2022-03-10 17:19 ` [PATCH V9 22/45] mm/pkeys: PKS testing, test context switching ira.weiny
2022-03-10 17:19 ` [PATCH V9 23/45] x86/entry: Add auxiliary pt_regs space ira.weiny
2022-03-10 17:19 ` [PATCH V9 24/45] entry: Split up irqentry_exit_cond_resched() ira.weiny
2022-04-07  2:50   ` Ira Weiny
2022-03-10 17:19 ` [PATCH V9 25/45] entry: Add calls for save/restore auxiliary pt_regs ira.weiny
2022-03-10 17:20 ` [PATCH V9 26/45] x86/entry: Define arch_{save|restore}_auxiliary_pt_regs() ira.weiny
2022-03-10 17:20 ` [PATCH V9 27/45] x86/pkeys: Preserve PKRS MSR across exceptions ira.weiny
2022-03-10 17:20 ` [PATCH V9 28/45] x86/fault: Print PKS MSR on fault ira.weiny
2022-03-10 17:20 ` [PATCH V9 29/45] mm/pkeys: PKS testing, Add exception test ira.weiny
2022-03-10 17:20 ` [PATCH V9 30/45] mm/pkeys: Introduce pks_update_exception() ira.weiny
2022-03-10 17:20 ` [PATCH V9 31/45] mm/pkeys: PKS testing, test pks_update_exception() ira.weiny
2022-03-10 17:20 ` [PATCH V9 32/45] mm/pkeys: PKS testing, add test for all keys ira.weiny
2022-03-10 17:20 ` [PATCH V9 33/45] mm/pkeys: Add pks_available() ira.weiny
2022-03-10 17:20 ` [PATCH V9 34/45] memremap_pages: Add Kconfig for DEVMAP_ACCESS_PROTECTION ira.weiny
2022-03-10 17:20 ` [PATCH V9 35/45] memremap_pages: Introduce pgmap_protection_available() ira.weiny
2022-03-10 17:20 ` [PATCH V9 36/45] memremap_pages: Introduce a PGMAP_PROTECTION flag ira.weiny
2022-03-10 17:20 ` [PATCH V9 37/45] memremap_pages: Introduce devmap_protected() ira.weiny
2022-03-10 17:20 ` [PATCH V9 38/45] memremap_pages: Reserve a PKS pkey for eventual use by PMEM ira.weiny
2022-03-10 17:20 ` [PATCH V9 39/45] memremap_pages: Set PKS pkey in PTEs if requested ira.weiny
2022-03-10 17:20 ` [PATCH V9 40/45] memremap_pages: Define pgmap_set_{readwrite|noaccess}() calls ira.weiny
2022-03-10 17:20 ` [PATCH V9 41/45] memremap_pages: Add memremap.pks_fault_mode ira.weiny
2022-03-10 17:20 ` [PATCH V9 42/45] kmap: Make kmap work for devmap protected pages ira.weiny
2022-03-10 17:20 ` [PATCH V9 43/45] dax: Stray access protection for dax_direct_access() ira.weiny
2022-03-10 17:20 ` [PATCH V9 44/45] nvdimm/pmem: Enable stray access protection ira.weiny
2022-03-10 17:20 ` [PATCH V9 45/45] devdax: " ira.weiny
2022-03-31 17:13 ` Ira Weiny [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkXhIxfqjVBRTyjS@iweiny-desk3 \
    --to=ira.weiny@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.