From: Randy Dunlap <rdunlap@infradead.org>
To: Igor Stoppa <igor.stoppa@gmail.com>,
Mimi Zohar <zohar@linux.vnet.ibm.com>,
Kees Cook <keescook@chromium.org>,
Matthew Wilcox <willy@infradead.org>,
Dave Chinner <david@fromorbit.com>,
James Morris <jmorris@namei.org>,
Michal Hocko <mhocko@kernel.org>,
kernel-hardening@lists.openwall.com,
linux-integrity@vger.kernel.org,
linux-security-module@vger.kernel.org
Cc: igor.stoppa@huawei.com, Dave Hansen <dave.hansen@linux.intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Laura Abbott <labbott@redhat.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 10/17] prmem: documentation
Date: Tue, 23 Oct 2018 20:48:33 -0700 [thread overview]
Message-ID: <b09b3feb-8e3d-5fab-4e2b-35e8c252b27c@infradead.org> (raw)
In-Reply-To: <20181023213504.28905-11-igor.stoppa@huawei.com>
Hi,
On 10/23/18 2:34 PM, Igor Stoppa wrote:
> Documentation for protected memory.
>
> Topics covered:
> * static memory allocation
> * dynamic memory allocation
> * write-rare
>
> Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
> CC: Jonathan Corbet <corbet@lwn.net>
> CC: Randy Dunlap <rdunlap@infradead.org>
> CC: Mike Rapoport <rppt@linux.vnet.ibm.com>
> CC: linux-doc@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
> Documentation/core-api/index.rst | 1 +
> Documentation/core-api/prmem.rst | 172 +++++++++++++++++++++++++++++++
> MAINTAINERS | 1 +
> 3 files changed, 174 insertions(+)
> create mode 100644 Documentation/core-api/prmem.rst
> diff --git a/Documentation/core-api/prmem.rst b/Documentation/core-api/prmem.rst
> new file mode 100644
> index 000000000000..16d7edfe327a
> --- /dev/null
> +++ b/Documentation/core-api/prmem.rst
> @@ -0,0 +1,172 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. _prmem:
> +
> +Memory Protection
> +=================
> +
> +:Date: October 2018
> +:Author: Igor Stoppa <igor.stoppa@huawei.com>
> +
> +Foreword
> +--------
> +- In a typical system using some sort of RAM as execution environment,
> + **all** memory is initially writable.
> +
> +- It must be initialized with the appropriate content, be it code or data.
> +
> +- Said content typically undergoes modifications, i.e. relocations or
> + relocation-induced changes.
> +
> +- The present document doesn't address such transient.
transience.
> +
> +- Kernel code is protected at system level and, unlike data, it doesn't
> + require special attention.
> +
> +Protection mechanism
> +--------------------
> +
> +- When available, the MMU can write protect memory pages that would be
> + otherwise writable.
> +
> +- The protection has page-level granularity.
> +
> +- An attempt to overwrite a protected page will trigger an exception.
> +- **Write protected data must go exclusively to write protected pages**
> +- **Writable data must go exclusively to writable pages**
> +
> +Available protections for kernel data
> +-------------------------------------
> +
> +- **constant**
> + Labelled as **const**, the data is never supposed to be altered.
> + It is statically allocated - if it has any memory footprint at all.
> + The compiler can even optimize it away, where possible, by replacing
> + references to a **const** with its actual value.
> +
> +- **read only after init**
> + By tagging an otherwise ordinary statically allocated variable with
> + **__ro_after_init**, it is placed in a special segment that will
> + become write protected, at the end of the kernel init phase.
> + The compiler has no notion of this restriction and it will treat any
> + write operation on such variable as legal. However, assignments that
> + are attempted after the write protection is in place, will cause
no comma.
> + exceptions.
> +
> +- **write rare after init**
> + This can be seen as variant of read only after init, which uses the
> + tag **__wr_after_init**. It is also limited to statically allocated
> + memory. It is still possible to alter this type of variables, after
> + the kernel init phase is complete, however it can be done exclusively
> + with special functions, instead of the assignment operator. Using the
> + assignment operator after conclusion of the init phase will still
> + trigger an exception. It is not possible to transition a certain
> + variable from __wr_ater_init to a permanent read-only status, at
> + runtime.
> +
> +- **dynamically allocated write-rare / read-only**
> + After defining a pool, memory can be obtained through it, primarily
> + through the **pmalloc()** allocator. The exact writability state of the
> + memory obtained from **pmalloc()** and friends can be configured when
> + creating the pool. At any point it is possible to transition to a less
> + permissive write status the memory currently associated to the pool.
> + Once memory has become read-only, it the only valid operation, beside
> + reading, is to released it, by destroying the pool it belongs to.
> +
> +
> +Protecting dynamically allocated memory
> +---------------------------------------
> +
> +When dealing with dynamically allocated memory, three options are
> + available for configuring its writability state:
> +
> +- **Options selected when creating a pool**
> + When creating the pool, it is possible to choose one of the following:
> + - **PMALLOC_MODE_RO**
> + - Writability at allocation time: *WRITABLE*
> + - Writability at protection time: *NONE*
> + - **PMALLOC_MODE_WR**
> + - Writability at allocation time: *WRITABLE*
> + - Writability at protection time: *WRITE-RARE*
> + - **PMALLOC_MODE_AUTO_RO**
> + - Writability at allocation time:
> + - the latest allocation: *WRITABLE*
> + - every other allocation: *NONE*
> + - Writability at protection time: *NONE*
> + - **PMALLOC_MODE_AUTO_WR**
> + - Writability at allocation time:
> + - the latest allocation: *WRITABLE*
> + - every other allocation: *WRITE-RARE*
> + - Writability at protection time: *WRITE-RARE*
> + - **PMALLOC_MODE_START_WR**
> + - Writability at allocation time: *WRITE-RARE*
> + - Writability at protection time: *WRITE-RARE*
> +
> + **Remarks:**
> + - The "AUTO" modes perform automatic protection of the content, whenever
> + the current vmap_area is used up and a new one is allocated.
> + - At that point, the vmap_area being phased out is protected.
> + - The size of the vmap_area depends on various parameters.
> + - It might not be possible to know for sure *when* certain data will
> + be protected.
> + - The functionality is provided as tradeoff between hardening and speed.
> + - Its usefulness depends on the specific use case at hand
end above sentence with a period, please, like all of the others above it.
> + - The "START_WR" mode is the only one which provides immediate protection, at the cost of speed.
Please try to keep the line above and a few below to < 80 characters in length.
(because some of us read rst files as text files, with a text editor, and line
wrap is ugly)
> +
> +- **Protecting the pool**
> + This is achieved with **pmalloc_protect_pool()**
> + - Any vmap_area currently in the pool is write-protected according to its initial configuration.
> + - Any residual space still available from the current vmap_area is lost, as the area is protected.
> + - **protecting a pool after every allocation will likely be very wasteful**
> + - Using PMALLOC_MODE_START_WR is likely a better choice.
> +
> +- **Upgrading the protection level**
> + This is achieved with **pmalloc_make_pool_ro()**
> + - it turns the present content of a write-rare pool into read-only
> + - can be useful when the content of the memory has settled
> +
> +
> +Caveats
> +-------
> +- Freeing of memory is not supported. Pages will be returned to the
> + system upon destruction of their memory pool.
> +
> +- The address range available for vmalloc (and thus for pmalloc too) is
> + limited, on 32-bit systems. However it shouldn't be an issue, since not
> + much data is expected to be dynamically allocated and turned into
> + write-protected.
> +
> +- Regarding SMP systems, changing state of pages and altering mappings
> + requires performing cross-processor synchronizations of page tables.
> + This is an additional reason for limiting the use of write rare.
> +
> +- Not only the pmalloc memory must be protected, but also any reference to
> + it that might become the target for an attack. The attack would replace
> + a reference to the protected memory with a reference to some other,
> + unprotected, memory.
> +
> +- The users of rare write must take care of ensuring the atomicity of the
s/rare write/write rare/ ?
> + action, respect to the way they use the data being altered; for example,
This .. "respect to the way" is awkward, but I don't know what to
change it to.
> + take a lock before making a copy of the value to modify (if it's
> + relevant), then alter it, issue the call to rare write and finally
> + release the lock. Some special scenario might be exempt from the need
> + for locking, but in general rare-write must be treated as an operation
It seemed to me that "write-rare" (or write rare) was the going name, but now
it's being called "rare write" (or rare-write). Just be consistent, please.
> + that can incur into races.
> +
> +- pmalloc relies on virtual memory areas and will therefore use more
> + tlb entries. It still does a better job of it, compared to invoking
TLB
> + vmalloc for each allocation, but it is undeniably less optimized wrt to
s/wrt/with respect to/
> + TLB use than using the physmap directly, through kmalloc or similar.
> +
> +
> +Utilization
> +-----------
> +
> +**add examples here**
> +
> +API
> +---
> +
> +.. kernel-doc:: include/linux/prmem.h
> +.. kernel-doc:: mm/prmem.c
> +.. kernel-doc:: include/linux/prmemextra.h
Thanks for the documentation.
--
~Randy
next prev parent reply other threads:[~2018-10-24 3:48 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-23 21:34 [RFC v1 PATCH 00/17] prmem: protected memory Igor Stoppa
2018-10-23 21:34 ` [PATCH 01/17] prmem: linker section for static write rare Igor Stoppa
2018-10-23 21:34 ` [PATCH 02/17] prmem: write rare for static allocation Igor Stoppa
2018-10-25 0:24 ` Dave Hansen
2018-10-29 18:03 ` Igor Stoppa
2018-10-26 9:41 ` Peter Zijlstra
2018-10-29 20:01 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 03/17] prmem: vmalloc support for dynamic allocation Igor Stoppa
2018-10-25 0:26 ` Dave Hansen
2018-10-29 18:07 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 04/17] prmem: " Igor Stoppa
2018-10-23 21:34 ` [PATCH 05/17] prmem: shorthands for write rare on common types Igor Stoppa
2018-10-25 0:28 ` Dave Hansen
2018-10-29 18:12 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 06/17] prmem: test cases for memory protection Igor Stoppa
2018-10-24 3:27 ` Randy Dunlap
2018-10-24 14:24 ` Igor Stoppa
2018-10-25 16:43 ` Dave Hansen
2018-10-29 18:16 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 07/17] prmem: lkdtm tests " Igor Stoppa
2018-10-23 21:34 ` [PATCH 08/17] prmem: struct page: track vmap_area Igor Stoppa
2018-10-24 3:12 ` Matthew Wilcox
2018-10-24 23:01 ` Igor Stoppa
2018-10-25 2:13 ` Matthew Wilcox
2018-10-29 18:21 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 09/17] prmem: hardened usercopy Igor Stoppa
2018-10-29 11:45 ` Chris von Recklinghausen
2018-10-29 18:24 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 10/17] prmem: documentation Igor Stoppa
2018-10-24 3:48 ` Randy Dunlap [this message]
2018-10-24 14:30 ` Igor Stoppa
2018-10-24 23:04 ` Mike Rapoport
2018-10-29 19:05 ` Igor Stoppa
2018-10-26 9:26 ` Peter Zijlstra
2018-10-26 10:20 ` Matthew Wilcox
2018-10-29 19:28 ` Igor Stoppa
2018-10-26 10:46 ` Kees Cook
2018-10-28 18:31 ` Peter Zijlstra
2018-10-29 21:04 ` Igor Stoppa
2018-10-30 15:26 ` Peter Zijlstra
2018-10-30 16:37 ` Kees Cook
2018-10-30 17:06 ` Andy Lutomirski
2018-10-30 17:58 ` Matthew Wilcox
2018-10-30 18:03 ` Dave Hansen
2018-10-31 9:18 ` Peter Zijlstra
2018-10-30 18:28 ` Tycho Andersen
2018-10-30 19:20 ` Matthew Wilcox
2018-10-30 20:43 ` Igor Stoppa
2018-10-30 21:02 ` Andy Lutomirski
2018-10-30 21:07 ` Kees Cook
2018-10-30 21:25 ` Igor Stoppa
2018-10-30 22:15 ` Igor Stoppa
2018-10-31 10:11 ` Peter Zijlstra
2018-10-31 20:38 ` Andy Lutomirski
2018-10-31 20:53 ` Andy Lutomirski
2018-10-31 9:45 ` Peter Zijlstra
2018-10-30 21:35 ` Matthew Wilcox
2018-10-30 21:49 ` Igor Stoppa
2018-10-31 4:41 ` Andy Lutomirski
2018-10-31 9:08 ` Igor Stoppa
2018-10-31 19:38 ` Igor Stoppa
2018-10-31 10:02 ` Peter Zijlstra
2018-10-31 20:36 ` Andy Lutomirski
2018-10-31 21:00 ` Peter Zijlstra
2018-10-31 22:57 ` Andy Lutomirski
2018-10-31 23:10 ` Igor Stoppa
2018-10-31 23:19 ` Andy Lutomirski
2018-10-31 23:26 ` Igor Stoppa
2018-11-01 8:21 ` Thomas Gleixner
2018-11-01 15:58 ` Igor Stoppa
2018-11-01 17:08 ` Peter Zijlstra
2018-10-30 18:51 ` Andy Lutomirski
2018-10-30 19:14 ` Kees Cook
2018-10-30 21:25 ` Matthew Wilcox
2018-10-30 21:55 ` Igor Stoppa
2018-10-30 22:08 ` Matthew Wilcox
2018-10-31 9:29 ` Peter Zijlstra
2018-10-30 23:18 ` Nadav Amit
2018-10-31 9:08 ` Peter Zijlstra
2018-11-01 16:31 ` Nadav Amit
2018-11-02 21:11 ` Nadav Amit
2018-10-31 9:36 ` Peter Zijlstra
2018-10-31 11:33 ` Matthew Wilcox
2018-11-13 14:25 ` Igor Stoppa
2018-11-13 17:16 ` Andy Lutomirski
2018-11-13 17:43 ` Nadav Amit
2018-11-13 17:47 ` Andy Lutomirski
2018-11-13 18:06 ` Nadav Amit
2018-11-13 18:31 ` Igor Stoppa
2018-11-13 18:33 ` Igor Stoppa
2018-11-13 18:36 ` Andy Lutomirski
2018-11-13 19:03 ` Igor Stoppa
2018-11-21 16:34 ` Igor Stoppa
2018-11-21 17:36 ` Nadav Amit
2018-11-21 18:01 ` Igor Stoppa
2018-11-21 18:15 ` Andy Lutomirski
2018-11-22 19:27 ` Igor Stoppa
2018-11-22 20:04 ` Matthew Wilcox
2018-11-22 20:53 ` Andy Lutomirski
2018-12-04 12:34 ` Igor Stoppa
2018-11-13 18:48 ` Andy Lutomirski
2018-11-13 19:35 ` Igor Stoppa
2018-11-13 18:26 ` Igor Stoppa
2018-11-13 18:35 ` Andy Lutomirski
2018-11-13 19:01 ` Igor Stoppa
2018-10-31 9:27 ` Igor Stoppa
2018-10-26 11:09 ` Markus Heiser
2018-10-29 19:35 ` Igor Stoppa
2018-10-26 15:05 ` Jonathan Corbet
2018-10-29 19:38 ` Igor Stoppa
2018-10-29 20:35 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 11/17] prmem: llist: use designated initializer Igor Stoppa
2018-10-23 21:34 ` [PATCH 12/17] prmem: linked list: set alignment Igor Stoppa
2018-10-26 9:31 ` Peter Zijlstra
2018-10-23 21:35 ` [PATCH 13/17] prmem: linked list: disable layout randomization Igor Stoppa
2018-10-24 13:43 ` Alexey Dobriyan
2018-10-29 19:40 ` Igor Stoppa
2018-10-26 9:32 ` Peter Zijlstra
2018-10-26 10:17 ` Matthew Wilcox
2018-10-30 15:39 ` Peter Zijlstra
2018-10-23 21:35 ` [PATCH 14/17] prmem: llist, hlist, both plain and rcu Igor Stoppa
2018-10-24 11:37 ` Mathieu Desnoyers
2018-10-24 14:03 ` Igor Stoppa
2018-10-24 14:56 ` Tycho Andersen
2018-10-24 22:52 ` Igor Stoppa
2018-10-25 8:11 ` Tycho Andersen
2018-10-28 9:52 ` Steven Rostedt
2018-10-29 19:43 ` Igor Stoppa
2018-10-26 9:38 ` Peter Zijlstra
2018-10-23 21:35 ` [PATCH 15/17] prmem: test cases for prlist and prhlist Igor Stoppa
2018-10-23 21:35 ` [PATCH 16/17] prmem: pratomic-long Igor Stoppa
2018-10-25 0:13 ` Peter Zijlstra
2018-10-29 21:17 ` Igor Stoppa
2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 16:28 ` Will Deacon
2018-10-31 9:10 ` Peter Zijlstra
2018-11-01 3:28 ` Kees Cook
2018-10-23 21:35 ` [PATCH 17/17] prmem: ima: turn the measurements list write rare Igor Stoppa
2018-10-24 23:03 ` [RFC v1 PATCH 00/17] prmem: protected memory Dave Chinner
2018-10-29 19:47 ` Igor Stoppa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b09b3feb-8e3d-5fab-4e2b-35e8c252b27c@infradead.org \
--to=rdunlap@infradead.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@fromorbit.com \
--cc=igor.stoppa@gmail.com \
--cc=igor.stoppa@huawei.com \
--cc=jmorris@namei.org \
--cc=keescook@chromium.org \
--cc=kernel-hardening@lists.openwall.com \
--cc=labbott@redhat.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=willy@infradead.org \
--cc=zohar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).