linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
To: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Balbir Singh <bsingharora@gmail.com>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Florian Weimer <fweimer@redhat.com>,
	"H.J. Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromiun.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>, Pavel Machek <pavel@ucw.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Subject: [RFC PATCH v2 05/27] Documentation/x86: Add CET description
Date: Tue, 10 Jul 2018 15:26:17 -0700	[thread overview]
Message-ID: <20180710222639.8241-6-yu-cheng.yu@intel.com> (raw)
In-Reply-To: <20180710222639.8241-1-yu-cheng.yu@intel.com>

Explain how CET works and the no_cet_shstk/no_cet_ibt kernel
parameters.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 .../admin-guide/kernel-parameters.txt         |   6 +
 Documentation/x86/intel_cet.txt               | 250 ++++++++++++++++++
 2 files changed, 256 insertions(+)
 create mode 100644 Documentation/x86/intel_cet.txt

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index efc7aa7a0670..dc787facdcde 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2661,6 +2661,12 @@
 			noexec=on: enable non-executable mappings (default)
 			noexec=off: disable non-executable mappings
 
+	no_cet_ibt	[X86-64] Disable indirect branch tracking for user-mode
+			applications
+
+	no_cet_shstk	[X86-64] Disable shadow stack support for user-mode
+			applications
+
 	nosmap		[X86]
 			Disable SMAP (Supervisor Mode Access Prevention)
 			even if it is supported by processor.
diff --git a/Documentation/x86/intel_cet.txt b/Documentation/x86/intel_cet.txt
new file mode 100644
index 000000000000..974bb8262146
--- /dev/null
+++ b/Documentation/x86/intel_cet.txt
@@ -0,0 +1,250 @@
+=========================================
+Control Flow Enforcement Technology (CET)
+=========================================
+
+[1] Overview
+============
+
+Control Flow Enforcement Technology (CET) provides protection against
+return/jump-oriented programing (ROP) attacks.  It can be implemented
+to protect both the kernel and applications.  In the first phase,
+only the user-mode protection is implemented for the 64-bit kernel.
+Thirty-two bit applications are supported under the compatibility
+mode.
+
+CET includes shadow stack (SHSTK) and indirect branch tracking (IBT)
+and they are enabled from two kernel configuration options:
+
+    INTEL_X86_SHADOW_STACK_USER, and
+    INTEL_X86_BRANCH_TRACKING_USER.
+
+To build a CET-enabled kernel, Binutils v2.30 and GCC v8.1 or later
+are required.  To build a CET-enabled application, GLIBC v2.29 or
+later is also requried.
+
+There are two command-line options for disabling CET features:
+
+    no_cet_shstk - disables SHSTK, and
+    no_cet_ibt - disables IBT.
+
+At run time, /proc/cpuinfo shows the availability of SHSTK and IBT.
+
+[2] CET assembly instructions
+=============================
+
+RDSSP %r
+    Read the SHSTK pointer into %r.
+
+INCSSP %r
+    Unwind (increment) the SHSTK pointer (0 ~ 255) steps as indicated
+    in the operand register.  The GLIBC longjmp uses INCSSP to unwind
+    the SHSTK until that matches the program stack.  When it is
+    necessary to unwind beyond 255 steps, longjmp divides and repeats
+    the process.
+
+RSTORSSP (%r)
+    Switch to the SHSTK indicated in the 'restore token' pointed by
+    the operand register and replace the 'restore token' with a new
+    token to be saved (with SAVEPREVSSP) for the outgoing SHSTK.
+
+                               Before RSTORSSP
+
+             Incoming SHSTK                   Current/Outgoing SHSTK
+
+        |----------------------|             |----------------------|
+ addr=x |                      |       ssp-> |                      |
+        |----------------------|             |----------------------|
+ (%r)-> | rstor_token=(x|Lg)   |    addr=y-8 |                      |
+        |----------------------|             |----------------------|
+
+                               After RSTORSSP
+
+        |----------------------|             |----------------------|
+  ssp-> |                      |             |                      |
+        |----------------------|             |----------------------|
+        | rstor_token=(y|Bz|Lg)|    addr=y-8 |                      |
+        |----------------------|             |----------------------|
+
+    note:
+        1. Only valid addresses and restore tokens can be on the
+           user-mode SHSTK.
+        2. A token is always of type u64 and must align to u64.
+        3. The incoming SHSTK pointer in a rstor_token must point to
+           immediately above the token.
+        4. 'Lg' is bit[0] of a rstor_token indicating a 64-bit SHSTK.
+        5. 'Bz' is bit[1] of a rstor_token indicating the token is to
+           be used only for the next SAVEPREVSSP and invalid for the
+           RSTORSSP.
+
+SAVEPREVSSP
+    Store the SHSTK 'restore token' pointed by
+        (current_SHSTK_pointer + 8).
+
+                             After SAVEPREVSSP
+
+        |----------------------|             |----------------------|
+  ssp-> |                      |             |                      |
+        |----------------------|             |----------------------|
+        | rstor_token=(y|Bz|Lg)|    addr=y-8 | rstor_token(y|Lg)    |
+        |----------------------|             |----------------------|
+
+WRUSS %r0, (%r1)
+    Write the value in %r0 to the SHSTK address pointed by (%r1).
+    This is a kernel-mode only instruction.
+
+ENDBR
+    The compiler inserts an ENDBR at all valid branch targets.  Any
+    CALL/JMP to a target without an ENDBR triggers a control
+    protection fault.
+
+[3] Application Enabling
+========================
+
+An application's CET capability is marked in its ELF header and can
+be verified from the following command output, in the
+NT_GNU_PROPERTY_TYPE_0 field:
+
+    readelf -n <application>
+
+If an application supports CET and is statically linked, it will run
+with CET protection.  If the application needs any shared libraries,
+the loader checks all dependencies and enables CET only when all
+requirements are met.
+
+[4] Legacy Libraries
+====================
+
+GLIBC provides a few tunables for backward compatibility.
+
+GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-IBT
+    Turn off SHSTK/IBT for the current shell.
+
+GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>
+    This controls how dlopen() handles SHSTK legacy libraries:
+        on: continue with SHSTK enabled;
+        permissive: continue with SHSTK off.
+
+[5] CET system calls
+====================
+
+The following arch_prctl() system calls are added for CET:
+
+arch_prctl(ARCH_CET_STATUS, unsigned long *addr)
+    Return CET feature status.
+
+    The parameter 'addr' is a pointer to a user buffer.
+    On returning to the caller, the kernel fills the following
+    information:
+
+    *addr = SHSTK/IBT status
+    *(addr + 1) = SHSTK base address
+    *(addr + 2) = SHSTK size
+
+arch_prctl(ARCH_CET_DISABLE, unsigned long features)
+    Disable SHSTK and/or IBT specified in 'features'.  Return -EPERM
+    if CET is locked out.
+
+arch_prctl(ARCH_CET_LOCK)
+    Lock out CET feature.
+
+arch_prctl(ARCH_CET_ALLOC_SHSTK, unsigned long *addr)
+    Allocate a new SHSTK.
+
+    The parameter 'addr' is a pointer to a user buffer and indicates
+    the desired SHSTK size to allocate.  On returning to the caller
+    the buffer contains the address of the new SHSTK.
+
+arch_prctl(ARCH_CET_LEGACY_BITMAP, unsigned long *addr)
+    Allocate an IBT legacy code bitmap if the current task does not
+    have one.
+
+    The parameter 'addr' is a pointer to a user buffer.
+    On returning to the caller, the kernel fills the following
+    information:
+
+    *addr = IBT bitmap base address
+    *(addr + 1) = IBT bitmap size
+
+[6] The implementation of the SHSTK
+===================================
+
+SHSTK size
+----------
+
+A task's SHSTK is allocated from memory to a fixed size that can
+support 32 KB nested function calls; that is 256 KB for a 64-bit
+application and 128 KB for a 32-bit application.  The system admin
+can change the default size.
+
+Signal
+------
+
+The main program and its signal handlers use the same SHSTK.  Because
+the SHSTK stores only return addresses, we can estimate a large
+enough SHSTK to cover the condition that both the program stack and
+the sigaltstack run out.
+
+The kernel creates a restore token at the SHSTK restoring address and
+verifies that token when restoring from the signal handler.
+
+Fork
+----
+
+The SHSTK's vma has VM_SHSTK flag set; its PTEs are required to be
+read-only and dirty.  When a SHSTK PTE is not present, RO, and dirty,
+a SHSTK access triggers a page fault with an additional SHSTK bit set
+in the page fault error code.
+
+When a task forks a child, its SHSTK PTEs are copied and both the
+parent's and the child's SHSTK PTEs are cleared of the dirty bit.
+Upon the next SHSTK access, the resulting SHSTK page fault is handled
+by page copy/re-use.
+
+When a pthread child is created, the kernel allocates a new SHSTK for
+the new thread.
+
+Setjmp/Longjmp
+--------------
+
+Longjmp unwinds SHSTK until it matches the program stack.
+
+Ucontext
+--------
+
+In GLIBC, getcontext/setcontext is implemented in similar way as
+setjmp/longjmp.
+
+When makecontext creates a new ucontext, a new SHSTK is allocated for
+that context with ARCH_CET_ALLOC_SHSTK the syscall.  The kernel
+creates a restore token at the top of the new SHSTK and the user-mode
+code switches to the new SHSTK with the RSTORSSP instruction.
+
+[7] The management of read-only & dirty PTEs for SHSTK
+======================================================
+
+A RO and dirty PTE exists in the following cases:
+
+(a) A page is modified and then shared with a fork()'ed child;
+(b) A R/O page that has been COW'ed;
+(c) A SHSTK page.
+
+The processor only checks the dirty bit for (c).  To prevent the use
+of non-SHSTK memory as SHSTK, we use a spare bit of the 64-bit PTE as
+DIRTY_SW for (a) and (b) above.  This results to the following PTE
+settings:
+
+Modified PTE:             (R/W + DIRTY_HW)
+Modified and shared PTE:  (R/O + DIRTY_SW)
+R/O PTE, COW'ed:          (R/O + DIRTY_SW)
+SHSTK PTE:                (R/O + DIRTY_HW)
+SHSTK PTE, COW'ed:        (R/O + DIRTY_HW)
+SHSTK PTE, shared:        (R/O + DIRTY_SW)
+
+Note that DIRTY_SW is only used in R/O PTEs but not R/W PTEs.
+
+[8] The implementation of IBT
+=============================
+
+The kernel provides IBT support in mmap() of the legacy code bit map.
+However, the management of the bitmap is done in the GLIBC or the
+application.
-- 
2.17.1


  parent reply	other threads:[~2018-07-10 22:33 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-10 22:26 [RFC PATCH v2 00/27] Control Flow Enforcement (CET) Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 01/27] x86/cpufeatures: Add CPUIDs for Control-flow Enforcement Technology (CET) Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 02/27] x86/fpu/xstate: Change some names to separate XSAVES system and user states Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 03/27] x86/fpu/xstate: Enable XSAVES system states Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 04/27] x86/fpu/xstate: Add XSAVES system states for shadow stack Yu-cheng Yu
2018-07-10 22:26 ` Yu-cheng Yu [this message]
2018-07-11  8:27   ` [RFC PATCH v2 05/27] Documentation/x86: Add CET description Pavel Machek
2018-07-11 15:25     ` Yu-cheng Yu
2018-07-11  9:57   ` Florian Weimer
2018-07-11 13:47     ` H.J. Lu
2018-07-11 14:53       ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 06/27] x86/cet: Control protection exception handler Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 07/27] x86/cet/shstk: Add Kconfig option for user-mode shadow stack Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 08/27] mm: Introduce VM_SHSTK for shadow stack memory Yu-cheng Yu
2018-07-11  8:34   ` Peter Zijlstra
2018-07-11 16:15     ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 09/27] x86/mm: Change _PAGE_DIRTY to _PAGE_DIRTY_HW Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 10/27] x86/mm: Introduce _PAGE_DIRTY_SW Yu-cheng Yu
2018-07-11  8:45   ` Peter Zijlstra
2018-07-11  9:21   ` Peter Zijlstra
2018-07-10 22:26 ` [RFC PATCH v2 11/27] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Yu-cheng Yu
2018-07-10 22:44   ` Dave Hansen
2018-07-10 23:23     ` Nadav Amit
2018-07-10 23:52       ` Dave Hansen
2018-07-11  8:48     ` Peter Zijlstra
2018-07-10 22:26 ` [RFC PATCH v2 12/27] x86/mm: Shadow stack page fault error checking Yu-cheng Yu
2018-07-10 22:52   ` Dave Hansen
2018-07-11 17:28     ` Yu-cheng Yu
2018-07-10 23:24   ` Dave Hansen
2018-07-10 22:26 ` [RFC PATCH v2 13/27] mm: Handle shadow stack page fault Yu-cheng Yu
2018-07-10 23:06   ` Dave Hansen
2018-07-11  9:06     ` Peter Zijlstra
2018-08-14 21:28       ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 14/27] mm: Handle THP/HugeTLB " Yu-cheng Yu
2018-07-10 23:08   ` Dave Hansen
2018-07-11  9:10   ` Peter Zijlstra
2018-07-11 16:11     ` Yu-cheng Yu
2018-07-20 14:20   ` Dave Hansen
2018-07-20 14:58     ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 15/27] mm/mprotect: Prevent mprotect from changing shadow stack Yu-cheng Yu
2018-07-10 23:10   ` Dave Hansen
2018-07-11  9:12     ` Peter Zijlstra
2018-07-11 16:07       ` Yu-cheng Yu
2018-07-11 16:22         ` Dave Hansen
2018-07-10 22:26 ` [RFC PATCH v2 16/27] mm: Modify can_follow_write_pte/pmd for " Yu-cheng Yu
2018-07-10 23:37   ` Dave Hansen
2018-07-11 17:05     ` Yu-cheng Yu
2018-07-13 18:26       ` Dave Hansen
2018-07-17 23:03         ` Yu-cheng Yu
2018-07-17 23:11           ` Dave Hansen
2018-07-17 23:15           ` Dave Hansen
2018-07-18 20:14             ` Yu-cheng Yu
2018-07-18 21:45               ` Dave Hansen
2018-07-18 23:10                 ` Yu-cheng Yu
2018-07-19  0:06                   ` Dave Hansen
2018-07-19 17:06                     ` Yu-cheng Yu
2018-07-19 19:31                       ` Dave Hansen
2018-07-11  9:29   ` Peter Zijlstra
2018-07-17 23:00     ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support Yu-cheng Yu
2018-07-10 23:40   ` Dave Hansen
2018-07-11  9:34   ` Peter Zijlstra
2018-07-11 15:45     ` Yu-cheng Yu
2018-07-11  9:36   ` Peter Zijlstra
2018-07-11 21:10   ` Jann Horn
2018-07-11 21:34     ` Andy Lutomirski
2018-07-11 21:51       ` Jann Horn
2018-07-11 22:21         ` Andy Lutomirski
2018-07-13 18:03           ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 18/27] x86/cet/shstk: Introduce WRUSS instruction Yu-cheng Yu
2018-07-10 23:48   ` Dave Hansen
2018-07-12 22:59     ` Yu-cheng Yu
2018-07-12 23:49       ` Dave Hansen
2018-07-13  1:50         ` Dave Hansen
2018-07-13  2:21           ` Andy Lutomirski
2018-07-13  4:16             ` Dave Hansen
2018-07-13  4:18               ` Dave Hansen
2018-07-13 17:39                 ` Yu-cheng Yu
2018-07-13  5:55               ` Andy Lutomirski
2018-07-11  9:44   ` Peter Zijlstra
2018-07-11 15:06     ` Yu-cheng Yu
2018-07-11 15:30       ` Peter Zijlstra
2018-07-11  9:45   ` Peter Zijlstra
2018-07-11 14:58     ` Yu-cheng Yu
2018-07-11 15:27       ` Peter Zijlstra
2018-07-11 15:41         ` Yu-cheng Yu
2018-07-13 12:12   ` Dave Hansen
2018-07-13 17:37     ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 19/27] x86/cet/shstk: Signal handling for shadow stack Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 20/27] x86/cet/shstk: ELF header parsing of CET Yu-cheng Yu
2018-07-11 11:12   ` Florian Weimer
2018-07-11 19:37   ` Jann Horn
2018-07-11 20:53     ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 21/27] x86/cet/ibt: Add Kconfig option for user-mode Indirect Branch Tracking Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 22/27] x86/cet/ibt: User-mode indirect branch tracking support Yu-cheng Yu
2018-07-11  0:11   ` Dave Hansen
2018-07-11 22:10     ` Yu-cheng Yu
2018-07-11 22:40       ` Dave Hansen
2018-07-11 23:00         ` Yu-cheng Yu
2018-07-11 23:16           ` Dave Hansen
2018-07-13 17:56             ` Yu-cheng Yu
2018-07-13 18:05               ` Dave Hansen
2018-07-11 21:07   ` Jann Horn
2018-07-10 22:26 ` [RFC PATCH v2 23/27] mm/mmap: Add IBT bitmap size to address space limit check Yu-cheng Yu
2018-07-10 23:57   ` Dave Hansen
2018-07-11 16:56     ` Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 24/27] x86: Insert endbr32/endbr64 to vDSO Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 25/27] x86/cet: Add PTRACE interface for CET Yu-cheng Yu
2018-07-11 10:20   ` Ingo Molnar
2018-07-11 15:40     ` Yu-cheng Yu
2018-07-12 14:03       ` Ingo Molnar
2018-07-12 22:37         ` Yu-cheng Yu
2018-07-12 23:08           ` Thomas Gleixner
2018-07-13 16:07             ` Yu-cheng Yu
2018-07-13  6:28         ` Pavel Machek
2018-07-13 13:33           ` Ingo Molnar
2018-07-14  6:27             ` Pavel Machek
2018-07-10 22:26 ` [RFC PATCH v2 26/27] x86/cet/shstk: Handle thread shadow stack Yu-cheng Yu
2018-07-10 22:26 ` [RFC PATCH v2 27/27] x86/cet: Add arch_prctl functions for CET Yu-cheng Yu
2018-07-11 12:19   ` Florian Weimer
2018-07-11 21:02     ` Yu-cheng Yu
2018-07-11 19:45   ` Jann Horn
2018-07-11 20:55     ` Yu-cheng Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180710222639.8241-6-yu-cheng.yu@intel.com \
    --to=yu-cheng.yu@intel.com \
    --cc=arnd@arndb.de \
    --cc=bsingharora@gmail.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=fweimer@redhat.com \
    --cc=gorcunov@gmail.com \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=keescook@chromiun.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=oleg@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vedvyas.shanbhogue@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).