All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
To: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Balbir Singh <bsingharora@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Eugene Syromiatnikov <esyr@redhat.com>,
	Florian Weimer <fweimer@redhat.com>,
	"H.J. Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromium.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>, Pavel Machek <pavel@ucw.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>,
	Dave Martin <Dave.Martin@arm.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Subject: [PATCH v7 01/27] Documentation/x86: Add CET description
Date: Thu,  6 Jun 2019 13:06:20 -0700	[thread overview]
Message-ID: <20190606200646.3951-2-yu-cheng.yu@intel.com> (raw)
In-Reply-To: <20190606200646.3951-1-yu-cheng.yu@intel.com>

Explain how CET works and the no_cet_shstk/no_cet_ibt kernel
parameters.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 .../admin-guide/kernel-parameters.txt         |   6 +
 Documentation/x86/index.rst                   |   1 +
 Documentation/x86/intel_cet.rst               | 268 ++++++++++++++++++
 3 files changed, 275 insertions(+)
 create mode 100644 Documentation/x86/intel_cet.rst

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 138f6664b2e2..b9b054f6c732 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2906,6 +2906,12 @@
 			noexec=on: enable non-executable mappings (default)
 			noexec=off: disable non-executable mappings
 
+	no_cet_ibt	[X86-64] Disable indirect branch tracking for user-mode
+			applications
+
+	no_cet_shstk	[X86-64] Disable shadow stack support for user-mode
+			applications
+
 	nosmap		[X86,PPC]
 			Disable SMAP (Supervisor Mode Access Prevention)
 			even if it is supported by processor.
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index ae36fc5fc649..47822ac02fe0 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,6 +21,7 @@ x86-specific Documentation
    pat
    protection-keys
    intel_mpx
+   intel_cet
    amd-memory-encryption
    pti
    mds
diff --git a/Documentation/x86/intel_cet.rst b/Documentation/x86/intel_cet.rst
new file mode 100644
index 000000000000..dac83bbf8a24
--- /dev/null
+++ b/Documentation/x86/intel_cet.rst
@@ -0,0 +1,268 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+Control-flow Enforcement Technology (CET)
+=========================================
+
+[1] Overview
+============
+
+Control-flow Enforcement Technology (CET) provides protection against
+return/jump-oriented programming (ROP) attacks.  It can be setup to
+protect both the kernel and applications.  In the first phase,
+only the user-mode protection is implemented in 64-bit mode; 32-bit
+applications are supported in compatibility mode.
+
+CET introduces shadow stack (SHSTK) and indirect branch tracking
+(IBT).  SHSTK is a secondary stack allocated from memory and cannot
+be directly modified by applications.  When executing a CALL, the
+processor pushes a copy of the return address to SHSTK.  Upon
+function return, the processor pops the SHSTK copy and compares it
+to the one from the program stack.  If the two copies differ, the
+processor raises a control-protection exception.  IBT verifies all
+indirect CALL/JMP targets are intended as marked by the compiler
+with 'ENDBR' opcodes (see CET instructions below).
+
+There are two kernel configuration options:
+
+    INTEL_X86_SHADOW_STACK_USER, and
+    INTEL_X86_BRANCH_TRACKING_USER.
+
+To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or later
+are required.  To build a CET-enabled application, GLIBC v2.28 or
+later is also required.
+
+There are two command-line options for disabling CET features:
+
+    no_cet_shstk - disables SHSTK, and
+    no_cet_ibt - disables IBT.
+
+At run time, /proc/cpuinfo shows the availability of SHSTK and IBT.
+
+[2] CET assembly instructions
+=============================
+
+RDSSP %r
+    Read the SHSTK pointer into %r.
+
+INCSSP %r
+    Unwind (increment) the SHSTK pointer (0 ~ 255) steps as indicated
+    in the operand register.  The GLIBC longjmp uses INCSSP to unwind
+    the SHSTK until that matches the program stack.  When it is
+    necessary to unwind beyond 255 steps, longjmp divides and repeats
+    the process.
+
+RSTORSSP (%r)
+    Switch to the SHSTK indicated in the 'restore token' pointed by
+    the operand register and replace the 'restore token' with a new
+    token to be saved (with SAVEPREVSSP) for the outgoing SHSTK.
+
+::
+
+                               Before RSTORSSP
+
+             Incoming SHSTK                   Current/Outgoing SHSTK
+
+        |----------------------|             |----------------------|
+ addr=x |                      |       ssp-> |                      |
+        |----------------------|             |----------------------|
+ (%r)-> | rstor_token=(x|Lg)   |    addr=y-8 |                      |
+        |----------------------|             |----------------------|
+
+                               After RSTORSSP
+
+        |----------------------|             |----------------------|
+        |                      |             |                      |
+        |----------------------|             |----------------------|
+  ssp-> | rstor_token=(y|Bz|Lg)|    addr=y-8 |                      |
+        |----------------------|             |----------------------|
+
+    note:
+        1. Only valid addresses and restore tokens can be on the
+           user-mode SHSTK.
+        2. A token is always of type u64 and must align to u64.
+        3. The incoming SHSTK pointer in a rstor_token must point to
+           immediately above the token.
+        4. 'Lg' is bit[0] of a rstor_token indicating a 64-bit SHSTK.
+        5. 'Bz' is bit[1] of a rstor_token indicating the token is to
+           be used only for the next SAVEPREVSSP and invalid for the
+           RSTORSSP.
+
+SAVEPREVSSP
+    Store the SHSTK 'restore token' pointed by
+        (current_SHSTK_pointer + 8).
+
+::
+
+                             After SAVEPREVSSP
+
+        |----------------------|             |----------------------|
+  ssp-> |                      |             |                      |
+        |----------------------|             |----------------------|
+        | rstor_token=(y|Bz|Lg)|    addr=y-8 | rstor_token(y|Lg)    |
+        |----------------------|             |----------------------|
+
+WRUSS %r0, (%r1)
+    Write the value in %r0 to the SHSTK address pointed by (%r1).
+    This is a kernel-mode only instruction.
+
+ENDBR
+    The compiler inserts an ENDBR at all valid branch targets.  Any
+    CALL/JMP to a target without an ENDBR triggers a control
+    protection fault.
+
+[3] Application Enabling
+========================
+
+An application's CET capability is marked in its ELF header and can
+be verified from the following command output, in the
+NT_GNU_PROPERTY_TYPE_0 field:
+
+    readelf -n <application>
+
+If an application supports CET and is statically linked, it will run
+with CET protection.  If the application needs any shared libraries,
+the loader checks all dependencies and enables CET only when all
+requirements are met.
+
+[4] Legacy Libraries
+====================
+
+GLIBC provides a few tunables for backward compatibility.
+
+GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-IBT
+    Turn off SHSTK/IBT for the current shell.
+
+GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>
+    This controls how dlopen() handles SHSTK legacy libraries:
+        on: continue with SHSTK enabled;
+        permissive: continue with SHSTK off.
+
+[5] CET system calls
+====================
+
+The following arch_prctl() system calls are added for CET:
+
+arch_prctl(ARCH_X86_CET_STATUS, unsigned long *addr)
+    Return CET feature status.
+
+    The parameter 'addr' is a pointer to a user buffer.
+    On returning to the caller, the kernel fills the following
+    information:
+
+    *addr = SHSTK/IBT status
+    *(addr + 1) = SHSTK base address
+    *(addr + 2) = SHSTK size
+
+arch_prctl(ARCH_X86_CET_DISABLE, unsigned long features)
+    Disable SHSTK and/or IBT specified in 'features'.  Return -EPERM
+    if CET is locked.
+
+arch_prctl(ARCH_X86_CET_LOCK)
+    Lock in CET feature.
+
+arch_prctl(ARCH_X86_CET_ALLOC_SHSTK, unsigned long *addr)
+    Allocate a new SHSTK and put a restore token at top.
+
+    The parameter 'addr' is a pointer to a user buffer and indicates
+    the desired SHSTK size to allocate.  On returning to the caller,
+    the kernel fills *addr with the base address of the new SHSTK.
+
+arch_prctl(ARCH_X86_CET_SET_LEGACY_BITMAP, unsigned long *addr)
+    Setup an IBT legacy code bitmap.
+
+    The parameter 'addr' is a pointer to a user buffer that has the
+    following information:
+
+    *addr = IBT bitmap base address
+    *(addr + 1) = IBT bitmap size
+
+Note:
+  There is no CET enabling arch_prctl function.  By design, CET is
+  enabled automatically if the binary and the system can support it.
+
+  The parameters passed are always unsigned 64-bit.  When an ia32
+  application passing pointers, it should only use the lower 32 bits.
+
+[6] The implementation of the SHSTK
+===================================
+
+SHSTK size
+----------
+
+A task's SHSTK is allocated from memory to a fixed size of
+RLIMIT_STACK.  A compat-mode thread's SHSTK size is 1/4 of
+RLIMIT_STACK.  The smaller 32-bit thread SHSTK allows more threads to
+share a 32-bit address space.
+
+Signal
+------
+
+The main program and its signal handlers use the same SHSTK.  Because
+the SHSTK stores only return addresses, a large SHSTK will cover the
+condition that both the program stack and the sigaltstack run out.
+
+The kernel creates a restore token at the SHSTK restoring address and
+verifies that token when restoring from the signal handler.
+
+Fork
+----
+
+The SHSTK's vma has VM_SHSTK flag set; its PTEs are required to be
+read-only and dirty.  When a SHSTK PTE is not present, RO, and dirty,
+a SHSTK access triggers a page fault with an additional SHSTK bit set
+in the page fault error code.
+
+When a task forks a child, its SHSTK PTEs are copied and both the
+parent's and the child's SHSTK PTEs are cleared of the dirty bit.
+Upon the next SHSTK access, the resulting SHSTK page fault is handled
+by page copy/re-use.
+
+When a pthread child is created, the kernel allocates a new SHSTK for
+the new thread.
+
+Setjmp/Longjmp
+--------------
+
+Longjmp unwinds SHSTK until it matches the program stack.
+
+Ucontext
+--------
+
+In GLIBC, getcontext/setcontext is implemented in similar way as
+setjmp/longjmp.
+
+When makecontext creates a new ucontext, a new SHSTK is allocated for
+that context with ARCH_X86_CET_ALLOC_SHSTK the syscall.  The kernel
+creates a restore token at the top of the new SHSTK and the user-mode
+code switches to the new SHSTK with the RSTORSSP instruction.
+
+[7] The management of read-only & dirty PTEs for SHSTK
+======================================================
+
+A RO and dirty PTE exists in the following cases:
+
+(a) A page is modified and then shared with a fork()'ed child;
+(b) A R/O page that has been COW'ed;
+(c) A SHSTK page.
+
+The processor only checks the dirty bit for (c).  To prevent the use
+of non-SHSTK memory as SHSTK, we use a spare bit of the 64-bit PTE as
+DIRTY_SW for (a) and (b) above.  This results to the following PTE
+settings:
+
+Modified PTE:             (R/W + DIRTY_HW)
+Modified and shared PTE:  (R/O + DIRTY_SW)
+R/O PTE, COW'ed:          (R/O + DIRTY_SW)
+SHSTK PTE:                (R/O + DIRTY_HW)
+SHSTK PTE, COW'ed:        (R/O + DIRTY_HW)
+SHSTK PTE, shared:        (R/O + DIRTY_SW)
+
+Note that DIRTY_SW is only used in R/O PTEs but not R/W PTEs.
+
+[8] The implementation of IBT
+=============================
+
+The kernel provides IBT support in mmap() of the legacy code bit map.
+However, the management of the bitmap is done in the GLIBC or the
+application.
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
To: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Balbir Singh <bsingharora@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Eugene Syromiatnikov <esyr@redhat.com>,
	Florian Weimer <fweimer@redhat.com>,
	"H.J. Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromium.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Nadav Amit <nadav.amit@gmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Subject: [PATCH v7 01/27] Documentation/x86: Add CET description
Date: Thu,  6 Jun 2019 13:06:20 -0700	[thread overview]
Message-ID: <20190606200646.3951-2-yu-cheng.yu@intel.com> (raw)
In-Reply-To: <20190606200646.3951-1-yu-cheng.yu@intel.com>

Explain how CET works and the no_cet_shstk/no_cet_ibt kernel
parameters.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 .../admin-guide/kernel-parameters.txt         |   6 +
 Documentation/x86/index.rst                   |   1 +
 Documentation/x86/intel_cet.rst               | 268 ++++++++++++++++++
 3 files changed, 275 insertions(+)
 create mode 100644 Documentation/x86/intel_cet.rst

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 138f6664b2e2..b9b054f6c732 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2906,6 +2906,12 @@
 			noexec=on: enable non-executable mappings (default)
 			noexec=off: disable non-executable mappings
 
+	no_cet_ibt	[X86-64] Disable indirect branch tracking for user-mode
+			applications
+
+	no_cet_shstk	[X86-64] Disable shadow stack support for user-mode
+			applications
+
 	nosmap		[X86,PPC]
 			Disable SMAP (Supervisor Mode Access Prevention)
 			even if it is supported by processor.
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index ae36fc5fc649..47822ac02fe0 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,6 +21,7 @@ x86-specific Documentation
    pat
    protection-keys
    intel_mpx
+   intel_cet
    amd-memory-encryption
    pti
    mds
diff --git a/Documentation/x86/intel_cet.rst b/Documentation/x86/intel_cet.rst
new file mode 100644
index 000000000000..dac83bbf8a24
--- /dev/null
+++ b/Documentation/x86/intel_cet.rst
@@ -0,0 +1,268 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+Control-flow Enforcement Technology (CET)
+=========================================
+
+[1] Overview
+============
+
+Control-flow Enforcement Technology (CET) provides protection against
+return/jump-oriented programming (ROP) attacks.  It can be setup to
+protect both the kernel and applications.  In the first phase,
+only the user-mode protection is implemented in 64-bit mode; 32-bit
+applications are supported in compatibility mode.
+
+CET introduces shadow stack (SHSTK) and indirect branch tracking
+(IBT).  SHSTK is a secondary stack allocated from memory and cannot
+be directly modified by applications.  When executing a CALL, the
+processor pushes a copy of the return address to SHSTK.  Upon
+function return, the processor pops the SHSTK copy and compares it
+to the one from the program stack.  If the two copies differ, the
+processor raises a control-protection exception.  IBT verifies all
+indirect CALL/JMP targets are intended as marked by the compiler
+with 'ENDBR' opcodes (see CET instructions below).
+
+There are two kernel configuration options:
+
+    INTEL_X86_SHADOW_STACK_USER, and
+    INTEL_X86_BRANCH_TRACKING_USER.
+
+To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or later
+are required.  To build a CET-enabled application, GLIBC v2.28 or
+later is also required.
+
+There are two command-line options for disabling CET features:
+
+    no_cet_shstk - disables SHSTK, and
+    no_cet_ibt - disables IBT.
+
+At run time, /proc/cpuinfo shows the availability of SHSTK and IBT.
+
+[2] CET assembly instructions
+=============================
+
+RDSSP %r
+    Read the SHSTK pointer into %r.
+
+INCSSP %r
+    Unwind (increment) the SHSTK pointer (0 ~ 255) steps as indicated
+    in the operand register.  The GLIBC longjmp uses INCSSP to unwind
+    the SHSTK until that matches the program stack.  When it is
+    necessary to unwind beyond 255 steps, longjmp divides and repeats
+    the process.
+
+RSTORSSP (%r)
+    Switch to the SHSTK indicated in the 'restore token' pointed by
+    the operand register and replace the 'restore token' with a new
+    token to be saved (with SAVEPREVSSP) for the outgoing SHSTK.
+
+::
+
+                               Before RSTORSSP
+
+             Incoming SHSTK                   Current/Outgoing SHSTK
+
+        |----------------------|             |----------------------|
+ addr=x |                      |       ssp-> |                      |
+        |----------------------|             |----------------------|
+ (%r)-> | rstor_token=(x|Lg)   |    addr=y-8 |                      |
+        |----------------------|             |----------------------|
+
+                               After RSTORSSP
+
+        |----------------------|             |----------------------|
+        |                      |             |                      |
+        |----------------------|             |----------------------|
+  ssp-> | rstor_token=(y|Bz|Lg)|    addr=y-8 |                      |
+        |----------------------|             |----------------------|
+
+    note:
+        1. Only valid addresses and restore tokens can be on the
+           user-mode SHSTK.
+        2. A token is always of type u64 and must align to u64.
+        3. The incoming SHSTK pointer in a rstor_token must point to
+           immediately above the token.
+        4. 'Lg' is bit[0] of a rstor_token indicating a 64-bit SHSTK.
+        5. 'Bz' is bit[1] of a rstor_token indicating the token is to
+           be used only for the next SAVEPREVSSP and invalid for the
+           RSTORSSP.
+
+SAVEPREVSSP
+    Store the SHSTK 'restore token' pointed by
+        (current_SHSTK_pointer + 8).
+
+::
+
+                             After SAVEPREVSSP
+
+        |----------------------|             |----------------------|
+  ssp-> |                      |             |                      |
+        |----------------------|             |----------------------|
+        | rstor_token=(y|Bz|Lg)|    addr=y-8 | rstor_token(y|Lg)    |
+        |----------------------|             |----------------------|
+
+WRUSS %r0, (%r1)
+    Write the value in %r0 to the SHSTK address pointed by (%r1).
+    This is a kernel-mode only instruction.
+
+ENDBR
+    The compiler inserts an ENDBR at all valid branch targets.  Any
+    CALL/JMP to a target without an ENDBR triggers a control
+    protection fault.
+
+[3] Application Enabling
+========================
+
+An application's CET capability is marked in its ELF header and can
+be verified from the following command output, in the
+NT_GNU_PROPERTY_TYPE_0 field:
+
+    readelf -n <application>
+
+If an application supports CET and is statically linked, it will run
+with CET protection.  If the application needs any shared libraries,
+the loader checks all dependencies and enables CET only when all
+requirements are met.
+
+[4] Legacy Libraries
+====================
+
+GLIBC provides a few tunables for backward compatibility.
+
+GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-IBT
+    Turn off SHSTK/IBT for the current shell.
+
+GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>
+    This controls how dlopen() handles SHSTK legacy libraries:
+        on: continue with SHSTK enabled;
+        permissive: continue with SHSTK off.
+
+[5] CET system calls
+====================
+
+The following arch_prctl() system calls are added for CET:
+
+arch_prctl(ARCH_X86_CET_STATUS, unsigned long *addr)
+    Return CET feature status.
+
+    The parameter 'addr' is a pointer to a user buffer.
+    On returning to the caller, the kernel fills the following
+    information:
+
+    *addr = SHSTK/IBT status
+    *(addr + 1) = SHSTK base address
+    *(addr + 2) = SHSTK size
+
+arch_prctl(ARCH_X86_CET_DISABLE, unsigned long features)
+    Disable SHSTK and/or IBT specified in 'features'.  Return -EPERM
+    if CET is locked.
+
+arch_prctl(ARCH_X86_CET_LOCK)
+    Lock in CET feature.
+
+arch_prctl(ARCH_X86_CET_ALLOC_SHSTK, unsigned long *addr)
+    Allocate a new SHSTK and put a restore token at top.
+
+    The parameter 'addr' is a pointer to a user buffer and indicates
+    the desired SHSTK size to allocate.  On returning to the caller,
+    the kernel fills *addr with the base address of the new SHSTK.
+
+arch_prctl(ARCH_X86_CET_SET_LEGACY_BITMAP, unsigned long *addr)
+    Setup an IBT legacy code bitmap.
+
+    The parameter 'addr' is a pointer to a user buffer that has the
+    following information:
+
+    *addr = IBT bitmap base address
+    *(addr + 1) = IBT bitmap size
+
+Note:
+  There is no CET enabling arch_prctl function.  By design, CET is
+  enabled automatically if the binary and the system can support it.
+
+  The parameters passed are always unsigned 64-bit.  When an ia32
+  application passing pointers, it should only use the lower 32 bits.
+
+[6] The implementation of the SHSTK
+===================================
+
+SHSTK size
+----------
+
+A task's SHSTK is allocated from memory to a fixed size of
+RLIMIT_STACK.  A compat-mode thread's SHSTK size is 1/4 of
+RLIMIT_STACK.  The smaller 32-bit thread SHSTK allows more threads to
+share a 32-bit address space.
+
+Signal
+------
+
+The main program and its signal handlers use the same SHSTK.  Because
+the SHSTK stores only return addresses, a large SHSTK will cover the
+condition that both the program stack and the sigaltstack run out.
+
+The kernel creates a restore token at the SHSTK restoring address and
+verifies that token when restoring from the signal handler.
+
+Fork
+----
+
+The SHSTK's vma has VM_SHSTK flag set; its PTEs are required to be
+read-only and dirty.  When a SHSTK PTE is not present, RO, and dirty,
+a SHSTK access triggers a page fault with an additional SHSTK bit set
+in the page fault error code.
+
+When a task forks a child, its SHSTK PTEs are copied and both the
+parent's and the child's SHSTK PTEs are cleared of the dirty bit.
+Upon the next SHSTK access, the resulting SHSTK page fault is handled
+by page copy/re-use.
+
+When a pthread child is created, the kernel allocates a new SHSTK for
+the new thread.
+
+Setjmp/Longjmp
+--------------
+
+Longjmp unwinds SHSTK until it matches the program stack.
+
+Ucontext
+--------
+
+In GLIBC, getcontext/setcontext is implemented in similar way as
+setjmp/longjmp.
+
+When makecontext creates a new ucontext, a new SHSTK is allocated for
+that context with ARCH_X86_CET_ALLOC_SHSTK the syscall.  The kernel
+creates a restore token at the top of the new SHSTK and the user-mode
+code switches to the new SHSTK with the RSTORSSP instruction.
+
+[7] The management of read-only & dirty PTEs for SHSTK
+======================================================
+
+A RO and dirty PTE exists in the following cases:
+
+(a) A page is modified and then shared with a fork()'ed child;
+(b) A R/O page that has been COW'ed;
+(c) A SHSTK page.
+
+The processor only checks the dirty bit for (c).  To prevent the use
+of non-SHSTK memory as SHSTK, we use a spare bit of the 64-bit PTE as
+DIRTY_SW for (a) and (b) above.  This results to the following PTE
+settings:
+
+Modified PTE:             (R/W + DIRTY_HW)
+Modified and shared PTE:  (R/O + DIRTY_SW)
+R/O PTE, COW'ed:          (R/O + DIRTY_SW)
+SHSTK PTE:                (R/O + DIRTY_HW)
+SHSTK PTE, COW'ed:        (R/O + DIRTY_HW)
+SHSTK PTE, shared:        (R/O + DIRTY_SW)
+
+Note that DIRTY_SW is only used in R/O PTEs but not R/W PTEs.
+
+[8] The implementation of IBT
+=============================
+
+The kernel provides IBT support in mmap() of the legacy code bit map.
+However, the management of the bitmap is done in the GLIBC or the
+application.
-- 
2.17.1

  reply	other threads:[~2019-06-06 20:15 UTC|newest]

Thread overview: 161+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 20:06 [PATCH v7 00/27] Control-flow Enforcement: Shadow Stack Yu-cheng Yu
2019-06-06 20:06 ` Yu-cheng Yu
2019-06-06 20:06 ` Yu-cheng Yu [this message]
2019-06-06 20:06   ` [PATCH v7 01/27] Documentation/x86: Add CET description Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 02/27] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 03/27] x86/fpu/xstate: Change names to separate XSAVES system and user states Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 04/27] x86/fpu/xstate: Introduce XSAVES system states Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 21:18   ` Dave Hansen
2019-06-06 21:18     ` Dave Hansen
2019-06-06 22:04     ` Andy Lutomirski
2019-06-06 22:04       ` Andy Lutomirski
2019-06-06 22:08       ` Dave Hansen
2019-06-06 22:08         ` Dave Hansen
2019-06-06 22:10         ` Yu-cheng Yu
2019-06-06 22:10           ` Yu-cheng Yu
2019-06-06 22:10           ` Yu-cheng Yu
2019-06-07  1:54         ` Andy Lutomirski
2019-06-07  1:54           ` Andy Lutomirski
2019-06-06 20:06 ` [PATCH v7 05/27] x86/fpu/xstate: Add XSAVES system states for shadow stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-07  7:07   ` Peter Zijlstra
2019-06-07  7:07     ` Peter Zijlstra
2019-06-07 16:14     ` Yu-cheng Yu
2019-06-07 16:14       ` Yu-cheng Yu
2019-06-07 16:14       ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 06/27] x86/cet: Add control protection exception handler Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 07/27] x86/cet/shstk: Add Kconfig option for user-mode shadow stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 08/27] mm: Introduce VM_SHSTK for shadow stack memory Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 09/27] mm/mmap: Prevent Shadow Stack VMA merges Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 10/27] x86/mm: Change _PAGE_DIRTY to _PAGE_DIRTY_HW Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 11/27] x86/mm: Introduce _PAGE_DIRTY_SW Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 12/27] drm/i915/gvt: Update _PAGE_DIRTY to _PAGE_DIRTY_BITS Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 13/27] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 14/27] x86/mm: Shadow stack page fault error checking Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 15/27] mm: Handle shadow stack page fault Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-07  7:30   ` Peter Zijlstra
2019-06-07  7:30     ` Peter Zijlstra
2019-06-06 20:06 ` [PATCH v7 16/27] mm: Handle THP/HugeTLB " Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 17/27] mm: Update can_follow_write_pte/pmd for shadow stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 18/27] mm: Introduce do_mmap_locked() Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-07  7:43   ` Peter Zijlstra
2019-06-07  7:43     ` Peter Zijlstra
2019-06-07  7:47     ` Peter Zijlstra
2019-06-07  7:47       ` Peter Zijlstra
2019-06-07 16:16       ` Yu-cheng Yu
2019-06-07 16:16         ` Yu-cheng Yu
2019-06-07 16:16         ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 19/27] x86/cet/shstk: User-mode shadow stack support Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 20/27] x86/cet/shstk: Introduce WRUSS instruction Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 21/27] x86/cet/shstk: Handle signals for shadow stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-07  7:58   ` Peter Zijlstra
2019-06-07  7:58     ` Peter Zijlstra
2019-06-07 16:17     ` Yu-cheng Yu
2019-06-07 16:17       ` Yu-cheng Yu
2019-06-07 16:17       ` Yu-cheng Yu
2019-06-07 18:01   ` Dave Martin
2019-06-07 18:01     ` Dave Martin
2019-06-10 16:29     ` Yu-cheng Yu
2019-06-10 16:29       ` Yu-cheng Yu
2019-06-10 16:29       ` Yu-cheng Yu
2019-06-10 16:57       ` Dave Martin
2019-06-10 16:57         ` Dave Martin
2019-06-10 17:24       ` Florian Weimer
2019-06-10 17:24         ` Florian Weimer
2019-06-10 17:24         ` Florian Weimer
2019-06-11 11:41         ` Dave Martin
2019-06-11 11:41           ` Dave Martin
2019-06-11 19:31           ` Yu-cheng Yu
2019-06-11 19:31             ` Yu-cheng Yu
2019-06-11 19:31             ` Yu-cheng Yu
2019-06-12  9:32             ` Dave Martin
2019-06-12  9:32               ` Dave Martin
2019-06-12 19:04               ` Yu-cheng Yu
2019-06-12 19:04                 ` Yu-cheng Yu
2019-06-12 19:04                 ` Yu-cheng Yu
2019-06-13 13:26                 ` Dave Martin
2019-06-13 13:26                   ` Dave Martin
2019-06-17 11:08               ` Florian Weimer
2019-06-17 11:08                 ` Florian Weimer
2019-06-17 11:08                 ` Florian Weimer
2019-06-17 12:20                 ` Thomas Gleixner
2019-06-17 12:20                   ` Thomas Gleixner
2019-06-17 12:20                   ` Thomas Gleixner
2019-06-18  9:12                   ` Dave Martin
2019-06-18  9:12                     ` Dave Martin
2019-06-18 12:41                     ` Peter Zijlstra
2019-06-18 12:41                       ` Peter Zijlstra
2019-06-18 12:47                       ` Florian Weimer
2019-06-18 12:47                         ` Florian Weimer
2019-06-18 12:47                         ` Florian Weimer
2019-06-18 12:55                         ` Peter Zijlstra
2019-06-18 12:55                           ` Peter Zijlstra
2019-06-18 13:32                           ` Dave Martin
2019-06-18 13:32                             ` Dave Martin
2019-06-18 13:32                             ` Dave Martin
2019-06-18 13:32                             ` Dave Martin
2019-06-18 14:58                             ` Yu-cheng Yu
2019-06-18 14:58                               ` Yu-cheng Yu
2019-06-18 14:58                               ` Yu-cheng Yu
2019-06-18 15:49                               ` Florian Weimer
2019-06-18 15:49                                 ` Florian Weimer
2019-06-18 15:49                                 ` Florian Weimer
2019-06-18 15:53                                 ` Yu-cheng Yu
2019-06-18 15:53                                   ` Yu-cheng Yu
2019-06-18 15:53                                   ` Yu-cheng Yu
2019-06-18 16:05                                   ` Florian Weimer
2019-06-18 16:05                                     ` Florian Weimer
2019-06-18 16:05                                     ` Florian Weimer
2019-06-18 16:00                                     ` Yu-cheng Yu
2019-06-18 16:00                                       ` Yu-cheng Yu
2019-06-18 16:00                                       ` Yu-cheng Yu
2019-06-18 16:20                                       ` Dave Martin
2019-06-18 16:20                                         ` Dave Martin
2019-06-18 16:25                                         ` Florian Weimer
2019-06-18 16:25                                           ` Florian Weimer
2019-06-18 16:25                                           ` Florian Weimer
2019-06-18 16:50                                           ` Dave Martin
2019-06-18 16:50                                             ` Dave Martin
2019-06-06 20:06 ` [PATCH v7 23/27] x86/cet/shstk: ELF header parsing of Shadow Stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-07  7:54   ` Peter Zijlstra
2019-06-07  7:54     ` Peter Zijlstra
2019-06-06 20:06 ` [PATCH v7 24/27] x86/cet/shstk: Handle thread shadow stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 25/27] mm/mmap: Add Shadow stack pages to memory accounting Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-11 17:55   ` Dave Hansen
2019-06-11 17:55     ` Dave Hansen
2019-06-11 19:22     ` Yu-cheng Yu
2019-06-11 19:22       ` Yu-cheng Yu
2019-06-11 19:22       ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 26/27] x86/cet/shstk: Add arch_prctl functions for Shadow Stack Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-06-06 20:06 ` [PATCH v7 27/27] x86/cet/shstk: Add Shadow Stack instructions to opcode map Yu-cheng Yu
2019-06-06 20:06   ` Yu-cheng Yu
2019-11-01 14:03   ` Adrian Hunter
2019-11-01 14:03     ` Adrian Hunter
2019-11-01 14:17     ` Yu-cheng Yu
2019-11-01 14:17       ` Yu-cheng Yu
2019-11-01 14:17       ` Yu-cheng Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190606200646.3951-2-yu-cheng.yu@intel.com \
    --to=yu-cheng.yu@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=bsingharora@gmail.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=esyr@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=gorcunov@gmail.com \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=oleg@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vedvyas.shanbhogue@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.