All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Intel MPX support
@ 2014-02-12 18:36 Qiaowei Ren
  2014-02-12 18:36 ` [PATCH v4 1/3] x86, mpx: add documentation on Intel MPX Qiaowei Ren
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Qiaowei Ren @ 2014-02-12 18:36 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar
  Cc: x86, linux-kernel, Qiaowei Ren

This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
    decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
    extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Qiaowei Ren (3):
  x86, mpx: add documentation on Intel MPX
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
    information

 Documentation/x86/intel_mpx.txt    |  239 ++++++++++++++++++++++++++
 arch/x86/include/asm/mpx.h         |   54 ++++++
 arch/x86/kernel/Makefile           |    1 +
 arch/x86/kernel/mpx.c              |  333 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/traps.c            |   61 +++++++-
 include/uapi/asm-generic/siginfo.h |    9 +-
 kernel/signal.c                    |    4 +
 7 files changed, 699 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/3] x86, mpx: add documentation on Intel MPX
  2014-02-12 18:36 [PATCH v4 0/3] Intel MPX support Qiaowei Ren
@ 2014-02-12 18:36 ` Qiaowei Ren
  2014-02-12 18:36 ` [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables Qiaowei Ren
  2014-02-12 18:36 ` [PATCH v4 3/3] x86, mpx: extend siginfo structure to include bound violation information Qiaowei Ren
  2 siblings, 0 replies; 7+ messages in thread
From: Qiaowei Ren @ 2014-02-12 18:36 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar
  Cc: x86, linux-kernel, Qiaowei Ren

This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 Documentation/x86/intel_mpx.txt |  239 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 239 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 0000000..9af8636
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,239 @@
+1. Intel(R) MPX Overview
+========================
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. MPX architecture is designed to
+allow a machine (i.e., the processor(s) and the OS software)
+to run both MPX enabled software and legacy software that
+is MPX unaware. In such a case, the legacy software does not
+benefit from MPX, but it also does not experience any change
+in functionality or reduction in performance.
+
+Intel(R) MPX Programming Model
+------------------------------
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+--------------------------------------------
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. The linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+----------------------------------------
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 are
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+
+2. How to get the advantage of MPX
+==================================
+
+
+To get the advantage of MPX, changes are required in
+the OS kernel, binutils, compiler, and system libraries support.
+
+MPX support in the GNU toolchain
+--------------------------------
+
+This section describes changes in GNU Binutils, GCC and Glibc
+to support MPX.
+
+The first step of MPX support is to implement support for new
+hardware features in binutils and the GCC.
+
+The second step is implementation of MPX instrumentation pass
+in the GCC compiler which is responsible for instrumenting all
+memory accesses with pointer checks. Compiler changes for runtime
+bound checks include:
+
+  * Bounds creation for statically allocated objects, objects
+    allocated on the stack and statically initialized pointers.
+
+  * MPX support in ABI: ABI extension allows passing bounds for
+    the pointers passed as function arguments and provides returned
+    bounds with the pointers.
+
+  * Bounds table content management: each pointer that is stored
+    into memory should have its bounds stored in the corresponding
+    row of the bounds table; compiler generates appropriate code
+    to have the bounds table in the consistent state.
+
+  * Memory accesses instrumentation: compiler analyzes data flow
+    to compute bounds corresponding to each memory access and
+    inserts code to check used address against computed bounds.
+
+Dynamically created objects in heap using memory allocators need
+to set bounds for objects (buffers) at allocation time. So the
+next step is to add MPX support into standard memory allocators
+in Glibc.
+
+To have full protection, an application has to use libraries
+compiled with MPX instrumentation. It means we had to compile
+Glibc with the MPX-enabled GCC compiler because it is used in
+most applications. Also we had to add MPX instrumentation to all
+the necessary Glibc routines (e.g. memcpy) written in assembler.
+
+A new GCC option -fmpx is introduced to utilize MPX instructions.
+Also binutils with MPX enabled should be used to get binaries
+with memory protection.
+
+Consider the following simple test for MPX compiled program:
+
+	int main(int argc, char* argv)
+	{
+		int buf[100];
+		return buf[argc];
+	}
+
+Snippet of the original assembler output (compiled with -O2):
+
+	movslq  %edi, %rdi
+	movl    -120(%rsp,%rdi,4), %eax  // memory access buf[argc]
+
+Compile test as follows: mpx-gcc/gcc test.c -fmpx -O2
+
+Resulted assembler snippet:
+
+        movl    $399, %edx
+        movslq  %edi, %rdi	// rdi contains value of argc
+        leaq    -104(%rsp), %rax	// load start address of buf to rax
+        bndmk   (%rax,%rdx), %bnd0	//  create bounds for buf
+        bndcl   (%rax,%rdi,4), %bnd0	// check that memory access doesn't
+					// violate buf's low bound
+        bndcu   3(%rax,%rdi,4), %bnd0	// check that memory access doesn't
+					// violate buf's upper bound
+        movl    -104(%rsp,%rdi,4), %eax	// original memory access
+
+Code looks pretty clear. Note only that we added displacement 3 for
+upper bound checking since we have 4 byte (integer) access here.
+
+Several MPX-specific compiler options besides -fmpx were introduced
+in the compiler. Most of them, like -fmpx-check-read and
+-fmpx-check-write, control number of inserted runtime bound checks.
+Also developers always can use intrinsics to insert MPX instructions
+manually.
+
+Currently GCC compiler sources with MPX support is available in a
+separate branch in common GCC SVN repository. See GCC SVN page
+(http://gcc.gnu.org/svn.html) for details.
+
+Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+MPX runtime support
+-------------------
+
+Enabling an application to use MPX will generally not require source
+code updates but there is some runtime code needed in order to make
+use of MPX. For most applications this runtime support will be available
+by linking to a library supplied by the compiler or possibly it will
+come directly from the OS once OS versions that support MPX are available.
+
+The runtime is responsible for configuring and enabling MPX. The
+configuration and enabling of MPX consists of the runtime writing
+the base address of the Bound Directory(BD) to the BNDCFGU register
+and setting the enable bit.
+
+MPX kernel support
+------------------
+
+MPX kernel code has mainly the following responsibilities.
+
+1) Providing handlers for bounds faults (#BR).
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults. If a bounds overflow occurs then a #BR is generated.
+The fault handler will decode MPX instructions to get violation
+address and set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87		/* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88		struct {
+89			void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91			int _trapno;	/* TRAP # which caused the signal */
+92 #endif
+93			short _addr_lsb; /* LSB of the reported address */
+94			struct {
+95				void __user *_lower;
+96				void __user *_upper;
+97			} _addr_bnd;
+98		} _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+The other case that generates a #BR is when a BNDSTX instruction
+attempts to save bounds to a BD entry marked as invalid. This is
+an indication that no BT exists for this entry. In this case the
+fault handler will allocate a new BT.
+
+2) Managing bounds memory.
+
+MPX defines 4 sets of bound registers. When an application needs
+more than 4 sets of bounds it uses the BNDSTX instruction to save
+the additional bounds out to memory. The kernel dynamically allocates
+the memory used to store these bounds. The bounds memory is organized
+into a 2-level structure consisting of a BD which contains pointers
+to a set of Bound Tables (BT) which contain the actual bound information.
+In order to minimize the Intel MPX memory usage the BTs are allocated
+on demand by the Intel MPX runtime.
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables
  2014-02-12 18:36 [PATCH v4 0/3] Intel MPX support Qiaowei Ren
  2014-02-12 18:36 ` [PATCH v4 1/3] x86, mpx: add documentation on Intel MPX Qiaowei Ren
@ 2014-02-12 18:36 ` Qiaowei Ren
  2014-02-12 20:19   ` Andy Lutomirski
  2014-02-12 18:36 ` [PATCH v4 3/3] x86, mpx: extend siginfo structure to include bound violation information Qiaowei Ren
  2 siblings, 1 reply; 7+ messages in thread
From: Qiaowei Ren @ 2014-02-12 18:36 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar
  Cc: x86, linux-kernel, Qiaowei Ren

An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 arch/x86/include/asm/mpx.h |   35 ++++++++++++++++++++++++++++
 arch/x86/kernel/Makefile   |    1 +
 arch/x86/kernel/mpx.c      |   44 +++++++++++++++++++++++++++++++++++
 arch/x86/kernel/traps.c    |   55 +++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 134 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 0000000..d074153
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,35 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include <linux/types.h>
+#include <asm/ptrace.h>
+
+#ifdef CONFIG_X86_64
+
+#define MPX_L1_BITS	28
+#define MPX_L1_SHIFT	3
+#define MPX_L2_BITS	17
+#define MPX_L2_SHIFT	5
+#define MPX_IGN_BITS	3
+#define MPX_L2_NODE_ADDR_MASK	0xfffffffffffffff8UL
+
+#define MPX_BNDSTA_ADDR_MASK	0xfffffffffffffffcUL
+#define MPX_BNDCFG_ADDR_MASK	0xfffffffffffff000UL
+
+#else
+
+#define MPX_L1_BITS	20
+#define MPX_L1_SHIFT	2
+#define MPX_L2_BITS	10
+#define MPX_L2_SHIFT	4
+#define MPX_IGN_BITS	2
+#define MPX_L2_NODE_ADDR_MASK	0xfffffffcUL
+
+#define MPX_BNDSTA_ADDR_MASK	0xfffffffcUL
+#define MPX_BNDCFG_ADDR_MASK	0xfffff000UL
+
+#endif
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index cb648c8..becb970 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT)	+= preempt.o
 
 obj-y				+= process.o
 obj-y				+= i387.o xsave.o
+obj-y				+= mpx.o
 obj-y				+= ptrace.o
 obj-$(CONFIG_X86_32)		+= tls.o
 obj-$(CONFIG_IA32_EMULATION)	+= tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 0000000..e055e0e
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,44 @@
+#include <linux/kernel.h>
+#include <linux/syscalls.h>
+#include <asm/processor.h>
+#include <asm/mpx.h>
+#include <asm/mman.h>
+#include <asm/i387.h>
+#include <asm/fpu-internal.h>
+#include <asm/alternative.h>
+
+static bool allocate_bt(unsigned long bd_entry)
+{
+	unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+	unsigned long bt_addr, old_val = 0;
+
+	bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+			MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+	if (bt_addr == -1) {
+		pr_err("L2 Node Allocation Failed at L1 addr %lx\n",
+				bd_entry);
+		return false;
+	}
+	bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01;
+
+	user_atomic_cmpxchg_inatomic(&old_val,
+			(long __user *)bd_entry, 0, bt_addr);
+	if (old_val)
+		vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size);
+
+	return true;
+}
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+	unsigned long status;
+	unsigned long bd_entry, bd_base;
+	unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+	bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+	status = xsave_buf->bndcsr.status_reg;
+
+	bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+	if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+		allocate_bt(bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..fe09b3d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include <asm/fixmap.h>
 #include <asm/mach_traps.h>
 #include <asm/alternative.h>
+#include <asm/mpx.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code)	\
 
 DO_ERROR_INFO(X86_TRAP_DE,     SIGFPE,  "divide error",			divide_error,		     FPE_INTDIV, regs->ip )
 DO_ERROR     (X86_TRAP_OF,     SIGSEGV, "overflow",			overflow					  )
-DO_ERROR     (X86_TRAP_BR,     SIGSEGV, "bounds",			bounds						  )
 DO_ERROR_INFO(X86_TRAP_UD,     SIGILL,  "invalid opcode",		invalid_op,		     ILL_ILLOPN, regs->ip )
 DO_ERROR     (X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment overrun",	coprocessor_segment_overrun			  )
 DO_ERROR     (X86_TRAP_TS,     SIGSEGV, "invalid TSS",			invalid_TSS					  )
@@ -263,6 +263,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+	enum ctx_state prev_state;
+	unsigned long status;
+	struct xsave_struct *xsave_buf;
+	struct task_struct *tsk = current;
+
+	prev_state = exception_enter();
+	if (notify_die(DIE_TRAP, "bounds", regs, error_code,
+			X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)
+		goto exit;
+	conditional_sti(regs);
+
+	if (!user_mode(regs))
+		die("bounds", regs, error_code);
+
+	if (!boot_cpu_has(X86_FEATURE_MPX)) {
+		/* The exception is not from Intel MPX */
+		do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
+		goto exit;
+	}
+
+	fpu_xsave(&tsk->thread.fpu);
+	xsave_buf = &(tsk->thread.fpu.state->xsave);
+	status = xsave_buf->bndcsr.status_reg;
+
+	/*
+	 * The error code field of the BNDSTATUS register communicates status
+	 * information of a bound range exception #BR or operation involving
+	 * bound directory.
+	 */
+	switch (status & 0x3) {
+	case 2:
+		/*
+		 * Bound directory has invalid entry.
+		 * No signal will be sent to the user space.
+		 */
+		do_mpx_bt_fault(xsave_buf);
+		break;
+
+	case 1: /* Bound violation. */
+	case 0: /* No exception caused by Intel MPX operations. */
+		do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
+		break;
+
+	default:
+		die("bounds", regs, error_code);
+	}
+
+exit:
+	exception_exit(prev_state);
+}
+
 dotraplinkage void __kprobes
 do_general_protection(struct pt_regs *regs, long error_code)
 {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 3/3] x86, mpx: extend siginfo structure to include bound violation information
  2014-02-12 18:36 [PATCH v4 0/3] Intel MPX support Qiaowei Ren
  2014-02-12 18:36 ` [PATCH v4 1/3] x86, mpx: add documentation on Intel MPX Qiaowei Ren
  2014-02-12 18:36 ` [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables Qiaowei Ren
@ 2014-02-12 18:36 ` Qiaowei Ren
  2 siblings, 0 replies; 7+ messages in thread
From: Qiaowei Ren @ 2014-02-12 18:36 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar
  Cc: x86, linux-kernel, Qiaowei Ren

This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

These fields will be set in #BR exception handler by decoding
the user instruction and constructing the faulting pointer.
A userspace application can get violation address, lower bound
and upper bound for bound violation from this new siginfo structure.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 arch/x86/include/asm/mpx.h         |   19 +++
 arch/x86/kernel/mpx.c              |  289 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/traps.c            |    6 +
 include/uapi/asm-generic/siginfo.h |    9 +-
 kernel/signal.c                    |    4 +
 5 files changed, 326 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index d074153..3129b1e 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <asm/ptrace.h>
+#include <asm/insn.h>
 
 #ifdef CONFIG_X86_64
 
@@ -30,6 +31,24 @@
 
 #endif
 
+struct mpx_insn {
+	struct insn_field rex_prefix;	/* REX prefix */
+	struct insn_field modrm;
+	struct insn_field sib;
+	struct insn_field displacement;
+
+	unsigned char addr_bytes;	/* effective address size */
+	unsigned char limit;
+	unsigned char x86_64;
+
+	const unsigned char *kaddr;	/* kernel address of insn to analyze */
+	const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE	15
+
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+		struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index e055e0e..f95abc2 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -7,6 +7,270 @@
 #include <asm/fpu-internal.h>
 #include <asm/alternative.h>
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+			     reg_type_t type)
+{
+	int regno = 0;
+	unsigned char modrm = (unsigned char)insn->modrm.value;
+	unsigned char sib = (unsigned char)insn->sib.value;
+
+	static const int regoff[] = {
+		offsetof(struct pt_regs, ax),
+		offsetof(struct pt_regs, cx),
+		offsetof(struct pt_regs, dx),
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, sp),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+		offsetof(struct pt_regs, r8),
+		offsetof(struct pt_regs, r9),
+		offsetof(struct pt_regs, r10),
+		offsetof(struct pt_regs, r11),
+		offsetof(struct pt_regs, r12),
+		offsetof(struct pt_regs, r13),
+		offsetof(struct pt_regs, r14),
+		offsetof(struct pt_regs, r15),
+#endif
+	};
+
+	switch (type) {
+	case REG_TYPE_RM:
+		regno = X86_MODRM_RM(modrm);
+		if (X86_REX_B(insn->rex_prefix.value) == 1)
+			regno += 8;
+		break;
+
+	case REG_TYPE_INDEX:
+		regno = X86_SIB_INDEX(sib);
+		if (X86_REX_X(insn->rex_prefix.value) == 1)
+			regno += 8;
+		break;
+
+	case REG_TYPE_BASE:
+		regno = X86_SIB_BASE(sib);
+		if (X86_REX_B(insn->rex_prefix.value) == 1)
+			regno += 8;
+		break;
+
+	default:
+		break;
+	}
+
+	return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+	unsigned long addr;
+	unsigned long base;
+	unsigned long indx;
+	unsigned char modrm = (unsigned char)insn->modrm.value;
+	unsigned char sib = (unsigned char)insn->sib.value;
+
+	if (X86_MODRM_MOD(modrm) == 3) {
+		addr = get_reg(insn, regs, REG_TYPE_RM);
+	} else {
+		if (insn->sib.nbytes) {
+			base = get_reg(insn, regs, REG_TYPE_BASE);
+			indx = get_reg(insn, regs, REG_TYPE_INDEX);
+			addr = base + indx * (1 << X86_SIB_SCALE(sib));
+		} else {
+			addr = get_reg(insn, regs, REG_TYPE_RM);
+		}
+		addr += insn->displacement.value;
+	}
+
+	return addr;
+}
+
+/* Verify next sizeof(t) bytes can be on the same instruction */
+#define validate_next(t, insn, n)	\
+	((insn)->next_byte + sizeof(t) + n - (insn)->kaddr <= (insn)->limit)
+
+#define __get_next(t, insn)		\
+({					\
+	t r = *(t *)insn->next_byte;	\
+	insn->next_byte += sizeof(t);	\
+	r;				\
+})
+
+#define __peek_next(t, insn)		\
+({					\
+	t r = *(t *)insn->next_byte;	\
+	r;				\
+})
+
+#define get_next(t, insn)		\
+({					\
+	if (unlikely(!validate_next(t, insn, 0)))	\
+		goto err_out;		\
+	__get_next(t, insn);		\
+})
+
+#define peek_next(t, insn)		\
+({					\
+	if (unlikely(!validate_next(t, insn, 0)))	\
+		goto err_out;		\
+	__peek_next(t, insn);		\
+})
+
+static void mpx_insn_get_prefixes(struct mpx_insn *insn)
+{
+	unsigned char b;
+
+	/* Decode legacy prefix and REX prefix */
+	b = peek_next(unsigned char, insn);
+	while (b != 0x0f) {
+		/*
+		 * look for a rex prefix
+		 * a REX prefix cannot be followed by a legacy prefix.
+		 */
+		if (insn->x86_64 && ((b&0xf0) == 0x40)) {
+			insn->rex_prefix.value = b;
+			insn->rex_prefix.nbytes = 1;
+			insn->next_byte++;
+			break;
+		}
+
+		/* check the other legacy prefixes */
+		switch (b) {
+		case 0xf2:
+		case 0xf3:
+		case 0xf0:
+		case 0x64:
+		case 0x65:
+		case 0x2e:
+		case 0x3e:
+		case 0x26:
+		case 0x36:
+		case 0x66:
+		case 0x67:
+			insn->next_byte++;
+			break;
+		default: /* everything else is garbage */
+			goto err_out;
+		}
+		b = peek_next(unsigned char, insn);
+	}
+
+err_out:
+	return;
+}
+
+static void mpx_insn_get_modrm(struct mpx_insn *insn)
+{
+	insn->modrm.value = get_next(unsigned char, insn);
+	insn->modrm.nbytes = 1;
+
+err_out:
+	return;
+}
+
+static void mpx_insn_get_sib(struct mpx_insn *insn)
+{
+	unsigned char modrm = (unsigned char)insn->modrm.value;
+
+	if (X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) {
+		insn->sib.value = get_next(unsigned char, insn);
+		insn->sib.nbytes = 1;
+	}
+
+err_out:
+	return;
+}
+
+static void mpx_insn_get_displacement(struct mpx_insn *insn)
+{
+	unsigned char mod, rm, base;
+
+	/*
+	 * Interpreting the modrm byte:
+	 * mod = 00 - no displacement fields (exceptions below)
+	 * mod = 01 - 1-byte displacement field
+	 * mod = 10 - displacement field is 4 bytes
+	 * mod = 11 - no memory operand
+	 *
+	 * mod != 11, r/m = 100 - SIB byte exists
+	 * mod = 00, SIB base = 101 - displacement field is 4 bytes
+	 * mod = 00, r/m = 101 - rip-relative addressing, displacement
+	 *	field is 4 bytes
+	 */
+	mod = X86_MODRM_MOD(insn->modrm.value);
+	rm = X86_MODRM_RM(insn->modrm.value);
+	base = X86_SIB_BASE(insn->sib.value);
+	if (mod == 3)
+		return;
+	if (mod == 1) {
+		insn->displacement.value = get_next(unsigned char, insn);
+		insn->displacement.nbytes = 1;
+	} else if ((mod == 0 && rm == 5) || mod == 2 ||
+			(mod == 0 && base == 5)) {
+		insn->displacement.value = get_next(int, insn);
+		insn->displacement.nbytes = 4;
+	}
+
+err_out:
+	return;
+}
+
+static void mpx_insn_init(struct mpx_insn *insn, struct pt_regs *regs)
+{
+	unsigned char buf[MAX_MPX_INSN_SIZE];
+	int bytes;
+
+	memset(insn, 0, sizeof(*insn));
+
+	bytes = copy_from_user(buf, (void __user *)regs->ip, MAX_MPX_INSN_SIZE);
+	insn->limit = MAX_MPX_INSN_SIZE - bytes;
+	insn->kaddr = buf;
+	insn->next_byte = buf;
+
+	/*
+	 * In 64-bit Mode, all Intel MPX instructions use 64-bit
+	 * operands for bounds and 64 bit addressing, i.e. REX.W &
+	 * 67H have no effect on data or address size.
+	 *
+	 * In compatibility and legacy modes (including 16-bit code
+	 * segments, real and virtual 8086 modes) all Intel MPX
+	 * instructions use 32-bit operands for bounds and 32 bit
+	 * addressing.
+	 */
+#ifdef CONFIG_X86_64
+	insn->x86_64 = 1;
+	insn->addr_bytes = 8;
+#else
+	insn->x86_64 = 0;
+	insn->addr_bytes = 4;
+#endif
+}
+
+static unsigned long mpx_insn_decode(struct mpx_insn *insn,
+				     struct pt_regs *regs)
+{
+	mpx_insn_init(insn, regs);
+
+	/*
+	 * In this case, we only need decode bndcl/bndcn/bndcu,
+	 * so we can use private diassembly interfaces to get
+	 * prefixes, modrm, sib, displacement, etc..
+	 */
+	mpx_insn_get_prefixes(insn);
+	insn->next_byte += 2; /* ignore opcode */
+	mpx_insn_get_modrm(insn);
+	mpx_insn_get_sib(insn);
+	mpx_insn_get_displacement(insn);
+
+	return get_addr_ref(insn, regs);
+}
+
 static bool allocate_bt(unsigned long bd_entry)
 {
 	unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
@@ -42,3 +306,28 @@ void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 	if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
 		allocate_bt(bd_entry);
 }
+
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+		struct xsave_struct *xsave_buf)
+{
+	struct mpx_insn insn;
+	uint8_t bndregno;
+	unsigned long addr_vio;
+
+	addr_vio = mpx_insn_decode(&insn, regs);
+
+	bndregno = X86_MODRM_REG(insn.modrm.value);
+	if (bndregno > 3)
+		return;
+
+	/* Note: the upper 32 bits are ignored in 32-bit mode. */
+	info->si_lower = (void __user *)(unsigned long)
+		(xsave_buf->bndregs.bndregs[2*bndregno]);
+	info->si_upper = (void __user *)(unsigned long)
+		(~xsave_buf->bndregs.bndregs[2*bndregno+1]);
+	info->si_addr_lsb = 0;
+	info->si_signo = SIGSEGV;
+	info->si_errno = 0;
+	info->si_code = SEGV_BNDERR;
+	info->si_addr = (void __user *)addr_vio;
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index fe09b3d..9b1aa19 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -269,6 +269,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 	unsigned long status;
 	struct xsave_struct *xsave_buf;
 	struct task_struct *tsk = current;
+	siginfo_t info;
 
 	prev_state = exception_enter();
 	if (notify_die(DIE_TRAP, "bounds", regs, error_code,
@@ -304,6 +305,11 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 		break;
 
 	case 1: /* Bound violation. */
+		do_mpx_bounds(regs, &info, xsave_buf);
+		do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs,
+				error_code, &info);
+		break;
+
 	case 0: /* No exception caused by Intel MPX operations. */
 		do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
 		break;
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
 			int _trapno;	/* TRAP # which caused the signal */
 #endif
 			short _addr_lsb; /* LSB of the reported address */
+			struct {
+				void __user *_lower;
+				void __user *_upper;
+			} _addr_bnd;
 		} _sigfault;
 
 		/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno	_sifields._sigfault._trapno
 #endif
 #define si_addr_lsb	_sifields._sigfault._addr_lsb
+#define si_lower	_sifields._sigfault._addr_bnd._lower
+#define si_upper	_sifields._sigfault._addr_bnd._upper
 #define si_band		_sifields._sigpoll._band
 #define si_fd		_sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR	(__SI_FAULT|1)	/* address not mapped to object */
 #define SEGV_ACCERR	(__SI_FAULT|2)	/* invalid permissions for mapped object */
-#define NSIGSEGV	2
+#define SEGV_BNDERR	(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV	3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 52f881d..b9ea074 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2771,6 +2771,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const siginfo_t *from)
 		if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO)
 			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+		err |= __put_user(from->si_lower, &to->si_lower);
+		err |= __put_user(from->si_upper, &to->si_upper);
+#endif
 		break;
 	case __SI_CHLD:
 		err |= __put_user(from->si_pid, &to->si_pid);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables
  2014-02-12 18:36 ` [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables Qiaowei Ren
@ 2014-02-12 20:19   ` Andy Lutomirski
  2014-02-21  2:44     ` Ren Qiaowei
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Lutomirski @ 2014-02-12 20:19 UTC (permalink / raw)
  To: Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar
  Cc: x86, linux-kernel

On 02/12/2014 10:36 AM, Qiaowei Ren wrote:
> An access to an invalid bound directory entry will cause a #BR
> exception. This patch hook #BR exception handler to allocate
> one bound table and bind it with that buond directory entry.
> 
> This will avoid the need of forwarding the #BR exception
> to the user space when bound directory has invalid entry.
> 
> Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
> ---
> +void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
> +{
> +	unsigned long status;
> +	unsigned long bd_entry, bd_base;
> +	unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
> +
> +	bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
> +	status = xsave_buf->bndcsr.status_reg;
> +
> +	bd_entry = status & MPX_BNDSTA_ADDR_MASK;
> +	if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
> +		allocate_bt(bd_entry);
> +}

This still just loops on failure, right?

--Andy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables
  2014-02-12 20:19   ` Andy Lutomirski
@ 2014-02-21  2:44     ` Ren Qiaowei
  2014-02-26 19:00       ` Andy Lutomirski
  0 siblings, 1 reply; 7+ messages in thread
From: Ren Qiaowei @ 2014-02-21  2:44 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin, Thomas Gleixner, Ingo Molnar
  Cc: x86, linux-kernel

On 02/13/2014 04:19 AM, Andy Lutomirski wrote:
> On 02/12/2014 10:36 AM, Qiaowei Ren wrote:
>> An access to an invalid bound directory entry will cause a #BR
>> exception. This patch hook #BR exception handler to allocate
>> one bound table and bind it with that buond directory entry.
>>
>> This will avoid the need of forwarding the #BR exception
>> to the user space when bound directory has invalid entry.
>>
>> Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
>> ---
>> +void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
>> +{
>> +	unsigned long status;
>> +	unsigned long bd_entry, bd_base;
>> +	unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
>> +
>> +	bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
>> +	status = xsave_buf->bndcsr.status_reg;
>> +
>> +	bd_entry = status & MPX_BNDSTA_ADDR_MASK;
>> +	if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
>> +		allocate_bt(bd_entry);
>> +}
>
> This still just loops on failure, right?
>
Seems like that SIGBUS should be raised if the allocation fail.

	if (!do_mpx_bt_fault(xsave_buf))
		force_sig(SIGBUS, tsk);

Thanks,
Qiaowei


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables
  2014-02-21  2:44     ` Ren Qiaowei
@ 2014-02-26 19:00       ` Andy Lutomirski
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Lutomirski @ 2014-02-26 19:00 UTC (permalink / raw)
  To: Ren Qiaowei
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel

On Thu, Feb 20, 2014 at 6:44 PM, Ren Qiaowei <qiaowei.ren@intel.com> wrote:
> On 02/13/2014 04:19 AM, Andy Lutomirski wrote:
>>
>> On 02/12/2014 10:36 AM, Qiaowei Ren wrote:
>>>
>>> An access to an invalid bound directory entry will cause a #BR
>>> exception. This patch hook #BR exception handler to allocate
>>> one bound table and bind it with that buond directory entry.
>>>
>>> This will avoid the need of forwarding the #BR exception
>>> to the user space when bound directory has invalid entry.
>>>
>>> Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
>>> ---
>>> +void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
>>> +{
>>> +       unsigned long status;
>>> +       unsigned long bd_entry, bd_base;
>>> +       unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
>>> +
>>> +       bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
>>> +       status = xsave_buf->bndcsr.status_reg;
>>> +
>>> +       bd_entry = status & MPX_BNDSTA_ADDR_MASK;
>>> +       if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
>>> +               allocate_bt(bd_entry);
>>> +}
>>
>>
>> This still just loops on failure, right?
>>
> Seems like that SIGBUS should be raised if the allocation fail.
>
>         if (!do_mpx_bt_fault(xsave_buf))
>                 force_sig(SIGBUS, tsk);

I wonder if this should go through the force_sig_info path.

--Andy

>
> Thanks,
> Qiaowei
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-02-26 19:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-12 18:36 [PATCH v4 0/3] Intel MPX support Qiaowei Ren
2014-02-12 18:36 ` [PATCH v4 1/3] x86, mpx: add documentation on Intel MPX Qiaowei Ren
2014-02-12 18:36 ` [PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables Qiaowei Ren
2014-02-12 20:19   ` Andy Lutomirski
2014-02-21  2:44     ` Ren Qiaowei
2014-02-26 19:00       ` Andy Lutomirski
2014-02-12 18:36 ` [PATCH v4 3/3] x86, mpx: extend siginfo structure to include bound violation information Qiaowei Ren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.