linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention
@ 2017-10-27 23:51 Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 01/13] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit Ricardo Neri
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri

This the 10th revision of a patchset to enable User-Mode Instruction
Prevention (UMIP) in Linux. This is the second part of two series. Instead
of submitting a single huge series. It would be better to split the series
for easier review and handling. The first part of the series can be found
here [1]. In this series, support is added to handle 32-bit and 16-bit
addresses in protected and virtual-8086 modes plus effectively enabling
UMIP. See below for a description on why this support is needed.

=== What is UMIP?

User-Mode Instruction Prevention (UMIP) is a security feature present in
new Intel Processors. If enabled, it prevents the execution of certain
instructions if the Current Privilege Level (CPL) is greater than 0. If
these instructions were executed while in CPL > 0, user space applications
could have access to system-wide settings such as the global and local
descriptor tables, the segment selectors to the current task state and the
local descriptor table. Hiding these system resources reduces the tools
available to craft privilege escalation attacks such as [2].

These are the instructions covered by UMIP:
* SGDT - Store Global Descriptor Table
* SIDT - Store Interrupt Descriptor Table
* SLDT - Store Local Descriptor Table
* SMSW - Store Machine Status Word
* STR - Store Task Register

If any of these instructions is executed with CPL > 0, a general protection
exception is issued when UMIP is enabled. This means that any process that
attempts to use the aforementioned instructions would see a SIGSEGV signal.

=== How does it impact applications?

When enabled, however, UMIP will change the behavior that certain
applications expect from the operating system. For instance, programs
running on WineHQ and DOSEMU2 rely on some of these instructions to
function. Stas Sergeev found that Microsoft Windows 3.1 and dos4gw use the
instruction SMSW when running in virtual-8086 mode [3]. SGDT and SIDT can
also be used on virtual-8086 mode.

In order to not change the behavior of the system (i.e., a SIGSEGV signal
should not be generated when using these instructions), this implementation
traps the #GP fault generated by the CPU and emulates SGDT, SIDT and SMSW.
with dummy returned values. This should be sufficient to not break the
applications mentioned above. Regarding the two remaining instructions,
STR and SLDT, the WineHQ team has shown interest catching the general
protection fault and use it as a vehicle to fix broken applications[4].

Thus, emulation is only provided for protected and virtual-8086 modes. No
emulation is implemented for processes running in long mode. Also the
instructions SLDT and STR are not emulated in any case.

DOSEMU2 emulates virtual-8086 mode via KVM. No applications will be broken
unless DOSEMU2 decides to enable the CR4.UMIP bit in platforms that support
it. Also, this should not pose a security risk as no system resources would
be revealed. Instead, code running inside the KVM would only see the KVM's
GDT, IDT and MSW.

=== How is this series laid out?

++ Extend the insn-eval library
This library is extended to also support 32-bit and 16 bit addresses. This
also implies to add functionality to enforce segment limits in protected
mode and linear address sizes in virtual-8086 mode as described in the
section 20.1.1 of the Intel 64 and IA-32 Architectures Software Development
Manual Vol. 3.

++ Emulate UMIP instructions
A new fixup_umip_exception() functions inspect the instruction at the
instruction pointer. If it is an UMIP-protected instruction, it executes
the emulation code.

++ Add self-tests
Lastly, self-tests are added to entry_from_v86.c to exercise the most
typical use cases of UMIP-protected instructions in a virtual-8086 mode.

++ Extensive tests
Extensive tests were performed to test all the combinations of ModRM,
SiB and displacements for 16-bit and 32-bit encodings for the SS, DS,
ES, FS and GS segments. Tests also include a 64-bit program that uses
segmentation via FS and GS. For this purpose, I temporarily enabled UMIP
support for 64-bit process. This change is not part of this patchset.
The intention is to test the computations of linear addresses in 64-bit
mode, including the extra R8-R15 registers. Extensive test is also
implemented for virtual-8086 tasks. Code of these tests can be found here
[5] and here [6].

Thanks and BR,
Ricardo

[1]. https://www.spinics.net/lists/kernel/msg2635138.html
[2]. http://timetobleed.com/a-closer-look-at-a-recent-privilege-escalation-bug-in-linux-cve-2013-2094/
[3]. https://www.winehq.org/pipermail/wine-devel/2017-April/117159.html
[4]. https://marc.info/?l=linux-kernel&m=147876798717927&w=2
[5]. https://github.com/01org/luv-yocto/tree/rneri/umip/meta-luv/recipes-core/umip/files
[6]. https://github.com/01org/luv-yocto/commit/a72a7fe7d68693c0f4100ad86de6ecabde57334f#diff-3860c136a63add269bce4ea50222c248R1

Changes since V9:
*All the changes described in [1], plus:
*Created new utility functions utility functions to handle each x86
 memory addressing mode separately. Also, such functions are extended to
 support 32-bit addresses. A separate set of function is used for 16-bit
 addresses.
*Extended and rename the function get_seg_base_addr() as
 get_seg_base_addr() to also return the limit of the segment associated
 with the instruction operand.
 
Ricardo Neri (13):
  x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit
  x86/insn-eval: Compute linear address in several utility functions
  x86/insn-eval: Add support to resolve 32-bit address encodings
  x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  x86/insn-eval: Add support to resolve 16-bit address encodings
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
  x86: Add emulation code for UMIP instructions
  x86/umip: Force a page fault when unable to copy emulated result to
    user
  x86: Enable User-Mode Instruction Prevention
  x86/traps: Fixup general protection faults caused by UMIP
  selftests/x86: Add tests for User-Mode Instruction Prevention
  selftests/x86: Add tests for instruction str and sldt

 arch/x86/Kconfig                              |  10 +
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/disabled-features.h      |   8 +-
 arch/x86/include/asm/umip.h                   |  12 +
 arch/x86/include/uapi/asm/processor-flags.h   |   2 +
 arch/x86/kernel/Makefile                      |   1 +
 arch/x86/kernel/cpu/common.c                  |  25 +-
 arch/x86/kernel/traps.c                       |   5 +
 arch/x86/kernel/umip.c                        | 353 ++++++++++++++
 arch/x86/lib/insn-eval.c                      | 637 +++++++++++++++++++++++---
 tools/testing/selftests/x86/entry_from_vm86.c |  89 +++-
 11 files changed, 1078 insertions(+), 65 deletions(-)
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v10 01/13] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-11-02  9:37   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions Ricardo Neri
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

In protected mode, it is common to want to obtain the limit of a segment
along with its base address. This is useful, for instance, to verify that
an effective address lies within a segment before computing a linear
address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Up to this point, this library only computes linear addresses in long
mode. Subsequent patches will include support for protected mode. Support
to verify the segment limit will be needed.
---
 arch/x86/lib/insn-eval.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 1c23ec0..91f08aa 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -729,25 +729,29 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
 }
 
 /**
- * get_seg_base_addr() - obtain base address of a segment
+ * get_seg_base_limit() - obtain base address and limit of a segment
  * @insn:	Instruction. Must be valid.
  * @regs:	Register values as seen when entering kernel mode
  * @regoff:	Operand offset, in pt_regs, used to resolve segment descriptor
  * @base:	Obtained segment base
+ * @limit:	Obtained segment limit
  *
- * Obtain the base address of the segment associated with the operand @regoff
- * and, if any or allowed, override prefixes in @insn. This function is
+ * Obtain the base address and limit of the segment associated with the operand
+ * @regoff and, if any or allowed, override prefixes in @insn. This function is
  * different from insn_get_seg_base() as the latter does not resolve the segment
- * associated with the instruction operand.
+ * associated with the instruction operand. If a limit is not needed (e.g.,
+ * when running in long mode), @limit can be NULL.
  *
  * Returns:
  *
- * 0 on success. @base will contain the base address of the resolved segment.
+ * 0 on success. @base and @limit will contain the base address and of the
+ * resolved segment, respectively.
  *
  * -EINVAL on error.
  */
-static int get_seg_base_addr(struct insn *insn, struct pt_regs *regs,
-			     int regoff, unsigned long *base)
+static int get_seg_base_limit(struct insn *insn, struct pt_regs *regs,
+			      int regoff, unsigned long *base,
+			      unsigned long *limit)
 {
 	int seg_reg_idx;
 
@@ -762,6 +766,13 @@ static int get_seg_base_addr(struct insn *insn, struct pt_regs *regs,
 	if (*base == -1L)
 		return -EINVAL;
 
+	if (!limit)
+		return 0;
+
+	*limit = get_seg_limit(regs, seg_reg_idx);
+	if (!(*limit))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -843,7 +854,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr += insn->displacement.value;
 	}
 
-	ret = get_seg_base_addr(insn, regs, addr_offset, &seg_base);
+	ret = get_seg_base_limit(insn, regs, addr_offset, &seg_base, NULL);
 	if (ret)
 		goto out;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 01/13] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-11-02  8:51   ` Ingo Molnar
  2017-10-27 23:51 ` [PATCH v10 03/13] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Computing a linear address involves several steps. The first step is to
compute the effective address. This involves determining the addressing
mode in use and perform arithmetic operations on the operands. Plus, each
addressing mode has special cases that must be handled.

Once the effective address is known, the base address of the applicable
segment is added to obtain the linear address.

Clearly, this is too much work for a single function. Instead, handle each
addressing mode in a separate utility function. This improves readability
and gives us the opportunity to handler errors better.

At the moment, arithmetic to compute the effective address uses 8-byte
variables. Thus, limit support to 64-bit addresses.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 243 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 186 insertions(+), 57 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 91f08aa..4aa3c48 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -776,6 +776,182 @@ static int get_seg_base_limit(struct insn *insn, struct pt_regs *regs,
 	return 0;
 }
 
+/**
+ * get_eff_addr_reg() - Obtain effective address from register operand
+ * @insn:	Instruction. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Obtained operand offset, in pt_regs, with the effective address
+ * @eff_addr:	Obtained effective address
+ *
+ * Obtain the effective address stored in the register operand as indicated by
+ * the ModRM byte. This function is to be used only with register addressing
+ * (i.e.,  ModRM.mod is 3). The effective address is saved in @eff_addr. The
+ * register operand, as an offset from the base of pt_regs, is saved in @regoff;
+ * such offset can then be used to resolve the segment associated with the
+ * operand. This function can be used with any of the supported address sizes
+ * in x86.
+ *
+ * Returns:
+ *
+ * 0 on success. @eff_addr will have the effective address stored in the
+ * operand indicated by ModRM. @regoff will have such operand as an offset from
+ * the base of pt_regs.
+ *
+ * -EINVAL on error.
+ */
+static int get_eff_addr_reg(struct insn *insn, struct pt_regs *regs,
+			    int *regoff, long *eff_addr)
+{
+	insn_get_modrm(insn);
+
+	if (!insn->modrm.nbytes)
+		return -EINVAL;
+
+	if (X86_MODRM_MOD(insn->modrm.value) != 3)
+		return -EINVAL;
+
+	*regoff = get_reg_offset(insn, regs, REG_TYPE_RM);
+	if (*regoff < 0)
+		return -EINVAL;
+
+	*eff_addr = (long)regs_get_register(regs, *regoff);
+
+	return 0;
+}
+
+/**
+ * get_eff_addr_modrm() - Obtain referenced effective address via ModRM
+ * @insn:	Instruction. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Obtained operand offset, in pt_regs, associated with segment
+ * @eff_addr:	Obtained effective address
+ *
+ * Obtain the effective address referenced by the ModRM byte of @insn. After
+ * identifying the registers involved in the register-indirect memory reference,
+ * its value is obtained from the operands in @regs. The computed address is
+ * stored @eff_addr. Also, the register operand that indicates the associated
+ * segment is stored in @regoff, this parameter can later be used to determine
+ * such segment. This function can be used for both 32-bit and 64-bit effective
+ * addresses.
+ *
+ * Returns:
+ *
+ * 0 on success. @eff_addr will have the referenced effective address. @regoff
+ * will have a register, as an offset from the base of pt_regs, that can be used
+ * to resolve the associated segment.
+ *
+ * -EINVAL on error.
+ */
+static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs,
+			      int *regoff, long *eff_addr)
+{
+	long tmp;
+
+	if (insn->addr_bytes != 8)
+		return -EINVAL;
+
+	insn_get_modrm(insn);
+
+	if (!insn->modrm.nbytes)
+		return -EINVAL;
+
+	if (X86_MODRM_MOD(insn->modrm.value) > 2)
+		return -EINVAL;
+
+	*regoff = get_reg_offset(insn, regs, REG_TYPE_RM);
+	/*
+	 * -EDOM means that we must ignore the address_offset. In such a case,
+	 * in 64-bit mode the effective address relative to the RIP of the
+	 * following instruction.
+	 */
+	if (*regoff == -EDOM) {
+		if (user_64bit_mode(regs))
+			tmp = (long)regs->ip + insn->length;
+		else
+			tmp = 0;
+	} else if (*regoff < 0) {
+		return -EINVAL;
+	} else {
+		tmp = (long)regs_get_register(regs, *regoff);
+	}
+
+	*eff_addr = tmp + insn->displacement.value;
+
+	return 0;
+}
+
+/**
+ * get_eff_addr_sib() - Obtain referenced effective address via SIB
+ * @insn:	Instruction. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Obtained operand offset, in pt_regs, associated with segment
+ * @eff_addr:	Obtained effective address
+ *
+ * Obtain the effective address referenced by the SIB byte of @insn. After
+ * identifying the registers involved in the indexed, register-indirect memory
+ * reference, its value is obtained from the operands in @regs. The computed
+ * address is stored @eff_addr. Also, the register operand that indicates the
+ * associated segment is stored in @regoff, this parameter can later be used to
+ * determine such segment. This function can be used for both 32-bit and 64-bit
+ * effective addresses.
+ *
+ * Returns:
+ *
+ * 0 on success. @eff_addr will have the referenced effective address. @regoff
+ * will have a register, as an offset from the base of pt_regs, that can be used
+ * to resolve the associated segment.
+ *
+ * -EINVAL on error.
+ */
+static int get_eff_addr_sib(struct insn *insn, struct pt_regs *regs,
+			    int *base_offset, long *eff_addr)
+{
+	long base, indx;
+	int indx_offset;
+
+	if (insn->addr_bytes != 8)
+		return -EINVAL;
+
+	insn_get_modrm(insn);
+
+	if (!insn->modrm.nbytes)
+		return -EINVAL;
+
+	if (X86_MODRM_MOD(insn->modrm.value) > 2)
+		return -EINVAL;
+
+	insn_get_sib(insn);
+
+	if (!insn->sib.nbytes)
+		return -EINVAL;
+
+	*base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
+	indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
+	/*
+	 * Negative values in the base and index offset means an error when
+	 * decoding the SIB byte. Except -EDOM, which means that the registers
+	 * should not be used in the address computation.
+	 */
+	if (*base_offset == -EDOM)
+		base = 0;
+	else if (*base_offset < 0)
+		return -EINVAL;
+	else
+		base = (long)regs_get_register(regs, *base_offset);
+
+	if (indx_offset == -EDOM)
+		indx = 0;
+	else if (indx_offset < 0)
+		return -EINVAL;
+	else
+		indx = (long)regs_get_register(regs, indx_offset);
+
+	*eff_addr = base + indx * (1 << X86_SIB_SCALE(insn->sib.value));
+
+	*eff_addr += insn->displacement.value;
+
+	return 0;
+}
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
@@ -783,75 +959,28 @@ static int get_seg_base_limit(struct insn *insn, struct pt_regs *regs,
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	int addr_offset, base_offset, indx_offset, ret;
 	unsigned long linear_addr = -1L, seg_base;
-	long eff_addr, base, indx;
-	insn_byte_t sib;
-
-	insn_get_modrm(insn);
-	insn_get_sib(insn);
-	sib = insn->sib.value;
+	int addr_offset, ret;
+	long eff_addr;
 
 	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
-		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-		if (addr_offset < 0)
+		ret = get_eff_addr_reg(insn, regs, &addr_offset, &eff_addr);
+		if (ret)
 			goto out;
 
-		eff_addr = regs_get_register(regs, addr_offset);
-
 	} else {
 		if (insn->sib.nbytes) {
-			/*
-			 * Negative values in the base and index offset means
-			 * an error when decoding the SIB byte. Except -EDOM,
-			 * which means that the registers should not be used
-			 * in the address computation.
-			 */
-			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset == -EDOM)
-				base = 0;
-			else if (base_offset < 0)
+			ret = get_eff_addr_sib(insn, regs, &addr_offset,
+					       &eff_addr);
+			if (ret)
 				goto out;
-			else
-				base = regs_get_register(regs, base_offset);
-
-			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-
-			if (indx_offset == -EDOM)
-				indx = 0;
-			else if (indx_offset < 0)
-				goto out;
-			else
-				indx = regs_get_register(regs, indx_offset);
-
-			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
-
-			/*
-			 * The base determines the segment used to compute
-			 * the linear address.
-			 */
-			addr_offset = base_offset;
-
 		} else {
-			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			/*
-			 * -EDOM means that we must ignore the address_offset.
-			 * In such a case, in 64-bit mode the effective address
-			 * relative to the RIP of the following instruction.
-			 */
-			if (addr_offset == -EDOM) {
-				if (user_64bit_mode(regs))
-					eff_addr = (long)regs->ip + insn->length;
-				else
-					eff_addr = 0;
-			} else if (addr_offset < 0) {
+			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
+						 &eff_addr);
+			if (ret)
 				goto out;
-			} else {
-				eff_addr = regs_get_register(regs, addr_offset);
-			}
 		}
 
-		eff_addr += insn->displacement.value;
 	}
 
 	ret = get_seg_base_limit(insn, regs, addr_offset, &seg_base, NULL);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 03/13] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 01/13] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 04/13] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses Ricardo Neri
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

32-bit and 64-bit address encodings are identical. Thus, the same logic
could be used to resolve the effective address. However, there are two key
differences: address size and enforcement of segment limits.

If running a 32-bit process on a 64-bit kernel, it is best to perform
the address calculation using 32-bit data types. In this manner hardware
is used for the arithmetic, including handling of signs and overflows.

32-bit addresses are generally used in protected mode; segment limits are
enforced in this mode. This implementation obtains the limit of the
segment associated with the instruction operands and prefixes. If the
computed address is outside the segment limits, an error is returned. It
is also possible to use 32-bit address in long mode and virtual-8086 mode
by using an address override prefix. In such cases, segment limits are not
enforced.

Add support to use 32-bit arithmetic to the utility functions that compute
effective addresses. Once the effective address is computed, ignore
the bytes that are outside the address size as given by the instruction
structure.

The new function get_addr_ref_32() is almost identical to the existing
function insn_get_addr_ref() (used for 64-bit addresses). The only
difference is that it verifies that the effective address is within the
limits of the segment.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 113 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 107 insertions(+), 6 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 4aa3c48..7133a47 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -814,7 +814,12 @@ static int get_eff_addr_reg(struct insn *insn, struct pt_regs *regs,
 	if (*regoff < 0)
 		return -EINVAL;
 
-	*eff_addr = (long)regs_get_register(regs, *regoff);
+	/* Ignore bytes that are outside the address size */
+	if (insn->addr_bytes == 4)
+		*eff_addr = (long)(regs_get_register(regs, *regoff) &
+				   0xffffffff);
+	else /* 64-bit address */
+		*eff_addr = (long)regs_get_register(regs, *regoff);
 
 	return 0;
 }
@@ -847,7 +852,7 @@ static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs,
 {
 	long tmp;
 
-	if (insn->addr_bytes != 8)
+	if (insn->addr_bytes != 8 && insn->addr_bytes != 4)
 		return -EINVAL;
 
 	insn_get_modrm(insn);
@@ -875,7 +880,13 @@ static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs,
 		tmp = (long)regs_get_register(regs, *regoff);
 	}
 
-	*eff_addr = tmp + insn->displacement.value;
+	if (insn->addr_bytes == 4) {
+		int addr32 = (int)(tmp & 0xffffffff) + insn->displacement.value;
+
+		*eff_addr = (long)(addr32 & 0xffffffff);
+	} else {
+		*eff_addr = tmp + insn->displacement.value;
+	}
 
 	return 0;
 }
@@ -909,7 +920,7 @@ static int get_eff_addr_sib(struct insn *insn, struct pt_regs *regs,
 	long base, indx;
 	int indx_offset;
 
-	if (insn->addr_bytes != 8)
+	if (insn->addr_bytes != 8 && insn->addr_bytes != 4)
 		return -EINVAL;
 
 	insn_get_modrm(insn);
@@ -946,12 +957,102 @@ static int get_eff_addr_sib(struct insn *insn, struct pt_regs *regs,
 	else
 		indx = (long)regs_get_register(regs, indx_offset);
 
-	*eff_addr = base + indx * (1 << X86_SIB_SCALE(insn->sib.value));
+	if (insn->addr_bytes == 4) {
+		int addr32, base32, idx32;
+
+		base32 = (int)(base & 0xffffffff);
+		idx32 = (int)(indx & 0xffffffff);
 
-	*eff_addr += insn->displacement.value;
+		addr32 = base32 + idx32 * (1 << X86_SIB_SCALE(insn->sib.value));
+		addr32 += insn->displacement.value;
+
+		*eff_addr = (long)(addr32 & 0xffffffff);
+	} else {
+		*eff_addr = base + indx * (1 << X86_SIB_SCALE(insn->sib.value));
+		*eff_addr += insn->displacement.value;
+	}
 
 	return 0;
 }
+
+/**
+ * get_addr_ref_32() - Obtain a 32-bit linear address
+ * @insn:	Instruction with ModRM, SIB bytes and displacement
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * This function is to be used with 32-bit address encodings to obtain the
+ * linear memory address referred by the instruction's ModRM, SIB,
+ * displacement bytes and segment base address, as applicable. If in protected
+ * mode, segment limits are enforced.
+ *
+ * Returns:
+ *
+ * Linear address referenced by instruction and registers on success.
+ *
+ * -1L on error.
+ */
+static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
+{
+	unsigned long linear_addr = -1L, seg_base, seg_limit;
+	int eff_addr, regoff;
+	long tmp;
+	int ret;
+
+	if (insn->addr_bytes != 4)
+		goto out;
+
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		ret = get_eff_addr_reg(insn, regs, &regoff, &tmp);
+		if (ret)
+			goto out;
+
+		eff_addr = (int)tmp;
+
+	} else {
+		if (insn->sib.nbytes) {
+			ret = get_eff_addr_sib(insn, regs, &regoff, &tmp);
+			if (ret)
+				goto out;
+
+			eff_addr = (int)tmp;
+		} else {
+			ret = get_eff_addr_modrm(insn, regs, &regoff, &tmp);
+			if (ret)
+				goto out;
+
+			eff_addr = (int)tmp;
+		}
+	}
+
+	ret = get_seg_base_limit(insn, regs, regoff, &seg_base,
+				 &seg_limit);
+	if (ret)
+		goto out;
+	/*
+	 * In protected mode, before computing the linear address, make sure
+	 * the effective address is within the limits of the segment.
+	 * 32-bit addresses can be used in long and virtual-8086 modes if an
+	 * address override prefix is used. In such cases, segment limits are
+	 * not enforced. When in virtual-8086 mode, the segment limit is -1L
+	 * to reflect this situation.
+	 *
+	 * After computed, the effective address is treated as an unsigned
+	 * quantity.
+	 */
+	if (!user_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit))
+		goto out;
+
+	/*
+	 * Data type long could be 64 bits in size. Ensure that our 32-bit
+	 * effective address is not sign-extended when computing the linear
+	 * address.
+	 */
+	linear_addr = (unsigned long)(eff_addr & 0xffffffff) + seg_base;
+
+out:
+	return (void __user *)linear_addr;
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 04/13] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (2 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 03/13] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 05/13] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode Ricardo Neri
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

The function insn_get_addr_ref() is capable of handling only 64-bit
addresses. A previous commit introduced a function to handle 32-bit
addresses. Invoke these two functions from a third wrapper function that
calls the appropriate routine based on the address size specified in the
instruction structure (obtained by looking at the code segment default
address size and the address override prefix, if present).

While doing this, rename the original function insn_get_addr_ref() with
the more appropriate name get_addr_ref_64(), ensure it is only used
for 64-bit addresses.

Also, since 64-bit addresses are not possible in 32-bit builds, provide
a dummy function such case.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 60 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 55 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 7133a47..d5618ee 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1053,17 +1053,36 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
 	return (void __user *)linear_addr;
 }
 
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
+/**
+ * get_addr_ref_64() - Obtain a 64-bit linear address
+ * @insn:	Instruction struct with ModRM and SIB bytes and displacement
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * This function is to be used with 64-bit address encodings to obtain the
+ * linear memory address referred by the instruction's ModRM, SIB,
+ * displacement bytes and segment base address, as applicable.
+ *
+ * Returns:
+ *
+ * Linear address referenced by instruction and registers on success.
+ *
+ * -1L on error.
  */
-void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+#ifndef CONFIG_X86_64
+static void __user *get_addr_ref_64(struct insn *insn, struct pt_regs *regs)
+{
+	return (void __user *)-1L;
+}
+#else
+static void __user *get_addr_ref_64(struct insn *insn, struct pt_regs *regs)
 {
 	unsigned long linear_addr = -1L, seg_base;
 	int addr_offset, ret;
 	long eff_addr;
 
+	if (insn->addr_bytes != 8)
+		goto out;
+
 	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
 		ret = get_eff_addr_reg(insn, regs, &addr_offset, &eff_addr);
 		if (ret)
@@ -1093,3 +1112,34 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 out:
 	return (void __user *)linear_addr;
 }
+#endif /* CONFIG_X86_64 */
+
+/**
+ * insn_get_addr_ref() - Obtain the linear address referred by instruction
+ * @insn:	Instruction structure containing ModRM byte and displacement
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * Obtain the linear address referred by the instruction's ModRM, SIB and
+ * displacement bytes, and segment base, as applicable. In protected mode,
+ * segment limits are enforced.
+ *
+ * Returns:
+ *
+ * Linear address referenced by instruction and registers on success.
+ *
+ * -1L on error.
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+	if (!insn || !regs)
+		return (void __user *)-1L;
+
+	switch (insn->addr_bytes) {
+	case 4:
+		return get_addr_ref_32(insn, regs);
+	case 8:
+		return get_addr_ref_64(insn, regs);
+	default:
+		return (void __user *)-1L;
+	}
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 05/13] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (3 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 04/13] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 06/13] x86/insn-eval: Add support to resolve 16-bit address encodings Ricardo Neri
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

It is possible to utilize 32-bit address encodings in virtual-8086 mode via
an address override instruction prefix. However, the range of the
effective address is still limited to [0x-0xffff]. In such a case, return
error.

Also, linear addresses in virtual-8086 mode are limited to 20 bits. Enforce
such limit by truncating the most significant bytes of the computed linear
address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index d5618ee..66d597d 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1043,12 +1043,23 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
 		goto out;
 
 	/*
+	 * Even though 32-bit address encodings are allowed in virtual-8086
+	 * mode, the address range is still limited to [0x-0xffff].
+	 */
+	if (v8086_mode(regs) && (eff_addr & ~0xffff))
+		goto out;
+
+	/*
 	 * Data type long could be 64 bits in size. Ensure that our 32-bit
 	 * effective address is not sign-extended when computing the linear
 	 * address.
 	 */
 	linear_addr = (unsigned long)(eff_addr & 0xffffffff) + seg_base;
 
+	/* Limit linear address to 20 bits */
+	if (v8086_mode(regs))
+		linear_addr &= 0xfffff;
+
 out:
 	return (void __user *)linear_addr;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 06/13] x86/insn-eval: Add support to resolve 16-bit address encodings
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (4 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 05/13] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 07/13] x86/cpufeature: Add User-Mode Instruction Prevention definitions Ricardo Neri
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Tasks running in virtual-8086 mode, in protected mode with code segment
descriptors that specify 16-bit default address sizes via the
D bit, or via an address override prefix will use 16-bit addressing form
encodings as described in the Intel 64 and IA-32 Architecture Software
Developer's Manual Volume 2A Section 2.1.5, Table 2-1.

16-bit addressing encodings differ in several ways from the 32-bit/64-bit
addressing form encodings: ModRM.rm points to different registers and, in
some cases, effective addresses are indicated by the addition of the value
of two registers. Also, there is no support for SIB bytes. Thus, a
separate function is needed to parse this form of addressing.

Three functions are introduced. get_reg_offset_16() obtains the
offset from the base of pt_regs of the registers indicated by the ModRM
byte of the address encoding. get_eff_addr_modrm_16() computes the
effective address from the value of the register operands.
get_addr_ref_16() computes the linear address using the obtained effective
address and the base address of the segment.

Segment limits are enforced when running in protected mode.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 213 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 212 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 66d597d..cee168b 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -481,6 +481,80 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_reg_offset_16() - Obtain offset of register indicated by instruction
+ * @insn:	Instruction containing ModRM byte
+ * @regs:	Register values as seen when entering kernel mode
+ * @offs1:	Offset of the first operand register
+ * @offs2:	Offset of the second opeand register, if applicable
+ *
+ * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
+ * in @insn. This function is to be used with 16-bit address encodings. The
+ * @offs1 and @offs2 will be written with the offset of the two registers
+ * indicated by the instruction. In cases where any of the registers is not
+ * referenced by the instruction, the value will be set to -EDOM.
+ *
+ * Returns:
+ *
+ * 0 on success, -EINVAL on error.
+ */
+static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
+			     int *offs1, int *offs2)
+{
+	/*
+	 * 16-bit addressing can use one or two registers. Specifics of
+	 * encodings are given in Table 2-1. "16-Bit Addressing Forms with the
+	 * ModR/M Byte" of the Intel Software Development Manual.
+	 */
+	static const int regoff1[] = {
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, bx),
+	};
+
+	static const int regoff2[] = {
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+		-EDOM,
+		-EDOM,
+		-EDOM,
+		-EDOM,
+	};
+
+	if (!offs1 || !offs2)
+		return -EINVAL;
+
+	/* Operand is a register, use the generic function. */
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		*offs1 = insn_get_modrm_rm_off(insn, regs);
+		*offs2 = -EDOM;
+		return 0;
+	}
+
+	*offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
+	*offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
+
+	/*
+	 * If ModRM.mod is 0 and ModRM.rm is 110b, then we use displacement-
+	 * only addressing. This means that no registers are involved in
+	 * computing the effective address. Thus, ensure that the first
+	 * register offset is invalild. The second register offset is already
+	 * invalid under the aforementioned conditions.
+	 */
+	if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
+	    (X86_MODRM_RM(insn->modrm.value) == 6))
+		*offs1 = -EDOM;
+
+	return 0;
+}
+
+/**
  * get_desc() - Obtain pointer to a segment descriptor
  * @sel:	Segment selector
  *
@@ -815,7 +889,9 @@ static int get_eff_addr_reg(struct insn *insn, struct pt_regs *regs,
 		return -EINVAL;
 
 	/* Ignore bytes that are outside the address size */
-	if (insn->addr_bytes == 4)
+	if (insn->addr_bytes == 2)
+		*eff_addr = (long)(regs_get_register(regs, *regoff) & 0xffff);
+	else if (insn->addr_bytes == 4)
 		*eff_addr = (long)(regs_get_register(regs, *regoff) &
 				   0xffffffff);
 	else /* 64-bit address */
@@ -892,6 +968,74 @@ static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_eff_addr_modrm_16() - Obtain referenced effective address via ModRM
+ * @insn:	Instruction. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Obtained operand offset, in pt_regs, associated with segment
+ * @eff_addr:	Obtained effective address
+ *
+ * Obtain the 16-bit effective address referenced by the ModRM byte of @insn.
+ * After identifying the registers involved in the register-indirect memory
+ * reference, its value is obtained from the operands in @regs. The computed
+ * address is stored @eff_addr. Also, the register operand that indicates
+ * the associated segment is stored in @regoff, this parameter can later be used
+ * to determine such segment.
+ *
+ * Returns:
+ *
+ * 0 on success. @eff_addr will have the referenced effective address. @regoff
+ * will have a register, as an offset from the base of pt_regs, that can be used
+ * to resolve the associated segment.
+ *
+ * -EINVAL on error.
+ */
+static int get_eff_addr_modrm_16(struct insn *insn, struct pt_regs *regs,
+				 int *regoff, short *eff_addr)
+{
+	int addr_offset1, addr_offset2, ret;
+	short addr1 = 0, addr2 = 0;
+
+	if (insn->addr_bytes != 2)
+		return -EINVAL;
+
+	insn_get_modrm(insn);
+
+	if (!insn->modrm.nbytes)
+		return -EINVAL;
+
+	if (X86_MODRM_MOD(insn->modrm.value) > 2)
+		return -EINVAL;
+
+	ret = get_reg_offset_16(insn, regs, &addr_offset1,
+				&addr_offset2);
+	if (ret < 0)
+		return -EINVAL;
+
+	/*
+	 * Don't fail on invalid offset values. They might be invalid because
+	 * they cannot be used for this particular value of ModRM. Instead, use
+	 * them in the computation only if they contain a valid value.
+	 */
+	if (addr_offset1 != -EDOM)
+		addr1 = (short)(0xffff & regs_get_register(regs, addr_offset1));
+
+	if (addr_offset2 != -EDOM)
+		addr2 = (short)(0xffff & regs_get_register(regs, addr_offset2));
+
+	*eff_addr = addr1 + addr2 + (short)(insn->displacement.value & 0xffff);
+
+	/*
+	 * The first operand register could indicate to use of either SS or DS
+	 * registers to obtain the segment selector.  The second operand
+	 * register can only indicate the use of DS. Thus, the first operand
+	 * will be used to obtain the segment selector.
+	 */
+	*regoff = addr_offset1;
+
+	return 0;
+}
+
+/**
  * get_eff_addr_sib() - Obtain referenced effective address via SIB
  * @insn:	Instruction. Must be valid.
  * @regs:	Register values as seen when entering kernel mode
@@ -976,6 +1120,71 @@ static int get_eff_addr_sib(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_addr_ref_16() - Obtain the 16-bit address referred by instruction
+ * @insn:	Instruction containing ModRM byte and displacement
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * This function is to be used with 16-bit address encodings. Obtain the memory
+ * address referred by the instruction's ModRM and displacement bytes. Also, the
+ * segment used as base is determined by either any segment override prefixes in
+ * @insn or the default segment of the registers involved in the address
+ * computation. In protected mode, segment limits are enforced.
+ *
+ * Returns:
+ *
+ * Linear address referenced by the instruction operands on success.
+ *
+ * -1L on error.
+ */
+static void __user *get_addr_ref_16(struct insn *insn, struct pt_regs *regs)
+{
+	unsigned long linear_addr = -1L, seg_base, seg_limit;
+	int ret, regoff;
+	short eff_addr;
+	long tmp;
+
+	insn_get_modrm(insn);
+	insn_get_displacement(insn);
+
+	if (insn->addr_bytes != 2)
+		goto out;
+
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		ret = get_eff_addr_reg(insn, regs, &regoff, &tmp);
+		if (ret)
+			goto out;
+
+		eff_addr = (short)tmp;
+	} else {
+		ret = get_eff_addr_modrm_16(insn, regs, &regoff, &eff_addr);
+		if (ret)
+			goto out;
+	}
+
+	ret = get_seg_base_limit(insn, regs, regoff, &seg_base, &seg_limit);
+	if (ret)
+		goto out;
+
+	/*
+	 * Before computing the linear address, make sure the effective address
+	 * is within the limits of the segment. In virtual-8086 mode, segment
+	 * limits are not enforced. In such a case, the segment limit is -1L to
+	 * reflect this fact.
+	 */
+	if ((unsigned long)(eff_addr & 0xffff) > seg_limit)
+		goto out;
+
+	linear_addr = (unsigned long)(eff_addr & 0xffff) + seg_base;
+
+	/* Limit linear address to 20 bits */
+	if (v8086_mode(regs))
+		linear_addr &= 0xfffff;
+
+out:
+	return (void __user *)linear_addr;
+}
+
+/**
  * get_addr_ref_32() - Obtain a 32-bit linear address
  * @insn:	Instruction with ModRM, SIB bytes and displacement
  * @regs:	Register values as seen when entering kernel mode
@@ -1146,6 +1355,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		return (void __user *)-1L;
 
 	switch (insn->addr_bytes) {
+	case 2:
+		return get_addr_ref_16(insn, regs);
 	case 4:
 		return get_addr_ref_32(insn, regs);
 	case 8:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 07/13] x86/cpufeature: Add User-Mode Instruction Prevention definitions
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (5 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 06/13] x86/insn-eval: Add support to resolve 16-bit address encodings Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 08/13] x86: Add emulation code for UMIP instructions Ricardo Neri
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu,
	Tony Luck

User-Mode Instruction Prevention is a security feature present in new
Intel processors that, when set, prevents the execution of a subset of
instructions if such instructions are executed in user mode (CPL > 0).
Attempting to execute such instructions causes a general protection
exception.

The subset of instructions comprises:

 * SGDT - Store Global Descriptor Table
 * SIDT - Store Interrupt Descriptor Table
 * SLDT - Store Local Descriptor Table
 * SMSW - Store Machine Status Word
 * STR  - Store Task Register

This feature is also added to the list of disabled-features to allow
a cleaner handling of build-time configuration.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/cpufeatures.h          | 1 +
 arch/x86/include/asm/disabled-features.h    | 8 +++++++-
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 401a709..ca0cc2d 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -297,6 +297,7 @@
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
 #define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
+#define X86_FEATURE_UMIP	(16*32+ 2) /* User Mode Instruction Protection */
 #define X86_FEATURE_PKU		(16*32+ 3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE	(16*32+ 4) /* OS Protection Keys Enable */
 #define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index c10c912..14d6d50 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -16,6 +16,12 @@
 # define DISABLE_MPX	(1<<(X86_FEATURE_MPX & 31))
 #endif
 
+#ifdef CONFIG_X86_INTEL_UMIP
+# define DISABLE_UMIP	0
+#else
+# define DISABLE_UMIP	(1<<(X86_FEATURE_UMIP & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME		(1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR	(1<<(X86_FEATURE_K6_MTRR & 31))
@@ -63,7 +69,7 @@
 #define DISABLED_MASK13	0
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
-#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57)
+#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 39946d0..cf4c876 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -104,6 +104,8 @@
 #define X86_CR4_OSFXSR		_BITUL(X86_CR4_OSFXSR_BIT)
 #define X86_CR4_OSXMMEXCPT_BIT	10 /* enable unmasked SSE exceptions */
 #define X86_CR4_OSXMMEXCPT	_BITUL(X86_CR4_OSXMMEXCPT_BIT)
+#define X86_CR4_UMIP_BIT	11 /* enable UMIP support */
+#define X86_CR4_UMIP		_BITUL(X86_CR4_UMIP_BIT)
 #define X86_CR4_LA57_BIT	12 /* enable 5-level page tables */
 #define X86_CR4_LA57		_BITUL(X86_CR4_LA57_BIT)
 #define X86_CR4_VMXE_BIT	13 /* enable VMX virtualization */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 08/13] x86: Add emulation code for UMIP instructions
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (6 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 07/13] x86/cpufeature: Add User-Mode Instruction Prevention definitions Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 09/13] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu,
	Tony Luck

The feature User-Mode Instruction Prevention present in recent Intel
processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and
str) from being executed with CPL > 0. Otherwise, a general protection
fault is issued.

Rather than relaying to the user space the general protection fault caused
by the UMIP-protected instructions (in the form of a SIGSEGV signal), it
can be trapped and emulate the result of such instructions to provide dummy
values. This allows to both conserve the current kernel behavior and not
reveal the system resources that UMIP intends to protect (i.e., the
locations of the global descriptor and interrupt descriptor tables, the
segment selectors of the local descriptor table, the value of the task
state register and the contents of the CR0 register).

This emulation is needed because certain applications (e.g., WineHQ and
DOSEMU2) rely on this subset of instructions to function. Given that sldt
and str are not commonly used in programs that run on WineHQ or DOSEMU2,
they are not emulated. Also, emulation is provided only for 32-bit
processes; 64-bit processes that attempt to use the instructions that UMIP
protects will receive the SIGSEGV signal issued as a consequence of the
general protection fault.

The instructions protected by UMIP can be split in two groups. Those which
return a kernel memory address (sgdt and sidt) and those which return a
value (sldt, str and smsw).

For the instructions that return a kernel memory address, applications such
as WineHQ rely on the result being located in the kernel memory space, not
the actual location of the table. The result is emulated as a hard-coded
value that lies close to the top of the kernel memory. The limit for the
GDT and the IDT are set to zero.

The instruction smsw is emulated to return the value that the register CR0
has at boot time as set in the head_32.

Care is taken to appropriately emulate the results when segmentation is
used. That is, rather than relying on USER_DS and USER_CS, the function
insn_get_addr_ref() inspects the segment descriptor pointed by the
registers in pt_regs. This ensures that we correctly obtain the segment
base address and the address and operand sizes even if the user space
application uses a local descriptor table.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/umip.h |  12 ++
 arch/x86/kernel/Makefile    |   1 +
 arch/x86/kernel/umip.c      | 312 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 325 insertions(+)
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c

diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
new file mode 100644
index 0000000..db43f2a
--- /dev/null
+++ b/arch/x86/include/asm/umip.h
@@ -0,0 +1,12 @@
+#ifndef _ASM_X86_UMIP_H
+#define _ASM_X86_UMIP_H
+
+#include <linux/types.h>
+#include <asm/ptrace.h>
+
+#ifdef CONFIG_X86_INTEL_UMIP
+bool fixup_umip_exception(struct pt_regs *regs);
+#else
+static inline bool fixup_umip_exception(struct pt_regs *regs) { return false; }
+#endif  /* CONFIG_X86_INTEL_UMIP */
+#endif  /* _ASM_X86_UMIP_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d449c5a..4855dc4 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -126,6 +126,7 @@ obj-$(CONFIG_EFI)			+= sysfb_efi.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o
 obj-$(CONFIG_TRACING)			+= tracepoint.o
 obj-$(CONFIG_SCHED_MC_PRIO)		+= itmt.o
+obj-$(CONFIG_X86_INTEL_UMIP)		+= umip.o
 
 obj-$(CONFIG_UNWINDER_ORC)		+= unwind_orc.o
 obj-$(CONFIG_UNWINDER_FRAME_POINTER)	+= unwind_frame.o
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
new file mode 100644
index 0000000..31cf9e9
--- /dev/null
+++ b/arch/x86/kernel/umip.c
@@ -0,0 +1,312 @@
+/*
+ * umip.c Emulation for instruction protected by the Intel User-Mode
+ * Instruction Prevention feature
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ * Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
+ */
+
+#include <linux/uaccess.h>
+#include <asm/umip.h>
+#include <asm/traps.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
+#include <linux/ratelimit.h>
+
+/** DOC: Emulation for User-Mode Instruction Prevention (UMIP)
+ *
+ * The feature User-Mode Instruction Prevention present in recent Intel
+ * processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and str)
+ * from being executed with CPL > 0. Otherwise, a general protection fault is
+ * issued.
+ *
+ * Rather than relaying to the user space the general protection fault caused by
+ * the UMIP-protected instructions (in the form of a SIGSEGV signal), it can be
+ * trapped and emulate the result of such instructions to provide dummy values.
+ * This allows to both conserve the current kernel behavior and not reveal the
+ * system resources that UMIP intends to protect (i.e., the locations of the
+ * global descriptor and interrupt descriptor tables, the segment selectors of
+ * the local descriptor table, the value of the task state register and the
+ * contents of the CR0 register).
+ *
+ * This emulation is needed because certain applications (e.g., WineHQ and
+ * DOSEMU2) rely on this subset of instructions to function.
+ *
+ * The instructions protected by UMIP can be split in two groups. Those which
+ * return a kernel memory address (sgdt and sidt) and those which return a
+ * value (sldt, str and smsw).
+ *
+ * For the instructions that return a kernel memory address, applications
+ * such as WineHQ rely on the result being located in the kernel memory space,
+ * not the actual location of the table. The result is emulated as a hard-coded
+ * value that, lies close to the top of the kernel memory. The limit for the GDT
+ * and the IDT are set to zero.
+ *
+ * Given that sldt and str are not commonly used in programs that run on WineHQ
+ * or DOSEMU2, they are not emulated.
+ *
+ * The instruction smsw is emulated to return the value that the register CR0
+ * has at boot time as set in the head_32.
+ *
+ * Also, emulation is provided only for 32-bit processes; 64-bit processes
+ * that attempt to use the instructions that UMIP protects will receive the
+ * SIGSEGV signal issued as a consequence of the general protection fault.
+ *
+ * Care is taken to appropriately emulate the results when segmentation is
+ * used. That is, rather than relying on USER_DS and USER_CS, the function
+ * insn_get_addr_ref() inspects the segment descriptor pointed by the
+ * registers in pt_regs. This ensures that we correctly obtain the segment
+ * base address and the address and operand sizes even if the user space
+ * application uses a local descriptor table.
+ */
+
+#define UMIP_DUMMY_GDT_BASE 0xfffe0000
+#define UMIP_DUMMY_IDT_BASE 0xffff0000
+
+/*
+ * The SGDT and SIDT instructions store the contents of the global descriptor
+ * table and interrupt table registers, respectively. The destination is a
+ * memory operand of X+2 bytes. X bytes are used to store the base address of
+ * the table and 2 bytes are used to store the limit. In 32-bit processes, the
+ * only processes for which emulation is provided, X has a value of 4.
+ */
+#define UMIP_GDT_IDT_BASE_SIZE 4
+#define UMIP_GDT_IDT_LIMIT_SIZE 2
+
+#define	UMIP_INST_SGDT	0	/* 0F 01 /0 */
+#define	UMIP_INST_SIDT	1	/* 0F 01 /1 */
+#define	UMIP_INST_SMSW	3	/* 0F 01 /4 */
+
+/**
+ * identify_insn() - Identify a UMIP-protected instruction
+ * @insn:	Instruction structure with opcode and ModRM byte.
+ *
+ * From the instruction opcode and the reg part of the ModRM byte, identify,
+ * if any, a UMIP-protected instruction.
+ *
+ * Return: a constant that identifies a specific UMIP-protected instruction.
+ * -EINVAL when not an UMIP-protected instruction.
+ */
+static int identify_insn(struct insn *insn)
+{
+	/* By getting modrm we also get the opcode. */
+	insn_get_modrm(insn);
+
+	/* All the instructions of interest start with 0x0f. */
+	if (insn->opcode.bytes[0] != 0xf)
+		return -EINVAL;
+
+	if (insn->opcode.bytes[1] == 0x1) {
+		switch (X86_MODRM_REG(insn->modrm.value)) {
+		case 0:
+			return UMIP_INST_SGDT;
+		case 1:
+			return UMIP_INST_SIDT;
+		case 4:
+			return UMIP_INST_SMSW;
+		default:
+			return -EINVAL;
+		}
+	}
+	/* SLDT AND STR are not emulated */
+	return -EINVAL;
+}
+
+/**
+ * emulate_umip_insn() - Emulate UMIP instructions with dummy values
+ * @insn:	Instruction structure with operands
+ * @umip_inst:	Instruction to emulate
+ * @data:	Buffer into which the dummy values will be copied
+ * @data_size:	Size of the emulated result
+ *
+ * Emulate an instruction protected by UMIP. The result of the emulation
+ * is saved in the provided buffer. The size of the results depends on both
+ * the instruction and type of operand (register vs memory address). Thus,
+ * the size of the result needs to be updated.
+ *
+ * Returns:
+ *
+ * 0 if success, -EINVAL on error while emulating.
+ */
+static int emulate_umip_insn(struct insn *insn, int umip_inst,
+			     unsigned char *data, int *data_size)
+{
+	unsigned long dummy_base_addr, dummy_value;
+	unsigned short dummy_limit = 0;
+
+	if (!data || !data_size || !insn)
+		return -EINVAL;
+	/*
+	 * These two instructions return the base address and limit of the
+	 * global and interrupt descriptor table, respectively. According to the
+	 * Intel Software Development manual, the base address can be 24-bit,
+	 * 32-bit or 64-bit. Limit is always 16-bit. If the operand size is
+	 * 16-bit, the returned value of the base address is supposed to be a
+	 * zero-extended 24-byte number. However, it seems that a 32-byte number
+	 * is always returned irrespective of the operand size.
+	 */
+
+	if (umip_inst == UMIP_INST_SGDT || umip_inst == UMIP_INST_SIDT) {
+		/* SGDT and SIDT do not use registers operands. */
+		if (X86_MODRM_MOD(insn->modrm.value) == 3)
+			return -EINVAL;
+
+		if (umip_inst == UMIP_INST_SGDT)
+			dummy_base_addr = UMIP_DUMMY_GDT_BASE;
+		else
+			dummy_base_addr = UMIP_DUMMY_IDT_BASE;
+
+		*data_size = UMIP_GDT_IDT_LIMIT_SIZE + UMIP_GDT_IDT_BASE_SIZE;
+
+		memcpy(data + 2, &dummy_base_addr, UMIP_GDT_IDT_BASE_SIZE);
+		memcpy(data, &dummy_limit, UMIP_GDT_IDT_LIMIT_SIZE);
+
+	} else if (umip_inst == UMIP_INST_SMSW) {
+		dummy_value = CR0_STATE;
+
+		/*
+		 * Even though the CR0 register has 4 bytes, the number
+		 * of bytes to be copied in the result buffer is determined
+		 * by whether the operand is a register or a memory location.
+		 * If operand is a register, return as many bytes as the operand
+		 * size. If operand is memory, return only the two least
+		 * siginificant bytes of CR0.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) == 3)
+			*data_size = insn->opnd_bytes;
+		else
+			*data_size = 2;
+
+		memcpy(data, &dummy_value, *data_size);
+	/* STR and SLDT  are not emulated */
+	} else {
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/**
+ * fixup_umip_exception() - Fixup #GP faults caused by UMIP
+ * @regs:	Registers as saved when entering the #GP trap
+ *
+ * The instructions sgdt, sidt, str, smsw, sldt cause a general protection
+ * fault if executed with CPL > 0 (i.e., from user space). If the offending
+ * user-space process is 32-bit, this function fixes the exception up and
+ * provides dummy values for the sgdt, sidt and smsw; str and sldt are not
+ * fixed up. Also 64-bit user-space processes are not fixed up.
+ *
+ * If operands are memory addresses, results are copied to user-
+ * space memory as indicated by the instruction pointed by EIP using the
+ * registers indicated in the instruction operands. If operands are registers,
+ * results are copied into the context that was saved when entering kernel mode.
+ *
+ * Returns:
+ *
+ * True if emulation was successful; false if not.
+ */
+bool fixup_umip_exception(struct pt_regs *regs)
+{
+	int not_copied, nr_copied, reg_offset, dummy_data_size, umip_inst;
+	unsigned long seg_base = 0, *reg_addr;
+	/* 10 bytes is the maximum size of the result of UMIP instructions */
+	unsigned char dummy_data[10] = { 0 };
+	unsigned char buf[MAX_INSN_SIZE];
+	void __user *uaddr;
+	struct insn insn;
+	char seg_defs;
+
+	if (!regs)
+		return false;
+
+	/* Do not emulate 64-bit processes. */
+	if (user_64bit_mode(regs))
+		return false;
+
+	/*
+	 * Use the segment base in case user space used a different code
+	 * segment, either in protected (e.g., from an LDT), virtual-8086
+	 * or long (via the FS or GS registers) modes. In most of the cases
+	 * seg_base will be zero as in USER_CS.
+	 */
+	if (!user_64bit_mode(regs))
+		seg_base = insn_get_seg_base(regs, INAT_SEG_REG_CS);
+
+	if (seg_base == -1L)
+		return false;
+
+	not_copied = copy_from_user(buf, (void __user *)(seg_base + regs->ip),
+				    sizeof(buf));
+	nr_copied = sizeof(buf) - not_copied;
+
+	/*
+	 * The copy_from_user above could have failed if user code is protected
+	 * by a memory protection key. Give up on emulation in such a case.
+	 * Should we issue a page fault?
+	 */
+	if (!nr_copied)
+		return false;
+
+	insn_init(&insn, buf, nr_copied, user_64bit_mode(regs));
+
+	/*
+	 * Override the default operand and address sizes with what is specified
+	 * in the code segment descriptor. The instruction decoder only sets
+	 * the address size it to either 4 or 8 address bytes and does nothing
+	 * for the operand bytes. This OK for most of the cases, but we could
+	 * have special cases where, for instance, a 16-bit code segment
+	 * descriptor is used.
+	 * If there is an address override prefix, the instruction decoder
+	 * correctly updates these values, even for 16-bit defaults.
+	 */
+	seg_defs = insn_get_code_seg_params(regs);
+	if (seg_defs == -EINVAL)
+		return false;
+
+	insn.addr_bytes = (unsigned char)INSN_CODE_SEG_ADDR_SZ(seg_defs);
+	insn.opnd_bytes = (unsigned char)INSN_CODE_SEG_OPND_SZ(seg_defs);
+
+	insn_get_length(&insn);
+	if (nr_copied < insn.length)
+		return false;
+
+	umip_inst = identify_insn(&insn);
+	if (umip_inst < 0)
+		return false;
+
+	if (emulate_umip_insn(&insn, umip_inst, dummy_data, &dummy_data_size))
+		return false;
+
+	/*
+	 * If operand is a register, write result to the copy of the register
+	 * value that was pushed to the stack when entering into kernel mode.
+	 * Upon exit, the value we write will be restored to the actual hardware
+	 * register.
+	 */
+	if (X86_MODRM_MOD(insn.modrm.value) == 3) {
+		reg_offset = insn_get_modrm_rm_off(&insn, regs);
+
+		/*
+		 * Negative values are usually errors. In memory addressing,
+		 * the exception is -EDOM. Since we expect a register operand,
+		 * all negative values are errors.
+		 */
+		if (reg_offset < 0)
+			return false;
+
+		reg_addr = (unsigned long *)((unsigned long)regs + reg_offset);
+		memcpy(reg_addr, dummy_data, dummy_data_size);
+	} else {
+		uaddr = insn_get_addr_ref(&insn, regs);
+		if ((unsigned long)uaddr == -1L)
+			return false;
+
+		nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
+		if (nr_copied  > 0)
+			return false;
+	}
+
+	/* increase IP to let the program keep going */
+	regs->ip += insn.length;
+	return true;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 09/13] x86/umip: Force a page fault when unable to copy emulated result to user
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (7 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 08/13] x86: Add emulation code for UMIP instructions Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 10/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu,
	Tony Luck

fixup_umip_exception() will be called from do_general_protection(). If the
former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
However, when emulation is successful but the emulated result cannot be
copied to user space memory, it is more accurate to issue a SIGSEGV with
SEGV_MAPERR with the offending address. A new function, inspired in
force_sig_info_fault(), is introduced to model the page fault.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/kernel/umip.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 31cf9e9..12da83a 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -187,6 +187,41 @@ static int emulate_umip_insn(struct insn *insn, int umip_inst,
 }
 
 /**
+ * force_sig_info_umip_fault() - Force a SIGSEGV with SEGV_MAPERR
+ * @addr:	Address that caused the signal
+ * @regs:	Register set containing the instruction pointer
+ *
+ * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This function is
+ * intended to be used to provide a segmentation fault when the result of the
+ * UMIP emulation could not be copied to the user space memory.
+ *
+ * Returns: none
+ */
+static void force_sig_info_umip_fault(void __user *addr, struct pt_regs *regs)
+{
+	siginfo_t info;
+	struct task_struct *tsk = current;
+
+	tsk->thread.cr2		= (unsigned long)addr;
+	tsk->thread.error_code	= X86_PF_USER | X86_PF_WRITE;
+	tsk->thread.trap_nr	= X86_TRAP_PF;
+
+	info.si_signo	= SIGSEGV;
+	info.si_errno	= 0;
+	info.si_code	= SEGV_MAPERR;
+	info.si_addr	= addr;
+	force_sig_info(SIGSEGV, &info, tsk);
+
+	if (!(show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)))
+		return;
+
+	pr_err_ratelimited("%s[%d] umip emulation segfault ip:%lx sp:%lx error:%x in %lx\n",
+			   tsk->comm, task_pid_nr(tsk), regs->ip,
+			   regs->sp, X86_PF_USER | X86_PF_WRITE,
+			   regs->ip);
+}
+
+/**
  * fixup_umip_exception() - Fixup #GP faults caused by UMIP
  * @regs:	Registers as saved when entering the #GP trap
  *
@@ -302,8 +337,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
 			return false;
 
 		nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
-		if (nr_copied  > 0)
-			return false;
+		if (nr_copied  > 0) {
+			/*
+			 * If copy fails, send a signal and tell caller that
+			 * fault was fixed up.
+			 */
+			force_sig_info_umip_fault(uaddr, regs);
+			return true;
+		}
 	}
 
 	/* increase IP to let the program keep going */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 10/13] x86: Enable User-Mode Instruction Prevention
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (8 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 09/13] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 11/13] x86/traps: Fixup general protection faults caused by UMIP Ricardo Neri
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu,
	Tony Luck

User-Mode Instruction Prevention (UMIP) is enabled by setting/clearing a
bit in %cr4.

It makes sense to enable UMIP at some point while booting, before user
spaces come up. Like SMAP and SMEP, is not critical to have it enabled
very early during boot. This is because UMIP is relevant only when there is
a userspace to be protected from. Given the similarities in relevance, it
makes sense to enable UMIP along with SMAP and SMEP.

At the moment, UMIP is disabled by default. It can be enabled at build
time by selecting CONFIG_X86_INTEL_UMIP. If enabled at build time, it can
be disabled at run time by adding clearcpuid=514 to the kernel parameters.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/Kconfig             | 10 ++++++++++
 arch/x86/kernel/cpu/common.c | 25 ++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ecf2cf3..1579a71 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1803,6 +1803,16 @@ config X86_SMAP
 
 	  If unsure, say Y.
 
+config X86_INTEL_UMIP
+	def_bool n
+	depends on CPU_SUP_INTEL
+	prompt "Intel User Mode Instruction Prevention" if EXPERT
+	---help---
+	  The User Mode Instruction Prevention (UMIP) is a security
+	  feature in newer Intel processors. If enabled, a general
+	  protection fault is issued if the instructions SGDT, SLDT,
+	  SIDT, SMSW and STR are executed in user mode.
+
 config X86_INTEL_MPX
 	prompt "Intel MPX (Memory Protection Extensions)"
 	def_bool n
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 09b8a99..3e6b9ca 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -329,6 +329,28 @@ static __always_inline void setup_smap(struct cpuinfo_x86 *c)
 	}
 }
 
+static __always_inline void setup_umip(struct cpuinfo_x86 *c)
+{
+	/* Check the boot processor, plus build option for UMIP. */
+	if (!cpu_feature_enabled(X86_FEATURE_UMIP))
+		goto out;
+
+	/* Check the current processor's cpuid bits. */
+	if (!cpu_has(c, X86_FEATURE_UMIP))
+		goto out;
+
+	cr4_set_bits(X86_CR4_UMIP);
+
+	return;
+
+out:
+	/*
+	 * Make sure UMIP is disabled in case it was enabled in a
+	 * previous boot (e.g., via kexec).
+	 */
+	cr4_clear_bits(X86_CR4_UMIP);
+}
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1147,9 +1169,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	/* Disable the PN if appropriate */
 	squash_the_stupid_serial_number(c);
 
-	/* Set up SMEP/SMAP */
+	/* Set up SMEP/SMAP/UMIP */
 	setup_smep(c);
 	setup_smap(c);
+	setup_umip(c);
 
 	/*
 	 * The vendor-specific functions might have changed features.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 11/13] x86/traps: Fixup general protection faults caused by UMIP
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (9 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 10/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 12/13] selftests/x86: Add tests for User-Mode Instruction Prevention Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 13/13] selftests/x86: Add tests for instruction str and sldt Ricardo Neri
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu,
	Tony Luck

If the User-Mode Instruction Prevention CPU feature is available and
enabled, a general protection fault will be issued if the instructions
sgdt, sldt, sidt, str or smsw are executed from user-mode context
(CPL > 0). If the fault was caused by any of the instructions protected
by UMIP, fixup_umip_exception() will emulate dummy results for these
instructions as follows: if running a 32-bit process, sgdt, sidt and smsw
are emulated; str and sldt are not emulated. No emulation is done for
64-bit processes.

If emulation is successful, the result is passed to the user space program
and no SIGSEGV signal is emitted.

Please note that fixup_umip_exception() also caters for the case when
the fault originated while running in virtual-8086 mode.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/kernel/traps.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a5791f3..4c0aa6c 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -60,6 +60,7 @@
 #include <asm/trace/mpx.h>
 #include <asm/mpx.h>
 #include <asm/vm86.h>
+#include <asm/umip.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -514,6 +515,10 @@ do_general_protection(struct pt_regs *regs, long error_code)
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
 	cond_local_irq_enable(regs);
 
+	if (static_cpu_has(X86_FEATURE_UMIP))
+		if (user_mode(regs) && fixup_umip_exception(regs))
+			return;
+
 	if (v8086_mode(regs)) {
 		local_irq_enable();
 		handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 12/13] selftests/x86: Add tests for User-Mode Instruction Prevention
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (10 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 11/13] x86/traps: Fixup general protection faults caused by UMIP Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  2017-10-27 23:51 ` [PATCH v10 13/13] selftests/x86: Add tests for instruction str and sldt Ricardo Neri
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu

Certain user space programs that run on virtual-8086 mode may utilize
instructions protected by the User-Mode Instruction Prevention (UMIP)
security feature present in new Intel processors: SGDT, SIDT and SMSW. In
such a case, a general protection fault is issued if UMIP is enabled. When
such a fault happens, the kernel traps it and emulates the results of
these instructions with dummy values. The purpose of this new
test is to verify whether the impacted instructions can be executed
without causing such #GP. If no #GP exceptions occur, we expect to exit
virtual-8086 mode from INT3.

The instructions protected by UMIP are executed in representative use
cases:
 a) displacement-only memory addressing
 b) register-indirect memory addressing
 c) results stored directly in operands

Unfortunately, it is not possible to check the results against a set of
expected values because no emulation will occur in systems that do not
have the UMIP feature. Instead, results are printed for verification. A
simple verification is done to ensure that results of all tests are
identical.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 73 ++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c b/tools/testing/selftests/x86/entry_from_vm86.c
index d075ea0..f7d9cea 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -95,6 +95,22 @@ asm (
 	"int3\n\t"
 	"vmcode_int80:\n\t"
 	"int $0x80\n\t"
+	"vmcode_umip:\n\t"
+	/* addressing via displacements */
+	"smsw (2052)\n\t"
+	"sidt (2054)\n\t"
+	"sgdt (2060)\n\t"
+	/* addressing via registers */
+	"mov $2066, %bx\n\t"
+	"smsw (%bx)\n\t"
+	"mov $2068, %bx\n\t"
+	"sidt (%bx)\n\t"
+	"mov $2074, %bx\n\t"
+	"sgdt (%bx)\n\t"
+	/* register operands, only for smsw */
+	"smsw %ax\n\t"
+	"mov %ax, (2080)\n\t"
+	"int3\n\t"
 	".size vmcode, . - vmcode\n\t"
 	"end_vmcode:\n\t"
 	".code32\n\t"
@@ -103,7 +119,7 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-	vmcode_sti[], vmcode_int3[], vmcode_int80[];
+	vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -160,6 +176,58 @@ static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
 	return true;
 }
 
+void do_umip_tests(struct vm86plus_struct *vm86, unsigned char *test_mem)
+{
+	struct table_desc {
+		unsigned short limit;
+		unsigned long base;
+	} __attribute__((packed));
+
+	/* Initialize variables with arbitrary values */
+	struct table_desc gdt1 = { .base = 0x3c3c3c3c, .limit = 0x9999 };
+	struct table_desc gdt2 = { .base = 0x1a1a1a1a, .limit = 0xaeae };
+	struct table_desc idt1 = { .base = 0x7b7b7b7b, .limit = 0xf1f1 };
+	struct table_desc idt2 = { .base = 0x89898989, .limit = 0x1313 };
+	unsigned short msw1 = 0x1414, msw2 = 0x2525, msw3 = 3737;
+
+	/* UMIP -- exit with INT3 unless kernel emulation did not trap #GP */
+	do_test(vm86, vmcode_umip - vmcode, VM86_TRAP, 3, "UMIP tests");
+
+	/* Results from displacement-only addressing */
+	msw1 = *(unsigned short *)(test_mem + 2052);
+	memcpy(&idt1, test_mem + 2054, sizeof(idt1));
+	memcpy(&gdt1, test_mem + 2060, sizeof(gdt1));
+
+	/* Results from register-indirect addressing */
+	msw2 = *(unsigned short *)(test_mem + 2066);
+	memcpy(&idt2, test_mem + 2068, sizeof(idt2));
+	memcpy(&gdt2, test_mem + 2074, sizeof(gdt2));
+
+	/* Results when using register operands */
+	msw3 = *(unsigned short *)(test_mem + 2080);
+
+	printf("[INFO]\tResult from SMSW:[0x%04x]\n", msw1);
+	printf("[INFO]\tResult from SIDT: limit[0x%04x]base[0x%08lx]\n",
+	       idt1.limit, idt1.base);
+	printf("[INFO]\tResult from SGDT: limit[0x%04x]base[0x%08lx]\n",
+	       gdt1.limit, gdt1.base);
+
+	if (msw1 != msw2 || msw1 != msw3)
+		printf("[FAIL]\tAll the results of SMSW should be the same.\n");
+	else
+		printf("[PASS]\tAll the results from SMSW are identical.\n");
+
+	if (memcmp(&gdt1, &gdt2, sizeof(gdt1)))
+		printf("[FAIL]\tAll the results of SGDT should be the same.\n");
+	else
+		printf("[PASS]\tAll the results from SGDT are identical.\n");
+
+	if (memcmp(&idt1, &idt2, sizeof(idt1)))
+		printf("[FAIL]\tAll the results of SIDT should be the same.\n");
+	else
+		printf("[PASS]\tAll the results from SIDT are identical.\n");
+}
+
 int main(void)
 {
 	struct vm86plus_struct v86;
@@ -218,6 +286,9 @@ int main(void)
 	v86.regs.eax = (unsigned int)-1;
 	do_test(&v86, vmcode_int80 - vmcode, VM86_INTx, 0x80, "int80");
 
+	/* UMIP -- should exit with INTx 0x80 unless UMIP was not disabled */
+	do_umip_tests(&v86, addr);
+
 	/* Execute a null pointer */
 	v86.regs.cs = 0;
 	v86.regs.ss = 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v10 13/13] selftests/x86: Add tests for instruction str and sldt
  2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (11 preceding siblings ...)
  2017-10-27 23:51 ` [PATCH v10 12/13] selftests/x86: Add tests for User-Mode Instruction Prevention Ricardo Neri
@ 2017-10-27 23:51 ` Ricardo Neri
  12 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-10-27 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Fenghua Yu

The instructions str and sldt are not recognized when running on virtual-
8086 mode and generate an invalid operand exception. These two
instructions are protected by the Intel User-Mode Instruction Prevention
(UMIP) security feature. In protected mode, if UMIP is enabled, these
instructions generate a general protection fault if called from CPL > 0.
Linux traps the general protection fault and emulates the instructions
sgdt, sidt and smsw; but not str and sldt.

These tests are added to verify that the emulation code does not emulate
these two instructions but the expected invalid operand exception is
seen.

Tests fallback to exit with int3 in case emulation does happen.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c b/tools/testing/selftests/x86/entry_from_vm86.c
index f7d9cea..361466a 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -111,6 +111,11 @@ asm (
 	"smsw %ax\n\t"
 	"mov %ax, (2080)\n\t"
 	"int3\n\t"
+	"vmcode_umip_str:\n\t"
+	"str %eax\n\t"
+	"vmcode_umip_sldt:\n\t"
+	"sldt %eax\n\t"
+	"int3\n\t"
 	".size vmcode, . - vmcode\n\t"
 	"end_vmcode:\n\t"
 	".code32\n\t"
@@ -119,7 +124,8 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-	vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[];
+	vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[],
+	vmcode_umip_str[], vmcode_umip_sldt[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -226,6 +232,16 @@ void do_umip_tests(struct vm86plus_struct *vm86, unsigned char *test_mem)
 		printf("[FAIL]\tAll the results of SIDT should be the same.\n");
 	else
 		printf("[PASS]\tAll the results from SIDT are identical.\n");
+
+	sethandler(SIGILL, sighandler, 0);
+	do_test(vm86, vmcode_umip_str - vmcode, VM86_SIGNAL, 0,
+		"STR instruction");
+	clearhandler(SIGILL);
+
+	sethandler(SIGILL, sighandler, 0);
+	do_test(vm86, vmcode_umip_sldt - vmcode, VM86_SIGNAL, 0,
+		"SLDT instruction");
+	clearhandler(SIGILL);
 }
 
 int main(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-10-27 23:51 ` [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions Ricardo Neri
@ 2017-11-02  8:51   ` Ingo Molnar
  2017-11-02 11:07     ` Thomas Gleixner
  2017-11-03  2:44     ` Ricardo Neri
  0 siblings, 2 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-11-02  8:51 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov


* Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:

> +	/*
> +	 * -EDOM means that we must ignore the address_offset. In such a case,
> +	 * in 64-bit mode the effective address relative to the RIP of the
> +	 * following instruction.
> +	 */
> +	if (*regoff == -EDOM) {
> +		if (user_64bit_mode(regs))
> +			tmp = (long)regs->ip + insn->length;
> +		else
> +			tmp = 0;
> +	} else if (*regoff < 0) {
> +		return -EINVAL;
> +	} else {
> +		tmp = (long)regs_get_register(regs, *regoff);
> +	}

> +	else
> +		indx = (long)regs_get_register(regs, indx_offset);

This and subsequent patches include a disgustly insane amount of type casts - why?

For example here 'tmp' is 'long', while regs_get_register() returns
'unsigned long', but no type cast is necessary for that.

> +			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> +						 &eff_addr);

Also, please don't break lines slightly longer than 80 cols just to pacify 
checkpatch (and this holds for other patches as well) - the cure is worse than the 
illness!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit
  2017-10-27 23:51 ` [PATCH v10 01/13] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit Ricardo Neri
@ 2017-11-02  9:37   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-02  9:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adam.buchbinder, adrian.hunter, linux-kernel, bp, qiaowei.ren,
	dave.hansen, slaoub, colin.king, paul.gortmaker, shuah,
	ricardo.neri-calderon, torvalds, dvyukov, lstoakes, keescook,
	acme, tglx, cmetcalf, luto, pbonzini, mhiramat, akpm, mst, hpa,
	ravi.v.shankar, peterz, mingo, vbabka, jslaby, thgarnie, corbet,
	ray.huang, brgerst

Commit-ID:  71271269ef9a997fb4416b2f8ef3558dd846c7cb
Gitweb:     https://git.kernel.org/tip/71271269ef9a997fb4416b2f8ef3558dd846c7cb
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 16:51:38 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 2 Nov 2017 09:55:14 +0100

x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit

In protected mode, it is common to want to obtain the limit of a segment
along with its base address. This is useful, for instance, to verify that
an effective address lies within a segment before computing a linear
address.

Up to this point, this library only computes linear addresses in long
mode. Subsequent patches will include support for protected mode. Support
to verify the segment limit will be needed.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1509148310-30862-2-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/lib/insn-eval.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 1c23ec0..91f08aa 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -729,25 +729,29 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
 }
 
 /**
- * get_seg_base_addr() - obtain base address of a segment
+ * get_seg_base_limit() - obtain base address and limit of a segment
  * @insn:	Instruction. Must be valid.
  * @regs:	Register values as seen when entering kernel mode
  * @regoff:	Operand offset, in pt_regs, used to resolve segment descriptor
  * @base:	Obtained segment base
+ * @limit:	Obtained segment limit
  *
- * Obtain the base address of the segment associated with the operand @regoff
- * and, if any or allowed, override prefixes in @insn. This function is
+ * Obtain the base address and limit of the segment associated with the operand
+ * @regoff and, if any or allowed, override prefixes in @insn. This function is
  * different from insn_get_seg_base() as the latter does not resolve the segment
- * associated with the instruction operand.
+ * associated with the instruction operand. If a limit is not needed (e.g.,
+ * when running in long mode), @limit can be NULL.
  *
  * Returns:
  *
- * 0 on success. @base will contain the base address of the resolved segment.
+ * 0 on success. @base and @limit will contain the base address and of the
+ * resolved segment, respectively.
  *
  * -EINVAL on error.
  */
-static int get_seg_base_addr(struct insn *insn, struct pt_regs *regs,
-			     int regoff, unsigned long *base)
+static int get_seg_base_limit(struct insn *insn, struct pt_regs *regs,
+			      int regoff, unsigned long *base,
+			      unsigned long *limit)
 {
 	int seg_reg_idx;
 
@@ -762,6 +766,13 @@ static int get_seg_base_addr(struct insn *insn, struct pt_regs *regs,
 	if (*base == -1L)
 		return -EINVAL;
 
+	if (!limit)
+		return 0;
+
+	*limit = get_seg_limit(regs, seg_reg_idx);
+	if (!(*limit))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -843,7 +854,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr += insn->displacement.value;
 	}
 
-	ret = get_seg_base_addr(insn, regs, addr_offset, &seg_base);
+	ret = get_seg_base_limit(insn, regs, addr_offset, &seg_base, NULL);
 	if (ret)
 		goto out;
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-11-02  8:51   ` Ingo Molnar
@ 2017-11-02 11:07     ` Thomas Gleixner
  2017-11-03  2:46       ` Ricardo Neri
  2017-11-03  2:44     ` Ricardo Neri
  1 sibling, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2017-11-02 11:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Ricardo Neri, Ingo Molnar, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

On Thu, 2 Nov 2017, Ingo Molnar wrote:
 
> * Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> 
> > +	/*
> > +	 * -EDOM means that we must ignore the address_offset. In such a case,
> > +	 * in 64-bit mode the effective address relative to the RIP of the
> > +	 * following instruction.
> > +	 */
> > +	if (*regoff == -EDOM) {
> > +		if (user_64bit_mode(regs))
> > +			tmp = (long)regs->ip + insn->length;
> > +		else
> > +			tmp = 0;
> > +	} else if (*regoff < 0) {
> > +		return -EINVAL;
> > +	} else {
> > +		tmp = (long)regs_get_register(regs, *regoff);
> > +	}
> 
> > +	else
> > +		indx = (long)regs_get_register(regs, indx_offset);
> 
> This and subsequent patches include a disgustly insane amount of type casts - why?
> 
> For example here 'tmp' is 'long', while regs_get_register() returns
> 'unsigned long', but no type cast is necessary for that.
> 
> > +			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> > +						 &eff_addr);
> 
> Also, please don't break lines slightly longer than 80 cols just to pacify 
> checkpatch (and this holds for other patches as well) - the cure is worse than the 
> illness!

The right thing to do here is to split out stuff into a helper function
which removes the indentation levels or restructure the code to avoid them.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-11-02  8:51   ` Ingo Molnar
  2017-11-02 11:07     ` Thomas Gleixner
@ 2017-11-03  2:44     ` Ricardo Neri
  2017-11-03 10:17       ` Ingo Molnar
  1 sibling, 1 reply; 22+ messages in thread
From: Ricardo Neri @ 2017-11-03  2:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

On Thu, Nov 02, 2017 at 09:51:08AM +0100, Ingo Molnar wrote:
> 
> * Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> 
> > +	/*
> > +	 * -EDOM means that we must ignore the address_offset. In such a case,
> > +	 * in 64-bit mode the effective address relative to the RIP of the
> > +	 * following instruction.
> > +	 */
> > +	if (*regoff == -EDOM) {
> > +		if (user_64bit_mode(regs))
> > +			tmp = (long)regs->ip + insn->length;
> > +		else
> > +			tmp = 0;
> > +	} else if (*regoff < 0) {
> > +		return -EINVAL;
> > +	} else {
> > +		tmp = (long)regs_get_register(regs, *regoff);
> > +	}
> 
> > +	else
> > +		indx = (long)regs_get_register(regs, indx_offset);
> 
> This and subsequent patches include a disgustly insane amount of type casts - why?
> 
> For example here 'tmp' is 'long', while regs_get_register() returns
> 'unsigned long', but no type cast is necessary for that.
> 
> > +			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> > +						 &eff_addr);

One of the goals of this series is to have the ability to compute 16-bit, 32-bit
and 64-bit addresses. I put lost of casts, between signed and unsigned types,
between 64-bit and 32-bit and 16-bit casts. After seeing your comment I have gone
through the code and I have removed most of the casts. Instead I will use masks.
I will also inspect the resulting assembly code to make sure the arithmetic is
performed in the address size pertinent to each case.

> 
> Also, please don't break lines slightly longer than 80 cols just to pacify 
> checkpatch (and this holds for other patches as well) - the cure is worse than the 
> illness!

I will look into these two cases and reorganize the code.

Thanks and BR,
Ricardo 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-11-02 11:07     ` Thomas Gleixner
@ 2017-11-03  2:46       ` Ricardo Neri
  0 siblings, 0 replies; 22+ messages in thread
From: Ricardo Neri @ 2017-11-03  2:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Ingo Molnar, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

On Thu, Nov 02, 2017 at 12:07:13PM +0100, Thomas Gleixner wrote:
> On Thu, 2 Nov 2017, Ingo Molnar wrote:
>  
> > * Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> > 
> > > +	/*
> > > +	 * -EDOM means that we must ignore the address_offset. In such a case,
> > > +	 * in 64-bit mode the effective address relative to the RIP of the
> > > +	 * following instruction.
> > > +	 */
> > > +	if (*regoff == -EDOM) {
> > > +		if (user_64bit_mode(regs))
> > > +			tmp = (long)regs->ip + insn->length;
> > > +		else
> > > +			tmp = 0;
> > > +	} else if (*regoff < 0) {
> > > +		return -EINVAL;
> > > +	} else {
> > > +		tmp = (long)regs_get_register(regs, *regoff);
> > > +	}
> > 
> > > +	else
> > > +		indx = (long)regs_get_register(regs, indx_offset);
> > 
> > This and subsequent patches include a disgustly insane amount of type casts - why?
> > 
> > For example here 'tmp' is 'long', while regs_get_register() returns
> > 'unsigned long', but no type cast is necessary for that.
> > 
> > > +			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> > > +						 &eff_addr);
> > 
> > Also, please don't break lines slightly longer than 80 cols just to pacify 
> > checkpatch (and this holds for other patches as well) - the cure is worse than the 
> > illness!
> 
> The right thing to do here is to split out stuff into a helper function
> which removes the indentation levels or restructure the code to avoid them.

This patch introduce this several helper function. Perhaps I can add some more. For
this particular case, I think that using shorter variable names will avoid this
problem.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-11-03  2:44     ` Ricardo Neri
@ 2017-11-03 10:17       ` Ingo Molnar
  2017-11-04  0:40         ` Ricardo Neri
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-11-03 10:17 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov


* Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:

> On Thu, Nov 02, 2017 at 09:51:08AM +0100, Ingo Molnar wrote:
> > 
> > * Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> > 
> > > +	/*
> > > +	 * -EDOM means that we must ignore the address_offset. In such a case,
> > > +	 * in 64-bit mode the effective address relative to the RIP of the
> > > +	 * following instruction.
> > > +	 */
> > > +	if (*regoff == -EDOM) {
> > > +		if (user_64bit_mode(regs))
> > > +			tmp = (long)regs->ip + insn->length;
> > > +		else
> > > +			tmp = 0;
> > > +	} else if (*regoff < 0) {
> > > +		return -EINVAL;
> > > +	} else {
> > > +		tmp = (long)regs_get_register(regs, *regoff);
> > > +	}
> > 
> > > +	else
> > > +		indx = (long)regs_get_register(regs, indx_offset);
> > 
> > This and subsequent patches include a disgustly insane amount of type casts - why?
> > 
> > For example here 'tmp' is 'long', while regs_get_register() returns
> > 'unsigned long', but no type cast is necessary for that.
> > 
> > > +			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> > > +						 &eff_addr);
> 
> One of the goals of this series is to have the ability to compute 16-bit, 32-bit
> and 64-bit addresses. I put lost of casts, between signed and unsigned types,
> between 64-bit and 32-bit and 16-bit casts. After seeing your comment I have gone
> through the code and I have removed most of the casts. Instead I will use masks.
> I will also inspect the resulting assembly code to make sure the arithmetic is
> performed in the address size pertinent to each case.

Well, casts are probably fine when the goal is to zero out high bits - but the 
ones I quoted converted types of the same with.

For register values it would also probably be cleaner to use the u8, u16, u32 and 
u64 types instead of char/short/int/long - this goes hand in hand with how the 
instructions are documented in the SDMs.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-11-03 10:17       ` Ingo Molnar
@ 2017-11-04  0:40         ` Ricardo Neri
  2017-11-04  8:26           ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Ricardo Neri @ 2017-11-04  0:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

On Fri, Nov 03, 2017 at 11:17:49AM +0100, Ingo Molnar wrote:
> 
> * Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> 
> > On Thu, Nov 02, 2017 at 09:51:08AM +0100, Ingo Molnar wrote:
> > > 
> > > * Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> > > 
> > > > +	/*
> > > > +	 * -EDOM means that we must ignore the address_offset. In such a case,
> > > > +	 * in 64-bit mode the effective address relative to the RIP of the
> > > > +	 * following instruction.
> > > > +	 */
> > > > +	if (*regoff == -EDOM) {
> > > > +		if (user_64bit_mode(regs))
> > > > +			tmp = (long)regs->ip + insn->length;
> > > > +		else
> > > > +			tmp = 0;
> > > > +	} else if (*regoff < 0) {
> > > > +		return -EINVAL;
> > > > +	} else {
> > > > +		tmp = (long)regs_get_register(regs, *regoff);
> > > > +	}
> > > 
> > > > +	else
> > > > +		indx = (long)regs_get_register(regs, indx_offset);
> > > 
> > > This and subsequent patches include a disgustly insane amount of type casts - why?
> > > 
> > > For example here 'tmp' is 'long', while regs_get_register() returns
> > > 'unsigned long', but no type cast is necessary for that.
> > > 
> > > > +			ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> > > > +						 &eff_addr);
> > 
> > One of the goals of this series is to have the ability to compute 16-bit, 32-bit
> > and 64-bit addresses. I put lost of casts, between signed and unsigned types,
> > between 64-bit and 32-bit and 16-bit casts. After seeing your comment I have gone
> > through the code and I have removed most of the casts. Instead I will use masks.
> > I will also inspect the resulting assembly code to make sure the arithmetic is
> > performed in the address size pertinent to each case.
> 
> Well, casts are probably fine when the goal is to zero out high bits

I was able to remove the majority of casts and used masks. I see many other parts
of Linux doing similarly. For instance, in arch/x86/kernel/kexec-bzimage64.c I see

    params->hdr.ramdisk_image = initrd_load_addr & 0xffffffffUL;

ramdisk_image is of type __u32 while initrd_load_addr is of type unsigned long.

I guess that in this example doing

    params->hdr.ramdisk_image = (__u32)(initrd_load_addr & 0xffffffffUL);

would be redundant? The mask would indicate better what is going on.

> but the ones I quoted converted types of the same with.

I made sure I removed these.

> 
> For register values it would also probably be cleaner to use the u8, u16, u32 and 
> u64 types instead of char/short/int/long - this goes hand in hand with how the 
> instructions are documented in the SDMs.

In the rest of the functions I have used char/short/int/long. Would it be OK to have
a mixture of type styles? Perhaps I can rewrite only the functions that deal with
variables of different width.

Plus, one more advantage of using char/short/int/long is that when building a 32-bit
kernel long will be a 32-bit type. Thus, all the aritmetic would be naturally done
with variables of the appropriate width. Perhaps I could use u8/u16/u32/long? It
looks white odd, though.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions
  2017-11-04  0:40         ` Ricardo Neri
@ 2017-11-04  8:26           ` Ingo Molnar
  0 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-11-04  8:26 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov


* Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:

> Plus, one more advantage of using char/short/int/long is that when building a 
> 32-bit kernel long will be a 32-bit type. Thus, all the aritmetic would be 
> naturally done with variables of the appropriate width. Perhaps I could use 
> u8/u16/u32/long? It looks white odd, though.

Ok, I agree that this aspect is important - and mixing u8/u16/u32 with 'long' 
would look a bit weird.

Let's keep the char/short/int/long types for now.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-11-04  8:26 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27 23:51 [PATCH v10 00/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 01/13] x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limit Ricardo Neri
2017-11-02  9:37   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions Ricardo Neri
2017-11-02  8:51   ` Ingo Molnar
2017-11-02 11:07     ` Thomas Gleixner
2017-11-03  2:46       ` Ricardo Neri
2017-11-03  2:44     ` Ricardo Neri
2017-11-03 10:17       ` Ingo Molnar
2017-11-04  0:40         ` Ricardo Neri
2017-11-04  8:26           ` Ingo Molnar
2017-10-27 23:51 ` [PATCH v10 03/13] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 04/13] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 05/13] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 06/13] x86/insn-eval: Add support to resolve 16-bit address encodings Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 07/13] x86/cpufeature: Add User-Mode Instruction Prevention definitions Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 08/13] x86: Add emulation code for UMIP instructions Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 09/13] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 10/13] x86: Enable User-Mode Instruction Prevention Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 11/13] x86/traps: Fixup general protection faults caused by UMIP Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 12/13] selftests/x86: Add tests for User-Mode Instruction Prevention Ricardo Neri
2017-10-27 23:51 ` [PATCH v10 13/13] selftests/x86: Add tests for instruction str and sldt Ricardo Neri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).