All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX
@ 2017-10-27 20:25 Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 01/18] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
                   ` (17 more replies)
  0 siblings, 18 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri

This a shortened version of the series "x86: Enable User-Mode Instruction
Prevention (UMIP)". This series only includes the code that is used to
compute 64-bit linear addresses with and without segmentation plus handling
the special cases of each addressing mode.

The purpose of this series is to gather all the patches that have been
reviewed during the last nine versions of the series and have it merged in
the tip tree (hopefully!).

Thus far, this code is use by MPX. It will also be used by UMIP once
enabled. A separate series will deal with the UMIP emulation code as
well as 32-bit and 16-bit addresses. A discussion on UMIP and the need for
emulation code can be found here [9].

For reference, the nine previous submissions can be found here [1], here [2],
here[3], here[4], here[5], here[6], here[7], here[8] and here[9].

This version addresses the feedback comments from Borislav Petkov received on
v9. Please see details in the change log.

=== How is this series laid out?

++ Preparatory work
As per suggestions from Andy Lutormirsky and Borislav Petkov, I moved
the x86 page fault error codes to a header. Also, I made user_64bit_mode
available to x86_32 builds. This helps to reuse code and reduce the number
of #ifdef's in these patches. Borislav also suggested to uprobes should use
the existing definitions in arch/x86/include/asm/inat.h instead of hard-
coded values when checking instruction prefixes. I included this change
in the series.

++ Fix bugs in MPX address decoder
I found very useful the code for Intel MPX (Memory Protection Extensions)
used to parse opcodes and the memory locations contained in the general
purpose registers when used as operands. I put this code in a separate
library file that both MPX, UMIP and potentially others can access and
avoid code duplication.

Before creating the new library, I fixed several of bugs that I found in
in corner cases on how MPX determines the address contained in the
instruction and operands.

++ Provide a new x86 instruction evaluating library
With bugs fixed, the MPX evaluating code is relocated in a new insn-eval.c
library. The basic functionality of this library is extended to obtain the
segment descriptor selected by either segment override prefixes or the
default segment by the involved registers in the calculation of the
effective address. It was also extended to obtain the default address and
operand sizes as well as the segment base address. Armed with this arsenal,
it is now possible to determine the linear address indicated by the
operands of an instruction structure. This new library relies on and
extends the capabilities of the existing instruction decoder in
arch/x86/lib/insn.c.

++ Extensive tests
Extensive tests were performed to test all the combinations of ModRM,
SiB and displacements for 64-bit addresses; including segmentation via the
FS and GS segment registers. For this purpose, I relied on a CPU that
features UMIP support for 64-bit process. Emulation could also be tested
by using instructions that cause a #GP in readily available systems (e.g.,
use lgdt instead of sgdt). This change is not part of this patchset. Code
of these tests can be found here [13].

++ Merging this series?
As stated, this series contains code that has been reasonably reviewed
through 9 versions. It should be in good condition to be merged [14].
 
[1]. https://lwn.net/Articles/705877/
[2]. https://lkml.org/lkml/2016/12/23/265
[3]. https://lkml.org/lkml/2017/1/25/622
[4]. https://lkml.org/lkml/2017/2/23/40
[5]. https://lkml.org/lkml/2017/3/3/678
[6]. https://lkml.org/lkml/2017/3/7/866
[7]. https://lkml.org/lkml/2017/5/5/398
[8]. https://lkml.org/lkml/2017/8/18/992
[9]. https://lkml.org/lkml/2017/10/3/1066
[13]. https://github.com/01org/luv-yocto/tree/rneri/umip/meta-luv/recipes-core/umip/files
[14]. https://lkml.org/lkml/2017/10/20/763

Thanks and BR,
Ricardo

Changes since V9:
*Shortened the series to group the patches that have been reviewed thus far.
 This comprises the code to resolve 64-bit linear addresses with segmentation.
*Reworked the handling of segment resolution for rIP. This is the only case
 in which we don't have a valid instruction structure.
*Added a new function get_seg_base_addr() that resolves the segment register
 and finds its associated address.
*Added a new function resolve_default_seg() to determine the default segment
 associated by a given register. This is to simplify further the function
 resolve_seg_reg().
*Renamed function get_overridden_seg_reg_idx() as get_seg_reg_override_idx() and
 several automatic variables of such function.
*Renamed function allow_seg_reg_overrides() as check_seg_overrides().
*Renamed the function insn_get_code_seg_defaults() as
 insn_get_code_seg_params().

Changes since V8:
*Simplified error handling in the family of get_addr_ref_xx functions
 by initializing linear address to -1L.
*Reworded commit that #define's an initial state of CR0 and removed unneeded
 comment.
*Reworked get_desc() to get rid of one mutex_unlock(). Used a new local variable
 to improve readability.
*Reworked the utility functions used to obtain the segment selector:
  + get_overridden_seg_reg_idx() now only inspects the instruction to find
    segment override prefixes.
  + A new function allow_seg_reg_overrides() determines if segment override
    prefixes can be used based on the register operand in use and the nature of
    the instruction (i.e., string instructions vs not).
  + resolve_seg_reg() uses the two functions above, along with user_64bit_mode()
    to resolve the segment register index: overridden, default or ignored.
*Renamed local variables to reflect the fact that our segment registers are
 indexes and not the actual hardware regiters.
*Reworded function documentation for improved readability.

Changes since V7:
*UMIP is not enabled by default.
*Relocated definition of the initial state of CR0 into processor-flags.h
*Updated uprobes to use the autogenerated INAT_PFX_xS definitions instead of
 hard-coded values.
*In insn-eval.c, refer to segment override prefixes using the autogenerated
 INAT_PFX_XS definitions.
*Removed enumeration for segment registers that reused the segment override
 instruction prefixes. Instead, a new, separate, set of #defines is used in
 arch/x86/include/asm/inat.h
*Simplified function to identify string instruction.
*Split the code usde to determine the relevant segment register into two
 functions: one to inspect segment overrides and a second one to determine
 default segment registers based on the instruction and operands. A third
 functions reads the segment register to obtain the segment selector.
*Reworked arithmetic to compute 32-bit and 64-bit effective addresses. Instead
 of type casts, two separate functions are used in each case.
*Removed structure to hold segment default address and operand sizes. Used
 #defines instead.
*Corrected bug when determining the limit of a segment.
*Updated various functions to use error codes from errno-base.h
*Replaced prink_ratelimited with pr_err_ratelimited.
*Corrected typos and format errors in functions' documentation.
*Fixed unimplemented handling of emulation of the SMSW instruction.
*Added documentation to file containing implementation for UMIP.
*Improved error handling in fixup_umip_exception() function.

Changes since V6:
*Reworded and addded more details on the special cases of ModRM and SIB
 bytes. To avoid confusion, I ommited mentioning the involved registers
 (EBP and ESP).
*Replaced BUG() with printk_ratelimited in function get_reg_offset of
 insn-eval.c
*Removed unused utility functions that obtain a register value from pt_regs
 given a SIB base and index.
*Clarified nomenclature to call CS, DS, ES, FS, GS and SS segment registers
 and their values segment selectors.
*Reworked function resolve_seg_register to issue an error when more than
 one segment overrides prefixes are used in the instruction.
*Added logic in resolve_seg_register to ignore segment register when in
 long mode and not using FS or GS.
*Added logic to ensure the effective address is within the limits of the
 segment in protected mode.
*Added logic to ensure segment override prefixes are ignored when resolving
 the segment of EIP and EDI with string instructions.
*Added code to make user_64bit_mode() available in CONFIG_X86_32... and
 make it return false, of course.
*Merged the two functions that obtain the default address and operand size
 of a code segment into one as they are always used together.
*Corrected logic of displacement-only addressing in long mode to make the
 displacement relative to the RIP of the next instruction.
*Reworked logic to sign-extend 32-bit memory offsets into 64-bit signed
 memory offsets. This include more checks and putting all together in an
 utility function.
*Removed the 'unlikely' of conditional statements as we are not in a
 critical path.
*In virtual-8086 mode, ensure that effective addresses are always less
 than 0x10000,  even when address override prefixes are used. Also, ensure
 that linear addresses have a size of 20-bits.

Changes since V5:
* Relocate the page fault error code enumerations to traps.h

Changes since V4:
* Audited patches to use braces in all the branches of conditional.
  statements, except those in which the conditional action only takes one
  line.
* Implemented support in 64-builds for both 32-bit and 64-bit tasks in the
  instruction evaluating library.
* Split segment selector function in the instruction evaluating library
  into two functions to resolve the segment type by instruction override
  or default and a separate function to actually read the segment selector.
* Fixed a bug when evaluating 32-bit effective addresses with 64-bit
  kernels.
* Split patches further for for easier review.
* Use signed variables for computation of effective address.
* Fixed issue with a spurious static modifier in function insn_get_addr_ref
  found by kbuild test bot.
* Removed comparison between true and fixup_umip_exception.
* Reworked check logic when identifying erroneous vs invalid values of the
  SiB base and index.

Changes since V3:
* Limited emulation to 32-bit and 16-bit modes. For 64-bit mode, a general
  protection fault is still issued when UMIP-protected instructions are
  executed with CPL > 0.
* Expanded instruction-evaluating code to obtain segment descriptor along
  with their attributes such as base address and default address and
  operand sizes. Also, support for 16-bit encodings in protected mode was
  implemented.
* When getting a segment descriptor, this include support to obtain those
  of a local descriptor table.
* Now the instruction-evaluating code returns -EDOM when the value of
  registers should not be used in calculating the effective address. The
  value -EINVAL is left for errors.
* Incorporate the value of the segment base address in the computation of
  linear addresses.
* Renamed new instruction evaluation library from insn-kernel.c to
  insn-eval.c
* Exported functions insn_get_reg_offset_* to obtain the register offset
  by ModRM r/m, SiB base and SiB index.
* Improved documentation of functions.
* Split patches further for easier review.

Changes since V2:
* Added new utility functions to decode the memory addresses contained in
  registers when the 16-bit addressing encodings are used. This includes
  code to obtain and compute memory addresses using segment selectors for
  real-mode address translation.
* Added support to emulate UMIP-protected instructions for virtual-8086
  tasks.
* Added self-tests for virtual-8086 mode that contains representative
  use cases: address represented as a displacement, address in registers
  and registers as operands.
* Instead of maintaining a static variable for the dummy base addresses
  of the IDT and GDT, a hard-coded value is used.
* The emulated SMSW instructions now return the value with which the CR0
  register is programmed in head_32/64.S This is: PE | MP | ET | NE | WP
  | AM. For x86_64, PG is also enabled.
* The new file arch/x86/lib/insn-utils.c is now renamed as arch/x86/lib/
  insn-kernel.c. It also has its own header. This helps keep in sync the
  the kernel and objtool instruction decoders. Also, the new insn-kernel.c
  contains utility functions that are only relevant in a kernel context.
* Removed printed warnings for errors that occur when decoding instructions
  with invalid operands.
* Added more comments on fixes in the instruction-decoding MPX functions.
* Now user_64bit_mode(regs) is used instead of test_thread_flag(TIF_IA32)
  to determine if the task is 32-bit or 64-bit.
* Found and fixed a bug in insn-decoder in which X86_MODRM_RM was
  incorrectly used to obtain the mod part of the ModRM byte.
* Added more explanatory code in emulation and instruction decoding code.
  This includes a comment regarding that copy_from_user could fail if there
  exists a memory protection key in place.
* Tested code with CONFIG_X86_DECODER_SELFTEST=y and everything passes now.
* Prefixed get_reg_offset_rm with insn_ as this function is exposed
  via a header file. For clarity, this function was added in a separate
  patch.

Changes since V1:
* Virtual-8086 mode tasks are not treated in a special manner. All code
  for this purpose was removed.
* Instead of attempting to disable UMIP during a context switch or when
  entering virtual-8086 mode, UMIP remains enabled all the time. General
  protection faults that occur are fixed-up by returning dummy values as
  detailed above.
* Removed umip= kernel parameter in favor of using clearcpuid=514 to
  disable UMIP.
* Removed selftests designed to detect the absence of SIGSEGV signals when
  running in virtual-8086 mode.
* Reused code from MPX to decode instructions operands. For this purpose
  code was put in a common location.
* Fixed two bugs in MPX code that decodes operands.


Ricardo Neri (18):
  x86/mm: Relocate page fault error codes to traps.h
  x86/boot: Relocate definition of the initial state of CR0
  ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  uprobes/x86: Use existing definitions for segment override prefixes
  x86/mpx: Simplify handling of errors when computing linear addresses
  x86/mpx: Use signed variables to compute effective addresses
  x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is
    not 11b
  x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval
    file
  x86/insn-eval: Do not BUG on invalid register type
  x86/insn-eval: Add a utility function to get register offsets
  x86/insn-eval: Add utility function to identify string instructions
  x86/insn-eval: Add utility functions to get segment selector
  x86/insn-eval: Add utility function to get segment descriptor
  x86/insn-eval: Add utility functions to get segment descriptor base
    address and limit
  x86/insn-eval: Add function to get default params of code segment
  x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and
    ModRM.rm is 101b
  x86/insn-eval: Incorporate segment base in linear address computation

 arch/x86/include/asm/inat.h                 |  10 +
 arch/x86/include/asm/insn-eval.h            |  23 +
 arch/x86/include/asm/ptrace.h               |   6 +-
 arch/x86/include/asm/traps.h                |  18 +
 arch/x86/include/uapi/asm/processor-flags.h |   3 +
 arch/x86/kernel/head_32.S                   |   3 -
 arch/x86/kernel/head_64.S                   |   3 -
 arch/x86/kernel/uprobes.c                   |  15 +-
 arch/x86/lib/Makefile                       |   2 +-
 arch/x86/lib/insn-eval.c                    | 854 ++++++++++++++++++++++++++++
 arch/x86/mm/fault.c                         |  88 ++-
 arch/x86/mm/mpx.c                           | 120 +---
 12 files changed, 959 insertions(+), 186 deletions(-)
 create mode 100644 arch/x86/include/asm/insn-eval.h
 create mode 100644 arch/x86/lib/insn-eval.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v10 01/18] x86/mm: Relocate page fault error codes to traps.h
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:55   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25   ` Ricardo Neri
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Kirill A. Shutemov, Josh Poimboeuf

Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: x86@kernel.org
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/traps.h | 18 +++++++++
 arch/x86/mm/fault.c          | 88 +++++++++++++++++---------------------------
 2 files changed, 52 insertions(+), 54 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 5545f64..da3c3a3 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -144,4 +144,22 @@ enum {
 	X86_TRAP_IRET = 32,	/* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==	 0: no page found	1: protection fault
+ *   bit 1 ==	 0: read access		1: write access
+ *   bit 2 ==	 0: kernel-mode access	1: user-mode access
+ *   bit 3 ==				1: use of reserved bit detected
+ *   bit 4 ==				1: fault was an instruction fetch
+ *   bit 5 ==				1: protection keys block access
+ */
+enum x86_pf_error_code {
+	X86_PF_PROT	=		1 << 0,
+	X86_PF_WRITE	=		1 << 1,
+	X86_PF_USER	=		1 << 2,
+	X86_PF_RSVD	=		1 << 3,
+	X86_PF_INSTR	=		1 << 4,
+	X86_PF_PK	=		1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e2baeaa..db71c73 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -29,26 +29,6 @@
 #include <asm/trace/exceptions.h>
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==	 0: no page found	1: protection fault
- *   bit 1 ==	 0: read access		1: write access
- *   bit 2 ==	 0: kernel-mode access	1: user-mode access
- *   bit 3 ==				1: use of reserved bit detected
- *   bit 4 ==				1: fault was an instruction fetch
- *   bit 5 ==				1: protection keys block access
- */
-enum x86_pf_error_code {
-
-	PF_PROT		=		1 << 0,
-	PF_WRITE	=		1 << 1,
-	PF_USER		=		1 << 2,
-	PF_RSVD		=		1 << 3,
-	PF_INSTR	=		1 << 4,
-	PF_PK		=		1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -149,7 +129,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
 	 * If it was a exec (instruction fetch) fault on NX page, then
 	 * do not ignore the fault:
 	 */
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		return 0;
 
 	instr = (void *)convert_ip_to_linear(current, regs);
@@ -179,7 +159,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -204,7 +184,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info, u32 *pkey)
 	/*
 	 * force_sig_info_fault() is called from a number of
 	 * contexts, some of which have a VMA and some of which
-	 * do not.  The PF_PK handing happens after we have a
+	 * do not.  The X86_PF_PK handing happens after we have a
 	 * valid VMA, so we should never reach this without a
 	 * valid VMA.
 	 */
@@ -697,7 +677,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
 	if (!oops_may_print())
 		return;
 
-	if (error_code & PF_INSTR) {
+	if (error_code & X86_PF_INSTR) {
 		unsigned int level;
 		pgd_t *pgd;
 		pte_t *pte;
@@ -779,7 +759,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 		 */
 		if (current->thread.sig_on_uaccess_err && signal) {
 			tsk->thread.trap_nr = X86_TRAP_PF;
-			tsk->thread.error_code = error_code | PF_USER;
+			tsk->thread.error_code = error_code | X86_PF_USER;
 			tsk->thread.cr2 = address;
 
 			/* XXX: hwpoison faults will set the wrong code. */
@@ -897,7 +877,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 	struct task_struct *tsk = current;
 
 	/* User mode accesses just cause a SIGSEGV */
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * It's possible to have interrupts off here:
 		 */
@@ -918,7 +898,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		 * Instruction fetch faults in the vsyscall page might need
 		 * emulation.
 		 */
-		if (unlikely((error_code & PF_INSTR) &&
+		if (unlikely((error_code & X86_PF_INSTR) &&
 			     ((address & ~0xfff) == VSYSCALL_ADDR))) {
 			if (emulate_vsyscall(regs, address))
 				return;
@@ -931,7 +911,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		 * are always protection faults.
 		 */
 		if (address >= TASK_SIZE_MAX)
-			error_code |= PF_PROT;
+			error_code |= X86_PF_PROT;
 
 		if (likely(show_unhandled_signals))
 			show_signal_msg(regs, error_code, address, tsk);
@@ -992,11 +972,11 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 
 	if (!boot_cpu_has(X86_FEATURE_OSPKE))
 		return false;
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return true;
 	/* this checks permission keys on the VMA: */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return true;
 	return false;
 }
@@ -1024,7 +1004,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	int code = BUS_ADRERR;
 
 	/* Kernel mode? Handle exceptions or die: */
-	if (!(error_code & PF_USER)) {
+	if (!(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
 		return;
 	}
@@ -1052,14 +1032,14 @@ static noinline void
 mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 	       unsigned long address, u32 *pkey, unsigned int fault)
 {
-	if (fatal_signal_pending(current) && !(error_code & PF_USER)) {
+	if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, 0, 0);
 		return;
 	}
 
 	if (fault & VM_FAULT_OOM) {
 		/* Kernel mode? Handle exceptions or die: */
-		if (!(error_code & PF_USER)) {
+		if (!(error_code & X86_PF_USER)) {
 			no_context(regs, error_code, address,
 				   SIGSEGV, SEGV_MAPERR);
 			return;
@@ -1084,16 +1064,16 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 
 static int spurious_fault_check(unsigned long error_code, pte_t *pte)
 {
-	if ((error_code & PF_WRITE) && !pte_write(*pte))
+	if ((error_code & X86_PF_WRITE) && !pte_write(*pte))
 		return 0;
 
-	if ((error_code & PF_INSTR) && !pte_exec(*pte))
+	if ((error_code & X86_PF_INSTR) && !pte_exec(*pte))
 		return 0;
 	/*
 	 * Note: We do not do lazy flushing on protection key
-	 * changes, so no spurious fault will ever set PF_PK.
+	 * changes, so no spurious fault will ever set X86_PF_PK.
 	 */
-	if ((error_code & PF_PK))
+	if ((error_code & X86_PF_PK))
 		return 1;
 
 	return 1;
@@ -1139,8 +1119,8 @@ spurious_fault(unsigned long error_code, unsigned long address)
 	 * change, so user accesses are not expected to cause spurious
 	 * faults.
 	 */
-	if (error_code != (PF_WRITE | PF_PROT)
-	    && error_code != (PF_INSTR | PF_PROT))
+	if (error_code != (X86_PF_WRITE | X86_PF_PROT) &&
+	    error_code != (X86_PF_INSTR | X86_PF_PROT))
 		return 0;
 
 	pgd = init_mm.pgd + pgd_index(address);
@@ -1200,19 +1180,19 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
 	 * always an unconditional error and can never result in
 	 * a follow-up action to resolve the fault, like a COW.
 	 */
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return 1;
 
 	/*
 	 * Make sure to check the VMA so that we do not perform
-	 * faults just to hit a PF_PK as soon as we fill in a
+	 * faults just to hit a X86_PF_PK as soon as we fill in a
 	 * page.
 	 */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return 1;
 
-	if (error_code & PF_WRITE) {
+	if (error_code & X86_PF_WRITE) {
 		/* write, present and write, not present: */
 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
 			return 1;
@@ -1220,7 +1200,7 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
 	}
 
 	/* read, present: */
-	if (unlikely(error_code & PF_PROT))
+	if (unlikely(error_code & X86_PF_PROT))
 		return 1;
 
 	/* read, not present: */
@@ -1243,7 +1223,7 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
 	if (!static_cpu_has(X86_FEATURE_SMAP))
 		return false;
 
-	if (error_code & PF_USER)
+	if (error_code & X86_PF_USER)
 		return false;
 
 	if (!user_mode(regs) && (regs->flags & X86_EFLAGS_AC))
@@ -1296,7 +1276,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * protection error (error_code & 9) == 0.
 	 */
 	if (unlikely(fault_in_kernel_space(address))) {
-		if (!(error_code & (PF_RSVD | PF_USER | PF_PROT))) {
+		if (!(error_code & (X86_PF_RSVD | X86_PF_USER | X86_PF_PROT))) {
 			if (vmalloc_fault(address) >= 0)
 				return;
 
@@ -1324,7 +1304,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	if (unlikely(kprobes_fault(regs)))
 		return;
 
-	if (unlikely(error_code & PF_RSVD))
+	if (unlikely(error_code & X86_PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
 	if (unlikely(smap_violation(error_code, regs))) {
@@ -1350,7 +1330,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 */
 	if (user_mode(regs)) {
 		local_irq_enable();
-		error_code |= PF_USER;
+		error_code |= X86_PF_USER;
 		flags |= FAULT_FLAG_USER;
 	} else {
 		if (regs->flags & X86_EFLAGS_IF)
@@ -1359,9 +1339,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-	if (error_code & PF_WRITE)
+	if (error_code & X86_PF_WRITE)
 		flags |= FAULT_FLAG_WRITE;
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
 	/*
@@ -1381,7 +1361,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * space check, thus avoiding the deadlock:
 	 */
 	if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
-		if ((error_code & PF_USER) == 0 &&
+		if (!(error_code & X86_PF_USER) &&
 		    !search_exception_tables(regs->ip)) {
 			bad_area_nosemaphore(regs, error_code, address, NULL);
 			return;
@@ -1408,7 +1388,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 		bad_area(regs, error_code, address);
 		return;
 	}
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * Accessing the stack below %sp is always a bug.
 		 * The large cushion allows instructions like enter
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 02/18] x86/boot: Relocate definition of the initial state of CR0
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 01/18] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
@ 2017-10-27 20:25   ` Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 03/18] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
                     ` (15 subsequent siblings)
  17 siblings, 0 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Andy Lutomirski,
	Borislav Petkov, Dave Hansen, Denys Vlasenko, Josh Poimboeuf,
	Linus Torvalds, linux-arch, linux-mm

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 99b1262..701f3d9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -153,9 +153,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 02/18] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-27 20:25   ` Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Andy Lutomirski,
	Borislav Petkov, Dave Hansen, Denys Vlasenko, Josh

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 99b1262..701f3d9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -153,9 +153,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 02/18] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-27 20:25   ` Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Andy Lutomirski,
	Borislav Petkov, Dave Hansen, Denys Vlasenko, Josh Poimboeuf,
	Linus Torvalds, linux-arch, linux-mm

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 99b1262..701f3d9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -153,9 +153,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 03/18] ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 01/18] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
  2017-10-27 20:25   ` Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:55   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 04/18] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.

This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.

Suggested-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/ptrace.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 91c04c8..e2afbf6 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -135,9 +135,9 @@ static inline int v8086_mode(struct pt_regs *regs)
 #endif
 }
 
-#ifdef CONFIG_X86_64
 static inline bool user_64bit_mode(struct pt_regs *regs)
 {
+#ifdef CONFIG_X86_64
 #ifndef CONFIG_PARAVIRT
 	/*
 	 * On non-paravirt systems, this is the only long mode CPL 3
@@ -148,8 +148,12 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
 	/* Headers are too twisted for this to go in paravirt.h. */
 	return regs->cs == __USER_CS || regs->cs == pv_info.extra_user_64bit_cs;
 #endif
+#else /* !CONFIG_X86_64 */
+	return false;
+#endif
 }
 
+#ifdef CONFIG_X86_64
 #define current_user_stack_pointer()	current_pt_regs()->sp
 #define compat_user_stack_pointer()	current_pt_regs()->sp
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 04/18] uprobes/x86: Use existing definitions for segment override prefixes
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (2 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 03/18] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:56   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 05/18] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Denys Vlasenko,
	Srikar Dronamraju

Rather than using hard-coded values of the segment override prefixes,
leverage the existing definitions provided in inat.h.

Suggested-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/kernel/uprobes.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 495c776..a3755d2 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -271,12 +271,15 @@ static bool is_prefix_bad(struct insn *insn)
 	int i;
 
 	for (i = 0; i < insn->prefixes.nbytes; i++) {
-		switch (insn->prefixes.bytes[i]) {
-		case 0x26:	/* INAT_PFX_ES   */
-		case 0x2E:	/* INAT_PFX_CS   */
-		case 0x36:	/* INAT_PFX_DS   */
-		case 0x3E:	/* INAT_PFX_SS   */
-		case 0xF0:	/* INAT_PFX_LOCK */
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+		case INAT_MAKE_PREFIX(INAT_PFX_LOCK):
 			return true;
 		}
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 05/18] x86/mpx: Simplify handling of errors when computing linear addresses
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (3 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 04/18] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:56   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 06/18] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren, Nathan Howard,
	Adan Hawthorn, Joe Perches

When errors occur in the computation of the linear address, -1L is
returned. Rather than having a separate return path for errors, the
variable used to return the computed linear address can be initialized
with the error value. Hence, only one return path is needed. This makes
the function easier to read.

While here, ensure that the error value is -1L, a 64-bit value, rather
than -1, a 32-bit value.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 9ceaa95..f4c48a0 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,7 +138,7 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	unsigned long addr, base, indx;
+	unsigned long addr = -1L, base, indx;
 	int addr_offset, base_offset, indx_offset;
 	insn_byte_t sib;
 
@@ -149,17 +149,17 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
-			goto out_err;
+			goto out;
 		addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
 			if (base_offset < 0)
-				goto out_err;
+				goto out;
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
 			if (indx_offset < 0)
-				goto out_err;
+				goto out;
 
 			base = regs_get_register(regs, base_offset);
 			indx = regs_get_register(regs, indx_offset);
@@ -167,14 +167,13 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			if (addr_offset < 0)
-				goto out_err;
+				goto out;
 			addr = regs_get_register(regs, addr_offset);
 		}
 		addr += insn->displacement.value;
 	}
+out:
 	return (void __user *)addr;
-out_err:
-	return (void __user *)-1;
 }
 
 static int mpx_insn_decode(struct insn *insn,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 06/18] x86/mpx: Use signed variables to compute effective addresses
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (4 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 05/18] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:57   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 07/18] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren, Nathan Howard,
	Adan Hawthorn, Joe Perches

Even though memory addresses are unsigned, the operands used to compute the
effective address do have a sign. This is true for ModRM.rm, SIB.base,
SIB.index as well as the displacement bytes. Thus, signed variables shall
be used when computing the effective address from these operands. Once the
signed effective address has been computed, it is casted to an unsigned
long to determine the linear address.

Variables are renamed to better reflect the type of address being
computed.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index f4c48a0..57e5bf5 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,8 +138,9 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	unsigned long addr = -1L, base, indx;
 	int addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L;
+	long eff_addr, base, indx;
 	insn_byte_t sib;
 
 	insn_get_modrm(insn);
@@ -150,7 +151,8 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
 			goto out;
-		addr = regs_get_register(regs, addr_offset);
+
+		eff_addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
@@ -163,17 +165,23 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 
 			base = regs_get_register(regs, base_offset);
 			indx = regs_get_register(regs, indx_offset);
-			addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			if (addr_offset < 0)
 				goto out;
-			addr = regs_get_register(regs, addr_offset);
+
+			eff_addr = regs_get_register(regs, addr_offset);
 		}
-		addr += insn->displacement.value;
+
+		eff_addr += insn->displacement.value;
 	}
+
+	linear_addr = (unsigned long)eff_addr;
+
 out:
-	return (void __user *)addr;
+	return (void __user *)linear_addr;
 }
 
 static int mpx_insn_decode(struct insn *insn,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 07/18] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (5 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 06/18] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:57   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 08/18] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren, Nathan Howard,
	Adan Hawthorn, Joe Perches

Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod !=11b and
ModRM.rm = 100b indexed register-indirect addressing is used. In other
words, a SIB byte follows the ModRM byte. In the specific case of
SIB.index = 100b, the scale*index portion of the computation of the
effective address is null. To signal callers of this particular situation,
get_reg_offset() can return -EDOM (-EINVAL continues to indicate that an
error when decoding the SIB byte).

An example of this situation can be the following instruction:

   8b 4c 23 80       mov -0x80(%rbx,%riz,1),%rcx
   ModRM:            0x4c [mod:1b][reg:1b][rm:100b]
   SIB:              0x23 [scale:0b][index:100b][base:11b]
   Displacement:     0x80  (1-byte, as per ModRM.mod = 1b)

The %riz 'register' indicates a null index.

In long mode, a REX prefix may be used. When a REX prefix is present,
REX.X adds a fourth bit to the register selection of SIB.index. This gives
the ability to refer to all the 16 general purpose registers. When REX.X is
1b and SIB.index is 100b, the index is indicated in %r12. In our example,
this would look like:

   42 8b 4c 23 80    mov -0x80(%rbx,%r12,1),%rcx
   REX:              0x42 [W:0b][R:0b][X:1b][B:0b]
   ModRM:            0x4c [mod:1b][reg:1b][rm:100b]
   SIB:              0x23 [scale:0b][.X: 1b, index:100b][.B:0b, base:11b]
   Displacement:     0x80  (1-byte, as per ModRM.mod = 1b)

%r12 is a valid register to use in the scale*index part of the effective
address computation.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 57e5bf5..2ad1d4a 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -110,6 +110,15 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 		regno = X86_SIB_INDEX(insn->sib.value);
 		if (X86_REX_X(insn->rex_prefix.value))
 			regno += 8;
+
+		/*
+		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
+		 * portion of the address computation is null. This is
+		 * true only if REX.X is 0. In such a case, the SIB index
+		 * is used in the address computation.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+			return -EDOM;
 		break;
 
 	case REG_TYPE_BASE:
@@ -160,11 +169,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 				goto out;
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-			if (indx_offset < 0)
+			/*
+			 * A negative offset generally means a error, except
+			 * -EDOM, which means that the contents of the register
+			 * should not be used as index.
+			 */
+			if (indx_offset == -EDOM)
+				indx = 0;
+			else if (indx_offset < 0)
 				goto out;
+			else
+				indx = regs_get_register(regs, indx_offset);
 
 			base = regs_get_register(regs, base_offset);
-			indx = regs_get_register(regs, indx_offset);
 
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 08/18] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (6 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 07/18] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:57   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 09/18] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren, Nathan Howard,
	Adan Hawthorn, Joe Perches

Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that if a SIB byte is used and
SIB.base is 101b and ModRM.mod is zero, then the base part of the base
part of the effective address computation is null. To signal this
situation, a -EDOM error is returned to indicate callers to ignore the
base value present in the register operand.

In this scenario, a 32-bit displacement follows the SIB byte. Displacement
is obtained when the instruction decoder parses the operands.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 2ad1d4a..581a960 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -123,6 +123,14 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 
 	case REG_TYPE_BASE:
 		regno = X86_SIB_BASE(insn->sib.value);
+		/*
+		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
+		 * register-indirect addressing is 0. In this case, a
+		 * 32-bit displacement follows the SIB byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
 		if (X86_REX_B(insn->rex_prefix.value))
 			regno += 8;
 		break;
@@ -164,16 +172,22 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset < 0)
+			if (base_offset == -EDOM)
+				base = 0;
+			else if (base_offset < 0)
 				goto out;
+			else
+				base = regs_get_register(regs, base_offset);
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-			/*
-			 * A negative offset generally means a error, except
-			 * -EDOM, which means that the contents of the register
-			 * should not be used as index.
-			 */
+
 			if (indx_offset == -EDOM)
 				indx = 0;
 			else if (indx_offset < 0)
@@ -181,8 +195,6 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			else
 				indx = regs_get_register(regs, indx_offset);
 
-			base = regs_get_register(regs, base_offset);
-
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 09/18] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (7 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 08/18] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:58   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 10/18] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Other kernel submodules can benefit from using the utility functions
defined in mpx.c to obtain the addresses and values of operands contained
in the general purpose registers. An instance of this is the emulation code
used for instructions protected by the Intel User-Mode Instruction
Prevention feature.

Thus, these functions are relocated to a new insn-eval.c file. The reason
to not relocate these utilities into insn.c is that the latter solely
analyses instructions given by a struct insn without any knowledge of the
meaning of the values of instruction operands. This new utility insn-
eval.c aims to be used to resolve userspace linear addresses based on
the contents of the instruction operands as well as the contents of pt_regs
structure.

These utilities come with a separate header. This is to avoid taking insn.c
out of sync from the instructions decoders under tools/obj and tools/perf.
This also avoids adding cumbersome #ifdef's for the #include'd files
required to decode instructions in a kernel context.

Functions are simply relocated. There are not functional or indentation
changes.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
The checkpatch script issues the following warning with this
commit:

WARNING: Avoid crashing the kernel - try using WARN_ON & recovery code
rather than BUG() or BUG_ON()
+               BUG();

This warning will be fixed in a subsequent patch.
---
 arch/x86/include/asm/insn-eval.h |  16 ++++
 arch/x86/lib/Makefile            |   2 +-
 arch/x86/lib/insn-eval.c         | 163 +++++++++++++++++++++++++++++++++++++++
 arch/x86/mm/mpx.c                | 156 +------------------------------------
 4 files changed, 182 insertions(+), 155 deletions(-)
 create mode 100644 arch/x86/include/asm/insn-eval.h
 create mode 100644 arch/x86/lib/insn-eval.c

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
new file mode 100644
index 0000000..5cab1b1
--- /dev/null
+++ b/arch/x86/include/asm/insn-eval.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_X86_INSN_EVAL_H
+#define _ASM_X86_INSN_EVAL_H
+/*
+ * A collection of utility functions for x86 instruction analysis to be
+ * used in a kernel context. Useful when, for instance, making sense
+ * of the registers indicated by operands.
+ */
+
+#include <linux/compiler.h>
+#include <linux/bug.h>
+#include <linux/err.h>
+#include <asm/ptrace.h>
+
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+
+#endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 34a7413..675d7b0 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -23,7 +23,7 @@ lib-y := delay.o misc.o cmdline.o cpu.o
 lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
-lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
 lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
new file mode 100644
index 0000000..df9418c
--- /dev/null
+++ b/arch/x86/lib/insn-eval.c
@@ -0,0 +1,163 @@
+/*
+ * Utility functions for x86 operand and address decoding
+ *
+ * Copyright (C) Intel Corporation 2017
+ */
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <asm/inat.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
+
+enum reg_type {
+	REG_TYPE_RM = 0,
+	REG_TYPE_INDEX,
+	REG_TYPE_BASE,
+};
+
+static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
+			  enum reg_type type)
+{
+	int regno = 0;
+
+	static const int regoff[] = {
+		offsetof(struct pt_regs, ax),
+		offsetof(struct pt_regs, cx),
+		offsetof(struct pt_regs, dx),
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, sp),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+		offsetof(struct pt_regs, r8),
+		offsetof(struct pt_regs, r9),
+		offsetof(struct pt_regs, r10),
+		offsetof(struct pt_regs, r11),
+		offsetof(struct pt_regs, r12),
+		offsetof(struct pt_regs, r13),
+		offsetof(struct pt_regs, r14),
+		offsetof(struct pt_regs, r15),
+#endif
+	};
+	int nr_registers = ARRAY_SIZE(regoff);
+	/*
+	 * Don't possibly decode a 32-bit instructions as
+	 * reading a 64-bit-only register.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64) && !insn->x86_64)
+		nr_registers -= 8;
+
+	switch (type) {
+	case REG_TYPE_RM:
+		regno = X86_MODRM_RM(insn->modrm.value);
+		if (X86_REX_B(insn->rex_prefix.value))
+			regno += 8;
+		break;
+
+	case REG_TYPE_INDEX:
+		regno = X86_SIB_INDEX(insn->sib.value);
+		if (X86_REX_X(insn->rex_prefix.value))
+			regno += 8;
+
+		/*
+		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
+		 * portion of the address computation is null. This is
+		 * true only if REX.X is 0. In such a case, the SIB index
+		 * is used in the address computation.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+			return -EDOM;
+		break;
+
+	case REG_TYPE_BASE:
+		regno = X86_SIB_BASE(insn->sib.value);
+		/*
+		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
+		 * register-indirect addressing is 0. In this case, a
+		 * 32-bit displacement follows the SIB byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
+		if (X86_REX_B(insn->rex_prefix.value))
+			regno += 8;
+		break;
+
+	default:
+		pr_err("invalid register type");
+		BUG();
+		break;
+	}
+
+	if (regno >= nr_registers) {
+		WARN_ONCE(1, "decoded an instruction with an invalid register");
+		return -EINVAL;
+	}
+	return regoff[regno];
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+	int addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L;
+	long eff_addr, base, indx;
+	insn_byte_t sib;
+
+	insn_get_modrm(insn);
+	insn_get_sib(insn);
+	sib = insn->sib.value;
+
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+		if (addr_offset < 0)
+			goto out;
+
+		eff_addr = regs_get_register(regs, addr_offset);
+	} else {
+		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
+			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
+			if (base_offset == -EDOM)
+				base = 0;
+			else if (base_offset < 0)
+				goto out;
+			else
+				base = regs_get_register(regs, base_offset);
+
+			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
+
+			if (indx_offset == -EDOM)
+				indx = 0;
+			else if (indx_offset < 0)
+				goto out;
+			else
+				indx = regs_get_register(regs, indx_offset);
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+		} else {
+			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+			if (addr_offset < 0)
+				goto out;
+
+			eff_addr = regs_get_register(regs, addr_offset);
+		}
+
+		eff_addr += insn->displacement.value;
+	}
+
+	linear_addr = (unsigned long)eff_addr;
+
+out:
+	return (void __user *)linear_addr;
+}
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 581a960..2878205 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -12,6 +12,7 @@
 #include <linux/sched/sysctl.h>
 
 #include <asm/insn.h>
+#include <asm/insn-eval.h>
 #include <asm/mman.h>
 #include <asm/mmu_context.h>
 #include <asm/mpx.h>
@@ -60,159 +61,6 @@ static unsigned long mpx_mmap(unsigned long len)
 	return addr;
 }
 
-enum reg_type {
-	REG_TYPE_RM = 0,
-	REG_TYPE_INDEX,
-	REG_TYPE_BASE,
-};
-
-static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
-			  enum reg_type type)
-{
-	int regno = 0;
-
-	static const int regoff[] = {
-		offsetof(struct pt_regs, ax),
-		offsetof(struct pt_regs, cx),
-		offsetof(struct pt_regs, dx),
-		offsetof(struct pt_regs, bx),
-		offsetof(struct pt_regs, sp),
-		offsetof(struct pt_regs, bp),
-		offsetof(struct pt_regs, si),
-		offsetof(struct pt_regs, di),
-#ifdef CONFIG_X86_64
-		offsetof(struct pt_regs, r8),
-		offsetof(struct pt_regs, r9),
-		offsetof(struct pt_regs, r10),
-		offsetof(struct pt_regs, r11),
-		offsetof(struct pt_regs, r12),
-		offsetof(struct pt_regs, r13),
-		offsetof(struct pt_regs, r14),
-		offsetof(struct pt_regs, r15),
-#endif
-	};
-	int nr_registers = ARRAY_SIZE(regoff);
-	/*
-	 * Don't possibly decode a 32-bit instructions as
-	 * reading a 64-bit-only register.
-	 */
-	if (IS_ENABLED(CONFIG_X86_64) && !insn->x86_64)
-		nr_registers -= 8;
-
-	switch (type) {
-	case REG_TYPE_RM:
-		regno = X86_MODRM_RM(insn->modrm.value);
-		if (X86_REX_B(insn->rex_prefix.value))
-			regno += 8;
-		break;
-
-	case REG_TYPE_INDEX:
-		regno = X86_SIB_INDEX(insn->sib.value);
-		if (X86_REX_X(insn->rex_prefix.value))
-			regno += 8;
-
-		/*
-		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
-		 * portion of the address computation is null. This is
-		 * true only if REX.X is 0. In such a case, the SIB index
-		 * is used in the address computation.
-		 */
-		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
-			return -EDOM;
-		break;
-
-	case REG_TYPE_BASE:
-		regno = X86_SIB_BASE(insn->sib.value);
-		/*
-		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
-		 * register-indirect addressing is 0. In this case, a
-		 * 32-bit displacement follows the SIB byte.
-		 */
-		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
-			return -EDOM;
-
-		if (X86_REX_B(insn->rex_prefix.value))
-			regno += 8;
-		break;
-
-	default:
-		pr_err("invalid register type");
-		BUG();
-		break;
-	}
-
-	if (regno >= nr_registers) {
-		WARN_ONCE(1, "decoded an instruction with an invalid register");
-		return -EINVAL;
-	}
-	return regoff[regno];
-}
-
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
- */
-static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
-{
-	int addr_offset, base_offset, indx_offset;
-	unsigned long linear_addr = -1L;
-	long eff_addr, base, indx;
-	insn_byte_t sib;
-
-	insn_get_modrm(insn);
-	insn_get_sib(insn);
-	sib = insn->sib.value;
-
-	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
-		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-		if (addr_offset < 0)
-			goto out;
-
-		eff_addr = regs_get_register(regs, addr_offset);
-	} else {
-		if (insn->sib.nbytes) {
-			/*
-			 * Negative values in the base and index offset means
-			 * an error when decoding the SIB byte. Except -EDOM,
-			 * which means that the registers should not be used
-			 * in the address computation.
-			 */
-			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset == -EDOM)
-				base = 0;
-			else if (base_offset < 0)
-				goto out;
-			else
-				base = regs_get_register(regs, base_offset);
-
-			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-
-			if (indx_offset == -EDOM)
-				indx = 0;
-			else if (indx_offset < 0)
-				goto out;
-			else
-				indx = regs_get_register(regs, indx_offset);
-
-			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
-		} else {
-			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			if (addr_offset < 0)
-				goto out;
-
-			eff_addr = regs_get_register(regs, addr_offset);
-		}
-
-		eff_addr += insn->displacement.value;
-	}
-
-	linear_addr = (unsigned long)eff_addr;
-
-out:
-	return (void __user *)linear_addr;
-}
-
 static int mpx_insn_decode(struct insn *insn,
 			   struct pt_regs *regs)
 {
@@ -325,7 +173,7 @@ siginfo_t *mpx_generate_siginfo(struct pt_regs *regs)
 	info->si_signo = SIGSEGV;
 	info->si_errno = 0;
 	info->si_code = SEGV_BNDERR;
-	info->si_addr = mpx_get_addr_ref(&insn, regs);
+	info->si_addr = insn_get_addr_ref(&insn, regs);
 	/*
 	 * We were not able to extract an address from the instruction,
 	 * probably because there was something invalid in it.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 10/18] x86/insn-eval: Do not BUG on invalid register type
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (8 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 09/18] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:58   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 11/18] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

We are not in a critical failure path. The invalid register type is caused
when trying to decode invalid instruction bytes from a user-space program.
Thus, simply print an error message. To prevent this warning from being
abused from user space programs, use the rate-limited variant of pr_err().
along with a descriptive prefix.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index df9418c..4931d92 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -5,10 +5,14 @@
  */
 #include <linux/kernel.h>
 #include <linux/string.h>
+#include <linux/ratelimit.h>
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 
+#undef pr_fmt
+#define pr_fmt(fmt) "insn: " fmt
+
 enum reg_type {
 	REG_TYPE_RM = 0,
 	REG_TYPE_INDEX,
@@ -85,9 +89,8 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 		break;
 
 	default:
-		pr_err("invalid register type");
-		BUG();
-		break;
+		pr_err_ratelimited("invalid register type: %d\n", type);
+		return -EINVAL;
 	}
 
 	if (regno >= nr_registers) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 11/18] x86/insn-eval: Add a utility function to get register offsets
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (9 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 10/18] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:59   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 12/18] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

The function get_reg_offset() returns the offset to the register the
argument specifies as indicated in an enumeration of type offset. Callers
of this function would need the definition of such enumeration. This is
not needed. Instead, add helper functions for this purpose. These functions
are useful in cases when, for instance, the caller needs to decide whether
the operand is a register or a memory location by looking at the rm part
of the ModRM byte. As of now, this is the only helper function that is
needed.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  1 +
 arch/x86/lib/insn-eval.c         | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 5cab1b1..7e8c963 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -12,5 +12,6 @@
 #include <asm/ptrace.h>
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 4931d92..405ffeb 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -100,6 +100,23 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 	return regoff[regno];
 }
 
+/**
+ * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
+ * @insn:	Instruction containing the ModRM byte
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * Returns:
+ *
+ * The register indicated by the r/m part of the ModRM byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of ModRM does not refer to a register and shall be ignored.
+ */
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
+{
+	return get_reg_offset(insn, regs, REG_TYPE_RM);
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 12/18] x86/insn-eval: Add utility function to identify string instructions
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (10 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 11/18] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 20:59   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

String instructions are special because, in protected mode, the linear
address is always obtained via the ES segment register in operands that
use the (E)DI register; the DS segment register in operands that use
the (E)SI register. Furthermore, segment override prefixes are ignored
when calculating a linear address involving the (E)DI register; segment
override prefixes can be used when calculating linear addresses involving
the (E)SI register.

It follows that linear addresses are calculated differently for the case of
string instructions. The purpose of this utility function is to identify
such instructions for callers to determine a linear address correctly.

Note that this function only identifies string instructions; it does not
determine what segment register to use in the address computation. That is
left to callers. A subsequent commmit introduces a function to determine
the segment register to use given the instruction, operands and
segment override prefixes.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 405ffeb..ac7b87c 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -19,6 +19,34 @@ enum reg_type {
 	REG_TYPE_BASE,
 };
 
+/**
+ * is_string_insn() - Determine if instruction is a string instruction
+ * @insn:	Instruction containing the opcode to inspect
+ *
+ * Returns:
+ *
+ * true if the instruction, determined by the opcode, is any of the
+ * string instructions as defined in the Intel Software Development manual.
+ * False otherwise.
+ */
+static bool is_string_insn(struct insn *insn)
+{
+	insn_get_opcode(insn);
+
+	/* All string instructions have a 1-byte opcode. */
+	if (insn->opcode.nbytes != 1)
+		return false;
+
+	switch (insn->opcode.bytes[0]) {
+	case 0x6c ... 0x6f:	/* INS, OUTS */
+	case 0xa4 ... 0xa7:	/* MOVS, CMPS */
+	case 0xaa ... 0xaf:	/* STOS, LODS, SCAS */
+		return true;
+	default:
+		return false;
+	}
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (11 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 12/18] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-11-09 11:12   ` [PATCH v10 13/18] " Arnd Bergmann
  2017-10-27 20:25 ` [PATCH v10 14/18] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
                   ` (4 subsequent siblings)
  17 siblings, 2 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

When computing a linear address and segmentation is used, we need to know
the base address of the segment involved in the computation. In most of
the cases, the segment base address will be zero as in USER_DS/USER32_DS.
However, it may be possible that a user space program defines its own
segments via a local descriptor table. In such a case, the segment base
address may not be zero. Thus, the segment base address is needed to
calculate correctly the linear address.

If running in protected mode, the segment selector to be used when
computing a linear address is determined by either any of segment override
prefixes in the instruction or inferred from the registers involved in the
computation of the effective address; in that order. Also, there are cases
when the segment override prefixes shall be ignored (i.e., code segments
are always selected by the CS segment register; string instructions always
use the ES segment register when using rDI register as operand). In long
mode, segment registers are ignored, except for FS and GS. In these two
cases, base addresses are obtained from the respective MSRs.

For clarity, this process can be split into four steps (and an equal
number of functions): determine if segment prefixes overrides can be used;
parse the segment override prefixes, and use them if found; if not found
or cannot be used, use the default segment registers associated with the
operand registers. Once the segment register to use has been identified,
read its value to obtain the segment selector.

The method to obtain the segment selector depends on several factors. In
32-bit builds, segment selectors are saved into a pt_regs structure
when switching to kernel mode. The same is also true for virtual-8086
mode. In 64-bit builds, segmentation is mostly ignored, except when
running a program in 32-bit legacy mode. In this case, CS and SS can be
obtained from pt_regs. DS, ES, FS and GS can be read directly from
the respective segment registers.

In order to identify the segment registers, a new set of #defines is
introduced. It also includes two special identifiers. One of them
indicates when the default segment register associated with instruction
operands shall be used. Another one indicates that the contents of the
segment register shall be ignored; this identifier is used when in long
mode.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Improvements-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/inat.h |  10 ++
 arch/x86/lib/insn-eval.c    | 340 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 350 insertions(+)

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
index 02aff08..1c78580 100644
--- a/arch/x86/include/asm/inat.h
+++ b/arch/x86/include/asm/inat.h
@@ -97,6 +97,16 @@
 #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
 #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
 
+/* Identifiers for segment registers */
+#define INAT_SEG_REG_IGNORE	0
+#define INAT_SEG_REG_DEFAULT	1
+#define INAT_SEG_REG_CS		2
+#define INAT_SEG_REG_SS		3
+#define INAT_SEG_REG_DS		4
+#define INAT_SEG_REG_ES		5
+#define INAT_SEG_REG_FS		6
+#define INAT_SEG_REG_GS		7
+
 /* Attribute search APIs */
 extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
 extern int inat_get_last_prefix_id(insn_byte_t last_pfx);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ac7b87c..6a902b1 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -9,6 +9,7 @@
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/vm86.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "insn: " fmt
@@ -47,6 +48,345 @@ static bool is_string_insn(struct insn *insn)
 	}
 }
 
+/**
+ * get_seg_reg_override_idx() - obtain segment register override index
+ * @insn:	Valid instruction with segment override prefixes
+ *
+ * Inspect the instruction prefixes in @insn and find segment overrides, if any.
+ *
+ * Returns:
+ *
+ * A constant identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_DEFAULT is returned if no segment override
+ * prefixes were found.
+ *
+ * -EINVAL in case of error.
+ */
+static int get_seg_reg_override_idx(struct insn *insn)
+{
+	int idx = INAT_SEG_REG_DEFAULT;
+	int num_overrides = 0, i;
+
+	insn_get_prefixes(insn);
+
+	/* Look for any segment override prefixes. */
+	for (i = 0; i < insn->prefixes.nbytes; i++) {
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+			idx = INAT_SEG_REG_CS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+			idx = INAT_SEG_REG_SS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+			idx = INAT_SEG_REG_DS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+			idx = INAT_SEG_REG_ES;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_FS):
+			idx = INAT_SEG_REG_FS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_GS):
+			idx = INAT_SEG_REG_GS;
+			num_overrides++;
+			break;
+		/* No default action needed. */
+		}
+	}
+
+	/* More than one segment override prefix leads to undefined behavior. */
+	if (num_overrides > 1)
+		return -EINVAL;
+
+	return idx;
+}
+
+/**
+ * check_seg_overrides() - check if segment override prefixes are allowed
+ * @insn:	Valid instruction with segment override prefixes
+ * @regoff:	Operand offset, in pt_regs, for which the check is performed
+ *
+ * For a particular register used in register-indirect addressing, determine if
+ * segment override prefixes can be used. Specifically, no overrides are allowed
+ * for rDI if used with a string instruction.
+ *
+ * Returns:
+ *
+ * True if segment override prefixes can be used with the register indicated
+ * in @regoff. False if otherwise.
+ */
+static bool check_seg_overrides(struct insn *insn, int regoff)
+{
+	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
+		return false;
+
+	return true;
+}
+
+/**
+ * resolve_default_seg() - resolve default segment register index for an operand
+ * @insn:	Instruction with opcode and address size. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @off:	Operand offset, in pt_regs, for which resolution is needed
+ *
+ * Resolve the default segment register index associated with the instruction
+ * operand register indicated by @off. Such index is resolved based on defaults
+ * described in the Intel Software Development Manual.
+ *
+ * Returns:
+ *
+ * If in protected mode, a constant identifying the segment register to use,
+ * among CS, SS, ES or DS. If in long mode, INAT_SEG_REG_IGNORE.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
+{
+	if (user_64bit_mode(regs))
+		return INAT_SEG_REG_IGNORE;
+	/*
+	 * Resolve the default segment register as described in Section 3.7.4
+	 * of the Intel Software Development Manual Vol. 1:
+	 *
+	 *  + DS for all references involving r[ABCD]X, and rSI.
+	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
+	 *  + AX, CX and DX are not valid register operands in 16-bit address
+	 *    encodings but are valid for 32-bit and 64-bit encodings.
+	 *  + -EDOM is reserved to identify for cases in which no register
+	 *    is used (i.e., displacement-only addressing). Use DS.
+	 *  + SS for rSP or rBP.
+	 *  + CS for rIP.
+	 */
+
+	switch (off) {
+	case offsetof(struct pt_regs, ax):
+	case offsetof(struct pt_regs, cx):
+	case offsetof(struct pt_regs, dx):
+		/* Need insn to verify address size. */
+		if (insn->addr_bytes == 2)
+			return -EINVAL;
+
+	case -EDOM:
+	case offsetof(struct pt_regs, bx):
+	case offsetof(struct pt_regs, si):
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, di):
+		if (is_string_insn(insn))
+			return INAT_SEG_REG_ES;
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, bp):
+	case offsetof(struct pt_regs, sp):
+		return INAT_SEG_REG_SS;
+
+	case offsetof(struct pt_regs, ip):
+		return INAT_SEG_REG_CS;
+
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
+ * resolve_seg_reg() - obtain segment register index
+ * @insn:	Instruction with operands
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to deterimine segment register
+ *
+ * Determine the segment register associated with the operands and, if
+ * applicable, prefixes and the instruction pointed by @insn.
+ *
+ * The segment register associated to an operand used in register-indirect
+ * addressing depends on:
+ *
+ * a) Whether running in long mode (in such a case segments are ignored, except
+ * if FS or GS are used).
+ *
+ * b) Whether segment override prefixes can be used. Certain instructions and
+ *    registers do not allow override prefixes.
+ *
+ * c) Whether segment overrides prefixes are found in the instruction prefixes.
+ *
+ * d) If there are not segment override prefixes or they cannot be used, the
+ *    default segment register associated with the operand register is used.
+ *
+ * The function checks first if segment override prefixes can be used with the
+ * operand indicated by @regoff. If allowed, obtain such overridden segment
+ * register index. Lastly, if not prefixes were found or cannot be used, resolve
+ * the segment register index to use based on the defaults described in the
+ * Intel documentation. In long mode, all segment register indexes will be
+ * ignored, except if overrides were found for FS or GS. All these operations
+ * are done using helper functions.
+ *
+ * The operand register, @regoff, is represented as the offset from the base of
+ * pt_regs.
+ *
+ * As stated, the main use of this function is to determine the segment register
+ * index based on the instruction, its operands and prefixes. Hence, @insn
+ * must be valid. However, if @regoff indicates rIP, we don't need to inspect
+ * @insn at all as in this case CS is used in all cases. This case is checked
+ * before proceeding further.
+ *
+ * Please note that this function does not return the value in the segment
+ * register (i.e., the segment selector) but our defined index. The segment
+ * selector needs to be obtained using get_segment_selector() and passing the
+ * segment register index resolved by this function.
+ *
+ * Returns:
+ *
+ * An index identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_IGNORE is returned if running in long mode.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
+{
+	int idx;
+
+	/*
+	 * In the unlikely event of having to resolve the segment register
+	 * index for rIP, do it first. Segment override prefixes should not
+	 * be used. Hence, it is not necessary to inspect the instruction,
+	 * which may be invalid at this point.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip)) {
+		if (user_64bit_mode(regs))
+			return INAT_SEG_REG_IGNORE;
+		else
+			return INAT_SEG_REG_CS;
+	}
+
+	if (!insn)
+		return -EINVAL;
+
+	if (!check_seg_overrides(insn, regoff))
+		return resolve_default_seg(insn, regs, regoff);
+
+	idx = get_seg_reg_override_idx(insn);
+	if (idx < 0)
+		return idx;
+
+	if (idx == INAT_SEG_REG_DEFAULT)
+		return resolve_default_seg(insn, regs, regoff);
+
+	/*
+	 * In long mode, segment override prefixes are ignored, except for
+	 * overrides for FS and GS.
+	 */
+	if (user_64bit_mode(regs)) {
+		if (idx != INAT_SEG_REG_FS &&
+		    idx != INAT_SEG_REG_GS)
+			idx = INAT_SEG_REG_IGNORE;
+	}
+
+	return idx;
+}
+
+/**
+ * get_segment_selector() - obtain segment selector
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Segment register index to use
+ *
+ * Obtain the segment selector from any of the CS, SS, DS, ES, FS, GS segment
+ * registers. In CONFIG_X86_32, the segment is obtained from either pt_regs or
+ * kernel_vm86_regs as applicable. In CONFIG_X86_64, CS and SS are obtained
+ * from pt_regs. DS, ES, FS and GS are obtained by reading the actual CPU
+ * registers. This done for only for completeness as in CONFIG_X86_64 segment
+ * registers are ignored.
+ *
+ * Returns:
+ *
+ * Value of the segment selector, including null when running in
+ * long mode.
+ *
+ * -EINVAL on error.
+ */
+static short get_segment_selector(struct pt_regs *regs, int seg_reg_idx)
+{
+#ifdef CONFIG_X86_64
+	unsigned short sel;
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_IGNORE:
+		return 0;
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		savesegment(ds, sel);
+		return sel;
+	case INAT_SEG_REG_ES:
+		savesegment(es, sel);
+		return sel;
+	case INAT_SEG_REG_FS:
+		savesegment(fs, sel);
+		return sel;
+	case INAT_SEG_REG_GS:
+		savesegment(gs, sel);
+		return sel;
+	default:
+		return -EINVAL;
+	}
+#else /* CONFIG_X86_32 */
+	struct kernel_vm86_regs *vm86regs = (struct kernel_vm86_regs *)regs;
+
+	if (v8086_mode(regs)) {
+		switch (seg_reg_idx) {
+		case INAT_SEG_REG_CS:
+			return (unsigned short)(regs->cs & 0xffff);
+		case INAT_SEG_REG_SS:
+			return (unsigned short)(regs->ss & 0xffff);
+		case INAT_SEG_REG_DS:
+			return vm86regs->ds;
+		case INAT_SEG_REG_ES:
+			return vm86regs->es;
+		case INAT_SEG_REG_FS:
+			return vm86regs->fs;
+		case INAT_SEG_REG_GS:
+			return vm86regs->gs;
+		case INAT_SEG_REG_IGNORE:
+			/* fall through */
+		default:
+			return -EINVAL;
+		}
+	}
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		return (unsigned short)(regs->ds & 0xffff);
+	case INAT_SEG_REG_ES:
+		return (unsigned short)(regs->es & 0xffff);
+	case INAT_SEG_REG_FS:
+		return (unsigned short)(regs->fs & 0xffff);
+	case INAT_SEG_REG_GS:
+		/*
+		 * GS may or may not be in regs as per CONFIG_X86_32_LAZY_GS.
+		 * The macro below takes care of both cases.
+		 */
+		return get_user_gs(regs);
+	case INAT_SEG_REG_IGNORE:
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+#endif /* CONFIG_X86_64 */
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 14/18] x86/insn-eval: Add utility function to get segment descriptor
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (12 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 15/18] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

The segment descriptor contains information that is relevant to how linear
addresses need to be computed. It contains the default size of addresses
as well as the base address of the segment. Thus, given a segment
selector, we ought to look at segment descriptor to correctly calculate
the linear address.

In protected mode, the segment selector might indicate a segment
descriptor from either the global descriptor table or a local descriptor
table. Both cases are considered in this function.

This function is a prerequisite for functions in subsequent commits that
will obtain the aforementioned attributes of the segment descriptor.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 6a902b1..d85e840 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -6,9 +6,13 @@
 #include <linux/kernel.h>
 #include <linux/string.h>
 #include <linux/ratelimit.h>
+#include <linux/mmu_context.h>
+#include <asm/desc_defs.h>
+#include <asm/desc.h>
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/ldt.h>
 #include <asm/vm86.h>
 
 #undef pr_fmt
@@ -469,6 +473,59 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_desc() - Obtain pointer to a segment descriptor
+ * @sel:	Segment selector
+ *
+ * Given a segment selector, obtain a pointer to the segment descriptor.
+ * Both global and local descriptor tables are supported.
+ *
+ * Returns:
+ *
+ * Pointer to segment descriptor on success.
+ *
+ * NULL on error.
+ */
+static struct desc_struct *get_desc(unsigned short sel)
+{
+	struct desc_ptr gdt_desc = {0, 0};
+	unsigned long desc_base;
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+		struct desc_struct *desc = NULL;
+		struct ldt_struct *ldt;
+
+		/* Bits [15:3] contain the index of the desired entry. */
+		sel >>= 3;
+
+		mutex_lock(&current->active_mm->context.lock);
+		ldt = current->active_mm->context.ldt;
+		if (ldt && sel < ldt->nr_entries)
+			desc = &ldt->entries[sel];
+
+		mutex_unlock(&current->active_mm->context.lock);
+
+		return desc;
+	}
+#endif
+	native_store_gdt(&gdt_desc);
+
+	/*
+	 * Segment descriptors have a size of 8 bytes. Thus, the index is
+	 * multiplied by 8 to obtain the memory offset of the desired descriptor
+	 * from the base of the GDT. As bits [15:3] of the segment selector
+	 * contain the index, it can be regarded as multiplied by 8 already.
+	 * All that remains is to clear bits [2:0].
+	 */
+	desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+	if (desc_base > gdt_desc.size)
+		return NULL;
+
+	return (struct desc_struct *)(gdt_desc.address + desc_base);
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 15/18] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (13 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 14/18] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 16/18] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

With segmentation, the base address of the segment is needed to compute a
linear address. This base address is obtained from the applicable segment
descriptor. Such segment descriptor is referenced from a segment selector.
These new functions obtain the segment base and limit of the segment
selector indicated by segment register index given as argument. This index
is any of the INAT_SEG_REG_* family of #define's.

The logic to obtain the segment selector is wrapped in the function
get_segment_selector() with the inputs described above. Once the selector
is known, the base address is determined. In protected mode, the selector
is used to obtain the segment descriptor and then its base address. In
long mode, the segment base address is zero except when FS or GS are used.
In virtual-8086 mode, the base address is computed as the value of the
segment selector shifted 4 positions to the left.

In protected mode, segment limits are enforced. Thus, a function to
determine the limit of the segment is added. Segment limits are not
enforced in long or virtual-8086. For the latter, addresses are limited
to 20 bits; address size will be handled when computing the linear
address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |   1 +
 arch/x86/lib/insn-eval.c         | 114 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 115 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 7e8c963..25d6e44 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -13,5 +13,6 @@
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
+unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index d85e840..89d5c89 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -526,6 +526,120 @@ static struct desc_struct *get_desc(unsigned short sel)
 }
 
 /**
+ * insn_get_seg_base() - Obtain base address of segment descriptor.
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Index of the segment register pointing to seg descriptor
+ *
+ * Obtain the base address of the segment as indicated by the segment descriptor
+ * pointed by the segment selector. The segment selector is obtained from the
+ * input segment register index @seg_reg_idx.
+ *
+ * Returns:
+ *
+ * In protected mode, base address of the segment. Zero in long mode,
+ * except when FS or GS are used. In virtual-8086 mode, the segment
+ * selector shifted 4 bits to the right.
+ *
+ * -1L in case of error.
+ */
+unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
+{
+	struct desc_struct *desc;
+	short sel;
+
+	sel = get_segment_selector(regs, seg_reg_idx);
+	if (sel < 0)
+		return -1L;
+
+	if (v8086_mode(regs))
+		/*
+		 * Base is simply the segment selector shifted 4
+		 * bits to the right.
+		 */
+		return (unsigned long)(sel << 4);
+
+	if (user_64bit_mode(regs)) {
+		/*
+		 * Only FS or GS will have a base address, the rest of
+		 * the segments' bases are forced to 0.
+		 */
+		unsigned long base;
+
+		if (seg_reg_idx == INAT_SEG_REG_FS)
+			rdmsrl(MSR_FS_BASE, base);
+		else if (seg_reg_idx == INAT_SEG_REG_GS)
+			/*
+			 * swapgs was called at the kernel entry point. Thus,
+			 * MSR_KERNEL_GS_BASE will have the user-space GS base.
+			 */
+			rdmsrl(MSR_KERNEL_GS_BASE, base);
+		else
+			base = 0;
+		return base;
+	}
+
+	/* In protected mode the segment selector cannot be null. */
+	if (!sel)
+		return -1L;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return -1L;
+
+	return get_desc_base(desc);
+}
+
+/**
+ * get_seg_limit() - Obtain the limit of a segment descriptor
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Index of the segment register pointing to seg descriptor
+ *
+ * Obtain the limit of the segment as indicated by the segment descriptor
+ * pointed by the segment selector. The segment selector is obtained from the
+ * input segment register index @seg_reg_idx.
+ *
+ * Returns:
+ *
+ * In protected mode, the limit of the segment descriptor in bytes.
+ * In long mode and virtual-8086 mode, segment limits are not enforced. Thus,
+ * limit is returned as -1L to imply a limit-less segment.
+ *
+ * Zero is returned on error.
+ */
+static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
+{
+	struct desc_struct *desc;
+	unsigned long limit;
+	short sel;
+
+	sel = get_segment_selector(regs, seg_reg_idx);
+	if (sel < 0)
+		return 0;
+
+	if (user_64bit_mode(regs) || v8086_mode(regs))
+		return -1L;
+
+	if (!sel)
+		return 0;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return 0;
+
+	/*
+	 * If the granularity bit is set, the limit is given in multiples
+	 * of 4096. This also means that the 12 least significant bits are
+	 * not tested when checking the segment limits. In practice,
+	 * this means that the segment ends in (limit << 12) + 0xfff.
+	 */
+	limit = get_desc_limit(desc);
+	if (desc->g)
+		limit = (limit << 12) + 0xfff;
+
+	return limit;
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 16/18] x86/insn-eval: Add function to get default params of code segment
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (14 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 15/18] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 21:01   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 17/18] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Obtain the default values of the address and operand sizes as specified in
the D and L bits of the the segment descriptor selected by the register
CS. The function can be used for both protected and long modes.
For virtual-8086 mode, the default address and operand sizes are always 2
bytes.

The returned parameters are encoded in a signed 8-bit data type. Auxiliar
macros are provided to encode and decode such values.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  5 ++++
 arch/x86/lib/insn-eval.c         | 64 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 25d6e44..e1d3b4c 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -11,8 +11,13 @@
 #include <linux/err.h>
 #include <asm/ptrace.h>
 
+#define INSN_CODE_SEG_ADDR_SZ(params) ((params >> 4) & 0xf)
+#define INSN_CODE_SEG_OPND_SZ(params) (params & 0xf)
+#define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))
+
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
+char insn_get_code_seg_params(struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 89d5c89..01e36bd 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -640,6 +640,70 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 }
 
 /**
+ * insn_get_code_seg_params() - Obtain code segment parameters
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * Obtain address and operand sizes of the code segment. It is obtained from the
+ * selector contained in the CS register in regs. In protected mode, the default
+ * address is determined by inspecting the L and D bits of the segment
+ * descriptor. In virtual-8086 mode, the default is always two bytes for both
+ * address and operand sizes.
+ *
+ * Returns:
+ *
+ * A signed 8-bit value containing the default parameters on success.
+ *
+ * -EINVAL on error.
+ */
+char insn_get_code_seg_params(struct pt_regs *regs)
+{
+	struct desc_struct *desc;
+	short sel;
+
+	if (v8086_mode(regs))
+		/* Address and operand size are both 16-bit. */
+		return INSN_CODE_SEG_PARAMS(2, 2);
+
+	sel = get_segment_selector(regs, INAT_SEG_REG_CS);
+	if (sel < 0)
+		return sel;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return -EINVAL;
+
+	/*
+	 * The most significant byte of the Type field of the segment descriptor
+	 * determines whether a segment contains data or code. If this is a data
+	 * segment, return error.
+	 */
+	if (!(desc->type & BIT(3)))
+		return -EINVAL;
+
+	switch ((desc->l << 1) | desc->d) {
+	case 0: /*
+		 * Legacy mode. CS.L=0, CS.D=0. Address and operand size are
+		 * both 16-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(2, 2);
+	case 1: /*
+		 * Legacy mode. CS.L=0, CS.D=1. Address and operand size are
+		 * both 32-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(4, 4);
+	case 2: /*
+		 * IA-32e 64-bit mode. CS.L=1, CS.D=0. Address size is 64-bit;
+		 * operand size is 32-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(4, 8);
+	case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 17/18] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (15 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 16/18] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 21:01   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  2017-10-27 20:25 ` [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
  17 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod is zero and
ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
that none of the registers are used in the computation of the effective
address. A return value of -EDOM indicates callers that they should not
use the value of registers when computing the effective address for the
instruction.

In long mode, the effective address is given by the 32-bit displacement
plus the location of the next instruction. In protected mode, only the
displacement is used.

The instruction decoder takes care of obtaining the displacement.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 01e36bd..6bf819f 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -427,6 +427,14 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 	switch (type) {
 	case REG_TYPE_RM:
 		regno = X86_MODRM_RM(insn->modrm.value);
+
+		/*
+		 * ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement
+		 * follows the ModRM byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
 		if (X86_REX_B(insn->rex_prefix.value))
 			regno += 8;
 		break;
@@ -770,10 +778,21 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			if (addr_offset < 0)
+			/*
+			 * -EDOM means that we must ignore the address_offset.
+			 * In such a case, in 64-bit mode the effective address
+			 * relative to the RIP of the following instruction.
+			 */
+			if (addr_offset == -EDOM) {
+				if (user_64bit_mode(regs))
+					eff_addr = (long)regs->ip + insn->length;
+				else
+					eff_addr = 0;
+			} else if (addr_offset < 0) {
 				goto out;
-
-			eff_addr = regs_get_register(regs, addr_offset);
+			} else {
+				eff_addr = regs_get_register(regs, addr_offset);
+			}
 		}
 
 		eff_addr += insn->displacement.value;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation
  2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
                   ` (16 preceding siblings ...)
  2017-10-27 20:25 ` [PATCH v10 17/18] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
@ 2017-10-27 20:25 ` Ricardo Neri
  2017-11-01 17:56   ` Borislav Petkov
  2017-11-01 21:02   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  17 siblings, 2 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-10-27 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Ricardo Neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

insn_get_addr_ref() returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. The segment descriptor to use depends on the register used as
operand and segment override prefixes, if any.

In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.

Since the effective address is a signed quantity, the unsigned segment
base address is saved in a separate variable and added to the final,
unsigned, effective address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 55 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 52 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 6bf819f..1c23ec0 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -728,6 +728,43 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
 	return get_reg_offset(insn, regs, REG_TYPE_RM);
 }
 
+/**
+ * get_seg_base_addr() - obtain base address of a segment
+ * @insn:	Instruction. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to resolve segment descriptor
+ * @base:	Obtained segment base
+ *
+ * Obtain the base address of the segment associated with the operand @regoff
+ * and, if any or allowed, override prefixes in @insn. This function is
+ * different from insn_get_seg_base() as the latter does not resolve the segment
+ * associated with the instruction operand.
+ *
+ * Returns:
+ *
+ * 0 on success. @base will contain the base address of the resolved segment.
+ *
+ * -EINVAL on error.
+ */
+static int get_seg_base_addr(struct insn *insn, struct pt_regs *regs,
+			     int regoff, unsigned long *base)
+{
+	int seg_reg_idx;
+
+	if (!base)
+		return -EINVAL;
+
+	seg_reg_idx = resolve_seg_reg(insn, regs, regoff);
+	if (seg_reg_idx < 0)
+		return seg_reg_idx;
+
+	*base = insn_get_seg_base(regs, seg_reg_idx);
+	if (*base == -1L)
+		return -EINVAL;
+
+	return 0;
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
@@ -735,8 +772,8 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	int addr_offset, base_offset, indx_offset;
-	unsigned long linear_addr = -1L;
+	int addr_offset, base_offset, indx_offset, ret;
+	unsigned long linear_addr = -1L, seg_base;
 	long eff_addr, base, indx;
 	insn_byte_t sib;
 
@@ -750,6 +787,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			goto out;
 
 		eff_addr = regs_get_register(regs, addr_offset);
+
 	} else {
 		if (insn->sib.nbytes) {
 			/*
@@ -776,6 +814,13 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 				indx = regs_get_register(regs, indx_offset);
 
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			/*
+			 * The base determines the segment used to compute
+			 * the linear address.
+			 */
+			addr_offset = base_offset;
+
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			/*
@@ -798,7 +843,11 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr += insn->displacement.value;
 	}
 
-	linear_addr = (unsigned long)eff_addr;
+	ret = get_seg_base_addr(insn, regs, addr_offset, &seg_base);
+	if (ret)
+		goto out;
+
+	linear_addr = (unsigned long)eff_addr + seg_base;
 
 out:
 	return (void __user *)linear_addr;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation
  2017-10-27 20:25 ` [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
@ 2017-11-01 17:56   ` Borislav Petkov
  2017-11-01 19:08     ` Ricardo Neri
  2017-11-01 21:02   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
  1 sibling, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2017-11-01 17:56 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 27, 2017 at 01:25:45PM -0700, Ricardo Neri wrote:
> insn_get_addr_ref() returns the effective address as defined by the
> section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual. In order to compute the linear address, we must add
> to the effective address the segment base address as set in the segment
> descriptor. The segment descriptor to use depends on the register used as
> operand and segment override prefixes, if any.
> 
> In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
> segment is used or if segmentation is not used. However, the base address
> is not necessarily zero if a user programs defines its own segments. This
> is possible by using a local descriptor table.
> 
> Since the effective address is a signed quantity, the unsigned segment
> base address is saved in a separate variable and added to the final,
> unsigned, effective address.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 55 +++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 52 insertions(+), 3 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation
  2017-11-01 17:56   ` Borislav Petkov
@ 2017-11-01 19:08     ` Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: Ricardo Neri @ 2017-11-01 19:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, Nov 01, 2017 at 06:56:42PM +0100, Borislav Petkov wrote:
> On Fri, Oct 27, 2017 at 01:25:45PM -0700, Ricardo Neri wrote:
> > insn_get_addr_ref() returns the effective address as defined by the
> > section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual. In order to compute the linear address, we must add
> > to the effective address the segment base address as set in the segment
> > descriptor. The segment descriptor to use depends on the register used as
> > operand and segment override prefixes, if any.
> > 
> > In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
> > segment is used or if segmentation is not used. However, the base address
> > is not necessarily zero if a user programs defines its own segments. This
> > is possible by using a local descriptor table.
> > 
> > Since the effective address is a signed quantity, the unsigned segment
> > base address is saved in a separate variable and added to the final,
> > unsigned, effective address.
> > 
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Thomas Garnier <thgarnie@google.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 55 +++++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 52 insertions(+), 3 deletions(-)
> 
> Reviewed-by: Borislav Petkov <bp@suse.de>

Thank you Borislav! This should complete tne review of this series. As proposed
earier [1], I guess that, if the tip maintainers are OK, this series can be merged
in the tip tree?

BR,
Ricardo

[1]. https://lkml.org/lkml/2017/10/20/851

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/mm: Relocate page fault error codes to traps.h
  2017-10-27 20:25 ` [PATCH v10 01/18] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
@ 2017-11-01 20:55   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, cmetcalf, ray.huang, vbabka, linux-kernel, hpa, brgerst,
	shuah, ravi.v.shankar, peterz, ricardo.neri-calderon, akpm,
	jslaby, paul.gortmaker, mst, dave.hansen, corbet,
	kirill.shutemov, tglx, slaoub, luto, jpoimboe, pbonzini, bp,
	mhiramat

Commit-ID:  1067f030994c69ca1fba8c607437c8895dcf8509
Gitweb:     https://git.kernel.org/tip/1067f030994c69ca1fba8c607437c8895dcf8509
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:28 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:07 +0100

x86/mm: Relocate page fault error codes to traps.h

Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/1509135945-13762-2-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/traps.h | 18 +++++++++
 arch/x86/mm/fault.c          | 88 +++++++++++++++++---------------------------
 2 files changed, 52 insertions(+), 54 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 5545f64..da3c3a3 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -144,4 +144,22 @@ enum {
 	X86_TRAP_IRET = 32,	/* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==	 0: no page found	1: protection fault
+ *   bit 1 ==	 0: read access		1: write access
+ *   bit 2 ==	 0: kernel-mode access	1: user-mode access
+ *   bit 3 ==				1: use of reserved bit detected
+ *   bit 4 ==				1: fault was an instruction fetch
+ *   bit 5 ==				1: protection keys block access
+ */
+enum x86_pf_error_code {
+	X86_PF_PROT	=		1 << 0,
+	X86_PF_WRITE	=		1 << 1,
+	X86_PF_USER	=		1 << 2,
+	X86_PF_RSVD	=		1 << 3,
+	X86_PF_INSTR	=		1 << 4,
+	X86_PF_PK	=		1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e2baeaa..db71c73 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -29,26 +29,6 @@
 #include <asm/trace/exceptions.h>
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==	 0: no page found	1: protection fault
- *   bit 1 ==	 0: read access		1: write access
- *   bit 2 ==	 0: kernel-mode access	1: user-mode access
- *   bit 3 ==				1: use of reserved bit detected
- *   bit 4 ==				1: fault was an instruction fetch
- *   bit 5 ==				1: protection keys block access
- */
-enum x86_pf_error_code {
-
-	PF_PROT		=		1 << 0,
-	PF_WRITE	=		1 << 1,
-	PF_USER		=		1 << 2,
-	PF_RSVD		=		1 << 3,
-	PF_INSTR	=		1 << 4,
-	PF_PK		=		1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -149,7 +129,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
 	 * If it was a exec (instruction fetch) fault on NX page, then
 	 * do not ignore the fault:
 	 */
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		return 0;
 
 	instr = (void *)convert_ip_to_linear(current, regs);
@@ -179,7 +159,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -204,7 +184,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info, u32 *pkey)
 	/*
 	 * force_sig_info_fault() is called from a number of
 	 * contexts, some of which have a VMA and some of which
-	 * do not.  The PF_PK handing happens after we have a
+	 * do not.  The X86_PF_PK handing happens after we have a
 	 * valid VMA, so we should never reach this without a
 	 * valid VMA.
 	 */
@@ -697,7 +677,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
 	if (!oops_may_print())
 		return;
 
-	if (error_code & PF_INSTR) {
+	if (error_code & X86_PF_INSTR) {
 		unsigned int level;
 		pgd_t *pgd;
 		pte_t *pte;
@@ -779,7 +759,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 		 */
 		if (current->thread.sig_on_uaccess_err && signal) {
 			tsk->thread.trap_nr = X86_TRAP_PF;
-			tsk->thread.error_code = error_code | PF_USER;
+			tsk->thread.error_code = error_code | X86_PF_USER;
 			tsk->thread.cr2 = address;
 
 			/* XXX: hwpoison faults will set the wrong code. */
@@ -897,7 +877,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 	struct task_struct *tsk = current;
 
 	/* User mode accesses just cause a SIGSEGV */
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * It's possible to have interrupts off here:
 		 */
@@ -918,7 +898,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		 * Instruction fetch faults in the vsyscall page might need
 		 * emulation.
 		 */
-		if (unlikely((error_code & PF_INSTR) &&
+		if (unlikely((error_code & X86_PF_INSTR) &&
 			     ((address & ~0xfff) == VSYSCALL_ADDR))) {
 			if (emulate_vsyscall(regs, address))
 				return;
@@ -931,7 +911,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		 * are always protection faults.
 		 */
 		if (address >= TASK_SIZE_MAX)
-			error_code |= PF_PROT;
+			error_code |= X86_PF_PROT;
 
 		if (likely(show_unhandled_signals))
 			show_signal_msg(regs, error_code, address, tsk);
@@ -992,11 +972,11 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 
 	if (!boot_cpu_has(X86_FEATURE_OSPKE))
 		return false;
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return true;
 	/* this checks permission keys on the VMA: */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return true;
 	return false;
 }
@@ -1024,7 +1004,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	int code = BUS_ADRERR;
 
 	/* Kernel mode? Handle exceptions or die: */
-	if (!(error_code & PF_USER)) {
+	if (!(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
 		return;
 	}
@@ -1052,14 +1032,14 @@ static noinline void
 mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 	       unsigned long address, u32 *pkey, unsigned int fault)
 {
-	if (fatal_signal_pending(current) && !(error_code & PF_USER)) {
+	if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, 0, 0);
 		return;
 	}
 
 	if (fault & VM_FAULT_OOM) {
 		/* Kernel mode? Handle exceptions or die: */
-		if (!(error_code & PF_USER)) {
+		if (!(error_code & X86_PF_USER)) {
 			no_context(regs, error_code, address,
 				   SIGSEGV, SEGV_MAPERR);
 			return;
@@ -1084,16 +1064,16 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 
 static int spurious_fault_check(unsigned long error_code, pte_t *pte)
 {
-	if ((error_code & PF_WRITE) && !pte_write(*pte))
+	if ((error_code & X86_PF_WRITE) && !pte_write(*pte))
 		return 0;
 
-	if ((error_code & PF_INSTR) && !pte_exec(*pte))
+	if ((error_code & X86_PF_INSTR) && !pte_exec(*pte))
 		return 0;
 	/*
 	 * Note: We do not do lazy flushing on protection key
-	 * changes, so no spurious fault will ever set PF_PK.
+	 * changes, so no spurious fault will ever set X86_PF_PK.
 	 */
-	if ((error_code & PF_PK))
+	if ((error_code & X86_PF_PK))
 		return 1;
 
 	return 1;
@@ -1139,8 +1119,8 @@ spurious_fault(unsigned long error_code, unsigned long address)
 	 * change, so user accesses are not expected to cause spurious
 	 * faults.
 	 */
-	if (error_code != (PF_WRITE | PF_PROT)
-	    && error_code != (PF_INSTR | PF_PROT))
+	if (error_code != (X86_PF_WRITE | X86_PF_PROT) &&
+	    error_code != (X86_PF_INSTR | X86_PF_PROT))
 		return 0;
 
 	pgd = init_mm.pgd + pgd_index(address);
@@ -1200,19 +1180,19 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
 	 * always an unconditional error and can never result in
 	 * a follow-up action to resolve the fault, like a COW.
 	 */
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return 1;
 
 	/*
 	 * Make sure to check the VMA so that we do not perform
-	 * faults just to hit a PF_PK as soon as we fill in a
+	 * faults just to hit a X86_PF_PK as soon as we fill in a
 	 * page.
 	 */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return 1;
 
-	if (error_code & PF_WRITE) {
+	if (error_code & X86_PF_WRITE) {
 		/* write, present and write, not present: */
 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
 			return 1;
@@ -1220,7 +1200,7 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
 	}
 
 	/* read, present: */
-	if (unlikely(error_code & PF_PROT))
+	if (unlikely(error_code & X86_PF_PROT))
 		return 1;
 
 	/* read, not present: */
@@ -1243,7 +1223,7 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
 	if (!static_cpu_has(X86_FEATURE_SMAP))
 		return false;
 
-	if (error_code & PF_USER)
+	if (error_code & X86_PF_USER)
 		return false;
 
 	if (!user_mode(regs) && (regs->flags & X86_EFLAGS_AC))
@@ -1296,7 +1276,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * protection error (error_code & 9) == 0.
 	 */
 	if (unlikely(fault_in_kernel_space(address))) {
-		if (!(error_code & (PF_RSVD | PF_USER | PF_PROT))) {
+		if (!(error_code & (X86_PF_RSVD | X86_PF_USER | X86_PF_PROT))) {
 			if (vmalloc_fault(address) >= 0)
 				return;
 
@@ -1324,7 +1304,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	if (unlikely(kprobes_fault(regs)))
 		return;
 
-	if (unlikely(error_code & PF_RSVD))
+	if (unlikely(error_code & X86_PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
 	if (unlikely(smap_violation(error_code, regs))) {
@@ -1350,7 +1330,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 */
 	if (user_mode(regs)) {
 		local_irq_enable();
-		error_code |= PF_USER;
+		error_code |= X86_PF_USER;
 		flags |= FAULT_FLAG_USER;
 	} else {
 		if (regs->flags & X86_EFLAGS_IF)
@@ -1359,9 +1339,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-	if (error_code & PF_WRITE)
+	if (error_code & X86_PF_WRITE)
 		flags |= FAULT_FLAG_WRITE;
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
 	/*
@@ -1381,7 +1361,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * space check, thus avoiding the deadlock:
 	 */
 	if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
-		if ((error_code & PF_USER) == 0 &&
+		if (!(error_code & X86_PF_USER) &&
 		    !search_exception_tables(regs->ip)) {
 			bad_area_nosemaphore(regs, error_code, address, NULL);
 			return;
@@ -1408,7 +1388,7 @@ retry:
 		bad_area(regs, error_code, address);
 		return;
 	}
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * Accessing the stack below %sp is always a bug.
 		 * The large cushion allows instructions like enter

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/boot: Relocate definition of the initial state of CR0
  2017-10-27 20:25   ` Ricardo Neri
  (?)
  (?)
@ 2017-11-01 20:55   ` tip-bot for Ricardo Neri
  -1 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: paul.gortmaker, ravi.v.shankar, peterz, torvalds, mingo, hpa,
	akpm, mhiramat, tglx, bp, dvlasenk, luto, cmetcalf, brgerst,
	ray.huang, corbet, luto, jpoimboe, jslaby, mst, slaoub, pbonzini,
	bp, dave.hansen, shuah, vbabka, dave.hansen,
	ricardo.neri-calderon, linux-kernel

Commit-ID:  b0ce5b8c95c83a7b98c679b117e3d6ae6f97154b
Gitweb:     https://git.kernel.org/tip/b0ce5b8c95c83a7b98c679b117e3d6ae6f97154b
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:29 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:07 +0100

x86/boot: Relocate definition of the initial state of CR0

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: linux-mm@kvack.org
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-arch@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 513cbb0..5e1bfdd 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -149,9 +149,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  2017-10-27 20:25 ` [PATCH v10 03/18] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
@ 2017-11-01 20:55   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, lstoakes, vbabka, hpa, tglx, dvyukov, akpm, mingo,
	dave.hansen, ray.huang, jslaby, ricardo.neri-calderon, mhiramat,
	luto, corbet, acme, adam.buchbinder, ravi.v.shankar,
	paul.gortmaker, keescook, pbonzini, slaoub, shuah, linux-kernel,
	cmetcalf, mst, qiaowei.ren, colin.king, brgerst, peterz,
	thgarnie, adrian.hunter

Commit-ID:  e27c310af5c05cf876d9cad006928076c27f54d4
Gitweb:     https://git.kernel.org/tip/e27c310af5c05cf876d9cad006928076c27f54d4
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:30 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:08 +0100

ptrace,x86: Make user_64bit_mode() available to 32-bit builds

In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.

This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-4-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/ptrace.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 91c04c8..e2afbf6 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -135,9 +135,9 @@ static inline int v8086_mode(struct pt_regs *regs)
 #endif
 }
 
-#ifdef CONFIG_X86_64
 static inline bool user_64bit_mode(struct pt_regs *regs)
 {
+#ifdef CONFIG_X86_64
 #ifndef CONFIG_PARAVIRT
 	/*
 	 * On non-paravirt systems, this is the only long mode CPL 3
@@ -148,8 +148,12 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
 	/* Headers are too twisted for this to go in paravirt.h. */
 	return regs->cs == __USER_CS || regs->cs == pv_info.extra_user_64bit_cs;
 #endif
+#else /* !CONFIG_X86_64 */
+	return false;
+#endif
 }
 
+#ifdef CONFIG_X86_64
 #define current_user_stack_pointer()	current_pt_regs()->sp
 #define compat_user_stack_pointer()	current_pt_regs()->sp
 #endif

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] uprobes/x86: Use existing definitions for segment override prefixes
  2017-10-27 20:25 ` [PATCH v10 04/18] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
@ 2017-11-01 20:56   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pbonzini, ray.huang, luto, hpa, brgerst, srikar, akpm, corbet,
	shuah, linux-kernel, dvlasenk, cmetcalf, mst, bp, ravi.v.shankar,
	vbabka, mingo, peterz, paul.gortmaker, mhiramat, dave.hansen,
	slaoub, jslaby, ricardo.neri-calderon, tglx

Commit-ID:  ed40a10431701d683bfd59f7ca01a8c97408cf67
Gitweb:     https://git.kernel.org/tip/ed40a10431701d683bfd59f7ca01a8c97408cf67
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:31 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:08 +0100

uprobes/x86: Use existing definitions for segment override prefixes

Rather than using hard-coded values of the segment override prefixes,
leverage the existing definitions provided in inat.h.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-5-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/kernel/uprobes.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 495c776..a3755d2 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -271,12 +271,15 @@ static bool is_prefix_bad(struct insn *insn)
 	int i;
 
 	for (i = 0; i < insn->prefixes.nbytes; i++) {
-		switch (insn->prefixes.bytes[i]) {
-		case 0x26:	/* INAT_PFX_ES   */
-		case 0x2E:	/* INAT_PFX_CS   */
-		case 0x36:	/* INAT_PFX_DS   */
-		case 0x3E:	/* INAT_PFX_SS   */
-		case 0xF0:	/* INAT_PFX_LOCK */
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+		case INAT_MAKE_PREFIX(INAT_PFX_LOCK):
 			return true;
 		}
 	}

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/mpx: Simplify handling of errors when computing linear addresses
  2017-10-27 20:25 ` [PATCH v10 05/18] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
@ 2017-11-01 20:56   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ricardo.neri-calderon, dave.hansen, mhiramat, vbabka, shuah,
	slaoub, ray.huang, corbet, luto, jslaby, adam.buchbinder,
	liverlint, adanhawthorn, bp, mst, mingo, lstoakes, cmetcalf,
	tglx, joe, pbonzini, hpa, qiaowei.ren, akpm, peterz,
	paul.gortmaker, brgerst, colin.king, linux-kernel,
	ravi.v.shankar

Commit-ID:  b15d70df6e685912be8bbcb7557d277d48aa942c
Gitweb:     https://git.kernel.org/tip/b15d70df6e685912be8bbcb7557d277d48aa942c
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:32 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:08 +0100

x86/mpx: Simplify handling of errors when computing linear addresses

When errors occur in the computation of the linear address, -1L is
returned. Rather than having a separate return path for errors, the
variable used to return the computed linear address can be initialized
with the error value. Hence, only one return path is needed. This makes
the function easier to read.

While here, ensure that the error value is -1L, a 64-bit value, rather
than -1, a 32-bit value.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-6-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/mm/mpx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 9ceaa95..f4c48a0 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,7 +138,7 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	unsigned long addr, base, indx;
+	unsigned long addr = -1L, base, indx;
 	int addr_offset, base_offset, indx_offset;
 	insn_byte_t sib;
 
@@ -149,17 +149,17 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
-			goto out_err;
+			goto out;
 		addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
 			if (base_offset < 0)
-				goto out_err;
+				goto out;
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
 			if (indx_offset < 0)
-				goto out_err;
+				goto out;
 
 			base = regs_get_register(regs, base_offset);
 			indx = regs_get_register(regs, indx_offset);
@@ -167,14 +167,13 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			if (addr_offset < 0)
-				goto out_err;
+				goto out;
 			addr = regs_get_register(regs, addr_offset);
 		}
 		addr += insn->displacement.value;
 	}
+out:
 	return (void __user *)addr;
-out_err:
-	return (void __user *)-1;
 }
 
 static int mpx_insn_decode(struct insn *insn,

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/mpx: Use signed variables to compute effective addresses
  2017-10-27 20:25 ` [PATCH v10 06/18] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
@ 2017-11-01 20:57   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, corbet, slaoub, colin.king, cmetcalf, joe, hpa,
	dave.hansen, vbabka, luto, bp, mhiramat, mingo, pbonzini, akpm,
	lstoakes, adam.buchbinder, ravi.v.shankar, liverlint, tglx,
	ray.huang, adanhawthorn, brgerst, shuah, paul.gortmaker,
	qiaowei.ren, mst, jslaby, linux-kernel, ricardo.neri-calderon

Commit-ID:  b8d2eff3b1c6e46238a5fb3f56843e9974b4889f
Gitweb:     https://git.kernel.org/tip/b8d2eff3b1c6e46238a5fb3f56843e9974b4889f
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:33 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:09 +0100

x86/mpx: Use signed variables to compute effective addresses

Even though memory addresses are unsigned, the operands used to compute the
effective address do have a sign. This is true for ModRM.rm, SIB.base,
SIB.index as well as the displacement bytes. Thus, signed variables shall
be used when computing the effective address from these operands. Once the
signed effective address has been computed, it is casted to an unsigned
long to determine the linear address.

Variables are renamed to better reflect the type of address being
computed.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-7-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/mm/mpx.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index f4c48a0..57e5bf5 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,8 +138,9 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	unsigned long addr = -1L, base, indx;
 	int addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L;
+	long eff_addr, base, indx;
 	insn_byte_t sib;
 
 	insn_get_modrm(insn);
@@ -150,7 +151,8 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
 			goto out;
-		addr = regs_get_register(regs, addr_offset);
+
+		eff_addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
@@ -163,17 +165,23 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 
 			base = regs_get_register(regs, base_offset);
 			indx = regs_get_register(regs, indx_offset);
-			addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			if (addr_offset < 0)
 				goto out;
-			addr = regs_get_register(regs, addr_offset);
+
+			eff_addr = regs_get_register(regs, addr_offset);
 		}
-		addr += insn->displacement.value;
+
+		eff_addr += insn->displacement.value;
 	}
+
+	linear_addr = (unsigned long)eff_addr;
+
 out:
-	return (void __user *)addr;
+	return (void __user *)linear_addr;
 }
 
 static int mpx_insn_decode(struct insn *insn,

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b
  2017-10-27 20:25 ` [PATCH v10 07/18] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
@ 2017-11-01 20:57   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, adam.buchbinder, qiaowei.ren, tglx, ray.huang, cmetcalf,
	dave.hansen, shuah, lstoakes, colin.king, slaoub, corbet, bp,
	mhiramat, ravi.v.shankar, joe, adanhawthorn, akpm, liverlint,
	peterz, luto, vbabka, paul.gortmaker, mingo, pbonzini, jslaby,
	brgerst, mst, ricardo.neri-calderon, linux-kernel

Commit-ID:  ff9d78025c519046cfbc212b34f09116685402fc
Gitweb:     https://git.kernel.org/tip/ff9d78025c519046cfbc212b34f09116685402fc
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:34 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:09 +0100

x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b

Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod !=11b and
ModRM.rm = 100b indexed register-indirect addressing is used. In other
words, a SIB byte follows the ModRM byte. In the specific case of
SIB.index = 100b, the scale*index portion of the computation of the
effective address is null. To signal callers of this particular situation,
get_reg_offset() can return -EDOM (-EINVAL continues to indicate that an
error when decoding the SIB byte).

An example of this situation can be the following instruction:

   8b 4c 23 80       mov -0x80(%rbx,%riz,1),%rcx
   ModRM:            0x4c [mod:1b][reg:1b][rm:100b]
   SIB:              0x23 [scale:0b][index:100b][base:11b]
   Displacement:     0x80  (1-byte, as per ModRM.mod = 1b)

The %riz 'register' indicates a null index.

In long mode, a REX prefix may be used. When a REX prefix is present,
REX.X adds a fourth bit to the register selection of SIB.index. This gives
the ability to refer to all the 16 general purpose registers. When REX.X is
1b and SIB.index is 100b, the index is indicated in %r12. In our example,
this would look like:

   42 8b 4c 23 80    mov -0x80(%rbx,%r12,1),%rcx
   REX:              0x42 [W:0b][R:0b][X:1b][B:0b]
   ModRM:            0x4c [mod:1b][reg:1b][rm:100b]
   SIB:              0x23 [scale:0b][.X: 1b, index:100b][.B:0b, base:11b]
   Displacement:     0x80  (1-byte, as per ModRM.mod = 1b)

%r12 is a valid register to use in the scale*index part of the effective
address computation.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-8-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/mm/mpx.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 57e5bf5..2ad1d4a 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -110,6 +110,15 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 		regno = X86_SIB_INDEX(insn->sib.value);
 		if (X86_REX_X(insn->rex_prefix.value))
 			regno += 8;
+
+		/*
+		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
+		 * portion of the address computation is null. This is
+		 * true only if REX.X is 0. In such a case, the SIB index
+		 * is used in the address computation.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+			return -EDOM;
 		break;
 
 	case REG_TYPE_BASE:
@@ -160,11 +169,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 				goto out;
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-			if (indx_offset < 0)
+			/*
+			 * A negative offset generally means a error, except
+			 * -EDOM, which means that the contents of the register
+			 * should not be used as index.
+			 */
+			if (indx_offset == -EDOM)
+				indx = 0;
+			else if (indx_offset < 0)
 				goto out;
+			else
+				indx = regs_get_register(regs, indx_offset);
 
 			base = regs_get_register(regs, base_offset);
-			indx = regs_get_register(regs, indx_offset);
 
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  2017-10-27 20:25 ` [PATCH v10 08/18] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
@ 2017-11-01 20:57   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: colin.king, vbabka, adanhawthorn, dave.hansen, shuah, cmetcalf,
	slaoub, tglx, ravi.v.shankar, liverlint, akpm,
	ricardo.neri-calderon, adam.buchbinder, joe, ray.huang, mst, bp,
	pbonzini, luto, mingo, hpa, brgerst, jslaby, lstoakes,
	qiaowei.ren, paul.gortmaker, mhiramat, linux-kernel, corbet,
	peterz

Commit-ID:  4578f06fc93fb73c9c644ed838f4cdabbfdc4df1
Gitweb:     https://git.kernel.org/tip/4578f06fc93fb73c9c644ed838f4cdabbfdc4df1
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:35 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:10 +0100

x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0

Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that if a SIB byte is used and
SIB.base is 101b and ModRM.mod is zero, then the base part of the base
part of the effective address computation is null. To signal this
situation, a -EDOM error is returned to indicate callers to ignore the
base value present in the register operand.

In this scenario, a 32-bit displacement follows the SIB byte. Displacement
is obtained when the instruction decoder parses the operands.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-9-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/mm/mpx.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 2ad1d4a..581a960 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -123,6 +123,14 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 
 	case REG_TYPE_BASE:
 		regno = X86_SIB_BASE(insn->sib.value);
+		/*
+		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
+		 * register-indirect addressing is 0. In this case, a
+		 * 32-bit displacement follows the SIB byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
 		if (X86_REX_B(insn->rex_prefix.value))
 			regno += 8;
 		break;
@@ -164,16 +172,22 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset < 0)
+			if (base_offset == -EDOM)
+				base = 0;
+			else if (base_offset < 0)
 				goto out;
+			else
+				base = regs_get_register(regs, base_offset);
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-			/*
-			 * A negative offset generally means a error, except
-			 * -EDOM, which means that the contents of the register
-			 * should not be used as index.
-			 */
+
 			if (indx_offset == -EDOM)
 				indx = 0;
 			else if (indx_offset < 0)
@@ -181,8 +195,6 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			else
 				indx = regs_get_register(regs, indx_offset);
 
-			base = regs_get_register(regs, base_offset);
-
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file
  2017-10-27 20:25 ` [PATCH v10 09/18] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
@ 2017-11-01 20:58   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, qiaowei.ren, keescook, ray.huang, paul.gortmaker,
	mst, acme, tglx, dvyukov, cmetcalf, corbet, lstoakes, slaoub,
	mhiramat, ricardo.neri-calderon, colin.king, luto,
	adam.buchbinder, hpa, peterz, adrian.hunter, ravi.v.shankar,
	jslaby, akpm, bp, brgerst, pbonzini, vbabka, thgarnie, shuah,
	mingo, dave.hansen

Commit-ID:  32542ee295bec38e5e1608f8c9d6d28e5a7e6112
Gitweb:     https://git.kernel.org/tip/32542ee295bec38e5e1608f8c9d6d28e5a7e6112
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:36 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:10 +0100

x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file

Other kernel submodules can benefit from using the utility functions
defined in mpx.c to obtain the addresses and values of operands contained
in the general purpose registers. An instance of this is the emulation code
used for instructions protected by the Intel User-Mode Instruction
Prevention feature.

Thus, these functions are relocated to a new insn-eval.c file. The reason
to not relocate these utilities into insn.c is that the latter solely
analyses instructions given by a struct insn without any knowledge of the
meaning of the values of instruction operands. This new utility insn-
eval.c aims to be used to resolve userspace linear addresses based on
the contents of the instruction operands as well as the contents of pt_regs
structure.

These utilities come with a separate header. This is to avoid taking insn.c
out of sync from the instructions decoders under tools/obj and tools/perf.
This also avoids adding cumbersome #ifdef's for the #include'd files
required to decode instructions in a kernel context.

Functions are simply relocated. There are not functional or indentation
changes.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-10-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/insn-eval.h |  16 ++++
 arch/x86/lib/Makefile            |   2 +-
 arch/x86/lib/insn-eval.c         | 163 +++++++++++++++++++++++++++++++++++++++
 arch/x86/mm/mpx.c                | 156 +------------------------------------
 4 files changed, 182 insertions(+), 155 deletions(-)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
new file mode 100644
index 0000000..5cab1b1
--- /dev/null
+++ b/arch/x86/include/asm/insn-eval.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_X86_INSN_EVAL_H
+#define _ASM_X86_INSN_EVAL_H
+/*
+ * A collection of utility functions for x86 instruction analysis to be
+ * used in a kernel context. Useful when, for instance, making sense
+ * of the registers indicated by operands.
+ */
+
+#include <linux/compiler.h>
+#include <linux/bug.h>
+#include <linux/err.h>
+#include <asm/ptrace.h>
+
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+
+#endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 34a7413..675d7b0 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -23,7 +23,7 @@ lib-y := delay.o misc.o cmdline.o cpu.o
 lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
-lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
 lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
new file mode 100644
index 0000000..df9418c
--- /dev/null
+++ b/arch/x86/lib/insn-eval.c
@@ -0,0 +1,163 @@
+/*
+ * Utility functions for x86 operand and address decoding
+ *
+ * Copyright (C) Intel Corporation 2017
+ */
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <asm/inat.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
+
+enum reg_type {
+	REG_TYPE_RM = 0,
+	REG_TYPE_INDEX,
+	REG_TYPE_BASE,
+};
+
+static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
+			  enum reg_type type)
+{
+	int regno = 0;
+
+	static const int regoff[] = {
+		offsetof(struct pt_regs, ax),
+		offsetof(struct pt_regs, cx),
+		offsetof(struct pt_regs, dx),
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, sp),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+		offsetof(struct pt_regs, r8),
+		offsetof(struct pt_regs, r9),
+		offsetof(struct pt_regs, r10),
+		offsetof(struct pt_regs, r11),
+		offsetof(struct pt_regs, r12),
+		offsetof(struct pt_regs, r13),
+		offsetof(struct pt_regs, r14),
+		offsetof(struct pt_regs, r15),
+#endif
+	};
+	int nr_registers = ARRAY_SIZE(regoff);
+	/*
+	 * Don't possibly decode a 32-bit instructions as
+	 * reading a 64-bit-only register.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64) && !insn->x86_64)
+		nr_registers -= 8;
+
+	switch (type) {
+	case REG_TYPE_RM:
+		regno = X86_MODRM_RM(insn->modrm.value);
+		if (X86_REX_B(insn->rex_prefix.value))
+			regno += 8;
+		break;
+
+	case REG_TYPE_INDEX:
+		regno = X86_SIB_INDEX(insn->sib.value);
+		if (X86_REX_X(insn->rex_prefix.value))
+			regno += 8;
+
+		/*
+		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
+		 * portion of the address computation is null. This is
+		 * true only if REX.X is 0. In such a case, the SIB index
+		 * is used in the address computation.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+			return -EDOM;
+		break;
+
+	case REG_TYPE_BASE:
+		regno = X86_SIB_BASE(insn->sib.value);
+		/*
+		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
+		 * register-indirect addressing is 0. In this case, a
+		 * 32-bit displacement follows the SIB byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
+		if (X86_REX_B(insn->rex_prefix.value))
+			regno += 8;
+		break;
+
+	default:
+		pr_err("invalid register type");
+		BUG();
+		break;
+	}
+
+	if (regno >= nr_registers) {
+		WARN_ONCE(1, "decoded an instruction with an invalid register");
+		return -EINVAL;
+	}
+	return regoff[regno];
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+	int addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L;
+	long eff_addr, base, indx;
+	insn_byte_t sib;
+
+	insn_get_modrm(insn);
+	insn_get_sib(insn);
+	sib = insn->sib.value;
+
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+		if (addr_offset < 0)
+			goto out;
+
+		eff_addr = regs_get_register(regs, addr_offset);
+	} else {
+		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
+			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
+			if (base_offset == -EDOM)
+				base = 0;
+			else if (base_offset < 0)
+				goto out;
+			else
+				base = regs_get_register(regs, base_offset);
+
+			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
+
+			if (indx_offset == -EDOM)
+				indx = 0;
+			else if (indx_offset < 0)
+				goto out;
+			else
+				indx = regs_get_register(regs, indx_offset);
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+		} else {
+			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+			if (addr_offset < 0)
+				goto out;
+
+			eff_addr = regs_get_register(regs, addr_offset);
+		}
+
+		eff_addr += insn->displacement.value;
+	}
+
+	linear_addr = (unsigned long)eff_addr;
+
+out:
+	return (void __user *)linear_addr;
+}
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 581a960..2878205 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -12,6 +12,7 @@
 #include <linux/sched/sysctl.h>
 
 #include <asm/insn.h>
+#include <asm/insn-eval.h>
 #include <asm/mman.h>
 #include <asm/mmu_context.h>
 #include <asm/mpx.h>
@@ -60,159 +61,6 @@ static unsigned long mpx_mmap(unsigned long len)
 	return addr;
 }
 
-enum reg_type {
-	REG_TYPE_RM = 0,
-	REG_TYPE_INDEX,
-	REG_TYPE_BASE,
-};
-
-static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
-			  enum reg_type type)
-{
-	int regno = 0;
-
-	static const int regoff[] = {
-		offsetof(struct pt_regs, ax),
-		offsetof(struct pt_regs, cx),
-		offsetof(struct pt_regs, dx),
-		offsetof(struct pt_regs, bx),
-		offsetof(struct pt_regs, sp),
-		offsetof(struct pt_regs, bp),
-		offsetof(struct pt_regs, si),
-		offsetof(struct pt_regs, di),
-#ifdef CONFIG_X86_64
-		offsetof(struct pt_regs, r8),
-		offsetof(struct pt_regs, r9),
-		offsetof(struct pt_regs, r10),
-		offsetof(struct pt_regs, r11),
-		offsetof(struct pt_regs, r12),
-		offsetof(struct pt_regs, r13),
-		offsetof(struct pt_regs, r14),
-		offsetof(struct pt_regs, r15),
-#endif
-	};
-	int nr_registers = ARRAY_SIZE(regoff);
-	/*
-	 * Don't possibly decode a 32-bit instructions as
-	 * reading a 64-bit-only register.
-	 */
-	if (IS_ENABLED(CONFIG_X86_64) && !insn->x86_64)
-		nr_registers -= 8;
-
-	switch (type) {
-	case REG_TYPE_RM:
-		regno = X86_MODRM_RM(insn->modrm.value);
-		if (X86_REX_B(insn->rex_prefix.value))
-			regno += 8;
-		break;
-
-	case REG_TYPE_INDEX:
-		regno = X86_SIB_INDEX(insn->sib.value);
-		if (X86_REX_X(insn->rex_prefix.value))
-			regno += 8;
-
-		/*
-		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
-		 * portion of the address computation is null. This is
-		 * true only if REX.X is 0. In such a case, the SIB index
-		 * is used in the address computation.
-		 */
-		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
-			return -EDOM;
-		break;
-
-	case REG_TYPE_BASE:
-		regno = X86_SIB_BASE(insn->sib.value);
-		/*
-		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
-		 * register-indirect addressing is 0. In this case, a
-		 * 32-bit displacement follows the SIB byte.
-		 */
-		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
-			return -EDOM;
-
-		if (X86_REX_B(insn->rex_prefix.value))
-			regno += 8;
-		break;
-
-	default:
-		pr_err("invalid register type");
-		BUG();
-		break;
-	}
-
-	if (regno >= nr_registers) {
-		WARN_ONCE(1, "decoded an instruction with an invalid register");
-		return -EINVAL;
-	}
-	return regoff[regno];
-}
-
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
- */
-static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
-{
-	int addr_offset, base_offset, indx_offset;
-	unsigned long linear_addr = -1L;
-	long eff_addr, base, indx;
-	insn_byte_t sib;
-
-	insn_get_modrm(insn);
-	insn_get_sib(insn);
-	sib = insn->sib.value;
-
-	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
-		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-		if (addr_offset < 0)
-			goto out;
-
-		eff_addr = regs_get_register(regs, addr_offset);
-	} else {
-		if (insn->sib.nbytes) {
-			/*
-			 * Negative values in the base and index offset means
-			 * an error when decoding the SIB byte. Except -EDOM,
-			 * which means that the registers should not be used
-			 * in the address computation.
-			 */
-			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset == -EDOM)
-				base = 0;
-			else if (base_offset < 0)
-				goto out;
-			else
-				base = regs_get_register(regs, base_offset);
-
-			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-
-			if (indx_offset == -EDOM)
-				indx = 0;
-			else if (indx_offset < 0)
-				goto out;
-			else
-				indx = regs_get_register(regs, indx_offset);
-
-			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
-		} else {
-			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			if (addr_offset < 0)
-				goto out;
-
-			eff_addr = regs_get_register(regs, addr_offset);
-		}
-
-		eff_addr += insn->displacement.value;
-	}
-
-	linear_addr = (unsigned long)eff_addr;
-
-out:
-	return (void __user *)linear_addr;
-}
-
 static int mpx_insn_decode(struct insn *insn,
 			   struct pt_regs *regs)
 {
@@ -325,7 +173,7 @@ siginfo_t *mpx_generate_siginfo(struct pt_regs *regs)
 	info->si_signo = SIGSEGV;
 	info->si_errno = 0;
 	info->si_code = SEGV_BNDERR;
-	info->si_addr = mpx_get_addr_ref(&insn, regs);
+	info->si_addr = insn_get_addr_ref(&insn, regs);
 	/*
 	 * We were not able to extract an address from the instruction,
 	 * probably because there was something invalid in it.

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Do not BUG on invalid register type
  2017-10-27 20:25 ` [PATCH v10 10/18] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
@ 2017-11-01 20:58   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jslaby, acme, shuah, pbonzini, dave.hansen, peterz, hpa,
	keescook, ray.huang, slaoub, adrian.hunter, qiaowei.ren, mst,
	ravi.v.shankar, akpm, cmetcalf, dvyukov, mhiramat, linux-kernel,
	brgerst, mingo, lstoakes, bp, colin.king, vbabka, tglx,
	ricardo.neri-calderon, adam.buchbinder, thgarnie, corbet, luto,
	paul.gortmaker

Commit-ID:  ed594e4ba5bfe268d63d7cee3c1a827e3dd5056f
Gitweb:     https://git.kernel.org/tip/ed594e4ba5bfe268d63d7cee3c1a827e3dd5056f
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:37 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:10 +0100

x86/insn-eval: Do not BUG on invalid register type

We are not in a critical failure path. The invalid register type is caused
when trying to decode invalid instruction bytes from a user-space program.
Thus, simply print an error message. To prevent this warning from being
abused from user space programs, use the rate-limited variant of pr_err().
along with a descriptive prefix.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-11-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/lib/insn-eval.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index df9418c..4931d92 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -5,10 +5,14 @@
  */
 #include <linux/kernel.h>
 #include <linux/string.h>
+#include <linux/ratelimit.h>
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 
+#undef pr_fmt
+#define pr_fmt(fmt) "insn: " fmt
+
 enum reg_type {
 	REG_TYPE_RM = 0,
 	REG_TYPE_INDEX,
@@ -85,9 +89,8 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 		break;
 
 	default:
-		pr_err("invalid register type");
-		BUG();
-		break;
+		pr_err_ratelimited("invalid register type: %d\n", type);
+		return -EINVAL;
 	}
 
 	if (regno >= nr_registers) {

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Add a utility function to get register offsets
  2017-10-27 20:25 ` [PATCH v10 11/18] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
@ 2017-11-01 20:59   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: thgarnie, mingo, acme, qiaowei.ren, ricardo.neri-calderon, tglx,
	pbonzini, hpa, corbet, vbabka, adrian.hunter, colin.king, mst,
	slaoub, jslaby, luto, brgerst, keescook, akpm, bp, lstoakes,
	paul.gortmaker, ray.huang, dave.hansen, peterz, shuah,
	adam.buchbinder, ravi.v.shankar, cmetcalf, mhiramat,
	linux-kernel, dvyukov

Commit-ID:  e5e45f11110191740ecb365fa8c7a25814ce8ac8
Gitweb:     https://git.kernel.org/tip/e5e45f11110191740ecb365fa8c7a25814ce8ac8
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:38 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:11 +0100

x86/insn-eval: Add a utility function to get register offsets

The function get_reg_offset() returns the offset to the register the
argument specifies as indicated in an enumeration of type offset. Callers
of this function would need the definition of such enumeration. This is
not needed. Instead, add helper functions for this purpose. These functions
are useful in cases when, for instance, the caller needs to decide whether
the operand is a register or a memory location by looking at the rm part
of the ModRM byte. As of now, this is the only helper function that is
needed.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-12-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/insn-eval.h |  1 +
 arch/x86/lib/insn-eval.c         | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 5cab1b1..7e8c963 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -12,5 +12,6 @@
 #include <asm/ptrace.h>
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 4931d92..405ffeb 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -100,6 +100,23 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 	return regoff[regno];
 }
 
+/**
+ * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
+ * @insn:	Instruction containing the ModRM byte
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * Returns:
+ *
+ * The register indicated by the r/m part of the ModRM byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of ModRM does not refer to a register and shall be ignored.
+ */
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
+{
+	return get_reg_offset(insn, regs, REG_TYPE_RM);
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Add utility function to identify string instructions
  2017-10-27 20:25 ` [PATCH v10 12/18] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
@ 2017-11-01 20:59   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 20:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ravi.v.shankar, peterz, lstoakes, ricardo.neri-calderon,
	colin.king, qiaowei.ren, mingo, slaoub, mhiramat, keescook, bp,
	paul.gortmaker, corbet, tglx, vbabka, adam.buchbinder, luto, hpa,
	dvyukov, thgarnie, ray.huang, mst, dave.hansen, shuah, cmetcalf,
	akpm, adrian.hunter, jslaby, acme, brgerst, pbonzini,
	linux-kernel

Commit-ID:  536b815388f7f4d2a7cd1418939902fb037ea370
Gitweb:     https://git.kernel.org/tip/536b815388f7f4d2a7cd1418939902fb037ea370
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:39 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:11 +0100

x86/insn-eval: Add utility function to identify string instructions

String instructions are special because, in protected mode, the linear
address is always obtained via the ES segment register in operands that
use the (E)DI register; the DS segment register in operands that use
the (E)SI register. Furthermore, segment override prefixes are ignored
when calculating a linear address involving the (E)DI register; segment
override prefixes can be used when calculating linear addresses involving
the (E)SI register.

It follows that linear addresses are calculated differently for the case of
string instructions. The purpose of this utility function is to identify
such instructions for callers to determine a linear address correctly.

Note that this function only identifies string instructions; it does not
determine what segment register to use in the address computation. That is
left to callers. A subsequent commmit introduces a function to determine
the segment register to use given the instruction, operands and
segment override prefixes.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-13-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/lib/insn-eval.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 405ffeb..ac7b87c 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -19,6 +19,34 @@ enum reg_type {
 	REG_TYPE_BASE,
 };
 
+/**
+ * is_string_insn() - Determine if instruction is a string instruction
+ * @insn:	Instruction containing the opcode to inspect
+ *
+ * Returns:
+ *
+ * true if the instruction, determined by the opcode, is any of the
+ * string instructions as defined in the Intel Software Development manual.
+ * False otherwise.
+ */
+static bool is_string_insn(struct insn *insn)
+{
+	insn_get_opcode(insn);
+
+	/* All string instructions have a 1-byte opcode. */
+	if (insn->opcode.nbytes != 1)
+		return false;
+
+	switch (insn->opcode.bytes[0]) {
+	case 0x6c ... 0x6f:	/* INS, OUTS */
+	case 0xa4 ... 0xa7:	/* MOVS, CMPS */
+	case 0xaa ... 0xaf:	/* STOS, LODS, SCAS */
+		return true;
+	default:
+		return false;
+	}
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Add utility functions to get segment selector
  2017-10-27 20:25 ` [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
@ 2017-11-01 21:00   ` tip-bot for Ricardo Neri
  2017-11-09 11:12   ` [PATCH v10 13/18] " Arnd Bergmann
  1 sibling, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 21:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, akpm, cmetcalf, colin.king, adrian.hunter, shuah,
	corbet, adam.buchbinder, luto, mingo, dvyukov, brgerst,
	ravi.v.shankar, bp, tglx, paul.gortmaker, ray.huang, dave.hansen,
	thgarnie, peterz, qiaowei.ren, ricardo.neri-calderon, keescook,
	pbonzini, vbabka, slaoub, mst, acme, mhiramat, hpa, lstoakes,
	jslaby

Commit-ID:  32d0b95300db03c2b23b2ea2c94769a4a138e79d
Gitweb:     https://git.kernel.org/tip/32d0b95300db03c2b23b2ea2c94769a4a138e79d
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:40 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:11 +0100

x86/insn-eval: Add utility functions to get segment selector

When computing a linear address and segmentation is used, we need to know
the base address of the segment involved in the computation. In most of
the cases, the segment base address will be zero as in USER_DS/USER32_DS.
However, it may be possible that a user space program defines its own
segments via a local descriptor table. In such a case, the segment base
address may not be zero. Thus, the segment base address is needed to
calculate correctly the linear address.

If running in protected mode, the segment selector to be used when
computing a linear address is determined by either any of segment override
prefixes in the instruction or inferred from the registers involved in the
computation of the effective address; in that order. Also, there are cases
when the segment override prefixes shall be ignored (i.e., code segments
are always selected by the CS segment register; string instructions always
use the ES segment register when using rDI register as operand). In long
mode, segment registers are ignored, except for FS and GS. In these two
cases, base addresses are obtained from the respective MSRs.

For clarity, this process can be split into four steps (and an equal
number of functions): determine if segment prefixes overrides can be used;
parse the segment override prefixes, and use them if found; if not found
or cannot be used, use the default segment registers associated with the
operand registers. Once the segment register to use has been identified,
read its value to obtain the segment selector.

The method to obtain the segment selector depends on several factors. In
32-bit builds, segment selectors are saved into a pt_regs structure
when switching to kernel mode. The same is also true for virtual-8086
mode. In 64-bit builds, segmentation is mostly ignored, except when
running a program in 32-bit legacy mode. In this case, CS and SS can be
obtained from pt_regs. DS, ES, FS and GS can be read directly from
the respective segment registers.

In order to identify the segment registers, a new set of #defines is
introduced. It also includes two special identifiers. One of them
indicates when the default segment register associated with instruction
operands shall be used. Another one indicates that the contents of the
segment register shall be ignored; this identifier is used when in long
mode.

Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-14-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/inat.h |  10 ++
 arch/x86/lib/insn-eval.c    | 340 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 350 insertions(+)

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
index 02aff08..1c78580 100644
--- a/arch/x86/include/asm/inat.h
+++ b/arch/x86/include/asm/inat.h
@@ -97,6 +97,16 @@
 #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
 #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
 
+/* Identifiers for segment registers */
+#define INAT_SEG_REG_IGNORE	0
+#define INAT_SEG_REG_DEFAULT	1
+#define INAT_SEG_REG_CS		2
+#define INAT_SEG_REG_SS		3
+#define INAT_SEG_REG_DS		4
+#define INAT_SEG_REG_ES		5
+#define INAT_SEG_REG_FS		6
+#define INAT_SEG_REG_GS		7
+
 /* Attribute search APIs */
 extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
 extern int inat_get_last_prefix_id(insn_byte_t last_pfx);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ac7b87c..6a902b1 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -9,6 +9,7 @@
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/vm86.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "insn: " fmt
@@ -47,6 +48,345 @@ static bool is_string_insn(struct insn *insn)
 	}
 }
 
+/**
+ * get_seg_reg_override_idx() - obtain segment register override index
+ * @insn:	Valid instruction with segment override prefixes
+ *
+ * Inspect the instruction prefixes in @insn and find segment overrides, if any.
+ *
+ * Returns:
+ *
+ * A constant identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_DEFAULT is returned if no segment override
+ * prefixes were found.
+ *
+ * -EINVAL in case of error.
+ */
+static int get_seg_reg_override_idx(struct insn *insn)
+{
+	int idx = INAT_SEG_REG_DEFAULT;
+	int num_overrides = 0, i;
+
+	insn_get_prefixes(insn);
+
+	/* Look for any segment override prefixes. */
+	for (i = 0; i < insn->prefixes.nbytes; i++) {
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+			idx = INAT_SEG_REG_CS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+			idx = INAT_SEG_REG_SS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+			idx = INAT_SEG_REG_DS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+			idx = INAT_SEG_REG_ES;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_FS):
+			idx = INAT_SEG_REG_FS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_GS):
+			idx = INAT_SEG_REG_GS;
+			num_overrides++;
+			break;
+		/* No default action needed. */
+		}
+	}
+
+	/* More than one segment override prefix leads to undefined behavior. */
+	if (num_overrides > 1)
+		return -EINVAL;
+
+	return idx;
+}
+
+/**
+ * check_seg_overrides() - check if segment override prefixes are allowed
+ * @insn:	Valid instruction with segment override prefixes
+ * @regoff:	Operand offset, in pt_regs, for which the check is performed
+ *
+ * For a particular register used in register-indirect addressing, determine if
+ * segment override prefixes can be used. Specifically, no overrides are allowed
+ * for rDI if used with a string instruction.
+ *
+ * Returns:
+ *
+ * True if segment override prefixes can be used with the register indicated
+ * in @regoff. False if otherwise.
+ */
+static bool check_seg_overrides(struct insn *insn, int regoff)
+{
+	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
+		return false;
+
+	return true;
+}
+
+/**
+ * resolve_default_seg() - resolve default segment register index for an operand
+ * @insn:	Instruction with opcode and address size. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @off:	Operand offset, in pt_regs, for which resolution is needed
+ *
+ * Resolve the default segment register index associated with the instruction
+ * operand register indicated by @off. Such index is resolved based on defaults
+ * described in the Intel Software Development Manual.
+ *
+ * Returns:
+ *
+ * If in protected mode, a constant identifying the segment register to use,
+ * among CS, SS, ES or DS. If in long mode, INAT_SEG_REG_IGNORE.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
+{
+	if (user_64bit_mode(regs))
+		return INAT_SEG_REG_IGNORE;
+	/*
+	 * Resolve the default segment register as described in Section 3.7.4
+	 * of the Intel Software Development Manual Vol. 1:
+	 *
+	 *  + DS for all references involving r[ABCD]X, and rSI.
+	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
+	 *  + AX, CX and DX are not valid register operands in 16-bit address
+	 *    encodings but are valid for 32-bit and 64-bit encodings.
+	 *  + -EDOM is reserved to identify for cases in which no register
+	 *    is used (i.e., displacement-only addressing). Use DS.
+	 *  + SS for rSP or rBP.
+	 *  + CS for rIP.
+	 */
+
+	switch (off) {
+	case offsetof(struct pt_regs, ax):
+	case offsetof(struct pt_regs, cx):
+	case offsetof(struct pt_regs, dx):
+		/* Need insn to verify address size. */
+		if (insn->addr_bytes == 2)
+			return -EINVAL;
+
+	case -EDOM:
+	case offsetof(struct pt_regs, bx):
+	case offsetof(struct pt_regs, si):
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, di):
+		if (is_string_insn(insn))
+			return INAT_SEG_REG_ES;
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, bp):
+	case offsetof(struct pt_regs, sp):
+		return INAT_SEG_REG_SS;
+
+	case offsetof(struct pt_regs, ip):
+		return INAT_SEG_REG_CS;
+
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
+ * resolve_seg_reg() - obtain segment register index
+ * @insn:	Instruction with operands
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to deterimine segment register
+ *
+ * Determine the segment register associated with the operands and, if
+ * applicable, prefixes and the instruction pointed by @insn.
+ *
+ * The segment register associated to an operand used in register-indirect
+ * addressing depends on:
+ *
+ * a) Whether running in long mode (in such a case segments are ignored, except
+ * if FS or GS are used).
+ *
+ * b) Whether segment override prefixes can be used. Certain instructions and
+ *    registers do not allow override prefixes.
+ *
+ * c) Whether segment overrides prefixes are found in the instruction prefixes.
+ *
+ * d) If there are not segment override prefixes or they cannot be used, the
+ *    default segment register associated with the operand register is used.
+ *
+ * The function checks first if segment override prefixes can be used with the
+ * operand indicated by @regoff. If allowed, obtain such overridden segment
+ * register index. Lastly, if not prefixes were found or cannot be used, resolve
+ * the segment register index to use based on the defaults described in the
+ * Intel documentation. In long mode, all segment register indexes will be
+ * ignored, except if overrides were found for FS or GS. All these operations
+ * are done using helper functions.
+ *
+ * The operand register, @regoff, is represented as the offset from the base of
+ * pt_regs.
+ *
+ * As stated, the main use of this function is to determine the segment register
+ * index based on the instruction, its operands and prefixes. Hence, @insn
+ * must be valid. However, if @regoff indicates rIP, we don't need to inspect
+ * @insn at all as in this case CS is used in all cases. This case is checked
+ * before proceeding further.
+ *
+ * Please note that this function does not return the value in the segment
+ * register (i.e., the segment selector) but our defined index. The segment
+ * selector needs to be obtained using get_segment_selector() and passing the
+ * segment register index resolved by this function.
+ *
+ * Returns:
+ *
+ * An index identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_IGNORE is returned if running in long mode.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
+{
+	int idx;
+
+	/*
+	 * In the unlikely event of having to resolve the segment register
+	 * index for rIP, do it first. Segment override prefixes should not
+	 * be used. Hence, it is not necessary to inspect the instruction,
+	 * which may be invalid at this point.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip)) {
+		if (user_64bit_mode(regs))
+			return INAT_SEG_REG_IGNORE;
+		else
+			return INAT_SEG_REG_CS;
+	}
+
+	if (!insn)
+		return -EINVAL;
+
+	if (!check_seg_overrides(insn, regoff))
+		return resolve_default_seg(insn, regs, regoff);
+
+	idx = get_seg_reg_override_idx(insn);
+	if (idx < 0)
+		return idx;
+
+	if (idx == INAT_SEG_REG_DEFAULT)
+		return resolve_default_seg(insn, regs, regoff);
+
+	/*
+	 * In long mode, segment override prefixes are ignored, except for
+	 * overrides for FS and GS.
+	 */
+	if (user_64bit_mode(regs)) {
+		if (idx != INAT_SEG_REG_FS &&
+		    idx != INAT_SEG_REG_GS)
+			idx = INAT_SEG_REG_IGNORE;
+	}
+
+	return idx;
+}
+
+/**
+ * get_segment_selector() - obtain segment selector
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Segment register index to use
+ *
+ * Obtain the segment selector from any of the CS, SS, DS, ES, FS, GS segment
+ * registers. In CONFIG_X86_32, the segment is obtained from either pt_regs or
+ * kernel_vm86_regs as applicable. In CONFIG_X86_64, CS and SS are obtained
+ * from pt_regs. DS, ES, FS and GS are obtained by reading the actual CPU
+ * registers. This done for only for completeness as in CONFIG_X86_64 segment
+ * registers are ignored.
+ *
+ * Returns:
+ *
+ * Value of the segment selector, including null when running in
+ * long mode.
+ *
+ * -EINVAL on error.
+ */
+static short get_segment_selector(struct pt_regs *regs, int seg_reg_idx)
+{
+#ifdef CONFIG_X86_64
+	unsigned short sel;
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_IGNORE:
+		return 0;
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		savesegment(ds, sel);
+		return sel;
+	case INAT_SEG_REG_ES:
+		savesegment(es, sel);
+		return sel;
+	case INAT_SEG_REG_FS:
+		savesegment(fs, sel);
+		return sel;
+	case INAT_SEG_REG_GS:
+		savesegment(gs, sel);
+		return sel;
+	default:
+		return -EINVAL;
+	}
+#else /* CONFIG_X86_32 */
+	struct kernel_vm86_regs *vm86regs = (struct kernel_vm86_regs *)regs;
+
+	if (v8086_mode(regs)) {
+		switch (seg_reg_idx) {
+		case INAT_SEG_REG_CS:
+			return (unsigned short)(regs->cs & 0xffff);
+		case INAT_SEG_REG_SS:
+			return (unsigned short)(regs->ss & 0xffff);
+		case INAT_SEG_REG_DS:
+			return vm86regs->ds;
+		case INAT_SEG_REG_ES:
+			return vm86regs->es;
+		case INAT_SEG_REG_FS:
+			return vm86regs->fs;
+		case INAT_SEG_REG_GS:
+			return vm86regs->gs;
+		case INAT_SEG_REG_IGNORE:
+			/* fall through */
+		default:
+			return -EINVAL;
+		}
+	}
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		return (unsigned short)(regs->ds & 0xffff);
+	case INAT_SEG_REG_ES:
+		return (unsigned short)(regs->es & 0xffff);
+	case INAT_SEG_REG_FS:
+		return (unsigned short)(regs->fs & 0xffff);
+	case INAT_SEG_REG_GS:
+		/*
+		 * GS may or may not be in regs as per CONFIG_X86_32_LAZY_GS.
+		 * The macro below takes care of both cases.
+		 */
+		return get_user_gs(regs);
+	case INAT_SEG_REG_IGNORE:
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+#endif /* CONFIG_X86_64 */
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-10-27 20:25 ` [PATCH v10 14/18] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
@ 2017-11-01 21:00   ` tip-bot for Ricardo Neri
  2017-12-05 17:48     ` Peter Zijlstra
  0 siblings, 1 reply; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 21:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ravi.v.shankar, slaoub, dvyukov, ray.huang, tglx, corbet,
	ricardo.neri-calderon, adrian.hunter, peterz, thgarnie, keescook,
	paul.gortmaker, bp, hpa, lstoakes, acme, shuah, brgerst,
	cmetcalf, akpm, vbabka, colin.king, linux-kernel, mingo,
	pbonzini, jslaby, mst, mhiramat, dave.hansen, qiaowei.ren,
	adam.buchbinder, luto

Commit-ID:  670f928ba09b06712da34a3c44be6c8fa561fb19
Gitweb:     https://git.kernel.org/tip/670f928ba09b06712da34a3c44be6c8fa561fb19
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:41 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:12 +0100

x86/insn-eval: Add utility function to get segment descriptor

The segment descriptor contains information that is relevant to how linear
addresses need to be computed. It contains the default size of addresses
as well as the base address of the segment. Thus, given a segment
selector, we ought to look at segment descriptor to correctly calculate
the linear address.

In protected mode, the segment selector might indicate a segment
descriptor from either the global descriptor table or a local descriptor
table. Both cases are considered in this function.

This function is a prerequisite for functions in subsequent commits that
will obtain the aforementioned attributes of the segment descriptor.

Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-15-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/lib/insn-eval.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 6a902b1..d85e840 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -6,9 +6,13 @@
 #include <linux/kernel.h>
 #include <linux/string.h>
 #include <linux/ratelimit.h>
+#include <linux/mmu_context.h>
+#include <asm/desc_defs.h>
+#include <asm/desc.h>
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/ldt.h>
 #include <asm/vm86.h>
 
 #undef pr_fmt
@@ -469,6 +473,59 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_desc() - Obtain pointer to a segment descriptor
+ * @sel:	Segment selector
+ *
+ * Given a segment selector, obtain a pointer to the segment descriptor.
+ * Both global and local descriptor tables are supported.
+ *
+ * Returns:
+ *
+ * Pointer to segment descriptor on success.
+ *
+ * NULL on error.
+ */
+static struct desc_struct *get_desc(unsigned short sel)
+{
+	struct desc_ptr gdt_desc = {0, 0};
+	unsigned long desc_base;
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+		struct desc_struct *desc = NULL;
+		struct ldt_struct *ldt;
+
+		/* Bits [15:3] contain the index of the desired entry. */
+		sel >>= 3;
+
+		mutex_lock(&current->active_mm->context.lock);
+		ldt = current->active_mm->context.ldt;
+		if (ldt && sel < ldt->nr_entries)
+			desc = &ldt->entries[sel];
+
+		mutex_unlock(&current->active_mm->context.lock);
+
+		return desc;
+	}
+#endif
+	native_store_gdt(&gdt_desc);
+
+	/*
+	 * Segment descriptors have a size of 8 bytes. Thus, the index is
+	 * multiplied by 8 to obtain the memory offset of the desired descriptor
+	 * from the base of the GDT. As bits [15:3] of the segment selector
+	 * contain the index, it can be regarded as multiplied by 8 already.
+	 * All that remains is to clear bits [2:0].
+	 */
+	desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+	if (desc_base > gdt_desc.size)
+		return NULL;
+
+	return (struct desc_struct *)(gdt_desc.address + desc_base);
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-27 20:25 ` [PATCH v10 15/18] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
@ 2017-11-01 21:00   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 21:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jslaby, tglx, corbet, brgerst, thgarnie, colin.king, mst, akpm,
	adrian.hunter, luto, cmetcalf, pbonzini, vbabka, dave.hansen,
	dvyukov, acme, keescook, slaoub, hpa, paul.gortmaker, ray.huang,
	ravi.v.shankar, peterz, ricardo.neri-calderon, mhiramat, bp,
	qiaowei.ren, linux-kernel, lstoakes, mingo, adam.buchbinder,
	shuah

Commit-ID:  bd5a410a5de3a6893eaacc749e706b85506dc908
Gitweb:     https://git.kernel.org/tip/bd5a410a5de3a6893eaacc749e706b85506dc908
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:42 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:12 +0100

x86/insn-eval: Add utility functions to get segment descriptor base address and limit

With segmentation, the base address of the segment is needed to compute a
linear address. This base address is obtained from the applicable segment
descriptor. Such segment descriptor is referenced from a segment selector.
These new functions obtain the segment base and limit of the segment
selector indicated by segment register index given as argument. This index
is any of the INAT_SEG_REG_* family of #define's.

The logic to obtain the segment selector is wrapped in the function
get_segment_selector() with the inputs described above. Once the selector
is known, the base address is determined. In protected mode, the selector
is used to obtain the segment descriptor and then its base address. In
long mode, the segment base address is zero except when FS or GS are used.
In virtual-8086 mode, the base address is computed as the value of the
segment selector shifted 4 positions to the left.

In protected mode, segment limits are enforced. Thus, a function to
determine the limit of the segment is added. Segment limits are not
enforced in long or virtual-8086. For the latter, addresses are limited
to 20 bits; address size will be handled when computing the linear
address.

Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-16-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/insn-eval.h |   1 +
 arch/x86/lib/insn-eval.c         | 114 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 115 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 7e8c963..25d6e44 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -13,5 +13,6 @@
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
+unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index d85e840..89d5c89 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -526,6 +526,120 @@ static struct desc_struct *get_desc(unsigned short sel)
 }
 
 /**
+ * insn_get_seg_base() - Obtain base address of segment descriptor.
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Index of the segment register pointing to seg descriptor
+ *
+ * Obtain the base address of the segment as indicated by the segment descriptor
+ * pointed by the segment selector. The segment selector is obtained from the
+ * input segment register index @seg_reg_idx.
+ *
+ * Returns:
+ *
+ * In protected mode, base address of the segment. Zero in long mode,
+ * except when FS or GS are used. In virtual-8086 mode, the segment
+ * selector shifted 4 bits to the right.
+ *
+ * -1L in case of error.
+ */
+unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
+{
+	struct desc_struct *desc;
+	short sel;
+
+	sel = get_segment_selector(regs, seg_reg_idx);
+	if (sel < 0)
+		return -1L;
+
+	if (v8086_mode(regs))
+		/*
+		 * Base is simply the segment selector shifted 4
+		 * bits to the right.
+		 */
+		return (unsigned long)(sel << 4);
+
+	if (user_64bit_mode(regs)) {
+		/*
+		 * Only FS or GS will have a base address, the rest of
+		 * the segments' bases are forced to 0.
+		 */
+		unsigned long base;
+
+		if (seg_reg_idx == INAT_SEG_REG_FS)
+			rdmsrl(MSR_FS_BASE, base);
+		else if (seg_reg_idx == INAT_SEG_REG_GS)
+			/*
+			 * swapgs was called at the kernel entry point. Thus,
+			 * MSR_KERNEL_GS_BASE will have the user-space GS base.
+			 */
+			rdmsrl(MSR_KERNEL_GS_BASE, base);
+		else
+			base = 0;
+		return base;
+	}
+
+	/* In protected mode the segment selector cannot be null. */
+	if (!sel)
+		return -1L;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return -1L;
+
+	return get_desc_base(desc);
+}
+
+/**
+ * get_seg_limit() - Obtain the limit of a segment descriptor
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Index of the segment register pointing to seg descriptor
+ *
+ * Obtain the limit of the segment as indicated by the segment descriptor
+ * pointed by the segment selector. The segment selector is obtained from the
+ * input segment register index @seg_reg_idx.
+ *
+ * Returns:
+ *
+ * In protected mode, the limit of the segment descriptor in bytes.
+ * In long mode and virtual-8086 mode, segment limits are not enforced. Thus,
+ * limit is returned as -1L to imply a limit-less segment.
+ *
+ * Zero is returned on error.
+ */
+static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
+{
+	struct desc_struct *desc;
+	unsigned long limit;
+	short sel;
+
+	sel = get_segment_selector(regs, seg_reg_idx);
+	if (sel < 0)
+		return 0;
+
+	if (user_64bit_mode(regs) || v8086_mode(regs))
+		return -1L;
+
+	if (!sel)
+		return 0;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return 0;
+
+	/*
+	 * If the granularity bit is set, the limit is given in multiples
+	 * of 4096. This also means that the 12 least significant bits are
+	 * not tested when checking the segment limits. In practice,
+	 * this means that the segment ends in (limit << 12) + 0xfff.
+	 */
+	limit = get_desc_limit(desc);
+	if (desc->g)
+		limit = (limit << 12) + 0xfff;
+
+	return limit;
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Add function to get default params of code segment
  2017-10-27 20:25 ` [PATCH v10 16/18] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
@ 2017-11-01 21:01   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 21:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: akpm, keescook, linux-kernel, thgarnie, adrian.hunter,
	ravi.v.shankar, dave.hansen, bp, adam.buchbinder, pbonzini,
	brgerst, hpa, corbet, peterz, mingo, colin.king, paul.gortmaker,
	mst, acme, ricardo.neri-calderon, ray.huang, tglx, shuah,
	dvyukov, cmetcalf, mhiramat, slaoub, jslaby, luto, vbabka,
	qiaowei.ren, lstoakes

Commit-ID:  4efea85fb56fa1691b79af1eea4c1425660cf4e3
Gitweb:     https://git.kernel.org/tip/4efea85fb56fa1691b79af1eea4c1425660cf4e3
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:43 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:12 +0100

x86/insn-eval: Add function to get default params of code segment

Obtain the default values of the address and operand sizes as specified in
the D and L bits of the the segment descriptor selected by the register
CS. The function can be used for both protected and long modes.
For virtual-8086 mode, the default address and operand sizes are always 2
bytes.

The returned parameters are encoded in a signed 8-bit data type. Auxiliar
macros are provided to encode and decode such values.

Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-17-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/include/asm/insn-eval.h |  5 ++++
 arch/x86/lib/insn-eval.c         | 64 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 25d6e44..e1d3b4c 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -11,8 +11,13 @@
 #include <linux/err.h>
 #include <asm/ptrace.h>
 
+#define INSN_CODE_SEG_ADDR_SZ(params) ((params >> 4) & 0xf)
+#define INSN_CODE_SEG_OPND_SZ(params) (params & 0xf)
+#define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))
+
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
+char insn_get_code_seg_params(struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 89d5c89..01e36bd 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -640,6 +640,70 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 }
 
 /**
+ * insn_get_code_seg_params() - Obtain code segment parameters
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * Obtain address and operand sizes of the code segment. It is obtained from the
+ * selector contained in the CS register in regs. In protected mode, the default
+ * address is determined by inspecting the L and D bits of the segment
+ * descriptor. In virtual-8086 mode, the default is always two bytes for both
+ * address and operand sizes.
+ *
+ * Returns:
+ *
+ * A signed 8-bit value containing the default parameters on success.
+ *
+ * -EINVAL on error.
+ */
+char insn_get_code_seg_params(struct pt_regs *regs)
+{
+	struct desc_struct *desc;
+	short sel;
+
+	if (v8086_mode(regs))
+		/* Address and operand size are both 16-bit. */
+		return INSN_CODE_SEG_PARAMS(2, 2);
+
+	sel = get_segment_selector(regs, INAT_SEG_REG_CS);
+	if (sel < 0)
+		return sel;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return -EINVAL;
+
+	/*
+	 * The most significant byte of the Type field of the segment descriptor
+	 * determines whether a segment contains data or code. If this is a data
+	 * segment, return error.
+	 */
+	if (!(desc->type & BIT(3)))
+		return -EINVAL;
+
+	switch ((desc->l << 1) | desc->d) {
+	case 0: /*
+		 * Legacy mode. CS.L=0, CS.D=0. Address and operand size are
+		 * both 16-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(2, 2);
+	case 1: /*
+		 * Legacy mode. CS.L=0, CS.D=1. Address and operand size are
+		 * both 32-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(4, 4);
+	case 2: /*
+		 * IA-32e 64-bit mode. CS.L=1, CS.D=0. Address size is 64-bit;
+		 * operand size is 32-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(4, 8);
+	case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b
  2017-10-27 20:25 ` [PATCH v10 17/18] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
@ 2017-11-01 21:01   ` tip-bot for Ricardo Neri
  0 siblings, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 21:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave.hansen, mst, cmetcalf, paul.gortmaker, slaoub, mhiramat,
	corbet, vbabka, dvyukov, qiaowei.ren, luto, keescook, bp, hpa,
	lstoakes, acme, shuah, brgerst, adam.buchbinder, ravi.v.shankar,
	jslaby, tglx, thgarnie, pbonzini, akpm, linux-kernel,
	adrian.hunter, colin.king, ray.huang, peterz, mingo,
	ricardo.neri-calderon

Commit-ID:  e526a302e425ab11111efc5f59e52449bbcc768e
Gitweb:     https://git.kernel.org/tip/e526a302e425ab11111efc5f59e52449bbcc768e
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:44 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:13 +0100

x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b

Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod is zero and
ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
that none of the registers are used in the computation of the effective
address. A return value of -EDOM indicates callers that they should not
use the value of registers when computing the effective address for the
instruction.

In long mode, the effective address is given by the 32-bit displacement
plus the location of the next instruction. In protected mode, only the
displacement is used.

The instruction decoder takes care of obtaining the displacement.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-18-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/lib/insn-eval.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 01e36bd..6bf819f 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -427,6 +427,14 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 	switch (type) {
 	case REG_TYPE_RM:
 		regno = X86_MODRM_RM(insn->modrm.value);
+
+		/*
+		 * ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement
+		 * follows the ModRM byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
 		if (X86_REX_B(insn->rex_prefix.value))
 			regno += 8;
 		break;
@@ -770,10 +778,21 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			if (addr_offset < 0)
+			/*
+			 * -EDOM means that we must ignore the address_offset.
+			 * In such a case, in 64-bit mode the effective address
+			 * relative to the RIP of the following instruction.
+			 */
+			if (addr_offset == -EDOM) {
+				if (user_64bit_mode(regs))
+					eff_addr = (long)regs->ip + insn->length;
+				else
+					eff_addr = 0;
+			} else if (addr_offset < 0) {
 				goto out;
-
-			eff_addr = regs_get_register(regs, addr_offset);
+			} else {
+				eff_addr = regs_get_register(regs, addr_offset);
+			}
 		}
 
 		eff_addr += insn->displacement.value;

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [tip:x86/mpx] x86/insn-eval: Incorporate segment base in linear address computation
  2017-10-27 20:25 ` [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
  2017-11-01 17:56   ` Borislav Petkov
@ 2017-11-01 21:02   ` tip-bot for Ricardo Neri
  1 sibling, 0 replies; 51+ messages in thread
From: tip-bot for Ricardo Neri @ 2017-11-01 21:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave.hansen, adam.buchbinder, vbabka, lstoakes, cmetcalf, corbet,
	mingo, tglx, hpa, qiaowei.ren, linux-kernel, jslaby, bp, akpm,
	paul.gortmaker, ray.huang, mhiramat, brgerst, peterz, mst,
	slaoub, pbonzini, ravi.v.shankar, dvyukov, luto, keescook,
	colin.king, ricardo.neri-calderon, acme, shuah, adrian.hunter,
	thgarnie

Commit-ID:  108904442850c2884679f81121df3ef42d88cb9c
Gitweb:     https://git.kernel.org/tip/108904442850c2884679f81121df3ef42d88cb9c
Author:     Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
AuthorDate: Fri, 27 Oct 2017 13:25:45 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:50:13 +0100

x86/insn-eval: Incorporate segment base in linear address computation

insn_get_addr_ref() returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. The segment descriptor to use depends on the register used as
operand and segment override prefixes, if any.

In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.

Since the effective address is a signed quantity, the unsigned segment
base address is saved in a separate variable and added to the final,
unsigned, effective address.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-19-git-send-email-ricardo.neri-calderon@linux.intel.com

---
 arch/x86/lib/insn-eval.c | 55 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 52 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 6bf819f..1c23ec0 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -728,6 +728,43 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
 	return get_reg_offset(insn, regs, REG_TYPE_RM);
 }
 
+/**
+ * get_seg_base_addr() - obtain base address of a segment
+ * @insn:	Instruction. Must be valid.
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to resolve segment descriptor
+ * @base:	Obtained segment base
+ *
+ * Obtain the base address of the segment associated with the operand @regoff
+ * and, if any or allowed, override prefixes in @insn. This function is
+ * different from insn_get_seg_base() as the latter does not resolve the segment
+ * associated with the instruction operand.
+ *
+ * Returns:
+ *
+ * 0 on success. @base will contain the base address of the resolved segment.
+ *
+ * -EINVAL on error.
+ */
+static int get_seg_base_addr(struct insn *insn, struct pt_regs *regs,
+			     int regoff, unsigned long *base)
+{
+	int seg_reg_idx;
+
+	if (!base)
+		return -EINVAL;
+
+	seg_reg_idx = resolve_seg_reg(insn, regs, regoff);
+	if (seg_reg_idx < 0)
+		return seg_reg_idx;
+
+	*base = insn_get_seg_base(regs, seg_reg_idx);
+	if (*base == -1L)
+		return -EINVAL;
+
+	return 0;
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
@@ -735,8 +772,8 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	int addr_offset, base_offset, indx_offset;
-	unsigned long linear_addr = -1L;
+	int addr_offset, base_offset, indx_offset, ret;
+	unsigned long linear_addr = -1L, seg_base;
 	long eff_addr, base, indx;
 	insn_byte_t sib;
 
@@ -750,6 +787,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			goto out;
 
 		eff_addr = regs_get_register(regs, addr_offset);
+
 	} else {
 		if (insn->sib.nbytes) {
 			/*
@@ -776,6 +814,13 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 				indx = regs_get_register(regs, indx_offset);
 
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			/*
+			 * The base determines the segment used to compute
+			 * the linear address.
+			 */
+			addr_offset = base_offset;
+
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			/*
@@ -798,7 +843,11 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr += insn->displacement.value;
 	}
 
-	linear_addr = (unsigned long)eff_addr;
+	ret = get_seg_base_addr(insn, regs, addr_offset, &seg_base);
+	if (ret)
+		goto out;
+
+	linear_addr = (unsigned long)eff_addr + seg_base;
 
 out:
 	return (void __user *)linear_addr;

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector
  2017-10-27 20:25 ` [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
  2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
@ 2017-11-09 11:12   ` Arnd Bergmann
  2017-11-09 13:50     ` Ingo Molnar
  1 sibling, 1 reply; 51+ messages in thread
From: Arnd Bergmann @ 2017-11-09 11:12 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, Linux Kernel Mailing List, the arch/x86 maintainers,
	ricardo.neri, Adam Buchbinder, Colin Ian King, Lorenzo Stoakes,
	Qiaowei Ren, Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov, Josh Poimboeuf

On Fri, Oct 27, 2017 at 10:25 PM, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:

> diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
> index 02aff08..1c78580 100644
> --- a/arch/x86/include/asm/inat.h
> +++ b/arch/x86/include/asm/inat.h
> @@ -97,6 +97,16 @@
>  #define INAT_MAKE_GROUP(grp)   ((grp << INAT_GRP_OFFS) | INAT_MODRM)
>  #define INAT_MAKE_IMM(imm)     (imm << INAT_IMM_OFFS)
>
> +/* Identifiers for segment registers */
> +#define INAT_SEG_REG_IGNORE    0
> +#define INAT_SEG_REG_DEFAULT   1
> +#define INAT_SEG_REG_CS                2
> +#define INAT_SEG_REG_SS                3
> +#define INAT_SEG_REG_DS                4
> +#define INAT_SEG_REG_ES                5
> +#define INAT_SEG_REG_FS                6
> +#define INAT_SEG_REG_GS                7
> +

linux-next still reports a warning because of this change:

Warning: synced file at 'tools/objtool/arch/x86/include/asm/inat.h'
differs from latest kernel version at 'arch/x86/include/asm/inat.h'

Should the same change be applied to the objtool file in the
tip:x86/mpx branch?

       Arnd

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector
  2017-11-09 11:12   ` [PATCH v10 13/18] " Arnd Bergmann
@ 2017-11-09 13:50     ` Ingo Molnar
  0 siblings, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2017-11-09 13:50 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andy Lutomirski, Borislav Petkov, Peter Zijlstra, Andrew Morton,
	Brian Gerst, Chris Metcalf, Dave Hansen, Paolo Bonzini,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, Linux Kernel Mailing List,
	the arch/x86 maintainers, ricardo.neri, Adam Buchbinder,
	Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov, Josh Poimboeuf


* Arnd Bergmann <arnd@arndb.de> wrote:

> On Fri, Oct 27, 2017 at 10:25 PM, Ricardo Neri
> <ricardo.neri-calderon@linux.intel.com> wrote:
> 
> > diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
> > index 02aff08..1c78580 100644
> > --- a/arch/x86/include/asm/inat.h
> > +++ b/arch/x86/include/asm/inat.h
> > @@ -97,6 +97,16 @@
> >  #define INAT_MAKE_GROUP(grp)   ((grp << INAT_GRP_OFFS) | INAT_MODRM)
> >  #define INAT_MAKE_IMM(imm)     (imm << INAT_IMM_OFFS)
> >
> > +/* Identifiers for segment registers */
> > +#define INAT_SEG_REG_IGNORE    0
> > +#define INAT_SEG_REG_DEFAULT   1
> > +#define INAT_SEG_REG_CS                2
> > +#define INAT_SEG_REG_SS                3
> > +#define INAT_SEG_REG_DS                4
> > +#define INAT_SEG_REG_ES                5
> > +#define INAT_SEG_REG_FS                6
> > +#define INAT_SEG_REG_GS                7
> > +
> 
> linux-next still reports a warning because of this change:
> 
> Warning: synced file at 'tools/objtool/arch/x86/include/asm/inat.h'
> differs from latest kernel version at 'arch/x86/include/asm/inat.h'
> 
> Should the same change be applied to the objtool file in the
> tip:x86/mpx branch?

No, the best flow is that the headers will be synced once both components are 
upstream.

This is an integration artifact - and we don't want to 'pre sync', because what 
happens if Linus has to reject the arch/x86/include/asm/inat.h changes for 
whatever reason, and the tooling changes go upstream first?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
@ 2017-12-05 17:48     ` Peter Zijlstra
  2017-12-05 18:14       ` Borislav Petkov
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2017-12-05 17:48 UTC (permalink / raw)
  To: qiaowei.ren, luto, adam.buchbinder, mst, mhiramat, dave.hansen,
	mingo, linux-kernel, colin.king, jslaby, pbonzini, cmetcalf,
	akpm, vbabka, acme, brgerst, shuah, bp, paul.gortmaker, lstoakes,
	hpa, thgarnie, keescook, adrian.hunter, ricardo.neri-calderon,
	ray.huang, dvyukov, ravi.v.shankar, slaoub, tglx, corbet
  Cc: linux-tip-commits

On Wed, Nov 01, 2017 at 02:00:28PM -0700, tip-bot for Ricardo Neri wrote:
> +static struct desc_struct *get_desc(unsigned short sel)
> +{
> +	struct desc_ptr gdt_desc = {0, 0};
> +	unsigned long desc_base;
> +
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
> +	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> +		struct desc_struct *desc = NULL;
> +		struct ldt_struct *ldt;
> +
> +		/* Bits [15:3] contain the index of the desired entry. */
> +		sel >>= 3;
> +
> +		mutex_lock(&current->active_mm->context.lock);
> +		ldt = current->active_mm->context.ldt;
> +		if (ldt && sel < ldt->nr_entries)
> +			desc = &ldt->entries[sel];
> +
> +		mutex_unlock(&current->active_mm->context.lock);
> +
> +		return desc;
> +	}
> +#endif

This is broken right? You unlock and then return @desc, which afaict can
at that point get freed by free_ldt_struct().

Something like the below ought to cure; although its not entirely
pretty either.

---

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index e664058c4491..c234ef2b4430 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -572,6 +572,11 @@ static struct desc_struct *get_desc(unsigned short sel)
 	struct desc_ptr gdt_desc = {0, 0};
 	unsigned long desc_base;
 
+	/*
+	 * Relies on IRQs being disabled to serialize against the LDT.
+	 */
+	lockdep_assert_irqs_disabled();
+
 #ifdef CONFIG_MODIFY_LDT_SYSCALL
 	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
 		struct desc_struct *desc = NULL;
@@ -580,13 +585,10 @@ static struct desc_struct *get_desc(unsigned short sel)
 		/* Bits [15:3] contain the index of the desired entry. */
 		sel >>= 3;
 
-		mutex_lock(&current->active_mm->context.lock);
 		ldt = current->active_mm->context.ldt;
 		if (ldt && sel < ldt->nr_entries)
 			desc = &ldt->entries_va[sel];
 
-		mutex_unlock(&current->active_mm->context.lock);
-
 		return desc;
 	}
 #endif
@@ -626,6 +628,7 @@ static struct desc_struct *get_desc(unsigned short sel)
  */
 unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
 {
+	unsigned long base, flags;
 	struct desc_struct *desc;
 	short sel;
 
@@ -664,11 +667,15 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
 	if (!sel)
 		return -1L;
 
+	base = -1;
+
+	local_irq_save(flags);
 	desc = get_desc(sel);
-	if (!desc)
-		return -1L;
+	if (desc)
+		base = get_desc_base(desc);
+	local_irq_restore(flags);
 
-	return get_desc_base(desc);
+	return base;
 }
 
 /**
@@ -690,8 +697,8 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
  */
 static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 {
+	unsigned long flags, limit = 0;
 	struct desc_struct *desc;
-	unsigned long limit;
 	short sel;
 
 	sel = get_segment_selector(regs, seg_reg_idx);
@@ -704,19 +711,20 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 	if (!sel)
 		return 0;
 
+	local_irq_save(flags);
 	desc = get_desc(sel);
-	if (!desc)
-		return 0;
-
-	/*
-	 * If the granularity bit is set, the limit is given in multiples
-	 * of 4096. This also means that the 12 least significant bits are
-	 * not tested when checking the segment limits. In practice,
-	 * this means that the segment ends in (limit << 12) + 0xfff.
-	 */
-	limit = get_desc_limit(desc);
-	if (desc->g)
-		limit = (limit << 12) + 0xfff;
+	if (desc) {
+		/*
+		 * If the granularity bit is set, the limit is given in multiples
+		 * of 4096. This also means that the 12 least significant bits are
+		 * not tested when checking the segment limits. In practice,
+		 * this means that the segment ends in (limit << 12) + 0xfff.
+		 */
+		limit = get_desc_limit(desc);
+		if (desc->g)
+			limit = (limit << 12) + 0xfff;
+	}
+	local_irq_restore(flags);
 
 	return limit;
 }
@@ -740,19 +748,23 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 int insn_get_code_seg_params(struct pt_regs *regs)
 {
 	struct desc_struct *desc;
+	unsigned long flags;
+	int ret = -EINVAL;
 	short sel;
 
-	if (v8086_mode(regs))
+	if (v8086_mode(regs)) {
 		/* Address and operand size are both 16-bit. */
 		return INSN_CODE_SEG_PARAMS(2, 2);
+	}
 
 	sel = get_segment_selector(regs, INAT_SEG_REG_CS);
 	if (sel < 0)
 		return sel;
 
+	local_irq_save(flags);
 	desc = get_desc(sel);
 	if (!desc)
-		return -EINVAL;
+		goto out;
 
 	/*
 	 * The most significant byte of the Type field of the segment descriptor
@@ -760,29 +772,37 @@ int insn_get_code_seg_params(struct pt_regs *regs)
 	 * segment, return error.
 	 */
 	if (!(desc->type & BIT(3)))
-		return -EINVAL;
+		goto out;
 
 	switch ((desc->l << 1) | desc->d) {
 	case 0: /*
 		 * Legacy mode. CS.L=0, CS.D=0. Address and operand size are
 		 * both 16-bit.
 		 */
-		return INSN_CODE_SEG_PARAMS(2, 2);
+		ret = INSN_CODE_SEG_PARAMS(2, 2);
+		break;
 	case 1: /*
 		 * Legacy mode. CS.L=0, CS.D=1. Address and operand size are
 		 * both 32-bit.
 		 */
-		return INSN_CODE_SEG_PARAMS(4, 4);
+		ret = INSN_CODE_SEG_PARAMS(4, 4);
+		break;
 	case 2: /*
 		 * IA-32e 64-bit mode. CS.L=1, CS.D=0. Address size is 64-bit;
 		 * operand size is 32-bit.
 		 */
-		return INSN_CODE_SEG_PARAMS(4, 8);
+		ret = INSN_CODE_SEG_PARAMS(4, 8);
+		break;
+
 	case 3: /* Invalid setting. CS.L=1, CS.D=1 */
 		/* fall through */
 	default:
-		return -EINVAL;
+		break;
 	}
+out:
+	local_irq_restore(flags);
+
+	return ret;
 }
 
 /**

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-05 17:48     ` Peter Zijlstra
@ 2017-12-05 18:14       ` Borislav Petkov
  2017-12-05 18:38         ` Peter Zijlstra
  2017-12-07  7:26         ` Ricardo Neri
  0 siblings, 2 replies; 51+ messages in thread
From: Borislav Petkov @ 2017-12-05 18:14 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner
  Cc: qiaowei.ren, luto, adam.buchbinder, mst, mhiramat, dave.hansen,
	mingo, linux-kernel, colin.king, jslaby, pbonzini, cmetcalf,
	akpm, vbabka, acme, brgerst, shuah, paul.gortmaker, lstoakes,
	hpa, thgarnie, keescook, adrian.hunter, ricardo.neri-calderon,
	ray.huang, dvyukov, ravi.v.shankar, slaoub, tglx, corbet,
	linux-tip-commits

On Tue, Dec 05, 2017 at 06:48:44PM +0100, Peter Zijlstra wrote:
> This is broken right? You unlock and then return @desc, which afaict can
> at that point get freed by free_ldt_struct().
> 
> Something like the below ought to cure; although its not entirely
> pretty either.

Right.

Or, instead of introducing all the locking, we could also not do
anything because all that code runs inside fixup_umip_exception() so the
desc will be valid there.

But, if other code is going to use those functions - and I believe
that's the idea - otherwise they wouldn't be in arch/x86/lib/ - we
should convert all those functions to return directly the desc field
which is requested by the respective caller.

I.e., get_desc() will be called by a wrapper which returns desc base or
desc limit or whatever...

In the case where desc has been freed, it should return error, of
course.

How doed that sound?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-05 18:14       ` Borislav Petkov
@ 2017-12-05 18:38         ` Peter Zijlstra
  2017-12-05 21:29           ` Borislav Petkov
  2017-12-07  7:26         ` Ricardo Neri
  1 sibling, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2017-12-05 18:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, qiaowei.ren, luto, adam.buchbinder, mst,
	mhiramat, dave.hansen, mingo, linux-kernel, colin.king, jslaby,
	pbonzini, cmetcalf, akpm, vbabka, acme, brgerst, shuah,
	paul.gortmaker, lstoakes, hpa, thgarnie, keescook, adrian.hunter,
	ricardo.neri-calderon, ray.huang, dvyukov, ravi.v.shankar,
	slaoub, corbet, linux-tip-commits

On Tue, Dec 05, 2017 at 07:14:56PM +0100, Borislav Petkov wrote:
> On Tue, Dec 05, 2017 at 06:48:44PM +0100, Peter Zijlstra wrote:
> > This is broken right? You unlock and then return @desc, which afaict can
> > at that point get freed by free_ldt_struct().
> > 
> > Something like the below ought to cure; although its not entirely
> > pretty either.
> 
> Right.
> 
> Or, instead of introducing all the locking, we could also not do
> anything because all that code runs inside fixup_umip_exception() so the
> desc will be valid there.

Sorry what? So either this code is broken because it has IRQs enabled,
or its broken because its trying to acquire a mutex with IRQs disabled.
Which is it?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-05 18:38         ` Peter Zijlstra
@ 2017-12-05 21:29           ` Borislav Petkov
  2017-12-07  7:23             ` Ricardo Neri
  0 siblings, 1 reply; 51+ messages in thread
From: Borislav Petkov @ 2017-12-05 21:29 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, ricardo.neri-calderon
  Cc: luto, adam.buchbinder, mst, mhiramat, dave.hansen, mingo,
	linux-kernel, colin.king, jslaby, pbonzini, cmetcalf, akpm,
	vbabka, acme, brgerst, shuah, paul.gortmaker, lstoakes, hpa,
	thgarnie, keescook, adrian.hunter, ray.huang, dvyukov,
	ravi.v.shankar, slaoub, corbet, linux-tip-commits

On Tue, Dec 05, 2017 at 07:38:45PM +0100, Peter Zijlstra wrote:
> Sorry what? So either this code is broken because it has IRQs enabled,
> or its broken because its trying to acquire a mutex with IRQs disabled.
> Which is it?

Well, lemme try to sum up what Peter, Thomas and I discussed on IRC:

The problem is that there's no guarantee userspace won't change the LDT
from under us while the UMIP code runs in the insn decoder.

So, we need a way to be able to query the desc fields the insn decoder
needs *and* when the LDT changes through the syscall, to detect that
case and handle it gracefully in the decoder.

So Thomas' idea is to keep a mm->context.ldt_seq sequence number which
gets incremented (and wraps around) everytime a LDT changes.

That sequence number, i.e., cookie, gets handed down into the decoder
and it uses it during desc lookup. If the sequence number changes, the
decoder and the UMIP code must abort the emulation.

The lookup code needs to do that with IRQs disabled, of course, to
protect itself from IPIs which could change the LDT.

I *think* this is the gist of what we talked about, tglx, please correct
me if I missed something.

So, Ricardo, please take a look at fixing that as otherwise the UMIP
code would choke and possibly rely on wrong data. If there are any
questions, don't hesitate to ask.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-05 21:29           ` Borislav Petkov
@ 2017-12-07  7:23             ` Ricardo Neri
  2017-12-07  8:03               ` Borislav Petkov
  0 siblings, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-12-07  7:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Thomas Gleixner, luto, adam.buchbinder, mst,
	mhiramat, dave.hansen, mingo, linux-kernel, colin.king, jslaby,
	pbonzini, cmetcalf, akpm, vbabka, acme, brgerst, shuah,
	paul.gortmaker, lstoakes, hpa, thgarnie, keescook, adrian.hunter,
	ray.huang, dvyukov, ravi.v.shankar, slaoub, corbet,
	linux-tip-commits

On Tue, Dec 05, 2017 at 10:29:33PM +0100, Borislav Petkov wrote:
> On Tue, Dec 05, 2017 at 07:38:45PM +0100, Peter Zijlstra wrote:
> > Sorry what? So either this code is broken because it has IRQs enabled,
> > or its broken because its trying to acquire a mutex with IRQs disabled.
> > Which is it?
> 
> Well, lemme try to sum up what Peter, Thomas and I discussed on IRC:
> 
> The problem is that there's no guarantee userspace won't change the LDT
> from under us while the UMIP code runs in the insn decoder.

Yes, I see the problem now.
> 
> So, we need a way to be able to query the desc fields the insn decoder
> needs *and* when the LDT changes through the syscall, to detect that
> case and handle it gracefully in the decoder.
> 
> So Thomas' idea is to keep a mm->context.ldt_seq sequence number which
> gets incremented (and wraps around) everytime a LDT changes.
> 
> That sequence number, i.e., cookie, gets handed down into the decoder
> and it uses it during desc lookup. If the sequence number changes, the
> decoder and the UMIP code must abort the emulation.

In UMIP emulation we can potentially access the LDT twice. Once when
determining the base address of the code segment and again when determining
the base address and limit of the segment in which the result of the
emulation is written. I guess that mm->context.ldt_seq needs to not change
not only while decoding a particular linear address but across these two
linear address decodings.
> 
> The lookup code needs to do that with IRQs disabled, of course, to
> protect itself from IPIs which could change the LDT.
> 
> I *think* this is the gist of what we talked about, tglx, please correct
> me if I missed something.
> 
> So, Ricardo, please take a look at fixing that as otherwise the UMIP
> code would choke and possibly rely on wrong data. If there are any
> questions, don't hesitate to ask.

Sure, I will look into implementing this idea and post patches for it.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-05 18:14       ` Borislav Petkov
  2017-12-05 18:38         ` Peter Zijlstra
@ 2017-12-07  7:26         ` Ricardo Neri
  2017-12-07  8:01           ` Borislav Petkov
  1 sibling, 1 reply; 51+ messages in thread
From: Ricardo Neri @ 2017-12-07  7:26 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Thomas Gleixner, qiaowei.ren, luto,
	adam.buchbinder, mst, mhiramat, dave.hansen, mingo, linux-kernel,
	colin.king, jslaby, pbonzini, cmetcalf, akpm, vbabka, acme,
	brgerst, shuah, paul.gortmaker, lstoakes, hpa, thgarnie,
	keescook, adrian.hunter, ray.huang, dvyukov, ravi.v.shankar,
	slaoub, corbet, linux-tip-commits

On Tue, Dec 05, 2017 at 07:14:56PM +0100, Borislav Petkov wrote:
> 
> But, if other code is going to use those functions - and I believe
> that's the idea - otherwise they wouldn't be in arch/x86/lib/ 

At the moment MPX and UMIP are using the insn-eval decoder to determine
linear addresses.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-07  7:26         ` Ricardo Neri
@ 2017-12-07  8:01           ` Borislav Petkov
  0 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2017-12-07  8:01 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Peter Zijlstra, Thomas Gleixner, qiaowei.ren, luto,
	adam.buchbinder, mst, mhiramat, dave.hansen, mingo, linux-kernel,
	colin.king, jslaby, pbonzini, cmetcalf, akpm, vbabka, acme,
	brgerst, shuah, paul.gortmaker, lstoakes, hpa, thgarnie,
	keescook, adrian.hunter, ray.huang, dvyukov, ravi.v.shankar,
	slaoub, corbet, linux-tip-commits

On Wed, Dec 06, 2017 at 11:26:05PM -0800, Ricardo Neri wrote:
> At the moment MPX and UMIP are using the insn-eval decoder to determine
> linear addresses.

If we're keeping a whole instruction decoder in the kernel, it better
be designed generically enough and usable (and used) by everything that
needs it.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [tip:x86/mpx] x86/insn-eval: Add utility function to get segment descriptor
  2017-12-07  7:23             ` Ricardo Neri
@ 2017-12-07  8:03               ` Borislav Petkov
  0 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2017-12-07  8:03 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Peter Zijlstra, Thomas Gleixner, luto, adam.buchbinder, mst,
	mhiramat, dave.hansen, mingo, linux-kernel, colin.king, jslaby,
	pbonzini, cmetcalf, akpm, vbabka, acme, brgerst, shuah,
	paul.gortmaker, lstoakes, hpa, thgarnie, keescook, adrian.hunter,
	ray.huang, dvyukov, ravi.v.shankar, slaoub, corbet,
	linux-tip-commits

On Wed, Dec 06, 2017 at 11:23:59PM -0800, Ricardo Neri wrote:
> In UMIP emulation we can potentially access the LDT twice. Once when
> determining the base address of the code segment and again when determining
> the base address and limit of the segment in which the result of the
> emulation is written. I guess that mm->context.ldt_seq needs to not change
> not only while decoding a particular linear address but across these two
> linear address decodings.

Yap, stuff which needs to see an *unchanged* LDT should use the cookie
to verify that and the LDT code should change the cookie when the LDT
is modified.

> Sure, I will look into implementing this idea and post patches for it.

Thanks!

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2017-12-07  8:03 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27 20:25 [PATCH v10 00/18] x86: Add address resolution code for UMIP and MPX Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 01/18] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
2017-11-01 20:55   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 02/18] x86/boot: Relocate definition of the initial state of CR0 Ricardo Neri
2017-10-27 20:25   ` Ricardo Neri
2017-10-27 20:25   ` Ricardo Neri
2017-11-01 20:55   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 03/18] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
2017-11-01 20:55   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 04/18] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
2017-11-01 20:56   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 05/18] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
2017-11-01 20:56   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 06/18] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
2017-11-01 20:57   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 07/18] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
2017-11-01 20:57   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 08/18] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
2017-11-01 20:57   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 09/18] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
2017-11-01 20:58   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 10/18] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
2017-11-01 20:58   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 11/18] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
2017-11-01 20:59   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 12/18] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
2017-11-01 20:59   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 13/18] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-11-09 11:12   ` [PATCH v10 13/18] " Arnd Bergmann
2017-11-09 13:50     ` Ingo Molnar
2017-10-27 20:25 ` [PATCH v10 14/18] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-12-05 17:48     ` Peter Zijlstra
2017-12-05 18:14       ` Borislav Petkov
2017-12-05 18:38         ` Peter Zijlstra
2017-12-05 21:29           ` Borislav Petkov
2017-12-07  7:23             ` Ricardo Neri
2017-12-07  8:03               ` Borislav Petkov
2017-12-07  7:26         ` Ricardo Neri
2017-12-07  8:01           ` Borislav Petkov
2017-10-27 20:25 ` [PATCH v10 15/18] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
2017-11-01 21:00   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 16/18] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
2017-11-01 21:01   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 17/18] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
2017-11-01 21:01   ` [tip:x86/mpx] " tip-bot for Ricardo Neri
2017-10-27 20:25 ` [PATCH v10 18/18] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
2017-11-01 17:56   ` Borislav Petkov
2017-11-01 19:08     ` Ricardo Neri
2017-11-01 21:02   ` [tip:x86/mpx] " tip-bot for Ricardo Neri

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.