All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention
@ 2017-10-04  3:54 Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 01/29] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
                   ` (28 more replies)
  0 siblings, 29 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri

This is v9 of this series. The seven previous submissions can be found
here [1], here [2], here[3], here[4], here[5], here[6], here[7] and here[8].
This version addresses the feedback comments from Borislav Petkov received on
v7. Please see details in the change log.

=== What is UMIP?

User-Mode Instruction Prevention (UMIP) is a security feature present in
new Intel Processors. If enabled, it prevents the execution of certain
instructions if the Current Privilege Level (CPL) is greater than 0. If
these instructions were executed while in CPL > 0, user space applications
could have access to system-wide settings such as the global and local
descriptor tables, the segment selectors to the current task state and the
local descriptor table. Hiding these system resources reduces the tools
available to craft privilege escalation attacks such as [9].

These are the instructions covered by UMIP:
* SGDT - Store Global Descriptor Table
* SIDT - Store Interrupt Descriptor Table
* SLDT - Store Local Descriptor Table
* SMSW - Store Machine Status Word
* STR - Store Task Register

If any of these instructions is executed with CPL > 0, a general protection
exception is issued when UMIP is enabled.

=== How does it impact applications?

When enabled, However, UMIP will change the behavior that certain
applications expect from the operating system. For instance, programs
running on WineHQ and DOSEMU2 rely on some of these instructions to
function. Stas Sergeev found that Microsoft Windows 3.1 and dos4gw use the
instruction SMSW when running in virtual-8086 mode[10]. SGDT and SIDT can
also be used on virtual-8086 mode.

In order to not change the behavior of the system. This patchset emulates
SGDT, SIDT and SMSW. This should be sufficient to not break the
applications mentioned above. Regarding the two remaining instructions, STR
and SLDT, the WineHQ team has shown interest catching the general protection
fault and use it as a vehicle to fix broken applications[11]. Furthermore,
STR and SLDT can only run in protected and long modes.

DOSEMU2 emulates virtual-8086 mode via KVM. No applications will be broken
unless DOSEMU2 decides to enable the CR4.UMIP bit in platforms that support
it. Also, this should not pose a security risk as no system resouces would
be revealed. Instead, code running inside the KVM would only see the KVM's
GDT, IDT and MSW.

Please note that UMIP is always enabled for both 64-bit and 32-bit Linux
builds. However, emulation of the UMIP-protected instructions is not done
for 64-bit processes. 64-bit user space applications will receive the
SIGSEGV signal when UMIP instructions causes a general protection fault.

=== How are UMIP-protected instructions emulated?

UMIP is kept enabled at all times when the CONFIG_x86_INTEL_UMIP option is
selected. If a general protection fault caused by the instructions
protected by UMIP is detected, such fault will be trapped and fixed-up. The
return values will be dummy as follows:
 
 * SGDT and SIDT return hard-coded dummy values as the base of the global
   descriptor and interrupt descriptor tables. These hard-coded values
   correspond to memory addresses that are near the end of the kernel
   memory map. This is also the case for virtual-8086 mode tasks. In all
   my experiments with 32-bit processes, the base of GDT and IDT was always
   a 4-byte address, even for 16-bit operands. Thus, my emulation code does
   the same. In all cases, the limit of the table is set to 0.
 * SMSW returns the value with which the CR0 register is programmed in
   head_32/64.S at boot time. This is, the following bits are enabled:
   CR0.0 for Protection Enable, CR.1 for Monitor Coprocessor, CR.4 for
   Extension Type, which will always be 1 in recent processors with UMIP;
   CR.5 for Numeric Error, CR0.16 for Write Protect, CR0.18 for Alignment
   Mask and CR0.31 for Paging. As per the Intel 64 and IA-32 Architectures
   Software Developer's Manual, SMSW returns a 16-bit results for memory
   operands. However, when the operand is a register, the results can be up
   to CR0[63:0]. Since the emulation code only kicks-in for 32-bit
   processes, we return up to CR[31:0].
 * The proposed emulation code is handles faults that happens in both
   protected and virtual-8086 mode.
 * Again, STR and SLDT are not emulated.

=== How is this series laid out?

++ Preparatory work
As per suggestions from Andy Lutormirsky and Borislav Petkov, I moved
the x86 page fault error codes to a header. Also, I made user_64bit_mode
available to x86_32 builds. This helps to reuse code and reduce the number
of #ifdef's in these patches. Borislav also suggested to uprobes should use
the existing definitions in arch/x86/include/asm/inat.h instead of hard-
coded values when checking instruction prefixes. I included this change
in the series.

++ Fix bugs in MPX address decoder
I found very useful the code for Intel MPX (Memory Protection Extensions)
used to parse opcodes and the memory locations contained in the general
purpose registers when used as operands. I put this code in a separate
library file that both MPX, UMIP and potentially others can access and
avoid code duplication.

Before creating the new library, I fixed a couple of bugs that I found in
in corner cases on how MPX determines the address contained in the
instruction and operands.

++ Provide a new x86 instruction evaluating library
With bugs fixed, the MPX evaluating code is relocated in a new insn-eval.c
library. The basic functionality of this library is extended to obtain the
segment descriptor selected by either segment override prefixes or the
default segment by the involved registers in the calculation of the
effective address. It was also extended to obtain the default address and
operand sizes as well as the segment base address. Also, support to 
process 16-bit address encodings. Armed with this arsenal, it is now
possible to determine the linear address onto which the emulated results
shall be copied. Furthermore, this new library relies on and extends the
capabilities of the existing instruction decoder in arch/x86/lib/insn.c.

This code supports long mode with 32 and 64 bit addresses, protected mode
with 16 and 32 bit addresses and virtual-8086 mode with 16 and 32 bit
addresses. Both global and local descriptor tables are supported.
Segmentation is supported in protected mode; in long mode, is supported
via the FS and GS registers.

++ Emulate UMIP instructions
A new fixup_umip_exception() functions inspect the instruction at the
instruction pointer. If it is an UMIP-protected instruction, it executes
the emulation code. This uses all the address-computing code of the
previous section.

++ Add self-tests
Lastly, self-tests are added to entry_from_v86.c to exercise the most
typical use cases of UMIP-protected instructions in a virtual-8086 mode.

++ Extensive tests
Extensive tests were performed to test all the combinations of ModRM,
SiB and displacements for 16-bit and 32-bit encodings for the SS, DS,
ES, FS and GS segments. Tests also include a 64-bit program that uses
segmentation via FS and GS. For this purpose, I temporarily enabled UMIP
support for 64-bit process. This change is not part of this patchset.
The intention is to test the computations of linear addresses in 64-bit
mode, including the extra R8-R15 registers. Extensive test is also
implemented for virtual-8086 tasks. Code of these tests can be found here
[12] and here [13].

++ Merging this series?
Eight versions of this series have been submitted. Am I any close to see
these patches merged? :)
 
[1]. https://lwn.net/Articles/705877/
[2]. https://lkml.org/lkml/2016/12/23/265
[3]. https://lkml.org/lkml/2017/1/25/622
[4]. https://lkml.org/lkml/2017/2/23/40
[5]. https://lkml.org/lkml/2017/3/3/678
[6]. https://lkml.org/lkml/2017/3/7/866
[7]. https://lkml.org/lkml/2017/5/5/398
[8]. https://lkml.org/lkml/2017/8/18/992
[9]. http://timetobleed.com/a-closer-look-at-a-recent-privilege-escalation-bug-in-linux-cve-2013-2094/
[10]. https://www.winehq.org/pipermail/wine-devel/2017-April/117159.html
[11]. https://marc.info/?l=linux-kernel&m=147876798717927&w=2
[12]. https://github.com/01org/luv-yocto/tree/rneri/umip/meta-luv/recipes-core/umip/files
[13]. https://github.com/01org/luv-yocto/commit/a72a7fe7d68693c0f4100ad86de6ecabde57334f#diff-3860c136a63add269bce4ea50222c248R1

Thanks and BR,
Ricardo

Changes since V8:
*Simplified error handling in the family of get_addr_ref_xx functions
 by initializing linear address to -1L.
*Reworded commit that #define's an initial state of CR0 and removed unneeded
 comment.
*Reworked get_desc() to get rid of one mutex_unlock(). Used a new local variable
 to improve readability.
*Reworked the utility functions used to obtain the segment selector:
  + get_overridden_seg_reg_idx() now only inspects the instruction to find
    segment override prefixes.
  + A new function allow_seg_reg_overrides() determines if segment override
    prefixes can be used based on the register operand in use and the nature of
    the instruction (i.e., string instructions vs not).
  + resolve_seg_reg() uses the two functions above, along with user_64bit_mode()
    to resolve the segment register index: overridden, default or ignored.
*Renamed local variables to reflect the fact that our segment registers are
 indexes and not the actual hardware regiters.
*Reworded function documentation for improved readability.

Changes since V7:
*UMIP is not enabled by default.
*Relocated definition of the initial state of CR0 into processor-flags.h
*Updated uprobes to use the autogenerated INAT_PFX_xS definitions instead of
 hard-coded values.
*In insn-eval.c, refer to segment override prefixes using the autogenerated
 INAT_PFX_XS definitions.
*Removed enumeration for segment registers that reused the segment override
 instruction prefixes. Instead, a new, separate, set of #defines is used in
 arch/x86/include/asm/inat.h
*Simplified function to identify string instruction.
*Split the code usde to determine the relevant segment register into two
 functions: one to inspect segment overrides and a second one to determine
 default segment registers based on the instruction and operands. A third
 functions reads the segment register to obtain the segment selector.
*Reworked arithmetic to compute 32-bit and 64-bit effective addresses. Instead
 of type casts, two separate functions are used in each case.
*Removed structure to hold segment default address and operand sizes. Used
 #defines instead.
*Corrected bug when determining the limit of a segment.
*Updated various functions to use error codes from errno-base.h
*Replaced prink_ratelimited with pr_err_ratelimited.
*Corrected typos and format errors in functions' documentation.
*Fixed unimplemented handling of emulation of the SMSW instruction.
*Added documentation to file containing implementation for UMIP.
*Improved error handling in fixup_umip_exception() function.

Changes since V6:
*Reworded and addded more details on the special cases of ModRM and SIB
 bytes. To avoid confusion, I ommited mentioning the involved registers
 (EBP and ESP).
*Replaced BUG() with printk_ratelimited in function get_reg_offset of
 insn-eval.c
*Removed unused utility functions that obtain a register value from pt_regs
 given a SIB base and index.
*Clarified nomenclature to call CS, DS, ES, FS, GS and SS segment registers
 and their values segment selectors.
*Reworked function resolve_seg_register to issue an error when more than
 one segment overrides prefixes are used in the instruction.
*Added logic in resolve_seg_register to ignore segment register when in
 long mode and not using FS or GS.
*Added logic to ensure the effective address is within the limits of the
 segment in protected mode.
*Added logic to ensure segment override prefixes are ignored when resolving
 the segment of EIP and EDI with string instructions.
*Added code to make user_64bit_mode() available in CONFIG_X86_32... and
 make it return false, of course.
*Merged the two functions that obtain the default address and operand size
 of a code segment into one as they are always used together.
*Corrected logic of displacement-only addressing in long mode to make the
 displacement relative to the RIP of the next instruction.
*Reworked logic to sign-extend 32-bit memory offsets into 64-bit signed
 memory offsets. This include more checks and putting all together in an
 utility function.
*Removed the 'unlikely' of conditional statements as we are not in a
 critical path.
*In virtual-8086 mode, ensure that effective addresses are always less
 than 0x10000,  even when address override prefixes are used. Also, ensure
 that linear addresses have a size of 20-bits.

Changes since V5:
* Relocate the page fault error code enumerations to traps.h

Changes since V4:
* Audited patches to use braces in all the branches of conditional.
  statements, except those in which the conditional action only takes one
  line.
* Implemented support in 64-builds for both 32-bit and 64-bit tasks in the
  instruction evaluating library.
* Split segment selector function in the instruction evaluating library
  into two functions to resolve the segment type by instruction override
  or default and a separate function to actually read the segment selector.
* Fixed a bug when evaluating 32-bit effective addresses with 64-bit
  kernels.
* Split patches further for for easier review.
* Use signed variables for computation of effective address.
* Fixed issue with a spurious static modifier in function insn_get_addr_ref
  found by kbuild test bot.
* Removed comparison between true and fixup_umip_exception.
* Reworked check logic when identifying erroneous vs invalid values of the
  SiB base and index.

Changes since V3:
* Limited emulation to 32-bit and 16-bit modes. For 64-bit mode, a general
  protection fault is still issued when UMIP-protected instructions are
  executed with CPL > 0.
* Expanded instruction-evaluating code to obtain segment descriptor along
  with their attributes such as base address and default address and
  operand sizes. Also, support for 16-bit encodings in protected mode was
  implemented.
* When getting a segment descriptor, this include support to obtain those
  of a local descriptor table.
* Now the instruction-evaluating code returns -EDOM when the value of
  registers should not be used in calculating the effective address. The
  value -EINVAL is left for errors.
* Incorporate the value of the segment base address in the computation of
  linear addresses.
* Renamed new instruction evaluation library from insn-kernel.c to
  insn-eval.c
* Exported functions insn_get_reg_offset_* to obtain the register offset
  by ModRM r/m, SiB base and SiB index.
* Improved documentation of functions.
* Split patches further for easier review.

Changes since V2:
* Added new utility functions to decode the memory addresses contained in
  registers when the 16-bit addressing encodings are used. This includes
  code to obtain and compute memory addresses using segment selectors for
  real-mode address translation.
* Added support to emulate UMIP-protected instructions for virtual-8086
  tasks.
* Added self-tests for virtual-8086 mode that contains representative
  use cases: address represented as a displacement, address in registers
  and registers as operands.
* Instead of maintaining a static variable for the dummy base addresses
  of the IDT and GDT, a hard-coded value is used.
* The emulated SMSW instructions now return the value with which the CR0
  register is programmed in head_32/64.S This is: PE | MP | ET | NE | WP
  | AM. For x86_64, PG is also enabled.
* The new file arch/x86/lib/insn-utils.c is now renamed as arch/x86/lib/
  insn-kernel.c. It also has its own header. This helps keep in sync the
  the kernel and objtool instruction decoders. Also, the new insn-kernel.c
  contains utility functions that are only relevant in a kernel context.
* Removed printed warnings for errors that occur when decoding instructions
  with invalid operands.
* Added more comments on fixes in the instruction-decoding MPX functions.
* Now user_64bit_mode(regs) is used instead of test_thread_flag(TIF_IA32)
  to determine if the task is 32-bit or 64-bit.
* Found and fixed a bug in insn-decoder in which X86_MODRM_RM was
  incorrectly used to obtain the mod part of the ModRM byte.
* Added more explanatory code in emulation and instruction decoding code.
  This includes a comment regarding that copy_from_user could fail if there
  exists a memory protection key in place.
* Tested code with CONFIG_X86_DECODER_SELFTEST=y and everything passes now.
* Prefixed get_reg_offset_rm with insn_ as this function is exposed
  via a header file. For clarity, this function was added in a separate
  patch.

Changes since V1:
* Virtual-8086 mode tasks are not treated in a special manner. All code
  for this purpose was removed.
* Instead of attempting to disable UMIP during a context switch or when
  entering virtual-8086 mode, UMIP remains enabled all the time. General
  protection faults that occur are fixed-up by returning dummy values as
  detailed above.
* Removed umip= kernel parameter in favor of using clearcpuid=514 to
  disable UMIP.
* Removed selftests designed to detect the absence of SIGSEGV signals when
  running in virtual-8086 mode.
* Reused code from MPX to decode instructions operands. For this purpose
  code was put in a common location.
* Fixed two bugs in MPX code that decodes operands.

Ricardo Neri (29):
  x86/mm: Relocate page fault error codes to traps.h
  x86/boot: Relocate definition of the initial state of CR0
  ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  uprobes/x86: Use existing definitions for segment override prefixes
  x86/mpx: Simplify handling of errors when computing linear addresses
  x86/mpx: Use signed variables to compute effective addresses
  x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is
    not 11b
  x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval
    file
  x86/insn-eval: Do not BUG on invalid register type
  x86/insn-eval: Add a utility function to get register offsets
  x86/insn-eval: Add utility function to identify string instructions
  x86/insn-eval: Add utility functions to get segment selector
  x86/insn-eval: Add utility function to get segment descriptor
  x86/insn-eval: Add utility functions to get segment descriptor base
    address and limit
  x86/insn-eval: Add function to get default params of code segment
  x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and
    ModRM.rm is 101b
  x86/insn-eval: Incorporate segment base in linear address computation
  x86/insn-eval: Add support to resolve 32-bit address encodings
  x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  x86/insn-eval: Add support to resolve 16-bit addressing encodings
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
  x86: Add emulation code for UMIP instructions
  x86/umip: Force a page fault when unable to copy emulated result to
    user
  x86: Enable User-Mode Instruction Prevention
  x86/traps: Fixup general protection faults caused by UMIP
  selftests/x86: Add tests for User-Mode Instruction Prevention
  selftests/x86: Add tests for instruction str and sldt

 arch/x86/Kconfig                              |   10 +
 arch/x86/include/asm/cpufeatures.h            |    1 +
 arch/x86/include/asm/disabled-features.h      |    8 +-
 arch/x86/include/asm/inat.h                   |   10 +
 arch/x86/include/asm/insn-eval.h              |   23 +
 arch/x86/include/asm/ptrace.h                 |    6 +-
 arch/x86/include/asm/traps.h                  |   18 +
 arch/x86/include/asm/umip.h                   |   12 +
 arch/x86/include/uapi/asm/processor-flags.h   |    5 +
 arch/x86/kernel/Makefile                      |    1 +
 arch/x86/kernel/cpu/common.c                  |   25 +-
 arch/x86/kernel/head_32.S                     |    3 -
 arch/x86/kernel/head_64.S                     |    3 -
 arch/x86/kernel/traps.c                       |    5 +
 arch/x86/kernel/umip.c                        |  350 +++++++
 arch/x86/kernel/uprobes.c                     |   15 +-
 arch/x86/lib/Makefile                         |    2 +-
 arch/x86/lib/insn-eval.c                      | 1213 +++++++++++++++++++++++++
 arch/x86/mm/fault.c                           |   88 +-
 arch/x86/mm/mpx.c                             |  120 +--
 tools/testing/selftests/x86/entry_from_vm86.c |   89 +-
 21 files changed, 1818 insertions(+), 189 deletions(-)
 create mode 100644 arch/x86/include/asm/insn-eval.h
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c
 create mode 100644 arch/x86/lib/insn-eval.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v9 01/29] x86/mm: Relocate page fault error codes to traps.h
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54   ` Ricardo Neri
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Kirill A. Shutemov, Josh Poimboeuf

Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: x86@kernel.org
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/traps.h | 18 +++++++++
 arch/x86/mm/fault.c          | 88 +++++++++++++++++---------------------------
 2 files changed, 52 insertions(+), 54 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 5545f64..da3c3a3 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -144,4 +144,22 @@ enum {
 	X86_TRAP_IRET = 32,	/* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==	 0: no page found	1: protection fault
+ *   bit 1 ==	 0: read access		1: write access
+ *   bit 2 ==	 0: kernel-mode access	1: user-mode access
+ *   bit 3 ==				1: use of reserved bit detected
+ *   bit 4 ==				1: fault was an instruction fetch
+ *   bit 5 ==				1: protection keys block access
+ */
+enum x86_pf_error_code {
+	X86_PF_PROT	=		1 << 0,
+	X86_PF_WRITE	=		1 << 1,
+	X86_PF_USER	=		1 << 2,
+	X86_PF_RSVD	=		1 << 3,
+	X86_PF_INSTR	=		1 << 4,
+	X86_PF_PK	=		1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e2baeaa..db71c73 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -29,26 +29,6 @@
 #include <asm/trace/exceptions.h>
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==	 0: no page found	1: protection fault
- *   bit 1 ==	 0: read access		1: write access
- *   bit 2 ==	 0: kernel-mode access	1: user-mode access
- *   bit 3 ==				1: use of reserved bit detected
- *   bit 4 ==				1: fault was an instruction fetch
- *   bit 5 ==				1: protection keys block access
- */
-enum x86_pf_error_code {
-
-	PF_PROT		=		1 << 0,
-	PF_WRITE	=		1 << 1,
-	PF_USER		=		1 << 2,
-	PF_RSVD		=		1 << 3,
-	PF_INSTR	=		1 << 4,
-	PF_PK		=		1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -149,7 +129,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
 	 * If it was a exec (instruction fetch) fault on NX page, then
 	 * do not ignore the fault:
 	 */
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		return 0;
 
 	instr = (void *)convert_ip_to_linear(current, regs);
@@ -179,7 +159,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -204,7 +184,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info, u32 *pkey)
 	/*
 	 * force_sig_info_fault() is called from a number of
 	 * contexts, some of which have a VMA and some of which
-	 * do not.  The PF_PK handing happens after we have a
+	 * do not.  The X86_PF_PK handing happens after we have a
 	 * valid VMA, so we should never reach this without a
 	 * valid VMA.
 	 */
@@ -697,7 +677,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
 	if (!oops_may_print())
 		return;
 
-	if (error_code & PF_INSTR) {
+	if (error_code & X86_PF_INSTR) {
 		unsigned int level;
 		pgd_t *pgd;
 		pte_t *pte;
@@ -779,7 +759,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 		 */
 		if (current->thread.sig_on_uaccess_err && signal) {
 			tsk->thread.trap_nr = X86_TRAP_PF;
-			tsk->thread.error_code = error_code | PF_USER;
+			tsk->thread.error_code = error_code | X86_PF_USER;
 			tsk->thread.cr2 = address;
 
 			/* XXX: hwpoison faults will set the wrong code. */
@@ -897,7 +877,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 	struct task_struct *tsk = current;
 
 	/* User mode accesses just cause a SIGSEGV */
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * It's possible to have interrupts off here:
 		 */
@@ -918,7 +898,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		 * Instruction fetch faults in the vsyscall page might need
 		 * emulation.
 		 */
-		if (unlikely((error_code & PF_INSTR) &&
+		if (unlikely((error_code & X86_PF_INSTR) &&
 			     ((address & ~0xfff) == VSYSCALL_ADDR))) {
 			if (emulate_vsyscall(regs, address))
 				return;
@@ -931,7 +911,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 		 * are always protection faults.
 		 */
 		if (address >= TASK_SIZE_MAX)
-			error_code |= PF_PROT;
+			error_code |= X86_PF_PROT;
 
 		if (likely(show_unhandled_signals))
 			show_signal_msg(regs, error_code, address, tsk);
@@ -992,11 +972,11 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 
 	if (!boot_cpu_has(X86_FEATURE_OSPKE))
 		return false;
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return true;
 	/* this checks permission keys on the VMA: */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return true;
 	return false;
 }
@@ -1024,7 +1004,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	int code = BUS_ADRERR;
 
 	/* Kernel mode? Handle exceptions or die: */
-	if (!(error_code & PF_USER)) {
+	if (!(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
 		return;
 	}
@@ -1052,14 +1032,14 @@ static noinline void
 mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 	       unsigned long address, u32 *pkey, unsigned int fault)
 {
-	if (fatal_signal_pending(current) && !(error_code & PF_USER)) {
+	if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, 0, 0);
 		return;
 	}
 
 	if (fault & VM_FAULT_OOM) {
 		/* Kernel mode? Handle exceptions or die: */
-		if (!(error_code & PF_USER)) {
+		if (!(error_code & X86_PF_USER)) {
 			no_context(regs, error_code, address,
 				   SIGSEGV, SEGV_MAPERR);
 			return;
@@ -1084,16 +1064,16 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 
 static int spurious_fault_check(unsigned long error_code, pte_t *pte)
 {
-	if ((error_code & PF_WRITE) && !pte_write(*pte))
+	if ((error_code & X86_PF_WRITE) && !pte_write(*pte))
 		return 0;
 
-	if ((error_code & PF_INSTR) && !pte_exec(*pte))
+	if ((error_code & X86_PF_INSTR) && !pte_exec(*pte))
 		return 0;
 	/*
 	 * Note: We do not do lazy flushing on protection key
-	 * changes, so no spurious fault will ever set PF_PK.
+	 * changes, so no spurious fault will ever set X86_PF_PK.
 	 */
-	if ((error_code & PF_PK))
+	if ((error_code & X86_PF_PK))
 		return 1;
 
 	return 1;
@@ -1139,8 +1119,8 @@ spurious_fault(unsigned long error_code, unsigned long address)
 	 * change, so user accesses are not expected to cause spurious
 	 * faults.
 	 */
-	if (error_code != (PF_WRITE | PF_PROT)
-	    && error_code != (PF_INSTR | PF_PROT))
+	if (error_code != (X86_PF_WRITE | X86_PF_PROT) &&
+	    error_code != (X86_PF_INSTR | X86_PF_PROT))
 		return 0;
 
 	pgd = init_mm.pgd + pgd_index(address);
@@ -1200,19 +1180,19 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
 	 * always an unconditional error and can never result in
 	 * a follow-up action to resolve the fault, like a COW.
 	 */
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return 1;
 
 	/*
 	 * Make sure to check the VMA so that we do not perform
-	 * faults just to hit a PF_PK as soon as we fill in a
+	 * faults just to hit a X86_PF_PK as soon as we fill in a
 	 * page.
 	 */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return 1;
 
-	if (error_code & PF_WRITE) {
+	if (error_code & X86_PF_WRITE) {
 		/* write, present and write, not present: */
 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
 			return 1;
@@ -1220,7 +1200,7 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
 	}
 
 	/* read, present: */
-	if (unlikely(error_code & PF_PROT))
+	if (unlikely(error_code & X86_PF_PROT))
 		return 1;
 
 	/* read, not present: */
@@ -1243,7 +1223,7 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
 	if (!static_cpu_has(X86_FEATURE_SMAP))
 		return false;
 
-	if (error_code & PF_USER)
+	if (error_code & X86_PF_USER)
 		return false;
 
 	if (!user_mode(regs) && (regs->flags & X86_EFLAGS_AC))
@@ -1296,7 +1276,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * protection error (error_code & 9) == 0.
 	 */
 	if (unlikely(fault_in_kernel_space(address))) {
-		if (!(error_code & (PF_RSVD | PF_USER | PF_PROT))) {
+		if (!(error_code & (X86_PF_RSVD | X86_PF_USER | X86_PF_PROT))) {
 			if (vmalloc_fault(address) >= 0)
 				return;
 
@@ -1324,7 +1304,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	if (unlikely(kprobes_fault(regs)))
 		return;
 
-	if (unlikely(error_code & PF_RSVD))
+	if (unlikely(error_code & X86_PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
 	if (unlikely(smap_violation(error_code, regs))) {
@@ -1350,7 +1330,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 */
 	if (user_mode(regs)) {
 		local_irq_enable();
-		error_code |= PF_USER;
+		error_code |= X86_PF_USER;
 		flags |= FAULT_FLAG_USER;
 	} else {
 		if (regs->flags & X86_EFLAGS_IF)
@@ -1359,9 +1339,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-	if (error_code & PF_WRITE)
+	if (error_code & X86_PF_WRITE)
 		flags |= FAULT_FLAG_WRITE;
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
 	/*
@@ -1381,7 +1361,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * space check, thus avoiding the deadlock:
 	 */
 	if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
-		if ((error_code & PF_USER) == 0 &&
+		if (!(error_code & X86_PF_USER) &&
 		    !search_exception_tables(regs->ip)) {
 			bad_area_nosemaphore(regs, error_code, address, NULL);
 			return;
@@ -1408,7 +1388,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 		bad_area(regs, error_code, address);
 		return;
 	}
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * Accessing the stack below %sp is always a bug.
 		 * The large cushion allows instructions like enter
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 01/29] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
@ 2017-10-04  3:54   ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 03/29] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
                     ` (26 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Andy Lutomirski, Borislav Petkov, Dave Hansen, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, linux-arch, linux-mm

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 42e32c2..205dabc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -152,9 +152,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-04  3:54   ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Andy Lutomirski, Borislav Petkov, Dave Hansen, Denys

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 42e32c2..205dabc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -152,9 +152,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-04  3:54   ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Andy Lutomirski, Borislav Petkov, Dave Hansen, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, linux-arch, linux-mm

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/uapi/asm/processor-flags.h | 3 +++
 arch/x86/kernel/head_32.S                   | 3 ---
 arch/x86/kernel/head_64.S                   | 3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 185f3d1..39946d0 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -151,5 +151,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074..c3cfc65 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -211,9 +211,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 42e32c2..205dabc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -152,9 +152,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 03/29] ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 01/29] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
  2017-10-04  3:54   ` Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 04/29] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.

This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.

Suggested-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/ptrace.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 91c04c8..e2afbf6 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -135,9 +135,9 @@ static inline int v8086_mode(struct pt_regs *regs)
 #endif
 }
 
-#ifdef CONFIG_X86_64
 static inline bool user_64bit_mode(struct pt_regs *regs)
 {
+#ifdef CONFIG_X86_64
 #ifndef CONFIG_PARAVIRT
 	/*
 	 * On non-paravirt systems, this is the only long mode CPL 3
@@ -148,8 +148,12 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
 	/* Headers are too twisted for this to go in paravirt.h. */
 	return regs->cs == __USER_CS || regs->cs == pv_info.extra_user_64bit_cs;
 #endif
+#else /* !CONFIG_X86_64 */
+	return false;
+#endif
 }
 
+#ifdef CONFIG_X86_64
 #define current_user_stack_pointer()	current_pt_regs()->sp
 #define compat_user_stack_pointer()	current_pt_regs()->sp
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 04/29] uprobes/x86: Use existing definitions for segment override prefixes
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (2 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 03/29] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 05/29] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Denys Vlasenko, Srikar Dronamraju

Rather than using hard-coded values of the segment override prefixes,
leverage the existing definitions provided in inat.h.

Suggested-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/kernel/uprobes.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 495c776..a3755d2 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -271,12 +271,15 @@ static bool is_prefix_bad(struct insn *insn)
 	int i;
 
 	for (i = 0; i < insn->prefixes.nbytes; i++) {
-		switch (insn->prefixes.bytes[i]) {
-		case 0x26:	/* INAT_PFX_ES   */
-		case 0x2E:	/* INAT_PFX_CS   */
-		case 0x36:	/* INAT_PFX_DS   */
-		case 0x3E:	/* INAT_PFX_SS   */
-		case 0xF0:	/* INAT_PFX_LOCK */
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+		case INAT_MAKE_PREFIX(INAT_PFX_LOCK):
 			return true;
 		}
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 05/29] x86/mpx: Simplify handling of errors when computing linear addresses
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (3 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 04/29] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Nathan Howard, Adan Hawthorn, Joe Perches

When errors occur in the computation of the linear address, -1L is
returned. Rather than having a separate return path for errors, the
variable used to return the computed linear address can be initialized
with the error value. Hence, only one return path is needed. This makes
the function easier to read.

While here, ensure that the error value is -1L, a 64-bit value, rather
than -1, a 32-bit value.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 9ceaa95..f4c48a0 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,7 +138,7 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	unsigned long addr, base, indx;
+	unsigned long addr = -1L, base, indx;
 	int addr_offset, base_offset, indx_offset;
 	insn_byte_t sib;
 
@@ -149,17 +149,17 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
-			goto out_err;
+			goto out;
 		addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
 			if (base_offset < 0)
-				goto out_err;
+				goto out;
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
 			if (indx_offset < 0)
-				goto out_err;
+				goto out;
 
 			base = regs_get_register(regs, base_offset);
 			indx = regs_get_register(regs, indx_offset);
@@ -167,14 +167,13 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			if (addr_offset < 0)
-				goto out_err;
+				goto out;
 			addr = regs_get_register(regs, addr_offset);
 		}
 		addr += insn->displacement.value;
 	}
+out:
 	return (void __user *)addr;
-out_err:
-	return (void __user *)-1;
 }
 
 static int mpx_insn_decode(struct insn *insn,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (4 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 05/29] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-05  9:41   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 07/29] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
                   ` (22 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Nathan Howard, Adan Hawthorn, Joe Perches

Even though memory addresses are unsigned, the operands used to compute the
effective address do have a sign. This is true for ModRM.rm, SIB.base,
SIB.index as well as the displacement bytes. Thus, signed variables shall
be used when computing the effective address from these operands. Once the
signed effective address has been computed, it is casted to an unsigned
long to determine the linear address.

Variables are renamed to better reflect the type of address being
computed.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index f4c48a0..57e5bf5 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,8 +138,9 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	unsigned long addr = -1L, base, indx;
 	int addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L;
+	long eff_addr, base, indx;
 	insn_byte_t sib;
 
 	insn_get_modrm(insn);
@@ -150,7 +151,8 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
 			goto out;
-		addr = regs_get_register(regs, addr_offset);
+
+		eff_addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
@@ -163,17 +165,23 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 
 			base = regs_get_register(regs, base_offset);
 			indx = regs_get_register(regs, indx_offset);
-			addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			if (addr_offset < 0)
 				goto out;
-			addr = regs_get_register(regs, addr_offset);
+
+			eff_addr = regs_get_register(regs, addr_offset);
 		}
-		addr += insn->displacement.value;
+
+		eff_addr += insn->displacement.value;
 	}
+
+	linear_addr = (unsigned long)eff_addr;
+
 out:
-	return (void __user *)addr;
+	return (void __user *)linear_addr;
 }
 
 static int mpx_insn_decode(struct insn *insn,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 07/29] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (5 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 08/29] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Nathan Howard, Adan Hawthorn, Joe Perches

Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod !=11b and
ModRM.rm = 100b indexed register-indirect addressing is used. In other
words, a SIB byte follows the ModRM byte. In the specific case of
SIB.index = 100b, the scale*index portion of the computation of the
effective address is null. To signal callers of this particular situation,
get_reg_offset() can return -EDOM (-EINVAL continues to indicate that an
error when decoding the SIB byte).

An example of this situation can be the following instruction:

   8b 4c 23 80       mov -0x80(%rbx,%riz,1),%rcx
   ModRM:            0x4c [mod:1b][reg:1b][rm:100b]
   SIB:              0x23 [scale:0b][index:100b][base:11b]
   Displacement:     0x80  (1-byte, as per ModRM.mod = 1b)

The %riz 'register' indicates a null index.

In long mode, a REX prefix may be used. When a REX prefix is present,
REX.X adds a fourth bit to the register selection of SIB.index. This gives
the ability to refer to all the 16 general purpose registers. When REX.X is
1b and SIB.index is 100b, the index is indicated in %r12. In our example,
this would look like:

   42 8b 4c 23 80    mov -0x80(%rbx,%r12,1),%rcx
   REX:              0x42 [W:0b][R:0b][X:1b][B:0b]
   ModRM:            0x4c [mod:1b][reg:1b][rm:100b]
   SIB:              0x23 [scale:0b][.X: 1b, index:100b][.B:0b, base:11b]
   Displacement:     0x80  (1-byte, as per ModRM.mod = 1b)

%r12 is a valid register to use in the scale*index part of the effective
address computation.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 57e5bf5..2ad1d4a 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -110,6 +110,15 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 		regno = X86_SIB_INDEX(insn->sib.value);
 		if (X86_REX_X(insn->rex_prefix.value))
 			regno += 8;
+
+		/*
+		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
+		 * portion of the address computation is null. This is
+		 * true only if REX.X is 0. In such a case, the SIB index
+		 * is used in the address computation.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+			return -EDOM;
 		break;
 
 	case REG_TYPE_BASE:
@@ -160,11 +169,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 				goto out;
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-			if (indx_offset < 0)
+			/*
+			 * A negative offset generally means a error, except
+			 * -EDOM, which means that the contents of the register
+			 * should not be used as index.
+			 */
+			if (indx_offset == -EDOM)
+				indx = 0;
+			else if (indx_offset < 0)
 				goto out;
+			else
+				indx = regs_get_register(regs, indx_offset);
 
 			base = regs_get_register(regs, base_offset);
-			indx = regs_get_register(regs, indx_offset);
 
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 08/29] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (6 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 07/29] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 09/29] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Nathan Howard, Adan Hawthorn, Joe Perches

Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that if a SIB byte is used and
SIB.base is 101b and ModRM.mod is zero, then the base part of the base
part of the effective address computation is null. To signal this
situation, a -EDOM error is returned to indicate callers to ignore the
base value present in the register operand.

In this scenario, a 32-bit displacement follows the SIB byte. Displacement
is obtained when the instruction decoder parses the operands.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nathan Howard <liverlint@gmail.com>
Cc: Adan Hawthorn <adanhawthorn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/mm/mpx.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 2ad1d4a..581a960 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -123,6 +123,14 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 
 	case REG_TYPE_BASE:
 		regno = X86_SIB_BASE(insn->sib.value);
+		/*
+		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
+		 * register-indirect addressing is 0. In this case, a
+		 * 32-bit displacement follows the SIB byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
 		if (X86_REX_B(insn->rex_prefix.value))
 			regno += 8;
 		break;
@@ -164,16 +172,22 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 		eff_addr = regs_get_register(regs, addr_offset);
 	} else {
 		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
 			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset < 0)
+			if (base_offset == -EDOM)
+				base = 0;
+			else if (base_offset < 0)
 				goto out;
+			else
+				base = regs_get_register(regs, base_offset);
 
 			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-			/*
-			 * A negative offset generally means a error, except
-			 * -EDOM, which means that the contents of the register
-			 * should not be used as index.
-			 */
+
 			if (indx_offset == -EDOM)
 				indx = 0;
 			else if (indx_offset < 0)
@@ -181,8 +195,6 @@ static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			else
 				indx = regs_get_register(regs, indx_offset);
 
-			base = regs_get_register(regs, base_offset);
-
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 09/29] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (7 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 08/29] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Other kernel submodules can benefit from using the utility functions
defined in mpx.c to obtain the addresses and values of operands contained
in the general purpose registers. An instance of this is the emulation code
used for instructions protected by the Intel User-Mode Instruction
Prevention feature.

Thus, these functions are relocated to a new insn-eval.c file. The reason
to not relocate these utilities into insn.c is that the latter solely
analyses instructions given by a struct insn without any knowledge of the
meaning of the values of instruction operands. This new utility insn-
eval.c aims to be used to resolve userspace linear addresses based on
the contents of the instruction operands as well as the contents of pt_regs
structure.

These utilities come with a separate header. This is to avoid taking insn.c
out of sync from the instructions decoders under tools/obj and tools/perf.
This also avoids adding cumbersome #ifdef's for the #include'd files
required to decode instructions in a kernel context.

Functions are simply relocated. There are not functional or indentation
changes.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
The checkpatch script issues the following warning with this
commit:

WARNING: Avoid crashing the kernel - try using WARN_ON & recovery code
rather than BUG() or BUG_ON()
+               BUG();

This warning will be fixed in a subsequent patch.
---
 arch/x86/include/asm/insn-eval.h |  16 ++++
 arch/x86/lib/Makefile            |   2 +-
 arch/x86/lib/insn-eval.c         | 163 +++++++++++++++++++++++++++++++++++++++
 arch/x86/mm/mpx.c                | 156 +------------------------------------
 4 files changed, 182 insertions(+), 155 deletions(-)
 create mode 100644 arch/x86/include/asm/insn-eval.h
 create mode 100644 arch/x86/lib/insn-eval.c

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
new file mode 100644
index 0000000..5cab1b1
--- /dev/null
+++ b/arch/x86/include/asm/insn-eval.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_X86_INSN_EVAL_H
+#define _ASM_X86_INSN_EVAL_H
+/*
+ * A collection of utility functions for x86 instruction analysis to be
+ * used in a kernel context. Useful when, for instance, making sense
+ * of the registers indicated by operands.
+ */
+
+#include <linux/compiler.h>
+#include <linux/bug.h>
+#include <linux/err.h>
+#include <asm/ptrace.h>
+
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+
+#endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 34a7413..675d7b0 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -23,7 +23,7 @@ lib-y := delay.o misc.o cmdline.o cpu.o
 lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
-lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
 lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
new file mode 100644
index 0000000..df9418c
--- /dev/null
+++ b/arch/x86/lib/insn-eval.c
@@ -0,0 +1,163 @@
+/*
+ * Utility functions for x86 operand and address decoding
+ *
+ * Copyright (C) Intel Corporation 2017
+ */
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <asm/inat.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
+
+enum reg_type {
+	REG_TYPE_RM = 0,
+	REG_TYPE_INDEX,
+	REG_TYPE_BASE,
+};
+
+static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
+			  enum reg_type type)
+{
+	int regno = 0;
+
+	static const int regoff[] = {
+		offsetof(struct pt_regs, ax),
+		offsetof(struct pt_regs, cx),
+		offsetof(struct pt_regs, dx),
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, sp),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+		offsetof(struct pt_regs, r8),
+		offsetof(struct pt_regs, r9),
+		offsetof(struct pt_regs, r10),
+		offsetof(struct pt_regs, r11),
+		offsetof(struct pt_regs, r12),
+		offsetof(struct pt_regs, r13),
+		offsetof(struct pt_regs, r14),
+		offsetof(struct pt_regs, r15),
+#endif
+	};
+	int nr_registers = ARRAY_SIZE(regoff);
+	/*
+	 * Don't possibly decode a 32-bit instructions as
+	 * reading a 64-bit-only register.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64) && !insn->x86_64)
+		nr_registers -= 8;
+
+	switch (type) {
+	case REG_TYPE_RM:
+		regno = X86_MODRM_RM(insn->modrm.value);
+		if (X86_REX_B(insn->rex_prefix.value))
+			regno += 8;
+		break;
+
+	case REG_TYPE_INDEX:
+		regno = X86_SIB_INDEX(insn->sib.value);
+		if (X86_REX_X(insn->rex_prefix.value))
+			regno += 8;
+
+		/*
+		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
+		 * portion of the address computation is null. This is
+		 * true only if REX.X is 0. In such a case, the SIB index
+		 * is used in the address computation.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+			return -EDOM;
+		break;
+
+	case REG_TYPE_BASE:
+		regno = X86_SIB_BASE(insn->sib.value);
+		/*
+		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
+		 * register-indirect addressing is 0. In this case, a
+		 * 32-bit displacement follows the SIB byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
+		if (X86_REX_B(insn->rex_prefix.value))
+			regno += 8;
+		break;
+
+	default:
+		pr_err("invalid register type");
+		BUG();
+		break;
+	}
+
+	if (regno >= nr_registers) {
+		WARN_ONCE(1, "decoded an instruction with an invalid register");
+		return -EINVAL;
+	}
+	return regoff[regno];
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+	int addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L;
+	long eff_addr, base, indx;
+	insn_byte_t sib;
+
+	insn_get_modrm(insn);
+	insn_get_sib(insn);
+	sib = insn->sib.value;
+
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+		if (addr_offset < 0)
+			goto out;
+
+		eff_addr = regs_get_register(regs, addr_offset);
+	} else {
+		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
+			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
+			if (base_offset == -EDOM)
+				base = 0;
+			else if (base_offset < 0)
+				goto out;
+			else
+				base = regs_get_register(regs, base_offset);
+
+			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
+
+			if (indx_offset == -EDOM)
+				indx = 0;
+			else if (indx_offset < 0)
+				goto out;
+			else
+				indx = regs_get_register(regs, indx_offset);
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+		} else {
+			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+			if (addr_offset < 0)
+				goto out;
+
+			eff_addr = regs_get_register(regs, addr_offset);
+		}
+
+		eff_addr += insn->displacement.value;
+	}
+
+	linear_addr = (unsigned long)eff_addr;
+
+out:
+	return (void __user *)linear_addr;
+}
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 581a960..2878205 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -12,6 +12,7 @@
 #include <linux/sched/sysctl.h>
 
 #include <asm/insn.h>
+#include <asm/insn-eval.h>
 #include <asm/mman.h>
 #include <asm/mmu_context.h>
 #include <asm/mpx.h>
@@ -60,159 +61,6 @@ static unsigned long mpx_mmap(unsigned long len)
 	return addr;
 }
 
-enum reg_type {
-	REG_TYPE_RM = 0,
-	REG_TYPE_INDEX,
-	REG_TYPE_BASE,
-};
-
-static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
-			  enum reg_type type)
-{
-	int regno = 0;
-
-	static const int regoff[] = {
-		offsetof(struct pt_regs, ax),
-		offsetof(struct pt_regs, cx),
-		offsetof(struct pt_regs, dx),
-		offsetof(struct pt_regs, bx),
-		offsetof(struct pt_regs, sp),
-		offsetof(struct pt_regs, bp),
-		offsetof(struct pt_regs, si),
-		offsetof(struct pt_regs, di),
-#ifdef CONFIG_X86_64
-		offsetof(struct pt_regs, r8),
-		offsetof(struct pt_regs, r9),
-		offsetof(struct pt_regs, r10),
-		offsetof(struct pt_regs, r11),
-		offsetof(struct pt_regs, r12),
-		offsetof(struct pt_regs, r13),
-		offsetof(struct pt_regs, r14),
-		offsetof(struct pt_regs, r15),
-#endif
-	};
-	int nr_registers = ARRAY_SIZE(regoff);
-	/*
-	 * Don't possibly decode a 32-bit instructions as
-	 * reading a 64-bit-only register.
-	 */
-	if (IS_ENABLED(CONFIG_X86_64) && !insn->x86_64)
-		nr_registers -= 8;
-
-	switch (type) {
-	case REG_TYPE_RM:
-		regno = X86_MODRM_RM(insn->modrm.value);
-		if (X86_REX_B(insn->rex_prefix.value))
-			regno += 8;
-		break;
-
-	case REG_TYPE_INDEX:
-		regno = X86_SIB_INDEX(insn->sib.value);
-		if (X86_REX_X(insn->rex_prefix.value))
-			regno += 8;
-
-		/*
-		 * If ModRM.mod != 3 and SIB.index = 4 the scale*index
-		 * portion of the address computation is null. This is
-		 * true only if REX.X is 0. In such a case, the SIB index
-		 * is used in the address computation.
-		 */
-		if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
-			return -EDOM;
-		break;
-
-	case REG_TYPE_BASE:
-		regno = X86_SIB_BASE(insn->sib.value);
-		/*
-		 * If ModRM.mod is 0 and SIB.base == 5, the base of the
-		 * register-indirect addressing is 0. In this case, a
-		 * 32-bit displacement follows the SIB byte.
-		 */
-		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
-			return -EDOM;
-
-		if (X86_REX_B(insn->rex_prefix.value))
-			regno += 8;
-		break;
-
-	default:
-		pr_err("invalid register type");
-		BUG();
-		break;
-	}
-
-	if (regno >= nr_registers) {
-		WARN_ONCE(1, "decoded an instruction with an invalid register");
-		return -EINVAL;
-	}
-	return regoff[regno];
-}
-
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
- */
-static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
-{
-	int addr_offset, base_offset, indx_offset;
-	unsigned long linear_addr = -1L;
-	long eff_addr, base, indx;
-	insn_byte_t sib;
-
-	insn_get_modrm(insn);
-	insn_get_sib(insn);
-	sib = insn->sib.value;
-
-	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
-		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-		if (addr_offset < 0)
-			goto out;
-
-		eff_addr = regs_get_register(regs, addr_offset);
-	} else {
-		if (insn->sib.nbytes) {
-			/*
-			 * Negative values in the base and index offset means
-			 * an error when decoding the SIB byte. Except -EDOM,
-			 * which means that the registers should not be used
-			 * in the address computation.
-			 */
-			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-			if (base_offset == -EDOM)
-				base = 0;
-			else if (base_offset < 0)
-				goto out;
-			else
-				base = regs_get_register(regs, base_offset);
-
-			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
-
-			if (indx_offset == -EDOM)
-				indx = 0;
-			else if (indx_offset < 0)
-				goto out;
-			else
-				indx = regs_get_register(regs, indx_offset);
-
-			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
-		} else {
-			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			if (addr_offset < 0)
-				goto out;
-
-			eff_addr = regs_get_register(regs, addr_offset);
-		}
-
-		eff_addr += insn->displacement.value;
-	}
-
-	linear_addr = (unsigned long)eff_addr;
-
-out:
-	return (void __user *)linear_addr;
-}
-
 static int mpx_insn_decode(struct insn *insn,
 			   struct pt_regs *regs)
 {
@@ -325,7 +173,7 @@ siginfo_t *mpx_generate_siginfo(struct pt_regs *regs)
 	info->si_signo = SIGSEGV;
 	info->si_errno = 0;
 	info->si_code = SEGV_BNDERR;
-	info->si_addr = mpx_get_addr_ref(&insn, regs);
+	info->si_addr = insn_get_addr_ref(&insn, regs);
 	/*
 	 * We were not able to extract an address from the instruction,
 	 * probably because there was something invalid in it.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (8 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 09/29] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-07 16:22   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 11/29] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
                   ` (18 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

We are not in a critical failure path. The invalid register type is caused
when trying to decode invalid instruction bytes from a user-space program.
Thus, simply print an error message. To prevent this warning from being
abused from user space programs, use the rate-limited variant of pr_err().
along with a descriptive prefix.

Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index df9418c..4931d92 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -5,10 +5,14 @@
  */
 #include <linux/kernel.h>
 #include <linux/string.h>
+#include <linux/ratelimit.h>
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 
+#undef pr_fmt
+#define pr_fmt(fmt) "insn: " fmt
+
 enum reg_type {
 	REG_TYPE_RM = 0,
 	REG_TYPE_INDEX,
@@ -85,9 +89,8 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 		break;
 
 	default:
-		pr_err("invalid register type");
-		BUG();
-		break;
+		pr_err_ratelimited("invalid register type: %d\n", type);
+		return -EINVAL;
 	}
 
 	if (regno >= nr_registers) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 11/29] x86/insn-eval: Add a utility function to get register offsets
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (9 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 12/29] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

The function get_reg_offset() returns the offset to the register the
argument specifies as indicated in an enumeration of type offset. Callers
of this function would need the definition of such enumeration. This is
not needed. Instead, add helper functions for this purpose. These functions
are useful in cases when, for instance, the caller needs to decide whether
the operand is a register or a memory location by looking at the rm part
of the ModRM byte. As of now, this is the only helper function that is
needed.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  1 +
 arch/x86/lib/insn-eval.c         | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 5cab1b1..7e8c963 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -12,5 +12,6 @@
 #include <asm/ptrace.h>
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 4931d92..405ffeb 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -100,6 +100,23 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 	return regoff[regno];
 }
 
+/**
+ * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
+ * @insn:	Instruction containing the ModRM byte
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * Returns:
+ *
+ * The register indicated by the r/m part of the ModRM byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of ModRM does not refer to a register and shall be ignored.
+ */
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
+{
+	return get_reg_offset(insn, regs, REG_TYPE_RM);
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 12/29] x86/insn-eval: Add utility function to identify string instructions
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (10 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 11/29] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

String instructions are special because, in protected mode, the linear
address is always obtained via the ES segment register in operands that
use the (E)DI register; the DS segment register in operands that use
the (E)SI register. Furthermore, segment override prefixes are ignored
when calculating a linear address involving the (E)DI register; segment
override prefixes can be used when calculating linear addresses involving
the (E)SI register.

It follows that linear addresses are calculated differently for the case of
string instructions. The purpose of this utility function is to identify
such instructions for callers to determine a linear address correctly.

Note that this function only identifies string instructions; it does not
determine what segment register to use in the address computation. That is
left to callers. A subsequent commmit introduces a function to determine
the segment register to use given the instruction, operands and
segment override prefixes.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 405ffeb..ac7b87c 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -19,6 +19,34 @@ enum reg_type {
 	REG_TYPE_BASE,
 };
 
+/**
+ * is_string_insn() - Determine if instruction is a string instruction
+ * @insn:	Instruction containing the opcode to inspect
+ *
+ * Returns:
+ *
+ * true if the instruction, determined by the opcode, is any of the
+ * string instructions as defined in the Intel Software Development manual.
+ * False otherwise.
+ */
+static bool is_string_insn(struct insn *insn)
+{
+	insn_get_opcode(insn);
+
+	/* All string instructions have a 1-byte opcode. */
+	if (insn->opcode.nbytes != 1)
+		return false;
+
+	switch (insn->opcode.bytes[0]) {
+	case 0x6c ... 0x6f:	/* INS, OUTS */
+	case 0xa4 ... 0xa7:	/* MOVS, CMPS */
+	case 0xaa ... 0xaf:	/* STOS, LODS, SCAS */
+		return true;
+	default:
+		return false;
+	}
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (11 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 12/29] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-10 22:41   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
                   ` (15 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

When computing a linear address and segmentation is used, we need to know
the base address of the segment involved in the computation. In most of
the cases, the segment base address will be zero as in USER_DS/USER32_DS.
However, it may be possible that a user space program defines its own
segments via a local descriptor table. In such a case, the segment base
address may not be zero. Thus, the segment base address is needed to
calculate correctly the linear address.

If running in protected mode, the segment selector to be used when
computing a linear address is determined by either any of segment override
prefixes in the instruction or inferred from the registers involved in the
computation of the effective address; in that order. Also, there are cases
when the segment override prefixes shall be ignored (i.e., code segments
are always selected by the CS segment register; string instructions always
use the ES segment register when using rDI register as operand). In long
mode, segment registers are ignored, except for FS and GS. In these two
cases, base addresses are obtained from the respective MSRs.

For clarity, this process can be split into four steps (and an equal
number of functions): determine if segment prefixes overrides can be used;
parse the segment override prefixes, and use them if found; if not found
or cannot be used, use the default segment registers associated with the
operand registers. Once the segment register to use has been identified,
read its value to obtain the segment selector.

The method to obtain the segment selector depends on several factors. In
32-bit builds, segment selectors are saved into a pt_regs structure
when switching to kernel mode. The same is also true for virtual-8086
mode. In 64-bit builds, segmentation is mostly ignored, except when
running a program in 32-bit legacy mode. In this case, CS and SS can be
obtained from pt_regs. DS, ES, FS and GS can be read directly from
the respective segment registers.

In order to identify the segment registers, a new set of #defines is
introduced. It also includes two special identifiers. One of them
indicates when the default segment register associated with instruction
operands shall be used. Another one indicates that the contents of the
segment register shall be ignored; this identifier is used when in long
mode.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/inat.h |  10 ++
 arch/x86/lib/insn-eval.c    | 321 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 331 insertions(+)

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
index 02aff08..1c78580 100644
--- a/arch/x86/include/asm/inat.h
+++ b/arch/x86/include/asm/inat.h
@@ -97,6 +97,16 @@
 #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
 #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
 
+/* Identifiers for segment registers */
+#define INAT_SEG_REG_IGNORE	0
+#define INAT_SEG_REG_DEFAULT	1
+#define INAT_SEG_REG_CS		2
+#define INAT_SEG_REG_SS		3
+#define INAT_SEG_REG_DS		4
+#define INAT_SEG_REG_ES		5
+#define INAT_SEG_REG_FS		6
+#define INAT_SEG_REG_GS		7
+
 /* Attribute search APIs */
 extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
 extern int inat_get_last_prefix_id(insn_byte_t last_pfx);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ac7b87c..77b48f9 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -9,6 +9,7 @@
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/vm86.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "insn: " fmt
@@ -47,6 +48,326 @@ static bool is_string_insn(struct insn *insn)
 	}
 }
 
+/**
+ * get_overridden_seg_reg_idx() - obtain segment register override index
+ * @insn:	Instruction with segment override prefixes
+ *
+ * Inspect the instruction prefixes and find segment overrides, if any.
+ *
+ * Returns:
+ *
+ * A constant identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_DEFAULT is returned if no segment override
+ * prefixes were found.
+ *
+ * -EINVAL in case of error.
+ */
+static int get_overridden_seg_reg_idx(struct insn *insn)
+{
+	int idx = INAT_SEG_REG_DEFAULT;
+	int sel_overrides = 0, i;
+
+	if (!insn)
+		return -EINVAL;
+
+	insn_get_prefixes(insn);
+
+	/* Look for any segment override prefixes. */
+	for (i = 0; i < insn->prefixes.nbytes; i++) {
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+			idx = INAT_SEG_REG_CS;
+			sel_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+			idx = INAT_SEG_REG_SS;
+			sel_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+			idx = INAT_SEG_REG_DS;
+			sel_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+			idx = INAT_SEG_REG_ES;
+			sel_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_FS):
+			idx = INAT_SEG_REG_FS;
+			sel_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_GS):
+			idx = INAT_SEG_REG_GS;
+			sel_overrides++;
+			break;
+		/* No default action needed. */
+		}
+	}
+
+	/* More than one segment override prefix leads to undefined behavior. */
+	if (sel_overrides > 1)
+		return -EINVAL;
+
+	return idx;
+}
+
+/**
+ * allow_seg_reg_overrides() - check if segment override prefixes are allowed
+ * @insn:	Instruction with segment override prefixes
+ * @regoff:	Operand offset, in pt_regs, for which the check is performed
+ *
+ * Determine if for a particular register used in register-indirect addressing
+ * segment override prefixes can be used. Specifically, no overrides are allowed
+ * for rIP as well as for rDI if used in a string instruction.
+ *
+ * Returns:
+ *
+ * 1 if segment override prefixes can be used with the register indicated
+ * in regoff. 0 if otherwise.
+ *
+ * -EINVAL in case of error.
+ */
+static int allow_seg_reg_overrides(struct insn *insn, int regoff)
+{
+	/*
+	 * Segment override prefixes should not be used for rIP. It is not
+	 * necessary to inspect the instruction structure.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip))
+		return 0;
+
+	/* Subsequent checks require a valid insn. */
+	if (!insn)
+		return -EINVAL;
+
+	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
+		return 0;
+
+	return 1;
+}
+
+/**
+ * resolve_seg_reg() - obtain segment register index
+ * @insn:	Instruction with operands
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to deterimine segment register
+ *
+ * Determine the segment register associated with the operands and, if
+ * applicable, prefixes and the instruction pointed by insn.
+ *
+ * The segment register associated to an operand used in register-indirect
+ * addressing depends on:
+ *
+ * a) Whether running in long mode (in such a case segments are ignored, except
+ * if FS or GS are used).
+ *
+ * b) Whether segment override prefixes can be used. Certain instructions and
+ *    registers do not allow override prefixes.
+ *
+ * c) If segment overrides prefixes are found in the instruction prefixes.
+ *
+ * d) The default segment register associated with the operand register.
+ *
+ * The function checks first if segment override prefixes can be used with the
+ * register indicated by regoff. If allowed, obtain such overridden segment.
+ * Lastly, if not prefixes were found, resolve the segment register index to use
+ * based on the defaults described in the Intel documentation. All segment
+ * register indexes will be ignored, except if overrides were found for FS or
+ * GS.
+ *
+ * The operand register, regoff, is represented as the offset from the base of
+ * pt_regs.
+ *
+ * Please note that this function does not return the value in the segment
+ * register (i.e., the segment selector) but our defined index. The segment
+ * selector needs to be obtained using get_segment_selector() and passing the
+ * segment register index resolved by this function.
+ *
+ * Returns:
+ *
+ * An index identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_IGNORE is returned if running in long mode.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
+{
+	int use_pfx_overrides, idx;
+
+	use_pfx_overrides = allow_seg_reg_overrides(insn, regoff);
+	if (use_pfx_overrides < 0)
+		return use_pfx_overrides;
+
+	if (use_pfx_overrides == 0)
+		goto resolve_default_idx;
+
+	if (!insn)
+		return -EINVAL;
+
+	idx = get_overridden_seg_reg_idx(insn);
+	if (idx < 0)
+		return idx;
+
+	if (idx == INAT_SEG_REG_DEFAULT)
+		goto resolve_default_idx;
+
+	/*
+	 * In long mode, segment override prefixes are ignored, except for
+	 * overrides for FS and GS.
+	 */
+	if (user_64bit_mode(regs)) {
+		if (idx != INAT_SEG_REG_FS &&
+		    idx != INAT_SEG_REG_GS)
+			idx = INAT_SEG_REG_IGNORE;
+	}
+
+	return idx;
+
+resolve_default_idx:
+
+	if (user_64bit_mode(regs))
+		return INAT_SEG_REG_IGNORE;
+	/*
+	 * If we are here, we use the default segment register as described
+	 * in the Intel documentation:
+	 *
+	 *  + DS for all references involving r[ABCD]X, and rSI.
+	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
+	 *  + AX, CX and DX are not valid register operands in 16-bit addresses.
+	 *    encodings but are valid for 32-bit and 64-bit encodings.
+	 *  + -EDOM is reserved to identify for cases in which no register
+	 *    is used (i.e., displacement-only addressing). Use DS.
+	 *  + SS for (E)SP or (E)BP.
+	 *  + CS for (E)IP.
+	 */
+
+	switch (regoff) {
+	case offsetof(struct pt_regs, ax):
+	case offsetof(struct pt_regs, cx):
+	case offsetof(struct pt_regs, dx):
+		/* Need insn to verify address size. */
+		if (insn->addr_bytes == 2)
+			return -EINVAL;
+
+	case -EDOM:
+	case offsetof(struct pt_regs, bx):
+	case offsetof(struct pt_regs, si):
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, di):
+		if (is_string_insn(insn))
+			return INAT_SEG_REG_ES;
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, bp):
+	case offsetof(struct pt_regs, sp):
+		return INAT_SEG_REG_SS;
+
+	case offsetof(struct pt_regs, ip):
+		return INAT_SEG_REG_CS;
+
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
+ * get_segment_selector() - obtain segment selector
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Segment register index to use
+ *
+ * Obtain the segment selector from any of the CS, SS, DS, ES, FS, GS segment
+ * registers. In CONFIG_X86_32, the segment is obtained from either pt_regs or
+ * kernel_vm86_regs as applicable. In CONFIG_X86_64, CS and SS are obtained
+ * from pt_regs. DS, ES, FS and GS are obtained by reading the actual CPU
+ * registers. This done for only for completeness as in CONFIG_X86_64 segment
+ * registers are ignored.
+ *
+ * Returns:
+ *
+ * Value of the segment selector, including null when running in
+ * long mode.
+ *
+ * -EINVAL on error.
+ */
+static short get_segment_selector(struct pt_regs *regs, int seg_reg_idx)
+{
+#ifdef CONFIG_X86_64
+	unsigned short sel;
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_IGNORE:
+		return 0;
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		savesegment(ds, sel);
+		return sel;
+	case INAT_SEG_REG_ES:
+		savesegment(es, sel);
+		return sel;
+	case INAT_SEG_REG_FS:
+		savesegment(fs, sel);
+		return sel;
+	case INAT_SEG_REG_GS:
+		savesegment(gs, sel);
+		return sel;
+	default:
+		return -EINVAL;
+	}
+#else /* CONFIG_X86_32 */
+	struct kernel_vm86_regs *vm86regs = (struct kernel_vm86_regs *)regs;
+
+	if (v8086_mode(regs)) {
+		switch (seg_reg_idx) {
+		case INAT_SEG_REG_CS:
+			return (unsigned short)(regs->cs & 0xffff);
+		case INAT_SEG_REG_SS:
+			return (unsigned short)(regs->ss & 0xffff);
+		case INAT_SEG_REG_DS:
+			return vm86regs->ds;
+		case INAT_SEG_REG_ES:
+			return vm86regs->es;
+		case INAT_SEG_REG_FS:
+			return vm86regs->fs;
+		case INAT_SEG_REG_GS:
+			return vm86regs->gs;
+		case INAT_SEG_REG_IGNORE:
+			/* fall through */
+		default:
+			return -EINVAL;
+		}
+	}
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		return (unsigned short)(regs->ds & 0xffff);
+	case INAT_SEG_REG_ES:
+		return (unsigned short)(regs->es & 0xffff);
+	case INAT_SEG_REG_FS:
+		return (unsigned short)(regs->fs & 0xffff);
+	case INAT_SEG_REG_GS:
+		/*
+		 * GS may or may not be in regs as per CONFIG_X86_32_LAZY_GS.
+		 * The macro below takes care of both cases.
+		 */
+		return get_user_gs(regs);
+	case INAT_SEG_REG_IGNORE:
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+#endif /* CONFIG_X86_64 */
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (12 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-11 14:57   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
                   ` (14 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

The segment descriptor contains information that is relevant to how linear
addresses need to be computed. It contains the default size of addresses
as well as the base address of the segment. Thus, given a segment
selector, we ought to look at segment descriptor to correctly calculate
the linear address.

In protected mode, the segment selector might indicate a segment
descriptor from either the global descriptor table or a local descriptor
table. Both cases are considered in this function.

This function is a prerequisite for functions in subsequent commits that
will obtain the aforementioned attributes of the segment descriptor.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 77b48f9..d599dc3 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -6,9 +6,13 @@
 #include <linux/kernel.h>
 #include <linux/string.h>
 #include <linux/ratelimit.h>
+#include <linux/mmu_context.h>
+#include <asm/desc_defs.h>
+#include <asm/desc.h>
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/ldt.h>
 #include <asm/vm86.h>
 
 #undef pr_fmt
@@ -450,6 +454,59 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_desc() - Obtain address of segment descriptor
+ * @sel:	Segment selector
+ *
+ * Given a segment selector, obtain a pointer to the segment descriptor.
+ * Both global and local descriptor tables are supported.
+ *
+ * Returns:
+ *
+ * Pointer to segment descriptor on success.
+ *
+ * NULL on error.
+ */
+static struct desc_struct *get_desc(unsigned short sel)
+{
+	struct desc_ptr gdt_desc = {0, 0};
+	unsigned long desc_base;
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+	struct desc_struct *desc = NULL;
+	struct ldt_struct *ldt;
+
+	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+		/* Bits [15:3] contain the index of the desired entry. */
+		sel >>= 3;
+
+		mutex_lock(&current->active_mm->context.lock);
+		ldt = current->active_mm->context.ldt;
+		if (ldt && sel < ldt->nr_entries)
+			desc = &ldt->entries[sel];
+
+		mutex_unlock(&current->active_mm->context.lock);
+
+		return desc;
+	}
+#endif
+	native_store_gdt(&gdt_desc);
+
+	/*
+	 * Segment descriptors have a size of 8 bytes. Thus, the index is
+	 * multiplied by 8 to obtain the memory offset of the desired descriptor
+	 * from the base of the GDT. As bits [15:3] of the segment selector
+	 * contain the index, it can be regarded as multiplied by 8 already.
+	 * All that remains is to clear bits [2:0].
+	 */
+	desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+	if (desc_base > gdt_desc.size)
+		return NULL;
+
+	return (struct desc_struct *)(gdt_desc.address + desc_base);
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (13 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-11 15:15   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
                   ` (13 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

With segmentation, the base address of the segment is needed to compute a
linear address. This base address is obtained from the applicable segment
descriptor. Such segment descriptor is referenced from a segment selector.
These new functions obtain the segment base and limit of the segment
selector indicated by segment register index given as argument. This index
is any of the INAT_SEG_REG_* family of #define's.

The logic to obtain the segment selector is wrapped in the function
get_seg_selector() with the inputs described above. Once the selector is
known, the base address is determined. In protected mode, the selector is
used to obtain the segment descriptor and then its base address. In 64-bit
user mode, the segment base address is zero except when FS or GS are used.
In virtual-8086 mode, the base address is computed as the value of the
segment selector shifted 4 positions to the left.

In protected mode, segment limits are enforced. Thus, a function to
determine the limit of the segment is added. Segment limits are not
enforced in long or virtual-8086. For the latter, addresses are limited
to 20 bits; address size will be handled when computing the linear
address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |   1 +
 arch/x86/lib/insn-eval.c         | 117 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 7e8c963..25d6e44 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -13,5 +13,6 @@
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
+unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index d599dc3..02c4498 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -507,6 +507,123 @@ static struct desc_struct *get_desc(unsigned short sel)
 }
 
 /**
+ * insn_get_seg_base() - Obtain base address of segment descriptor.
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Index of the segment register pointing to seg descriptor
+ *
+ * Obtain the base address of the segment as indicated by the segment descriptor
+ * pointed by the segment selector. The segment selector is obtained from the
+ * input segment register index seg_reg_idx.
+ *
+ * Returns:
+ *
+ * In protected mode, base address of the segment. Zero in long mode,
+ * except when FS or GS are used. In virtual-8086 mode, the segment
+ * selector shifted 4 positions to the right.
+ *
+ * -1L in case of error.
+ */
+unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
+{
+	struct desc_struct *desc;
+	short sel;
+
+	sel = get_segment_selector(regs, seg_reg_idx);
+	if (sel < 0)
+		return -1L;
+
+	if (v8086_mode(regs))
+		/*
+		 * Base is simply the segment selector shifted 4
+		 * positions to the right.
+		 */
+		return (unsigned long)(sel << 4);
+
+	if (user_64bit_mode(regs)) {
+		/*
+		 * Only FS or GS will have a base address, the rest of
+		 * the segments' bases are forced to 0.
+		 */
+		unsigned long base;
+
+		if (seg_reg_idx == INAT_SEG_REG_FS)
+			rdmsrl(MSR_FS_BASE, base);
+		else if (seg_reg_idx == INAT_SEG_REG_GS)
+			/*
+			 * swapgs was called at the kernel entry point. Thus,
+			 * MSR_KERNEL_GS_BASE will have the user-space GS base.
+			 */
+			rdmsrl(MSR_KERNEL_GS_BASE, base);
+		else if (seg_reg_idx != INAT_SEG_REG_IGNORE)
+			/* We should ignore the rest of segment registers. */
+			base = -1L;
+		else
+			base = 0;
+		return base;
+	}
+
+	/* In protected mode the segment selector cannot be null. */
+	if (!sel)
+		return -1L;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return -1L;
+
+	return get_desc_base(desc);
+}
+
+/**
+ * get_seg_limit() - Obtain the limit of a segment descriptor
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Index of the segment register pointing to seg descriptor
+ *
+ * Obtain the limit of the segment as indicated by the segment descriptor
+ * pointed by the segment selector. The segment selector is obtained from the
+ * input segment register index seg_reg_idx.
+ *
+ * Returns:
+ *
+ * In protected mode, the limit of the segment descriptor in bytes.
+ * In long mode and virtual-8086 mode, segment limits are not enforced. Thus,
+ * limit is returned as -1L to imply a limit-less segment.
+ *
+ * Zero is returned on error.
+ */
+static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
+{
+	struct desc_struct *desc;
+	unsigned long limit;
+	short sel;
+
+	sel = get_segment_selector(regs, seg_reg_idx);
+	if (sel < 0)
+		return 0;
+
+	if (user_64bit_mode(regs) || v8086_mode(regs))
+		return -1L;
+
+	if (!sel)
+		return 0;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return 0;
+
+	/*
+	 * If the granularity bit is set, the limit is given in multiples
+	 * of 4096. This also means that the 12 least significant bits are
+	 * not tested when checking the segment limits. In practice,
+	 * this means that the segment ends in (limit << 12) + 0xfff.
+	 */
+	limit = get_desc_limit(desc);
+	if (desc->g)
+		limit = (limit << 12) + 0xfff;
+
+	return limit;
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (14 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-12 16:31   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
                   ` (12 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Obtain the default values of the address and operand sizes as specified in
the D and L bits of the the segment descriptor selected by the register
CS. The function can be used for both protected and long modes.
For virtual-8086 mode, the default address and operand sizes are always 2
bytes.

The returned parameters are encoded in a signed 8-bit data type. Auxiliar
macros are provided to encode and decode such values.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  5 ++++
 arch/x86/lib/insn-eval.c         | 64 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 25d6e44..a5886ecc 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -11,8 +11,13 @@
 #include <linux/err.h>
 #include <asm/ptrace.h>
 
+#define INSN_CODE_SEG_ADDR_SZ(params) ((params >> 4) & 0xf)
+#define INSN_CODE_SEG_OPND_SZ(params) (params & 0xf)
+#define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))
+
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
+char insn_get_code_seg_defaults(struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 02c4498..cb2734a 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -624,6 +624,70 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 }
 
 /**
+ * insn_get_code_seg_defaults() - Obtain code segment default parameters
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * Obtain the default parameters of the code segment: address and operand sizes.
+ * The code segment is obtained from the selector contained in the CS register
+ * in regs. In protected mode, the default address is determined by inspecting
+ * the L and D bits of the segment descriptor. In virtual-8086 mode, the default
+ * is always two bytes for both address and operand sizes.
+ *
+ * Returns:
+ *
+ * A signed 8-bit value containing the default parameters on success.
+ *
+ * -EINVAL on error.
+ */
+char insn_get_code_seg_defaults(struct pt_regs *regs)
+{
+	struct desc_struct *desc;
+	short sel;
+
+	if (v8086_mode(regs))
+		/* Address and operand size are both 16-bit. */
+		return INSN_CODE_SEG_PARAMS(2, 2);
+
+	sel = get_segment_selector(regs, INAT_SEG_REG_CS);
+	if (sel < 0)
+		return -1L;
+
+	desc = get_desc(sel);
+	if (!desc)
+		return -EINVAL;
+
+	/*
+	 * The most significant byte of the Type field of the segment descriptor
+	 * determines whether a segment contains data or code. If this is a data
+	 * segment, return error.
+	 */
+	if (!(desc->type & BIT(3)))
+		return -EINVAL;
+
+	switch ((desc->l << 1) | desc->d) {
+	case 0: /*
+		 * Legacy mode. CS.L=0, CS.D=0. Address and operand size are
+		 * both 16-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(2, 2);
+	case 1: /*
+		 * Legacy mode. CS.L=0, CS.D=1. Address and operand size are
+		 * both 32-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(4, 4);
+	case 2: /*
+		 * IA-32e 64-bit mode. CS.L=1, CS.D=0. Address size is 64-bit;
+		 * operand size is 32-bit.
+		 */
+		return INSN_CODE_SEG_PARAMS(4, 8);
+	case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
  * insn_get_modrm_rm_off() - Obtain register in r/m part of the ModRM byte
  * @insn:	Instruction containing the ModRM byte
  * @regs:	Register values as seen when entering kernel mode
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (15 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-20 15:44   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
                   ` (11 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod is zero and
ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
that none of the registers are used in the computation of the effective
address. A return value of -EDOM indicates callers that they should not
use the value of registers when computing the effective address for the
instruction.

In long mode, the effective address is given by the 32-bit displacement
plus the location of the next instruction. In protected mode, only the
displacement is used.

The instruction decoder takes care of obtaining the displacement.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index cb2734a..dd84819 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -408,6 +408,14 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 	switch (type) {
 	case REG_TYPE_RM:
 		regno = X86_MODRM_RM(insn->modrm.value);
+
+		/*
+		 * ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement
+		 * follows the ModRM byte.
+		 */
+		if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+			return -EDOM;
+
 		if (X86_REX_B(insn->rex_prefix.value))
 			regno += 8;
 		break;
@@ -754,10 +762,21 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-			if (addr_offset < 0)
+			/*
+			 * -EDOM means that we must ignore the address_offset.
+			 * In such a case, in 64-bit mode the effective address
+			 * relative to the RIP of the following instruction.
+			 */
+			if (addr_offset == -EDOM) {
+				if (user_64bit_mode(regs))
+					eff_addr = (long)regs->ip + insn->length;
+				else
+					eff_addr = 0;
+			} else if (addr_offset < 0) {
 				goto out;
-
-			eff_addr = regs_get_register(regs, addr_offset);
+			} else {
+				eff_addr = regs_get_register(regs, addr_offset);
+			}
 		}
 
 		eff_addr += insn->displacement.value;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (16 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-20 16:08   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
                   ` (10 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

insn_get_addr_ref() returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. The segment descriptor to use depends on the register used as
operand and segment override prefixes, if any.

In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.

Since the effective address is a signed quantity, the unsigned segment
base address is saved in a separate variable and added to the final,
unsigned, effective address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index dd84819..b3aa891 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -719,8 +719,8 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-	int addr_offset, base_offset, indx_offset;
-	unsigned long linear_addr = -1L;
+	int addr_offset, base_offset, indx_offset, seg_reg_indx;
+	unsigned long linear_addr = -1L, seg_base_addr;
 	long eff_addr, base, indx;
 	insn_byte_t sib;
 
@@ -734,6 +734,14 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			goto out;
 
 		eff_addr = regs_get_register(regs, addr_offset);
+
+		seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset);
+		if (seg_reg_indx < 0)
+			goto out;
+
+		seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+		if (seg_base_addr == -1L)
+			goto out;
 	} else {
 		if (insn->sib.nbytes) {
 			/*
@@ -760,6 +768,14 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 				indx = regs_get_register(regs, indx_offset);
 
 			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			seg_reg_indx = resolve_seg_reg(insn, regs, base_offset);
+			if (seg_reg_indx < 0)
+				goto out;
+
+			seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+			if (seg_base_addr == -1L)
+				goto out;
 		} else {
 			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 			/*
@@ -777,12 +793,20 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 			} else {
 				eff_addr = regs_get_register(regs, addr_offset);
 			}
+
+			seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset);
+			if (seg_reg_indx < 0)
+				goto out;
+
+			seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+			if (seg_base_addr == -1L)
+				goto out;
 		}
 
 		eff_addr += insn->displacement.value;
 	}
 
-	linear_addr = (unsigned long)eff_addr;
+	linear_addr = (unsigned long)eff_addr + seg_base_addr;
 
 out:
 	return (void __user *)linear_addr;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (17 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-20 17:12   ` Borislav Petkov
  2017-10-04  3:54 ` [PATCH v9 20/29] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses Ricardo Neri
                   ` (9 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

32-bit and 64-bit address encodings are identical. Thus, the same logic
could be used to resolve the effective address. However, there are two key
differences: address size and enforcement of segment limits.

If running a 32-bit process on a 64-bit kernel, it is best to perform
the address calculation using 32-bit data types. In this manner hardware
is used for the arithmetic, including handling of signs and overflows.

32-bit addresses are generally used in protected mode; segment limits are
enforced in this mode. This implementation obtains the limit of the
segment associated with the instruction operands and prefixes. If the
computed address is outside the segment limits, an error is returned. It
is also possible to use 32-bit address in long mode and virtual-8086 mode
by using an address override prefix. In such cases, segment limits are not
enforced.

The new function get_addr_ref_32() is almost identical to the existing
function insn_get_addr_ref() (used for 64-bit addresses); except for the
differences mentioned above. For the sake of simplicity and readability,
it is better to use two separate functions.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 160 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 160 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index b3aa891..945c9b7 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -712,6 +712,166 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
 	return get_reg_offset(insn, regs, REG_TYPE_RM);
 }
 
+/**
+ * get_addr_ref_32() - Obtain a 32-bit linear address
+ * @insn:	Instruction with ModRM, SIB bytes and displacement
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * This function is to be used with 32-bit address encodings to obtain the
+ * linear memory address referred by the instruction's ModRM, SIB,
+ * displacement bytes and segment base address, as applicable. If in protected
+ * mode, segment limits are enforced.
+ *
+ * Returns:
+ *
+ * Linear address referenced by instruction and registers on success.
+ *
+ * -1L on error.
+ */
+static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
+{
+	int eff_addr, base, indx, addr_offset, base_offset, indx_offset;
+	unsigned long linear_addr = -1L, seg_base_addr, seg_limit, tmp;
+	int seg_reg_indx;
+	insn_byte_t sib;
+
+	insn_get_modrm(insn);
+	insn_get_sib(insn);
+	sib = insn->sib.value;
+
+	if (insn->addr_bytes != 4)
+		goto out;
+
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+		if (addr_offset < 0)
+			goto out;
+
+		tmp = regs_get_register(regs, addr_offset);
+		/* The 4 most significant bytes must be zero. */
+		if (tmp & ~0xffffffffL)
+			goto out;
+
+		eff_addr = (int)(tmp & 0xffffffff);
+
+		seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset);
+		if (seg_reg_indx < 0)
+			goto out;
+
+		seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+		if (seg_base_addr == -1L)
+			goto out;
+
+		seg_limit = get_seg_limit(regs, seg_reg_indx);
+	} else {
+		if (insn->sib.nbytes) {
+			/*
+			 * Negative values in the base and index offset means
+			 * an error when decoding the SIB byte. Except -EDOM,
+			 * which means that the registers should not be used
+			 * in the address computation.
+			 */
+			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
+			if (base_offset == -EDOM) {
+				base = 0;
+			} else if (base_offset < 0) {
+				goto out;
+			} else {
+				tmp = regs_get_register(regs, base_offset);
+				/* The 4 most significant bytes must be zero. */
+				if (tmp & ~0xffffffffL)
+					goto out;
+
+				base = (int)(tmp & 0xffffffff);
+			}
+
+			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
+			if (indx_offset == -EDOM) {
+				indx = 0;
+			} else if (indx_offset < 0) {
+				goto out;
+			} else {
+				tmp = regs_get_register(regs, indx_offset);
+				/* The 4 most significant bytes must be zero. */
+				if (tmp & ~0xffffffffL)
+					goto out;
+
+				indx = (int)(tmp & 0xffffffff);
+			}
+
+			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+
+			seg_reg_indx = resolve_seg_reg(insn, regs, base_offset);
+			if (seg_reg_indx < 0)
+				goto out;
+
+			seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+			if (seg_base_addr == -1L)
+				goto out;
+
+			seg_limit = get_seg_limit(regs, seg_reg_indx);
+		} else {
+			addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
+
+			/*
+			 * -EDOM means that we must ignore the address_offset.
+			 * In such a case, in 64-bit mode the effective address
+			 * relative to the RIP of the following instruction.
+			 */
+			if (addr_offset == -EDOM) {
+				if (user_64bit_mode(regs))
+					eff_addr = (long)regs->ip + insn->length;
+				else
+					eff_addr = 0;
+			} else if (addr_offset < 0) {
+				goto out;
+			} else {
+				tmp = regs_get_register(regs, addr_offset);
+				/* The 4 most significant bytes must be zero. */
+				if (tmp & ~0xffffffffL)
+					goto out;
+
+				eff_addr = (int)(tmp & 0xffffffff);
+			}
+
+			seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset);
+			if (seg_reg_indx < 0)
+				goto out;
+
+			seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+			if (seg_base_addr == -1L)
+				goto out;
+
+			seg_limit = get_seg_limit(regs, seg_reg_indx);
+		}
+		eff_addr += insn->displacement.value;
+	}
+
+	/*
+	 * In protected mode, before computing the linear address, make sure
+	 * the effective address is within the limits of the segment.
+	 * 32-bit addresses can be used in long and virtual-8086 modes if an
+	 * address override prefix is used. In such cases, segment limits are
+	 * not enforced. When in virtual-8086 mode, the segment limit is -1L
+	 * to reflect this situation.
+	 *
+	 * After computed, the effective address is treated as an unsigned
+	 * quantity.
+	 */
+	if (!user_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit))
+		goto out;
+
+	/*
+	 * Data type long could be 64 bits in size. Ensure that our 32-bit
+	 * effective address is not sign-extended when computing the linear
+	 * address.
+	 */
+	linear_addr = (unsigned long)(eff_addr & 0xffffffff) + seg_base_addr;
+
+out:
+	return (void __user *)linear_addr;
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 20/29] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (18 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 21/29] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode Ricardo Neri
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

The function insn_get_addr_ref() is capable of handling only 64-bit
addresses. A previous commit introduced a function to handle 32-bit
addresses. Invoke these two functions from a third wrapper function that
calls the appropriate routine based on the address size specified in the
instruction structure (obtained by looking at the code segment default
address size and the address override prefix, if present).

While doing this, rename the original function insn_get_addr_ref() with
the more appropriate name get_addr_ref_64(), ensure it is only used
for 64-bit addresses.

Also, since 64-bit addresses are not possible in 32-bit builds, provide
a dummy function such case.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 57 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 52 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 945c9b7..1d510a6 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -872,12 +872,28 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
 	return (void __user *)linear_addr;
 }
 
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
+/**
+ * get_addr_ref_64() - Obtain a 64-bit linear address
+ * @insn:	Instruction struct with ModRM and SIB bytes and displacement
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * This function is to be used with 64-bit address encodings to obtain the
+ * linear memory address referred by the instruction's ModRM, SIB,
+ * displacement bytes and segment base address, as applicable.
+ *
+ * Returns:
+ *
+ * Linear address referenced by instruction and registers on success.
+ *
+ * -1L on error.
  */
-void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+#ifndef CONFIG_X86_64
+static void __user *get_addr_ref_64(struct insn *insn, struct pt_regs *regs)
+{
+	return (void __user *)-1L;
+}
+#else
+static void __user *get_addr_ref_64(struct insn *insn, struct pt_regs *regs)
 {
 	int addr_offset, base_offset, indx_offset, seg_reg_indx;
 	unsigned long linear_addr = -1L, seg_base_addr;
@@ -888,6 +904,9 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 	insn_get_sib(insn);
 	sib = insn->sib.value;
 
+	if (insn->addr_bytes != 8)
+		goto out;
+
 	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
 		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
 		if (addr_offset < 0)
@@ -971,3 +990,31 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 out:
 	return (void __user *)linear_addr;
 }
+#endif /* CONFIG_X86_64 */
+
+/**
+ * insn_get_addr_ref() - Obtain the linear address referred by instruction
+ * @insn:	Instruction structure containing ModRM byte and displacement
+ * @regs:	Structure with register values as seen when entering kernel mode
+ *
+ * Obtain the linear address referred by the instruction's ModRM, SIB and
+ * displacement bytes, and segment base, as applicable. In protected mode,
+ * segment limits are enforced.
+ *
+ * Returns:
+ *
+ * Linear address referenced by instruction and registers on success.
+ *
+ * -1L on error.
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+	switch (insn->addr_bytes) {
+	case 4:
+		return get_addr_ref_32(insn, regs);
+	case 8:
+		return get_addr_ref_64(insn, regs);
+	default:
+		return (void __user *)-1L;
+	}
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 21/29] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (19 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 20/29] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 22/29] x86/insn-eval: Add support to resolve 16-bit addressing encodings Ricardo Neri
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

It is possible to utilize 32-bit address encodings in virtual-8086 mode via
an address override instruction prefix. However, the range of the
effective address is still limited to [0x-0xffff]. In such a case, return
error.

Also, linear addresses in virtual-8086 mode are limited to 20 bits. Enforce
such limit by truncating the most significant bytes of the computed linear
address.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 1d510a6..d43808c 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -862,12 +862,23 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
 		goto out;
 
 	/*
+	 * Even though 32-bit address encodings are allowed in virtual-8086
+	 * mode, the address range is still limited to [0x-0xffff].
+	 */
+	if (v8086_mode(regs) && (eff_addr & ~0xffff))
+		goto out;
+
+	/*
 	 * Data type long could be 64 bits in size. Ensure that our 32-bit
 	 * effective address is not sign-extended when computing the linear
 	 * address.
 	 */
 	linear_addr = (unsigned long)(eff_addr & 0xffffffff) + seg_base_addr;
 
+	/* Limit linear address to 20 bits */
+	if (v8086_mode(regs))
+		linear_addr &= 0xfffff;
+
 out:
 	return (void __user *)linear_addr;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 22/29] x86/insn-eval: Add support to resolve 16-bit addressing encodings
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (20 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 21/29] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 23/29] x86/cpufeature: Add User-Mode Instruction Prevention definitions Ricardo Neri
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Adam Buchbinder, Colin Ian King, Lorenzo Stoakes, Qiaowei Ren,
	Arnaldo Carvalho de Melo, Adrian Hunter, Kees Cook,
	Thomas Garnier, Dmitry Vyukov

Tasks running in virtual-8086 mode, in protected mode with code segment
descriptors that specify 16-bit default address sizes via the
D bit, or via an address override prefix will use 16-bit addressing form
encodings as described in the Intel 64 and IA-32 Architecture Software
Developer's Manual Volume 2A Section 2.1.5, Table 2-1.

16-bit addressing encodings differ in several ways from the 32-bit/64-bit
addressing form encodings: ModRM.rm points to different registers and, in
some cases, effective addresses are indicated by the addition of the value
of two registers. Also, there is no support for SIB bytes. Thus, a
separate function is needed to parse this form of addressing.

A couple of functions are introduced. get_reg_offset_16() obtains the
offset from the base of pt_regs of the registers indicated by the ModRM
byte of the address encoding. get_addr_ref_16() computes the linear
address indicated by the instructions using the value of the registers
given by ModRM and the base address of the applicable segment.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 182 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 182 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index d43808c..2f859a1 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -462,6 +462,80 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
+ * get_reg_offset_16() - Obtain offset of register indicated by instruction
+ * @insn:	Instruction containing ModRM byte
+ * @regs:	Register values as seen when entering kernel mode
+ * @offs1:	Offset of the first operand register
+ * @offs2:	Offset of the second opeand register, if applicable
+ *
+ * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
+ * within insn. This function is to be used with 16-bit address encodings. The
+ * offs1 and offs2 will be written with the offset of the two registers
+ * indicated by the instruction. In cases where any of the registers is not
+ * referenced by the instruction, the value will be set to -EDOM.
+ *
+ * Returns:
+ *
+ * 0 on success, -EINVAL on error.
+ */
+static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
+			     int *offs1, int *offs2)
+{
+	/*
+	 * 16-bit addressing can use one or two registers. Specifics of
+	 * encodings are given in Table 2-1. "16-Bit Addressing Forms with the
+	 * ModR/M Byte" of the Intel Software Development Manual.
+	 */
+	static const int regoff1[] = {
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, bx),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+		offsetof(struct pt_regs, bp),
+		offsetof(struct pt_regs, bx),
+	};
+
+	static const int regoff2[] = {
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+		offsetof(struct pt_regs, si),
+		offsetof(struct pt_regs, di),
+		-EDOM,
+		-EDOM,
+		-EDOM,
+		-EDOM,
+	};
+
+	if (!offs1 || !offs2)
+		return -EINVAL;
+
+	/* Operand is a register, use the generic function. */
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		*offs1 = insn_get_modrm_rm_off(insn, regs);
+		*offs2 = -EDOM;
+		return 0;
+	}
+
+	*offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
+	*offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
+
+	/*
+	 * If ModRM.mod is 0 and ModRM.rm is 110b, then we use displacement-
+	 * only addressing. This means that no registers are involved in
+	 * computing the effective address. Thus, ensure that the first
+	 * register offset is invalild. The second register offset is already
+	 * invalid under the aforementioned conditions.
+	 */
+	if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
+	    (X86_MODRM_RM(insn->modrm.value) == 6))
+		*offs1 = -EDOM;
+
+	return 0;
+}
+
+/**
  * get_desc() - Obtain address of segment descriptor
  * @sel:	Segment selector
  *
@@ -713,6 +787,112 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
 }
 
 /**
+ * get_addr_ref_16() - Obtain the 16-bit address referred by instruction
+ * @insn:	Instruction containing ModRM byte and displacement
+ * @regs:	Register values as seen when entering kernel mode
+ *
+ * This function is to be used with 16-bit address encodings. Obtain the memory
+ * address referred by the instruction's ModRM and displacement bytes. Also, the
+ * segment used as base is determined by either any segment override prefixes in
+ * insn or the default segment of the registers involved in the address
+ * computation. In protected mode, segment limits are enforced.
+ *
+ * Returns:
+ *
+ * Linear address referenced by the instruction operands on success.
+ *
+ * -1L on error.
+ */
+static void __user *get_addr_ref_16(struct insn *insn, struct pt_regs *regs)
+{
+	unsigned long linear_addr = -1L, seg_base_addr, seg_limit;
+	int addr_offset1, addr_offset2, seg_reg_indx, ret;
+	short eff_addr, addr1 = 0, addr2 = 0;
+
+	insn_get_modrm(insn);
+	insn_get_displacement(insn);
+
+	if (insn->addr_bytes != 2)
+		goto out;
+
+	/*
+	 * If operand is a register, the layout is the same as in
+	 * 32-bit and 64-bit addressing.
+	 */
+	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+		addr_offset1 = get_reg_offset(insn, regs, REG_TYPE_RM);
+		if (addr_offset1 < 0)
+			goto out;
+
+		eff_addr = regs_get_register(regs, addr_offset1);
+
+		seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset1);
+		if (seg_reg_indx < 0)
+			goto out;
+
+		seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+		if (seg_base_addr == -1L)
+			goto out;
+
+		seg_limit = get_seg_limit(regs, seg_reg_indx);
+	} else {
+		ret = get_reg_offset_16(insn, regs, &addr_offset1,
+					&addr_offset2);
+		if (ret < 0)
+			goto out;
+
+		/*
+		 * Don't fail on invalid offset values. They might be invalid
+		 * because they cannot be used for this particular value of
+		 * the ModRM. Instead, use them in the computation only if
+		 * they contain a valid value.
+		 */
+		if (addr_offset1 != -EDOM)
+			addr1 = 0xffff & regs_get_register(regs, addr_offset1);
+		if (addr_offset2 != -EDOM)
+			addr2 = 0xffff & regs_get_register(regs, addr_offset2);
+
+		eff_addr = addr1 + addr2;
+
+		/*
+		 * The first operand register could indicate to use of either SS
+		 * or DS registers to obtain the segment selector.  The second
+		 * operand register can only indicate the use of DS. Thus, use
+		 * the first operand to obtain the segment selector.
+		 */
+		seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset1);
+		if (seg_reg_indx < 0)
+			goto out;
+
+		seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
+		if (seg_base_addr == -1L)
+			goto out;
+
+		seg_limit = get_seg_limit(regs, seg_reg_indx);
+
+		eff_addr += (insn->displacement.value & 0xffff);
+	}
+
+	/*
+	 * Before computing the linear address, make sure the effective address
+	 * is within the limits of the segment. In virtual-8086 mode, segment
+	 * limits are not enforced. In such a case, the segment limit is -1L to
+	 * reflect this fact.
+	 */
+	if ((unsigned long)(eff_addr & 0xffff) > seg_limit)
+		goto out;
+
+	linear_addr = (unsigned long)(eff_addr & 0xffff) + seg_base_addr;
+
+	/* Limit linear address to 20 bits */
+	if (v8086_mode(regs))
+		linear_addr &= 0xfffff;
+
+out:
+	return (void __user *)linear_addr;
+}
+
+/**
  * get_addr_ref_32() - Obtain a 32-bit linear address
  * @insn:	Instruction with ModRM, SIB bytes and displacement
  * @regs:	Register values as seen when entering kernel mode
@@ -1021,6 +1201,8 @@ static void __user *get_addr_ref_64(struct insn *insn, struct pt_regs *regs)
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
 	switch (insn->addr_bytes) {
+	case 2:
+		return get_addr_ref_16(insn, regs);
 	case 4:
 		return get_addr_ref_32(insn, regs);
 	case 8:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 23/29] x86/cpufeature: Add User-Mode Instruction Prevention definitions
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (21 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 22/29] x86/insn-eval: Add support to resolve 16-bit addressing encodings Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 24/29] x86: Add emulation code for UMIP instructions Ricardo Neri
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu, Tony Luck

User-Mode Instruction Prevention is a security feature present in new
Intel processors that, when set, prevents the execution of a subset of
instructions if such instructions are executed in user mode (CPL > 0).
Attempting to execute such instructions causes a general protection
exception.

The subset of instructions comprises:

 * SGDT - Store Global Descriptor Table
 * SIDT - Store Interrupt Descriptor Table
 * SLDT - Store Local Descriptor Table
 * SMSW - Store Machine Status Word
 * STR  - Store Task Register

This feature is also added to the list of disabled-features to allow
a cleaner handling of build-time configuration.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/cpufeatures.h          | 1 +
 arch/x86/include/asm/disabled-features.h    | 8 +++++++-
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 2519c6c..d1f18f2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -292,6 +292,7 @@
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
 #define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
+#define X86_FEATURE_UMIP	(16*32+ 2) /* User Mode Instruction Protection */
 #define X86_FEATURE_PKU		(16*32+ 3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE	(16*32+ 4) /* OS Protection Keys Enable */
 #define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index c10c912..14d6d50 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -16,6 +16,12 @@
 # define DISABLE_MPX	(1<<(X86_FEATURE_MPX & 31))
 #endif
 
+#ifdef CONFIG_X86_INTEL_UMIP
+# define DISABLE_UMIP	0
+#else
+# define DISABLE_UMIP	(1<<(X86_FEATURE_UMIP & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME		(1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR	(1<<(X86_FEATURE_K6_MTRR & 31))
@@ -63,7 +69,7 @@
 #define DISABLED_MASK13	0
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
-#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57)
+#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index 39946d0..cf4c876 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -104,6 +104,8 @@
 #define X86_CR4_OSFXSR		_BITUL(X86_CR4_OSFXSR_BIT)
 #define X86_CR4_OSXMMEXCPT_BIT	10 /* enable unmasked SSE exceptions */
 #define X86_CR4_OSXMMEXCPT	_BITUL(X86_CR4_OSXMMEXCPT_BIT)
+#define X86_CR4_UMIP_BIT	11 /* enable UMIP support */
+#define X86_CR4_UMIP		_BITUL(X86_CR4_UMIP_BIT)
 #define X86_CR4_LA57_BIT	12 /* enable 5-level page tables */
 #define X86_CR4_LA57		_BITUL(X86_CR4_LA57_BIT)
 #define X86_CR4_VMXE_BIT	13 /* enable VMX virtualization */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 24/29] x86: Add emulation code for UMIP instructions
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (22 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 23/29] x86/cpufeature: Add User-Mode Instruction Prevention definitions Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu, Tony Luck

The feature User-Mode Instruction Prevention present in recent Intel
processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and
str) from being executed with CPL > 0. Otherwise, a general protection
fault is issued.

Rather than relaying to the user space the general protection fault caused
by the UMIP-protected instructions (in the form of a SIGSEGV signal), it
can be trapped and emulate the result of such instructions to provide dummy
values. This allows to both conserve the current kernel behavior and not
reveal the system resources that UMIP intends to protect (i.e., the
locations of the global descriptor and interrupt descriptor tables, the
segment selectors of the local descriptor table, the value of the task
state register and the contents of the CR0 register).

This emulation is needed because certain applications (e.g., WineHQ and
DOSEMU2) rely on this subset of instructions to function. Given that sldt
and str are not commonly used in programs that run on WineHQ or DOSEMU2,
they are not emulated. Also, emulation is provided only for 32-bit
processes; 64-bit processes that attempt to use the instructions that UMIP
protects will receive the SIGSEGV signal issued as a consequence of the
general protection fault.

The instructions protected by UMIP can be split in two groups. Those which
return a kernel memory address (sgdt and sidt) and those which return a
value (sldt, str and smsw).

For the instructions that return a kernel memory address, applications such
as WineHQ rely on the result being located in the kernel memory space, not
the actual location of the table. The result is emulated as a hard-coded
value that lies close to the top of the kernel memory. The limit for the
GDT and the IDT are set to zero.

The instruction smsw is emulated to return the value that the register CR0
has at boot time as set in the head_32.

Care is taken to appropriately emulate the results when segmentation is
used. That is, rather than relying on USER_DS and USER_CS, the function
insn_get_addr_ref() inspects the segment descriptor pointed by the
registers in pt_regs. This ensures that we correctly obtain the segment
base address and the address and operand sizes even if the user space
application uses a local descriptor table.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/umip.h |  12 ++
 arch/x86/kernel/Makefile    |   1 +
 arch/x86/kernel/umip.c      | 309 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 322 insertions(+)
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c

diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
new file mode 100644
index 0000000..db43f2a
--- /dev/null
+++ b/arch/x86/include/asm/umip.h
@@ -0,0 +1,12 @@
+#ifndef _ASM_X86_UMIP_H
+#define _ASM_X86_UMIP_H
+
+#include <linux/types.h>
+#include <asm/ptrace.h>
+
+#ifdef CONFIG_X86_INTEL_UMIP
+bool fixup_umip_exception(struct pt_regs *regs);
+#else
+static inline bool fixup_umip_exception(struct pt_regs *regs) { return false; }
+#endif  /* CONFIG_X86_INTEL_UMIP */
+#endif  /* _ASM_X86_UMIP_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d8e2b70..bafeee1 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -125,6 +125,7 @@ obj-$(CONFIG_EFI)			+= sysfb_efi.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o
 obj-$(CONFIG_TRACING)			+= tracepoint.o
 obj-$(CONFIG_SCHED_MC_PRIO)		+= itmt.o
+obj-$(CONFIG_X86_INTEL_UMIP)		+= umip.o
 
 obj-$(CONFIG_ORC_UNWINDER)		+= unwind_orc.o
 obj-$(CONFIG_FRAME_POINTER_UNWINDER)	+= unwind_frame.o
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
new file mode 100644
index 0000000..1f338cb
--- /dev/null
+++ b/arch/x86/kernel/umip.c
@@ -0,0 +1,309 @@
+/*
+ * umip.c Emulation for instruction protected by the Intel User-Mode
+ * Instruction Prevention feature
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ * Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
+ */
+
+#include <linux/uaccess.h>
+#include <asm/umip.h>
+#include <asm/traps.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
+#include <linux/ratelimit.h>
+
+/** DOC: Emulation for User-Mode Instruction Prevention (UMIP)
+ *
+ * The feature User-Mode Instruction Prevention present in recent Intel
+ * processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and str)
+ * from being executed with CPL > 0. Otherwise, a general protection fault is
+ * issued.
+ *
+ * Rather than relaying to the user space the general protection fault caused by
+ * the UMIP-protected instructions (in the form of a SIGSEGV signal), it can be
+ * trapped and emulate the result of such instructions to provide dummy values.
+ * This allows to both conserve the current kernel behavior and not reveal the
+ * system resources that UMIP intends to protect (i.e., the locations of the
+ * global descriptor and interrupt descriptor tables, the segment selectors of
+ * the local descriptor table, the value of the task state register and the
+ * contents of the CR0 register).
+ *
+ * This emulation is needed because certain applications (e.g., WineHQ and
+ * DOSEMU2) rely on this subset of instructions to function.
+ *
+ * The instructions protected by UMIP can be split in two groups. Those which
+ * return a kernel memory address (sgdt and sidt) and those which return a
+ * value (sldt, str and smsw).
+ *
+ * For the instructions that return a kernel memory address, applications
+ * such as WineHQ rely on the result being located in the kernel memory space,
+ * not the actual location of the table. The result is emulated as a hard-coded
+ * value that, lies close to the top of the kernel memory. The limit for the GDT
+ * and the IDT are set to zero.
+ *
+ * Given that sldt and str are not commonly used in programs that run on WineHQ
+ * or DOSEMU2, they are not emulated.
+ *
+ * The instruction smsw is emulated to return the value that the register CR0
+ * has at boot time as set in the head_32.
+ *
+ * Also, emulation is provided only for 32-bit processes; 64-bit processes
+ * that attempt to use the instructions that UMIP protects will receive the
+ * SIGSEGV signal issued as a consequence of the general protection fault.
+ *
+ * Care is taken to appropriately emulate the results when segmentation is
+ * used. That is, rather than relying on USER_DS and USER_CS, the function
+ * insn_get_addr_ref() inspects the segment descriptor pointed by the
+ * registers in pt_regs. This ensures that we correctly obtain the segment
+ * base address and the address and operand sizes even if the user space
+ * application uses a local descriptor table.
+ */
+
+#define UMIP_DUMMY_GDT_BASE 0xfffe0000
+#define UMIP_DUMMY_IDT_BASE 0xffff0000
+
+/*
+ * The SGDT and SIDT instructions store the contents of the global descriptor
+ * table and interrupt table registers, respectively. The destination is a
+ * memory operand of X+2 bytes. X bytes are used to store the base address of
+ * the table and 2 bytes are used to store the limit. In 32-bit processes, the
+ * only processes for which emulation is provided, X has a value of 4.
+ */
+#define UMIP_GDT_IDT_BASE_SIZE 4
+#define UMIP_GDT_IDT_LIMIT_SIZE 2
+
+#define	UMIP_INST_SGDT	0	/* 0F 01 /0 */
+#define	UMIP_INST_SIDT	1	/* 0F 01 /1 */
+#define	UMIP_INST_SMSW	3	/* 0F 01 /4 */
+
+/**
+ * identify_insn() - Identify a UMIP-protected instruction
+ * @insn:	Instruction structure with opcode and ModRM byte.
+ *
+ * From the instruction opcode and the reg part of the ModRM byte, identify,
+ * if any, a UMIP-protected instruction.
+ *
+ * Return: a constant that identifies a specific UMIP-protected instruction.
+ * -EINVAL when not an UMIP-protected instruction.
+ */
+static int identify_insn(struct insn *insn)
+{
+	/* By getting modrm we also get the opcode. */
+	insn_get_modrm(insn);
+
+	/* All the instructions of interest start with 0x0f. */
+	if (insn->opcode.bytes[0] != 0xf)
+		return -EINVAL;
+
+	if (insn->opcode.bytes[1] == 0x1) {
+		switch (X86_MODRM_REG(insn->modrm.value)) {
+		case 0:
+			return UMIP_INST_SGDT;
+		case 1:
+			return UMIP_INST_SIDT;
+		case 4:
+			return UMIP_INST_SMSW;
+		default:
+			return -EINVAL;
+		}
+	}
+	/* SLDT AND STR are not emulated */
+	return -EINVAL;
+}
+
+/**
+ * emulate_umip_insn() - Emulate UMIP instructions with dummy values
+ * @insn:	Instruction structure with operands
+ * @umip_inst:	Instruction to emulate
+ * @data:	Buffer into which the dummy values will be copied
+ * @data_size:	Size of the emulated result
+ *
+ * Emulate an instruction protected by UMIP. The result of the emulation
+ * is saved in the provided buffer. The size of the results depends on both
+ * the instruction and type of operand (register vs memory address). Thus,
+ * the size of the result needs to be updated.
+ *
+ * Returns:
+ *
+ * 0 if success, -EINVAL on error while emulating.
+ */
+static int emulate_umip_insn(struct insn *insn, int umip_inst,
+			     unsigned char *data, int *data_size)
+{
+	unsigned long dummy_base_addr, dummy_value;
+	unsigned short dummy_limit = 0;
+
+	if (!data || !data_size || !insn)
+		return -EINVAL;
+	/*
+	 * These two instructions return the base address and limit of the
+	 * global and interrupt descriptor table, respectively. According to the
+	 * Intel Software Development manual, the base address can be 24-bit,
+	 * 32-bit or 64-bit. Limit is always 16-bit. If the operand size is
+	 * 16-bit, the returned value of the base address is supposed to be a
+	 * zero-extended 24-byte number. However, it seems that a 32-byte number
+	 * is always returned irrespective of the operand size.
+	 */
+
+	if (umip_inst == UMIP_INST_SGDT || umip_inst == UMIP_INST_SIDT) {
+		/* SGDT and SIDT do not use registers operands. */
+		if (X86_MODRM_MOD(insn->modrm.value) == 3)
+			return -EINVAL;
+
+		if (umip_inst == UMIP_INST_SGDT)
+			dummy_base_addr = UMIP_DUMMY_GDT_BASE;
+		else
+			dummy_base_addr = UMIP_DUMMY_IDT_BASE;
+
+		*data_size = UMIP_GDT_IDT_LIMIT_SIZE + UMIP_GDT_IDT_BASE_SIZE;
+
+		memcpy(data + 2, &dummy_base_addr, UMIP_GDT_IDT_BASE_SIZE);
+		memcpy(data, &dummy_limit, UMIP_GDT_IDT_LIMIT_SIZE);
+
+	} else if (umip_inst == UMIP_INST_SMSW) {
+		dummy_value = CR0_STATE;
+
+		/*
+		 * Even though the CR0 register has 4 bytes, the number
+		 * of bytes to be copied in the result buffer is determined
+		 * by whether the operand is a register or a memory location.
+		 * If operand is a register, return as many bytes as the operand
+		 * size. If operand is memory, return only the two least
+		 * siginificant bytes of CR0.
+		 */
+		if (X86_MODRM_MOD(insn->modrm.value) == 3)
+			*data_size = insn->opnd_bytes;
+		else
+			*data_size = 2;
+
+		memcpy(data, &dummy_value, *data_size);
+	/* STR and SLDT  are not emulated */
+	} else {
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/**
+ * fixup_umip_exception() - Fixup #GP faults caused by UMIP
+ * @regs:	Registers as saved when entering the #GP trap
+ *
+ * The instructions sgdt, sidt, str, smsw, sldt cause a general protection
+ * fault if executed with CPL > 0 (i.e., from user space). If the offending
+ * user-space process is 32-bit, this function fixes the exception up and
+ * provides dummy values for the sgdt, sidt and smsw; str and sldt are not
+ * fixed up. Also 64-bit user-space processes are not fixed up.
+ *
+ * If operands are memory addresses, results are copied to user-
+ * space memory as indicated by the instruction pointed by EIP using the
+ * registers indicated in the instruction operands. If operands are registers,
+ * results are copied into the context that was saved when entering kernel mode.
+ *
+ * Returns:
+ *
+ * True if emulation was successful; false if not.
+ */
+bool fixup_umip_exception(struct pt_regs *regs)
+{
+	int not_copied, nr_copied, reg_offset, dummy_data_size, umip_inst;
+	unsigned long seg_base = 0, *reg_addr;
+	/* 10 bytes is the maximum size of the result of UMIP instructions */
+	unsigned char dummy_data[10] = { 0 };
+	unsigned char buf[MAX_INSN_SIZE];
+	void __user *uaddr;
+	struct insn insn;
+	char seg_defs;
+
+	/* Do not emulate 64-bit processes. */
+	if (user_64bit_mode(regs))
+		return false;
+
+	/*
+	 * Use the segment base in case user space used a different code
+	 * segment, either in protected (e.g., from an LDT), virtual-8086
+	 * or long (via the FS or GS registers) modes. In most of the cases
+	 * seg_base will be zero as in USER_CS.
+	 */
+	if (!user_64bit_mode(regs))
+		seg_base = insn_get_seg_base(regs, INAT_SEG_REG_CS);
+
+	if (seg_base == -1L)
+		return false;
+
+	not_copied = copy_from_user(buf, (void __user *)(seg_base + regs->ip),
+				    sizeof(buf));
+	nr_copied = sizeof(buf) - not_copied;
+
+	/*
+	 * The copy_from_user above could have failed if user code is protected
+	 * by a memory protection key. Give up on emulation in such a case.
+	 * Should we issue a page fault?
+	 */
+	if (!nr_copied)
+		return false;
+
+	insn_init(&insn, buf, nr_copied, user_64bit_mode(regs));
+
+	/*
+	 * Override the default operand and address sizes with what is specified
+	 * in the code segment descriptor. The instruction decoder only sets
+	 * the address size it to either 4 or 8 address bytes and does nothing
+	 * for the operand bytes. This OK for most of the cases, but we could
+	 * have special cases where, for instance, a 16-bit code segment
+	 * descriptor is used.
+	 * If there is an address override prefix, the instruction decoder
+	 * correctly updates these values, even for 16-bit defaults.
+	 */
+	seg_defs = insn_get_code_seg_defaults(regs);
+	if (seg_defs == -EINVAL)
+		return false;
+
+	insn.addr_bytes = (unsigned char)INSN_CODE_SEG_ADDR_SZ(seg_defs);
+	insn.opnd_bytes = (unsigned char)INSN_CODE_SEG_OPND_SZ(seg_defs);
+
+	insn_get_length(&insn);
+	if (nr_copied < insn.length)
+		return false;
+
+	umip_inst = identify_insn(&insn);
+	if (umip_inst < 0)
+		return false;
+
+	if (emulate_umip_insn(&insn, umip_inst, dummy_data, &dummy_data_size))
+		return false;
+
+	/*
+	 * If operand is a register, write result to the copy of the register
+	 * value that was pushed to the stack when entering into kernel mode.
+	 * Upon exit, the value we write will be restored to the actual hardware
+	 * register.
+	 */
+	if (X86_MODRM_MOD(insn.modrm.value) == 3) {
+		reg_offset = insn_get_modrm_rm_off(&insn, regs);
+
+		/*
+		 * Negative values are usually errors. In memory addressing,
+		 * the exception is -EDOM. Since we expect a register operand,
+		 * all negative values are errors.
+		 */
+		if (reg_offset < 0)
+			return false;
+
+		reg_addr = (unsigned long *)((unsigned long)regs + reg_offset);
+		memcpy(reg_addr, dummy_data, dummy_data_size);
+	} else {
+		uaddr = insn_get_addr_ref(&insn, regs);
+		if ((unsigned long)uaddr == -1L)
+			return false;
+
+		nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
+		if (nr_copied  > 0)
+			return false;
+	}
+
+	/* increase IP to let the program keep going */
+	regs->ip += insn.length;
+	return true;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (23 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 24/29] x86: Add emulation code for UMIP instructions Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-26  7:59   ` Andy Lutomirski
  2017-10-04  3:54 ` [PATCH v9 26/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (3 subsequent siblings)
  28 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu, Tony Luck

fixup_umip_exception() will be called from do_general_protection(). If the
former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
However, when emulation is successful but the emulated result cannot be
copied to user space memory, it is more accurate to issue a SIGSEGV with
SEGV_MAPERR with the offending address. A new function, inspired in
force_sig_info_fault(), is introduced to model the page fault.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/kernel/umip.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 1f338cb..6d07eb5 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -187,6 +187,41 @@ static int emulate_umip_insn(struct insn *insn, int umip_inst,
 }
 
 /**
+ * force_sig_info_umip_fault() - Force a SIGSEGV with SEGV_MAPERR
+ * @addr:	Address that caused the signal
+ * @regs:	Register set containing the instruction pointer
+ *
+ * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This function is
+ * intended to be used to provide a segmentation fault when the result of the
+ * UMIP emulation could not be copied to the user space memory.
+ *
+ * Returns: none
+ */
+static void force_sig_info_umip_fault(void __user *addr, struct pt_regs *regs)
+{
+	siginfo_t info;
+	struct task_struct *tsk = current;
+
+	tsk->thread.cr2		= (unsigned long)addr;
+	tsk->thread.error_code	= X86_PF_USER | X86_PF_WRITE;
+	tsk->thread.trap_nr	= X86_TRAP_PF;
+
+	info.si_signo	= SIGSEGV;
+	info.si_errno	= 0;
+	info.si_code	= SEGV_MAPERR;
+	info.si_addr	= addr;
+	force_sig_info(SIGSEGV, &info, tsk);
+
+	if (!(show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)))
+		return;
+
+	pr_err_ratelimited("%s[%d] umip emulation segfault ip:%lx sp:%lx error:%x in %lx\n",
+			   tsk->comm, task_pid_nr(tsk), regs->ip,
+			   regs->sp, X86_PF_USER | X86_PF_WRITE,
+			   regs->ip);
+}
+
+/**
  * fixup_umip_exception() - Fixup #GP faults caused by UMIP
  * @regs:	Registers as saved when entering the #GP trap
  *
@@ -299,8 +334,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
 			return false;
 
 		nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
-		if (nr_copied  > 0)
-			return false;
+		if (nr_copied  > 0) {
+			/*
+			 * If copy fails, send a signal and tell caller that
+			 * fault was fixed up.
+			 */
+			force_sig_info_umip_fault(uaddr, regs);
+			return true;
+		}
 	}
 
 	/* increase IP to let the program keep going */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 26/29] x86: Enable User-Mode Instruction Prevention
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (24 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 27/29] x86/traps: Fixup general protection faults caused by UMIP Ricardo Neri
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu, Tony Luck

User-Mode Instruction Prevention (UMIP) is enabled by setting/clearing a
bit in %cr4.

It makes sense to enable UMIP at some point while booting, before user
spaces come up. Like SMAP and SMEP, is not critical to have it enabled
very early during boot. This is because UMIP is relevant only when there is
a userspace to be protected from. Given the similarities in relevance, it
makes sense to enable UMIP along with SMAP and SMEP.

UMIP is enabled by default. It can be disabled by adding clearcpuid=514
to the kernel parameters.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/Kconfig             | 10 ++++++++++
 arch/x86/kernel/cpu/common.c | 25 ++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5442735..b7c06d8 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1803,6 +1803,16 @@ config X86_SMAP
 
 	  If unsure, say Y.
 
+config X86_INTEL_UMIP
+	def_bool n
+	depends on CPU_SUP_INTEL
+	prompt "Intel User Mode Instruction Prevention" if EXPERT
+	---help---
+	  The User Mode Instruction Prevention (UMIP) is a security
+	  feature in newer Intel processors. If enabled, a general
+	  protection fault is issued if the instructions SGDT, SLDT,
+	  SIDT, SMSW and STR are executed in user mode.
+
 config X86_INTEL_MPX
 	prompt "Intel MPX (Memory Protection Extensions)"
 	def_bool n
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 03f9a1a..45b5ec4 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -329,6 +329,28 @@ static __always_inline void setup_smap(struct cpuinfo_x86 *c)
 	}
 }
 
+static __always_inline void setup_umip(struct cpuinfo_x86 *c)
+{
+	/* Check the boot processor, plus build option for UMIP. */
+	if (!cpu_feature_enabled(X86_FEATURE_UMIP))
+		goto out;
+
+	/* Check the current processor's cpuid bits. */
+	if (!cpu_has(c, X86_FEATURE_UMIP))
+		goto out;
+
+	cr4_set_bits(X86_CR4_UMIP);
+
+	return;
+
+out:
+	/*
+	 * Make sure UMIP is disabled in case it was enabled in a
+	 * previous boot (e.g., via kexec).
+	 */
+	cr4_clear_bits(X86_CR4_UMIP);
+}
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1147,9 +1169,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	/* Disable the PN if appropriate */
 	squash_the_stupid_serial_number(c);
 
-	/* Set up SMEP/SMAP */
+	/* Set up SMEP/SMAP/UMIP */
 	setup_smep(c);
 	setup_smap(c);
+	setup_umip(c);
 
 	/*
 	 * The vendor-specific functions might have changed features.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 27/29] x86/traps: Fixup general protection faults caused by UMIP
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (25 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 26/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 28/29] selftests/x86: Add tests for User-Mode Instruction Prevention Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 29/29] selftests/x86: Add tests for instruction str and sldt Ricardo Neri
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu, Tony Luck

If the User-Mode Instruction Prevention CPU feature is available and
enabled, a general protection fault will be issued if the instructions
sgdt, sldt, sidt, str or smsw are executed from user-mode context
(CPL > 0). If the fault was caused by any of the instructions protected
by UMIP, fixup_umip_exception() will emulate dummy results for these
instructions as follows: if running a 32-bit process, sgdt, sidt and smsw
are emulated; str and sldt are not emulated. No emulation is done for
64-bit processes.

If emulation is successful, the result is passed to the user space program
and no SIGSEGV signal is emitted.

Please note that fixup_umip_exception() also caters for the case when
the fault originated while running in virtual-8086 mode.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: x86@kernel.org
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/kernel/traps.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a5791f3..4c0aa6c 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -60,6 +60,7 @@
 #include <asm/trace/mpx.h>
 #include <asm/mpx.h>
 #include <asm/vm86.h>
+#include <asm/umip.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -514,6 +515,10 @@ do_general_protection(struct pt_regs *regs, long error_code)
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
 	cond_local_irq_enable(regs);
 
+	if (static_cpu_has(X86_FEATURE_UMIP))
+		if (user_mode(regs) && fixup_umip_exception(regs))
+			return;
+
 	if (v8086_mode(regs)) {
 		local_irq_enable();
 		handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 28/29] selftests/x86: Add tests for User-Mode Instruction Prevention
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (26 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 27/29] x86/traps: Fixup general protection faults caused by UMIP Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  2017-10-04  3:54 ` [PATCH v9 29/29] selftests/x86: Add tests for instruction str and sldt Ricardo Neri
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu

Certain user space programs that run on virtual-8086 mode may utilize
instructions protected by the User-Mode Instruction Prevention (UMIP)
security feature present in new Intel processors: SGDT, SIDT and SMSW. In
such a case, a general protection fault is issued if UMIP is enabled. When
such a fault happens, the kernel traps it and emulates the results of
these instructions with dummy values. The purpose of this new
test is to verify whether the impacted instructions can be executed
without causing such #GP. If no #GP exceptions occur, we expect to exit
virtual-8086 mode from INT3.

The instructions protected by UMIP are executed in representative use
cases:
 a) displacement-only memory addressing
 b) register-indirect memory addressing
 c) results stored directly in operands

Unfortunately, it is not possible to check the results against a set of
expected values because no emulation will occur in systems that do not
have the UMIP feature. Instead, results are printed for verification. A
simple verification is done to ensure that results of all tests are
identical.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 73 ++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c b/tools/testing/selftests/x86/entry_from_vm86.c
index d075ea0..f7d9cea 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -95,6 +95,22 @@ asm (
 	"int3\n\t"
 	"vmcode_int80:\n\t"
 	"int $0x80\n\t"
+	"vmcode_umip:\n\t"
+	/* addressing via displacements */
+	"smsw (2052)\n\t"
+	"sidt (2054)\n\t"
+	"sgdt (2060)\n\t"
+	/* addressing via registers */
+	"mov $2066, %bx\n\t"
+	"smsw (%bx)\n\t"
+	"mov $2068, %bx\n\t"
+	"sidt (%bx)\n\t"
+	"mov $2074, %bx\n\t"
+	"sgdt (%bx)\n\t"
+	/* register operands, only for smsw */
+	"smsw %ax\n\t"
+	"mov %ax, (2080)\n\t"
+	"int3\n\t"
 	".size vmcode, . - vmcode\n\t"
 	"end_vmcode:\n\t"
 	".code32\n\t"
@@ -103,7 +119,7 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-	vmcode_sti[], vmcode_int3[], vmcode_int80[];
+	vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -160,6 +176,58 @@ static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
 	return true;
 }
 
+void do_umip_tests(struct vm86plus_struct *vm86, unsigned char *test_mem)
+{
+	struct table_desc {
+		unsigned short limit;
+		unsigned long base;
+	} __attribute__((packed));
+
+	/* Initialize variables with arbitrary values */
+	struct table_desc gdt1 = { .base = 0x3c3c3c3c, .limit = 0x9999 };
+	struct table_desc gdt2 = { .base = 0x1a1a1a1a, .limit = 0xaeae };
+	struct table_desc idt1 = { .base = 0x7b7b7b7b, .limit = 0xf1f1 };
+	struct table_desc idt2 = { .base = 0x89898989, .limit = 0x1313 };
+	unsigned short msw1 = 0x1414, msw2 = 0x2525, msw3 = 3737;
+
+	/* UMIP -- exit with INT3 unless kernel emulation did not trap #GP */
+	do_test(vm86, vmcode_umip - vmcode, VM86_TRAP, 3, "UMIP tests");
+
+	/* Results from displacement-only addressing */
+	msw1 = *(unsigned short *)(test_mem + 2052);
+	memcpy(&idt1, test_mem + 2054, sizeof(idt1));
+	memcpy(&gdt1, test_mem + 2060, sizeof(gdt1));
+
+	/* Results from register-indirect addressing */
+	msw2 = *(unsigned short *)(test_mem + 2066);
+	memcpy(&idt2, test_mem + 2068, sizeof(idt2));
+	memcpy(&gdt2, test_mem + 2074, sizeof(gdt2));
+
+	/* Results when using register operands */
+	msw3 = *(unsigned short *)(test_mem + 2080);
+
+	printf("[INFO]\tResult from SMSW:[0x%04x]\n", msw1);
+	printf("[INFO]\tResult from SIDT: limit[0x%04x]base[0x%08lx]\n",
+	       idt1.limit, idt1.base);
+	printf("[INFO]\tResult from SGDT: limit[0x%04x]base[0x%08lx]\n",
+	       gdt1.limit, gdt1.base);
+
+	if (msw1 != msw2 || msw1 != msw3)
+		printf("[FAIL]\tAll the results of SMSW should be the same.\n");
+	else
+		printf("[PASS]\tAll the results from SMSW are identical.\n");
+
+	if (memcmp(&gdt1, &gdt2, sizeof(gdt1)))
+		printf("[FAIL]\tAll the results of SGDT should be the same.\n");
+	else
+		printf("[PASS]\tAll the results from SGDT are identical.\n");
+
+	if (memcmp(&idt1, &idt2, sizeof(idt1)))
+		printf("[FAIL]\tAll the results of SIDT should be the same.\n");
+	else
+		printf("[PASS]\tAll the results from SIDT are identical.\n");
+}
+
 int main(void)
 {
 	struct vm86plus_struct v86;
@@ -218,6 +286,9 @@ int main(void)
 	v86.regs.eax = (unsigned int)-1;
 	do_test(&v86, vmcode_int80 - vmcode, VM86_INTx, 0x80, "int80");
 
+	/* UMIP -- should exit with INTx 0x80 unless UMIP was not disabled */
+	do_umip_tests(&v86, addr);
+
 	/* Execute a null pointer */
 	v86.regs.cs = 0;
 	v86.regs.ss = 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v9 29/29] selftests/x86: Add tests for instruction str and sldt
  2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
                   ` (27 preceding siblings ...)
  2017-10-04  3:54 ` [PATCH v9 28/29] selftests/x86: Add tests for User-Mode Instruction Prevention Ricardo Neri
@ 2017-10-04  3:54 ` Ricardo Neri
  28 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-04  3:54 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov
  Cc: Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, x86, ricardo.neri, Ricardo Neri,
	Fenghua Yu

The instructions str and sldt are not recognized when running on virtual-
8086 mode and generate an invalid operand exception. These two
instructions are protected by the Intel User-Mode Instruction Prevention
(UMIP) security feature. In protected mode, if UMIP is enabled, these
instructions generate a general protection fault if called from CPL > 0.
Linux traps the general protection fault and emulates the instructions
sgdt, sidt and smsw; but not str and sldt.

These tests are added to verify that the emulation code does not emulate
these two instructions but the expected invalid operand exception is
seen.

Tests fallback to exit with int3 in case emulation does happen.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c b/tools/testing/selftests/x86/entry_from_vm86.c
index f7d9cea..361466a 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -111,6 +111,11 @@ asm (
 	"smsw %ax\n\t"
 	"mov %ax, (2080)\n\t"
 	"int3\n\t"
+	"vmcode_umip_str:\n\t"
+	"str %eax\n\t"
+	"vmcode_umip_sldt:\n\t"
+	"sldt %eax\n\t"
+	"int3\n\t"
 	".size vmcode, . - vmcode\n\t"
 	"end_vmcode:\n\t"
 	".code32\n\t"
@@ -119,7 +124,8 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-	vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[];
+	vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[],
+	vmcode_umip_str[], vmcode_umip_sldt[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -226,6 +232,16 @@ void do_umip_tests(struct vm86plus_struct *vm86, unsigned char *test_mem)
 		printf("[FAIL]\tAll the results of SIDT should be the same.\n");
 	else
 		printf("[PASS]\tAll the results from SIDT are identical.\n");
+
+	sethandler(SIGILL, sighandler, 0);
+	do_test(vm86, vmcode_umip_str - vmcode, VM86_SIGNAL, 0,
+		"STR instruction");
+	clearhandler(SIGILL);
+
+	sethandler(SIGILL, sighandler, 0);
+	do_test(vm86, vmcode_umip_sldt - vmcode, VM86_SIGNAL, 0,
+		"SLDT instruction");
+	clearhandler(SIGILL);
 }
 
 int main(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses
  2017-10-04  3:54 ` [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
@ 2017-10-05  9:41   ` Borislav Petkov
  2017-10-05 17:38     ` Neri, Ricardo
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-05  9:41 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Nathan Howard, Adan Hawthorn,
	Joe Perches

On Tue, Oct 03, 2017 at 08:54:09PM -0700, Ricardo Neri wrote:
> Even though memory addresses are unsigned, the operands used to compute the
> effective address do have a sign. This is true for ModRM.rm, SIB.base,
> SIB.index as well as the displacement bytes. Thus, signed variables shall
> be used when computing the effective address from these operands. Once the
> signed effective address has been computed, it is casted to an unsigned
> long to determine the linear address.
> 
> Variables are renamed to better reflect the type of address being
> computed.
> 
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Nathan Howard <liverlint@gmail.com>
> Cc: Adan Hawthorn <adanhawthorn@gmail.com>
> Cc: Joe Perches <joe@perches.com>
> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  arch/x86/mm/mpx.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses
  2017-10-05  9:41   ` Borislav Petkov
@ 2017-10-05 17:38     ` Neri, Ricardo
  0 siblings, 0 replies; 83+ messages in thread
From: Neri, Ricardo @ 2017-10-05 17:38 UTC (permalink / raw)
  To: bp
  Cc: corbet, linux-kernel, peterz, x86, Ren, Qiaowei, adam.buchbinder,
	colin.king, tglx, adanhawthorn, dave.hansen, ray.huang, joe,
	vbabka, mst, akpm, hpa, brgerst, mingo, luto, pbonzini, Shankar,
	Ravi V, mhiramat, jslaby, liverlint, Gortmaker, Paul (Wind River),
	cmetcalf, slaoub, shuah, lstoakes

On Thu, 2017-10-05 at 11:41 +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:09PM -0700, Ricardo Neri wrote:
> > 
> > Even though memory addresses are unsigned, the operands used to compute the
> > effective address do have a sign. This is true for ModRM.rm, SIB.base,
> > SIB.index as well as the displacement bytes. Thus, signed variables shall
> > be used when computing the effective address from these operands. Once the
> > signed effective address has been computed, it is casted to an unsigned
> > long to determine the linear address.
> > 
> > Variables are renamed to better reflect the type of address being
> > computed.
> > 
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Nathan Howard <liverlint@gmail.com>
> > Cc: Adan Hawthorn <adanhawthorn@gmail.com>
> > Cc: Joe Perches <joe@perches.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  arch/x86/mm/mpx.c | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> Reviewed-by: Borislav Petkov <bp@suse.de>

Thank you!

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type
  2017-10-04  3:54 ` [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
@ 2017-10-07 16:22   ` Borislav Petkov
  2017-10-09 23:56     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-07 16:22 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:13PM -0700, Ricardo Neri wrote:
> We are not in a critical failure path. The invalid register type is caused
> when trying to decode invalid instruction bytes from a user-space program.
> Thus, simply print an error message. To prevent this warning from being
> abused from user space programs, use the rate-limited variant of pr_err().
> along with a descriptive prefix.
> 
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type
  2017-10-07 16:22   ` Borislav Petkov
@ 2017-10-09 23:56     ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-09 23:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Sat, 2017-10-07 at 18:22 +0200, Borislav Petkov wrote:
> Reviewed-by: Borislav Petkov <bp@suse.de>

Thank you!

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-04  3:54 ` [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
@ 2017-10-10 22:41   ` Borislav Petkov
  2017-10-12  1:12     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-10 22:41 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:16PM -0700, Ricardo Neri wrote:
> When computing a linear address and segmentation is used, we need to know
> the base address of the segment involved in the computation. In most of
> the cases, the segment base address will be zero as in USER_DS/USER32_DS.

...

> ---
>  arch/x86/include/asm/inat.h |  10 ++
>  arch/x86/lib/insn-eval.c    | 321 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 331 insertions(+)

Ok, some more fixes ontop. I carved out the code under the
resolve_default_idx: label into a separate function. This made
resolve_seg_reg() pretty-much trivial to follow. Also renamed some
functions and variables to better denote what they do.

Please add

Improvements-by: Borislav Petkov <bp@suse.de>

to your commit message if you use this. Thanks.

---
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 77b48f99d73a..d02b94ace0f1 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -49,7 +49,7 @@ static bool is_string_insn(struct insn *insn)
 }
 
 /**
- * get_overridden_seg_reg_idx() - obtain segment register override index
+ * get_seg_reg_override_idx() - obtain segment register override index
  * @insn:	Instruction with segment override prefixes
  *
  * Inspect the instruction prefixes and find segment overrides, if any.
@@ -62,10 +62,10 @@ static bool is_string_insn(struct insn *insn)
  *
  * -EINVAL in case of error.
  */
-static int get_overridden_seg_reg_idx(struct insn *insn)
+static int get_seg_reg_override_idx(struct insn *insn)
 {
 	int idx = INAT_SEG_REG_DEFAULT;
-	int sel_overrides = 0, i;
+	int num_overrides = 0, i;
 
 	if (!insn)
 		return -EINVAL;
@@ -80,41 +80,41 @@ static int get_overridden_seg_reg_idx(struct insn *insn)
 		switch (attr) {
 		case INAT_MAKE_PREFIX(INAT_PFX_CS):
 			idx = INAT_SEG_REG_CS;
-			sel_overrides++;
+			num_overrides++;
 			break;
 		case INAT_MAKE_PREFIX(INAT_PFX_SS):
 			idx = INAT_SEG_REG_SS;
-			sel_overrides++;
+			num_overrides++;
 			break;
 		case INAT_MAKE_PREFIX(INAT_PFX_DS):
 			idx = INAT_SEG_REG_DS;
-			sel_overrides++;
+			num_overrides++;
 			break;
 		case INAT_MAKE_PREFIX(INAT_PFX_ES):
 			idx = INAT_SEG_REG_ES;
-			sel_overrides++;
+			num_overrides++;
 			break;
 		case INAT_MAKE_PREFIX(INAT_PFX_FS):
 			idx = INAT_SEG_REG_FS;
-			sel_overrides++;
+			num_overrides++;
 			break;
 		case INAT_MAKE_PREFIX(INAT_PFX_GS):
 			idx = INAT_SEG_REG_GS;
-			sel_overrides++;
+			num_overrides++;
 			break;
 		/* No default action needed. */
 		}
 	}
 
 	/* More than one segment override prefix leads to undefined behavior. */
-	if (sel_overrides > 1)
+	if (num_overrides > 1)
 		return -EINVAL;
 
 	return idx;
 }
 
 /**
- * allow_seg_reg_overrides() - check if segment override prefixes are allowed
+ * check_seg_overrides() - check if segment override prefixes are allowed
  * @insn:	Instruction with segment override prefixes
  * @regoff:	Operand offset, in pt_regs, for which the check is performed
  *
@@ -129,7 +129,7 @@ static int get_overridden_seg_reg_idx(struct insn *insn)
  *
  * -EINVAL in case of error.
  */
-static int allow_seg_reg_overrides(struct insn *insn, int regoff)
+static int check_seg_overrides(struct insn *insn, int regoff)
 {
 	/*
 	 * Segment override prefixes should not be used for rIP. It is not
@@ -148,6 +148,55 @@ static int allow_seg_reg_overrides(struct insn *insn, int regoff)
 	return 1;
 }
 
+static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
+{
+	if (user_64bit_mode(regs))
+		return INAT_SEG_REG_IGNORE;
+
+	/*
+	 * If we are here, we use the default segment register as described
+	 * in the Intel documentation:
+	 *
+	 *  + DS for all references involving r[ABCD]X, and rSI.
+	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
+	 *  + AX, CX and DX are not valid register operands in 16-bit addresses.
+	 *    encodings but are valid for 32-bit and 64-bit encodings.
+	 *  + -EDOM is reserved to identify for cases in which no register
+	 *    is used (i.e., displacement-only addressing). Use DS.
+	 *  + SS for (E)SP or (E)BP.
+	 *  + CS for (E)IP.
+	 */
+	switch (off) {
+	case offsetof(struct pt_regs, ax):
+	case offsetof(struct pt_regs, cx):
+	case offsetof(struct pt_regs, dx):
+		/* Need insn to verify address size. */
+		if (insn->addr_bytes == 2)
+			return -EINVAL;
+
+	case -EDOM:
+	case offsetof(struct pt_regs, bx):
+	case offsetof(struct pt_regs, si):
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, di):
+		if (is_string_insn(insn))
+			return INAT_SEG_REG_ES;
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, bp):
+	case offsetof(struct pt_regs, sp):
+		return INAT_SEG_REG_SS;
+
+	case offsetof(struct pt_regs, ip):
+		return INAT_SEG_REG_CS;
+
+	default:
+		return -EINVAL;
+	}
+}
+
+
 /**
  * resolve_seg_reg() - obtain segment register index
  * @insn:	Instruction with operands
@@ -194,24 +243,24 @@ static int allow_seg_reg_overrides(struct insn *insn, int regoff)
  */
 static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
 {
-	int use_pfx_overrides, idx;
+	int ret, idx;
 
-	use_pfx_overrides = allow_seg_reg_overrides(insn, regoff);
-	if (use_pfx_overrides < 0)
-		return use_pfx_overrides;
+	ret = check_seg_overrides(insn, regoff);
+	if (ret < 0)
+		return ret;
 
-	if (use_pfx_overrides == 0)
-		goto resolve_default_idx;
+	if (!ret)
+		return resolve_default_seg(insn, regs, regoff);
 
 	if (!insn)
 		return -EINVAL;
 
-	idx = get_overridden_seg_reg_idx(insn);
+	idx = get_seg_reg_override_idx(insn);
 	if (idx < 0)
 		return idx;
 
 	if (idx == INAT_SEG_REG_DEFAULT)
-		goto resolve_default_idx;
+		return resolve_default_seg(insn, regs, regoff);
 
 	/*
 	 * In long mode, segment override prefixes are ignored, except for
@@ -224,53 +273,6 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
 	}
 
 	return idx;
-
-resolve_default_idx:
-
-	if (user_64bit_mode(regs))
-		return INAT_SEG_REG_IGNORE;
-	/*
-	 * If we are here, we use the default segment register as described
-	 * in the Intel documentation:
-	 *
-	 *  + DS for all references involving r[ABCD]X, and rSI.
-	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
-	 *  + AX, CX and DX are not valid register operands in 16-bit addresses.
-	 *    encodings but are valid for 32-bit and 64-bit encodings.
-	 *  + -EDOM is reserved to identify for cases in which no register
-	 *    is used (i.e., displacement-only addressing). Use DS.
-	 *  + SS for (E)SP or (E)BP.
-	 *  + CS for (E)IP.
-	 */
-
-	switch (regoff) {
-	case offsetof(struct pt_regs, ax):
-	case offsetof(struct pt_regs, cx):
-	case offsetof(struct pt_regs, dx):
-		/* Need insn to verify address size. */
-		if (insn->addr_bytes == 2)
-			return -EINVAL;
-
-	case -EDOM:
-	case offsetof(struct pt_regs, bx):
-	case offsetof(struct pt_regs, si):
-		return INAT_SEG_REG_DS;
-
-	case offsetof(struct pt_regs, di):
-		if (is_string_insn(insn))
-			return INAT_SEG_REG_ES;
-		return INAT_SEG_REG_DS;
-
-	case offsetof(struct pt_regs, bp):
-	case offsetof(struct pt_regs, sp):
-		return INAT_SEG_REG_SS;
-
-	case offsetof(struct pt_regs, ip):
-		return INAT_SEG_REG_CS;
-
-	default:
-		return -EINVAL;
-	}
 }
 
 /**

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor
  2017-10-04  3:54 ` [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
@ 2017-10-11 14:57   ` Borislav Petkov
  2017-10-12  0:45     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-11 14:57 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:17PM -0700, Ricardo Neri wrote:
> The segment descriptor contains information that is relevant to how linear
> addresses need to be computed. It contains the default size of addresses
> as well as the base address of the segment. Thus, given a segment
> selector, we ought to look at segment descriptor to correctly calculate
> the linear address.
> 
> In protected mode, the segment selector might indicate a segment
> descriptor from either the global descriptor table or a local descriptor
> table. Both cases are considered in this function.

...

> +static struct desc_struct *get_desc(unsigned short sel)
> +{
> +	struct desc_ptr gdt_desc = {0, 0};
> +	unsigned long desc_base;
> +
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
> +	struct desc_struct *desc = NULL;
> +	struct ldt_struct *ldt;

You moved those out of the if-statement even though they're needed only
in that scope. Why?

Here's a diff that moves them back there and improves the function comment.

---
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 62975f825556..c4e82bb4c4d3 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -456,11 +456,11 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 }
 
 /**
- * get_desc() - Obtain address of segment descriptor
+ * get_desc() - Obtain pointer to a segment descriptor
  * @sel:	Segment selector
  *
- * Given a segment selector, obtain a pointer to the segment descriptor.
- * Both global and local descriptor tables are supported.
+ * Given a segment selector, obtain a pointer to the corresponding segment
+ * descriptor. Both global and local descriptor tables are supported.
  *
  * Returns:
  *
@@ -474,10 +474,10 @@ static struct desc_struct *get_desc(unsigned short sel)
 	unsigned long desc_base;
 
 #ifdef CONFIG_MODIFY_LDT_SYSCALL
-	struct desc_struct *desc = NULL;
-	struct ldt_struct *ldt;
-
 	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+		struct desc_struct *desc = NULL;
+		struct ldt_struct *ldt;
+
 		/* Bits [15:3] contain the index of the desired entry. */
 		sel >>= 3;
 
-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-04  3:54 ` [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
@ 2017-10-11 15:15   ` Borislav Petkov
  2017-10-11 19:57     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-11 15:15 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:18PM -0700, Ricardo Neri wrote:
> With segmentation, the base address of the segment is needed to compute a
> linear address. This base address is obtained from the applicable segment
> descriptor. Such segment descriptor is referenced from a segment selector.

...

> +unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
> +{
> +	struct desc_struct *desc;
> +	short sel;
> +
> +	sel = get_segment_selector(regs, seg_reg_idx);
> +	if (sel < 0)
> +		return -1L;
> +
> +	if (v8086_mode(regs))
> +		/*
> +		 * Base is simply the segment selector shifted 4
> +		 * positions to the right.
> +		 */
> +		return (unsigned long)(sel << 4);
> +
> +	if (user_64bit_mode(regs)) {
> +		/*
> +		 * Only FS or GS will have a base address, the rest of
> +		 * the segments' bases are forced to 0.
> +		 */
> +		unsigned long base;
> +
> +		if (seg_reg_idx == INAT_SEG_REG_FS)
> +			rdmsrl(MSR_FS_BASE, base);
> +		else if (seg_reg_idx == INAT_SEG_REG_GS)
> +			/*
> +			 * swapgs was called at the kernel entry point. Thus,
> +			 * MSR_KERNEL_GS_BASE will have the user-space GS base.
> +			 */
> +			rdmsrl(MSR_KERNEL_GS_BASE, base);
> +		else if (seg_reg_idx != INAT_SEG_REG_IGNORE)
> +			/* We should ignore the rest of segment registers. */
> +			base = -1L;

When is that case ever possible in long mode? You either have GS/FS
bases or 0. What's the meaning of -1L in that case?

Otherwise just minor things:

---
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 7ba5379a2923..e7e82b343bd0 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -515,13 +515,13 @@ static struct desc_struct *get_desc(unsigned short sel)
  *
  * Obtain the base address of the segment as indicated by the segment descriptor
  * pointed by the segment selector. The segment selector is obtained from the
- * input segment register index seg_reg_idx.
+ * input segment register index @seg_reg_idx.
  *
  * Returns:
  *
  * In protected mode, base address of the segment. Zero in long mode,
  * except when FS or GS are used. In virtual-8086 mode, the segment
- * selector shifted 4 positions to the right.
+ * selector shifted 4 bits to the right.
  *
  * -1L in case of error.
  */
@@ -537,7 +537,7 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
 	if (v8086_mode(regs))
 		/*
 		 * Base is simply the segment selector shifted 4
-		 * positions to the right.
+		 * bits to the right.
 		 */
 		return (unsigned long)(sel << 4);
 
-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-11 15:15   ` Borislav Petkov
@ 2017-10-11 19:57     ` Ricardo Neri
  2017-10-11 20:16       ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-11 19:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, 2017-10-11 at 17:15 +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:18PM -0700, Ricardo Neri wrote:
> > 
> > With segmentation, the base address of the segment is needed to compute a
> > linear address. This base address is obtained from the applicable segment
> > descriptor. Such segment descriptor is referenced from a segment selector.
> ...
> 
> > 
> > +unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
> > +{
> > +	struct desc_struct *desc;
> > +	short sel;
> > +
> > +	sel = get_segment_selector(regs, seg_reg_idx);
> > +	if (sel < 0)
> > +		return -1L;
> > +
> > +	if (v8086_mode(regs))
> > +		/*
> > +		 * Base is simply the segment selector shifted 4
> > +		 * positions to the right.
> > +		 */
> > +		return (unsigned long)(sel << 4);
> > +
> > +	if (user_64bit_mode(regs)) {
> > +		/*
> > +		 * Only FS or GS will have a base address, the rest of
> > +		 * the segments' bases are forced to 0.
> > +		 */
> > +		unsigned long base;
> > +
> > +		if (seg_reg_idx == INAT_SEG_REG_FS)
> > +			rdmsrl(MSR_FS_BASE, base);
> > +		else if (seg_reg_idx == INAT_SEG_REG_GS)
> > +			/*
> > +			 * swapgs was called at the kernel entry point.
> > Thus,
> > +			 * MSR_KERNEL_GS_BASE will have the user-space GS
> > base.
> > +			 */
> > +			rdmsrl(MSR_KERNEL_GS_BASE, base);
> > +		else if (seg_reg_idx != INAT_SEG_REG_IGNORE)
> > +			/* We should ignore the rest of segment registers.
> > */
> > +			base = -1L;
> When is that case ever possible in long mode? You either have GS/FS
> bases or 0. What's the meaning of -1L in that case?

This is meant to be an error case. In long mode, only INAT_SEG_REG_IGNORE/FS/GS
are valid. All other indices are invalid.

Perhaps we could return -EINVAL instead?

> Otherwise just minor things:
> 
> ---
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 7ba5379a2923..e7e82b343bd0 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -515,13 +515,13 @@ static struct desc_struct *get_desc(unsigned short sel)
>   *
>   * Obtain the base address of the segment as indicated by the segment
> descriptor
>   * pointed by the segment selector. The segment selector is obtained from the
> - * input segment register index seg_reg_idx.
> + * input segment register index @seg_reg_idx.
>   *
>   * Returns:
>   *
>   * In protected mode, base address of the segment. Zero in long mode,
>   * except when FS or GS are used. In virtual-8086 mode, the segment
> - * selector shifted 4 positions to the right.
> + * selector shifted 4 bits to the right.
>   *
>   * -1L in case of error.
>   */
> @@ -537,7 +537,7 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int
> seg_reg_idx)
>  	if (v8086_mode(regs))
>  		/*
>  		 * Base is simply the segment selector shifted 4
> -		 * positions to the right.
> +		 * bits to the right.
>  		 */
>  		return (unsigned long)(sel << 4);

Thanks! I will incorporate these changes along with your Improvements-by tag.

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-11 19:57     ` Ricardo Neri
@ 2017-10-11 20:16       ` Borislav Petkov
  2017-10-12  1:24         ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-11 20:16 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, Oct 11, 2017 at 12:57:01PM -0700, Ricardo Neri wrote:
> This is meant to be an error case. In long mode, only INAT_SEG_REG_IGNORE/FS/GS
> are valid. All other indices are invalid.
> 
> Perhaps we could return -EINVAL instead?

So, my question is, when are you ever going to have that case? What
constellation of events would ever hit this else branch for long mode?
Because it looks impossible to me. What I can imagine only is something
like this:

                else if (seg_reg != INAT_SEG_REG_IGNORE)
			WARN_ONCE(1, "This should never happen!\n");

assertion.

But you don't really need that - you can simply ignore seg_reg in that
case:

        if (user_64bit_mode(regs)) {
                /*
                 * Only FS or GS will have a base address, the rest of
                 * the segments' bases are forced to 0.
                 */
                unsigned long base;

                if (seg_reg == INAT_SEG_REG_FS)
                        rdmsrl(MSR_FS_BASE, base);
                else if (seg_reg == INAT_SEG_REG_GS)
                        /*
                         * swapgs was called at the kernel entry point. Thus,
                         * MSR_KERNEL_GS_BASE will have the user-space GS base.
                         */
                        rdmsrl(MSR_KERNEL_GS_BASE, base);
                else
                        base = 0;

                return base;
        }

Or am I missing something?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor
  2017-10-11 14:57   ` Borislav Petkov
@ 2017-10-12  0:45     ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-12  0:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, 2017-10-11 at 16:57 +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:17PM -0700, Ricardo Neri wrote:
> > 
> > The segment descriptor contains information that is relevant to how linear
> > addresses need to be computed. It contains the default size of addresses
> > as well as the base address of the segment. Thus, given a segment
> > selector, we ought to look at segment descriptor to correctly calculate
> > the linear address.
> > 
> > In protected mode, the segment selector might indicate a segment
> > descriptor from either the global descriptor table or a local descriptor
> > table. Both cases are considered in this function.
> ...
> 
> > 
> > +static struct desc_struct *get_desc(unsigned short sel)
> > +{
> > +	struct desc_ptr gdt_desc = {0, 0};
> > +	unsigned long desc_base;
> > +
> > +#ifdef CONFIG_MODIFY_LDT_SYSCALL
> > +	struct desc_struct *desc = NULL;
> > +	struct ldt_struct *ldt;
> You moved those out of the if-statement even though they're needed only
> in that scope. Why?

No reason, this is not correct. I will move them into the if-statement.


> Here's a diff that moves them back there and improves the function comment.
> 
> ---
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 62975f825556..c4e82bb4c4d3 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -456,11 +456,11 @@ static int get_reg_offset(struct insn *insn, struct
> pt_regs *regs,
>  }
>  
>  /**
> - * get_desc() - Obtain address of segment descriptor
> + * get_desc() - Obtain pointer to a segment descriptor
>   * @sel:	Segment selector
>   *
> - * Given a segment selector, obtain a pointer to the segment descriptor.
> - * Both global and local descriptor tables are supported.
> + * Given a segment selector, obtain a pointer to the corresponding segment
> + * descriptor. Both global and local descriptor tables are supported.
>   *
>   * Returns:
>   *
> @@ -474,10 +474,10 @@ static struct desc_struct *get_desc(unsigned short sel)
>  	unsigned long desc_base;
>  
>  #ifdef CONFIG_MODIFY_LDT_SYSCALL
> -	struct desc_struct *desc = NULL;
> -	struct ldt_struct *ldt;
> -
>  	if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> +		struct desc_struct *desc = NULL;
> +		struct ldt_struct *ldt;
> +
>  		/* Bits [15:3] contain the index of the desired entry. */
>  		sel >>= 3;
> 

I will take your diff along with your Improvements-by: tag.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-10 22:41   ` Borislav Petkov
@ 2017-10-12  1:12     ` Ricardo Neri
  2017-10-12  9:48       ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-12  1:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, 2017-10-11 at 00:41 +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:16PM -0700, Ricardo Neri wrote:
> > 
> > When computing a linear address and segmentation is used, we need to know
> > the base address of the segment involved in the computation. In most of
> > the cases, the segment base address will be zero as in USER_DS/USER32_DS.
> ...
> 
> > 
> > ---
> >  arch/x86/include/asm/inat.h |  10 ++
> >  arch/x86/lib/insn-eval.c    | 321
> > ++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 331 insertions(+)
> Ok, some more fixes ontop. I carved out the code under the
> resolve_default_idx: label into a separate function. This made
> resolve_seg_reg() pretty-much trivial to follow. Also renamed some
> functions and variables to better denote what they do.

Thanks! I will take your changes.
> 
> Please add
> 
> Improvements-by: Borislav Petkov <bp@suse.de>
> 
> to your commit message if you use this. Thanks.

Will do.
> 
> ---
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 77b48f99d73a..d02b94ace0f1 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -49,7 +49,7 @@ static bool is_string_insn(struct insn *insn)
>  }
>  
>  /**
> - * get_overridden_seg_reg_idx() - obtain segment register override index
> + * get_seg_reg_override_idx() - obtain segment register override index
>   * @insn:	Instruction with segment override prefixes
>   *
>   * Inspect the instruction prefixes and find segment overrides, if any.
> @@ -62,10 +62,10 @@ static bool is_string_insn(struct insn *insn)
>   *
>   * -EINVAL in case of error.
>   */
> -static int get_overridden_seg_reg_idx(struct insn *insn)
> +static int get_seg_reg_override_idx(struct insn *insn)
>  {
>  	int idx = INAT_SEG_REG_DEFAULT;
> -	int sel_overrides = 0, i;
> +	int num_overrides = 0, i;
>  
>  	if (!insn)
>  		return -EINVAL;
> @@ -80,41 +80,41 @@ static int get_overridden_seg_reg_idx(struct insn *insn)
>  		switch (attr) {
>  		case INAT_MAKE_PREFIX(INAT_PFX_CS):
>  			idx = INAT_SEG_REG_CS;
> -			sel_overrides++;
> +			num_overrides++;
>  			break;
>  		case INAT_MAKE_PREFIX(INAT_PFX_SS):
>  			idx = INAT_SEG_REG_SS;
> -			sel_overrides++;
> +			num_overrides++;
>  			break;
>  		case INAT_MAKE_PREFIX(INAT_PFX_DS):
>  			idx = INAT_SEG_REG_DS;
> -			sel_overrides++;
> +			num_overrides++;
>  			break;
>  		case INAT_MAKE_PREFIX(INAT_PFX_ES):
>  			idx = INAT_SEG_REG_ES;
> -			sel_overrides++;
> +			num_overrides++;
>  			break;
>  		case INAT_MAKE_PREFIX(INAT_PFX_FS):
>  			idx = INAT_SEG_REG_FS;
> -			sel_overrides++;
> +			num_overrides++;
>  			break;
>  		case INAT_MAKE_PREFIX(INAT_PFX_GS):
>  			idx = INAT_SEG_REG_GS;
> -			sel_overrides++;
> +			num_overrides++;
>  			break;
>  		/* No default action needed. */
>  		}
>  	}
>  
>  	/* More than one segment override prefix leads to undefined behavior.
> */
> -	if (sel_overrides > 1)
> +	if (num_overrides > 1)
>  		return -EINVAL;
>  
>  	return idx;
>  }
>  
>  /**
> - * allow_seg_reg_overrides() - check if segment override prefixes are allowed
> + * check_seg_overrides() - check if segment override prefixes are allowed
>   * @insn:	Instruction with segment override prefixes
>   * @regoff:	Operand offset, in pt_regs, for which the check is
> performed
>   *
> @@ -129,7 +129,7 @@ static int get_overridden_seg_reg_idx(struct insn *insn)
>   *
>   * -EINVAL in case of error.
>   */
> -static int allow_seg_reg_overrides(struct insn *insn, int regoff)
> +static int check_seg_overrides(struct insn *insn, int regoff)
>  {
>  	/*
>  	 * Segment override prefixes should not be used for rIP. It is not
> @@ -148,6 +148,55 @@ static int allow_seg_reg_overrides(struct insn *insn, int
> regoff)
>  	return 1;
>  }
>  
> +static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int
> off)
> +{

Shouldn't this function check for a null insn since it is used here?

> +	if (user_64bit_mode(regs))
> +		return INAT_SEG_REG_IGNORE;
> +
> +	/*
> +	 * If we are here, we use the default segment register as described
> +	 * in the Intel documentation:
> +	 *
> +	 *  + DS for all references involving r[ABCD]X, and rSI.
> +	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
> +	 *  + AX, CX and DX are not valid register operands in 16-bit
> addresses.
> +	 *    encodings but are valid for 32-bit and 64-bit encodings.
> +	 *  + -EDOM is reserved to identify for cases in which no register
> +	 *    is used (i.e., displacement-only addressing). Use DS.
> +	 *  + SS for (E)SP or (E)BP.
> +	 *  + CS for (E)IP.
> +	 */
> +	switch (off) {
> +	case offsetof(struct pt_regs, ax):
> +	case offsetof(struct pt_regs, cx):
> +	case offsetof(struct pt_regs, dx):
> +		/* Need insn to verify address size. */
> +		if (insn->addr_bytes == 2)
> +			return -EINVAL;
> +
> +	case -EDOM:
> +	case offsetof(struct pt_regs, bx):
> +	case offsetof(struct pt_regs, si):
> +		return INAT_SEG_REG_DS;
> +
> +	case offsetof(struct pt_regs, di):
> +		if (is_string_insn(insn))
> +			return INAT_SEG_REG_ES;
> +		return INAT_SEG_REG_DS;
> +
> +	case offsetof(struct pt_regs, bp):
> +	case offsetof(struct pt_regs, sp):
> +		return INAT_SEG_REG_SS;
> +
> +	case offsetof(struct pt_regs, ip):
> +		return INAT_SEG_REG_CS;
> +
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +
>  /**
>   * resolve_seg_reg() - obtain segment register index
>   * @insn:	Instruction with operands
> @@ -194,24 +243,24 @@ static int allow_seg_reg_overrides(struct insn *insn,
> int regoff)
>   */
>  static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int
> regoff)
>  {
> -	int use_pfx_overrides, idx;
> +	int ret, idx;
>  
> -	use_pfx_overrides = allow_seg_reg_overrides(insn, regoff);
> -	if (use_pfx_overrides < 0)
> -		return use_pfx_overrides;
> +	ret = check_seg_overrides(insn, regoff);
> +	if (ret < 0)
> +		return ret;
>  
> -	if (use_pfx_overrides == 0)
> -		goto resolve_default_idx;
> +	if (!ret)
> +		return resolve_default_seg(insn, regs, regoff);
>  
>  	if (!insn)
>  		return -EINVAL;

Could this check be removed? insn is not used for anything but passed to other
functions that do perform this check.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-11 20:16       ` Borislav Petkov
@ 2017-10-12  1:24         ` Ricardo Neri
  2017-10-12 16:02           ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-12  1:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, 2017-10-11 at 22:16 +0200, Borislav Petkov wrote:
> On Wed, Oct 11, 2017 at 12:57:01PM -0700, Ricardo Neri wrote:
> > 
> > This is meant to be an error case. In long mode,
> > only INAT_SEG_REG_IGNORE/FS/GS
> > are valid. All other indices are invalid.
> > 
> > Perhaps we could return -EINVAL instead?
> So, my question is, when are you ever going to have that case? What
> constellation of events would ever hit this else branch for long mode?
> Because it looks impossible to me. What I can imagine only is something
> like this:
> 
>                 else if (seg_reg != INAT_SEG_REG_IGNORE)
> 			WARN_ONCE(1, "This should never happen!\n");
> 
> assertion.

To clarify, I think you mean seg_reg_idx.

Yes, it would be impossible to hit this else branch provided that callers don't
attempt to use an invalid seg_reg_idx while in long mode. Probably this is not
critical as this is a static function and as such we control who can call it and
make sure seg_reg_idx is always valid (i.e., INAT_SEG_REG_IGNORE/FS/GS in long
mode).



> But you don't really need that - you can simply ignore seg_reg in that
> case:
> 
>         if (user_64bit_mode(regs)) {
>                 /*
>                  * Only FS or GS will have a base address, the rest of
>                  * the segments' bases are forced to 0.
>                  */
>                 unsigned long base;
> 
>                 if (seg_reg == INAT_SEG_REG_FS)
>                         rdmsrl(MSR_FS_BASE, base);
>                 else if (seg_reg == INAT_SEG_REG_GS)
>                         /*
>                          * swapgs was called at the kernel entry point. Thus,
>                          * MSR_KERNEL_GS_BASE will have the user-space GS
> base.
>                          */
>                         rdmsrl(MSR_KERNEL_GS_BASE, base);
>                 else
>                         base = 0;
> 
>                 return base;
>         }
> 
> Or am I missing something?

My intention is to let the caller know about the invalid seg_reg_idx instead of
silently correcting the caller's input by ignoring seg_reg_idx.

On the other hand, in long mode, hardware ignore all segment registers except FS
and GS.

Hence, I guess I can remove the check in question.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-12  1:12     ` Ricardo Neri
@ 2017-10-12  9:48       ` Borislav Petkov
  2017-10-13  1:08         ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-12  9:48 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, Oct 11, 2017 at 06:12:30PM -0700, Ricardo Neri wrote:
> Shouldn't this function check for a null insn since it is used here?

I have to say, this whole codepath from insn_get_seg_base() with
insn==NULL is nasty but I don't see a way around it as we need to know
how many bytes to copy and from where. Can't think of a better solution
without duplicating a lot of code. :-\

So how about this?

If the patch is hard to read, you can apply it and look at the code. But
here's the gist:

* You pull up the rIP check and do that directly in resolve_seg_reg()
and return INAT_SEG_REG_CS there immediately so you don't have to call
resolve_default_seg().

This way, you get the only case out of the way where insn can be NULL.

Then you can do the if (!insn) check once and now you have a valid insn.

check_seg_overrides() can then return simply bool and you can get rid of
the remaining if (!insn) checks down the road.

But please double-check me if I missed a case - the flow is not trivial.

Thx.

---
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index e7e82b343bd0..3c65fa9178bc 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -71,9 +71,6 @@ static int get_seg_reg_override_idx(struct insn *insn)
 	int idx = INAT_SEG_REG_DEFAULT;
 	int num_overrides = 0, i;
 
-	if (!insn)
-		return -EINVAL;
-
 	insn_get_prefixes(insn);
 
 	/* Look for any segment override prefixes. */
@@ -128,28 +125,16 @@ static int get_seg_reg_override_idx(struct insn *insn)
  *
  * Returns:
  *
- * 1 if segment override prefixes can be used with the register indicated
- * in regoff. 0 if otherwise.
+ * True if segment override prefixes can be used with the register indicated
+ * in regoff. False if otherwise.
  *
- * -EINVAL in case of error.
  */
-static int check_seg_overrides(struct insn *insn, int regoff)
+static bool check_seg_overrides(struct insn *insn, int regoff)
 {
-	/*
-	 * Segment override prefixes should not be used for rIP. It is not
-	 * necessary to inspect the instruction structure.
-	 */
-	if (regoff == offsetof(struct pt_regs, ip))
-		return 0;
-
-	/* Subsequent checks require a valid insn. */
-	if (!insn)
-		return -EINVAL;
-
 	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
-		return 0;
+		return false;
 
-	return 1;
+	return true;
 }
 
 static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
@@ -247,18 +232,21 @@ static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
  */
 static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
 {
-	int ret, idx;
+	int idx;
 
-	ret = check_seg_overrides(insn, regoff);
-	if (ret < 0)
-		return ret;
-
-	if (!ret)
-		return resolve_default_seg(insn, regs, regoff);
+	/*
+	 * Segment override prefixes should not be used for rIP. It is not
+	 * necessary to inspect the instruction.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip))
+		return INAT_SEG_REG_CS;
 
 	if (!insn)
 		return -EINVAL;
 
+	if (!check_seg_overrides(insn, regoff))
+		return resolve_default_seg(insn, regs, regoff);
+
 	idx = get_seg_reg_override_idx(insn);
 	if (idx < 0)
 		return idx;

> Could this check be removed? insn is not used for anything but passed to other
> functions that do perform this check.

Yap, eventually. :)

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit
  2017-10-12  1:24         ` Ricardo Neri
@ 2017-10-12 16:02           ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-12 16:02 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, Oct 11, 2017 at 06:24:03PM -0700, Ricardo Neri wrote:
> On the other hand, in long mode, hardware ignore all segment registers except FS
> and GS.

Yap.

> Hence, I guess I can remove the check in question.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment
  2017-10-04  3:54 ` [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
@ 2017-10-12 16:31   ` Borislav Petkov
  2017-10-12 18:27     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-12 16:31 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:19PM -0700, Ricardo Neri wrote:
> Obtain the default values of the address and operand sizes as specified in
> the D and L bits of the the segment descriptor selected by the register
> CS. The function can be used for both protected and long modes.
> For virtual-8086 mode, the default address and operand sizes are always 2
> bytes.
> 
> The returned parameters are encoded in a signed 8-bit data type. Auxiliar
> macros are provided to encode and decode such values.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  arch/x86/include/asm/insn-eval.h |  5 ++++
>  arch/x86/lib/insn-eval.c         | 64 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)

Some cleanups ontop:

---
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 64924b7d5fff..3352b9d5164f 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -614,14 +614,14 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
 }
 
 /**
- * insn_get_code_seg_defaults() - Obtain code segment default parameters
+ * insn_get_code_seg_params() - Obtain code segment parameters
  * @regs:	Structure with register values as seen when entering kernel mode
  *
- * Obtain the default parameters of the code segment: address and operand sizes.
- * The code segment is obtained from the selector contained in the CS register
- * in regs. In protected mode, the default address is determined by inspecting
- * the L and D bits of the segment descriptor. In virtual-8086 mode, the default
- * is always two bytes for both address and operand sizes.
+ * Obtain address and operand sizes of the code segment. It is obtained from the
+ * selector contained in the CS register in regs. In protected mode, the default
+ * address is determined by inspecting the L and D bits of the segment descriptor.
+ * In virtual-8086 mode, the default is always two bytes for both address and
+ * operand sizes.
  *
  * Returns:
  *
@@ -629,7 +629,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
  *
  * -EINVAL on error.
  */
-char insn_get_code_seg_defaults(struct pt_regs *regs)
+char insn_get_code_seg_params(struct pt_regs *regs)
 {
 	struct desc_struct *desc;
 	short sel;
@@ -640,7 +640,7 @@ char insn_get_code_seg_defaults(struct pt_regs *regs)
 
 	sel = get_segment_selector(regs, INAT_SEG_REG_CS);
 	if (sel < 0)
-		return -1L;
+		return sel;
 
 	desc = get_desc(sel);
 	if (!desc)

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment
  2017-10-12 16:31   ` Borislav Petkov
@ 2017-10-12 18:27     ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-12 18:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Thu, 2017-10-12 at 18:31 +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:19PM -0700, Ricardo Neri wrote:
> > 
> > Obtain the default values of the address and operand sizes as specified in
> > the D and L bits of the the segment descriptor selected by the register
> > CS. The function can be used for both protected and long modes.
> > For virtual-8086 mode, the default address and operand sizes are always 2
> > bytes.
> > 
> > The returned parameters are encoded in a signed 8-bit data type. Auxiliar
> > macros are provided to encode and decode such values.
> > 
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Thomas Garnier <thgarnie@google.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  arch/x86/include/asm/insn-eval.h |  5 ++++
> >  arch/x86/lib/insn-eval.c         | 64
> > ++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 69 insertions(+)
> Some cleanups ontop:
> 
> ---
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 64924b7d5fff..3352b9d5164f 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -614,14 +614,14 @@ static unsigned long get_seg_limit(struct pt_regs *regs,
> int seg_reg_idx)
>  }
>  
>  /**
> - * insn_get_code_seg_defaults() - Obtain code segment default parameters
> + * insn_get_code_seg_params() - Obtain code segment parameters
>   * @regs:	Structure with register values as seen when entering kernel
> mode
>   *
> - * Obtain the default parameters of the code segment: address and operand
> sizes.
> - * The code segment is obtained from the selector contained in the CS
> register
> - * in regs. In protected mode, the default address is determined by
> inspecting
> - * the L and D bits of the segment descriptor. In virtual-8086 mode, the
> default
> - * is always two bytes for both address and operand sizes.
> + * Obtain address and operand sizes of the code segment. It is obtained from
> the
> + * selector contained in the CS register in regs. In protected mode, the
> default
> + * address is determined by inspecting the L and D bits of the segment
> descriptor.
> + * In virtual-8086 mode, the default is always two bytes for both address and
> + * operand sizes.
>   *
>   * Returns:
>   *
> @@ -629,7 +629,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs,
> int seg_reg_idx)
>   *
>   * -EINVAL on error.
>   */
> -char insn_get_code_seg_defaults(struct pt_regs *regs)
> +char insn_get_code_seg_params(struct pt_regs *regs)
>  {
>  	struct desc_struct *desc;
>  	short sel;
> @@ -640,7 +640,7 @@ char insn_get_code_seg_defaults(struct pt_regs *regs)
>  
>  	sel = get_segment_selector(regs, INAT_SEG_REG_CS);
>  	if (sel < 0)
> -		return -1L;
> +		return sel;

Thanks! I implemente these changes along with your Improvements-by.

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-12  9:48       ` Borislav Petkov
@ 2017-10-13  1:08         ` Ricardo Neri
  2017-10-13 11:37           ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-13  1:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Thu, 2017-10-12 at 11:48 +0200, Borislav Petkov wrote:
> On Wed, Oct 11, 2017 at 06:12:30PM -0700, Ricardo Neri wrote:
> > 
> > Shouldn't this function check for a null insn since it is used here?
> I have to say, this whole codepath from insn_get_seg_base() with
> insn==NULL is nasty but I don't see a way around it as we need to know
> how many bytes to copy and from where. Can't think of a better solution
> without duplicating a lot of code. :-\

I have looked at your two proposals. I think I prefer the first one plus a
couple of tweaks.

> 
> So how about this?
> 
> If the patch is hard to read, you can apply it and look at the code. But
> here's the gist:
> 
> * You pull up the rIP check and do that directly in resolve_seg_reg()
> and return INAT_SEG_REG_CS there immediately so you don't have to call
> resolve_default_seg().

In my opinion it would be better to have all the checks in a single place. This
makes the code easier to read that having this special case directly
in resolve_default_seg(). Also, strictly speaking we would need to
return INAT_SEG_REG_IGNORE in long mode. Indeed, insn_get_seg_base() would
return base 0 in such a case, but I feel it is better if this logic is explicit
in resolve_default_seg().
> 
> This way, you get the only case out of the way where insn can be NULL.
> 
> Then you can do the if (!insn) check once and now you have a valid insn.

Rather than checking for null insn in resolve_seg_reg(), which does not use it,
let the functions it calls do the check if they need to.
> 
> check_seg_overrides() can then return simply bool and you can get rid of
> the remaining if (!insn) checks down the road.
> 
> But please double-check me if I missed a case - the flow is not trivial.

This is a diff based on your first proposal (I hope text does not wrap). I feel
this makes it clear how resolve_seg_reg() handles errors as well it uses
overridden or default segment register indices. Plus, insn is only checked when
used.

@@ -155,6 +155,16 @@ static int resolve_default_seg(struct insn *insn, struct
pt_regs *regs, int off)
 {
        if (user_64bit_mode(regs))
                return INAT_SEG_REG_IGNORE;
+
+       /*
+        * insn may be null as we may be about to copy the instruction.
+        * However is not needed at all.
+        */
+       if (off == offsetof(struct pt_regs, ip))
+               INAT_SEG_REG_CS;
+
+       if(!insn)
+               return -EINVAL;
        /*
         * If we are here, we use the default segment register as described
         * in the Intel documentation:
@@ -191,9 +201,6 @@ static int resolve_default_seg(struct insn *insn, struct
pt_regs *regs, int off)
        case offsetof(struct pt_regs, sp):
                return INAT_SEG_REG_SS;
 
-       case offsetof(struct pt_regs, ip):
-               return INAT_SEG_REG_CS;
-
        default:
                return -EINVAL;
        }
@@ -254,9 +261,6 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs
*regs, int regoff)
        if (!ret)
                return resolve_default_seg(insn, regs, regoff);
 
-       if (!insn)
-               return -EINVAL;
-
        idx = get_seg_reg_override_idx(insn);
        if (idx < 0)
                return idx;

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-13  1:08         ` Ricardo Neri
@ 2017-10-13 11:37           ` Borislav Petkov
  2017-10-13 18:43             ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-13 11:37 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Thu, Oct 12, 2017 at 06:08:17PM -0700, Ricardo Neri wrote:
> In my opinion it would be better to have all the checks in a single place. This
> makes the code easier to read that having this special case directly
> in resolve_default_seg().
> ...
> Rather than checking for null insn in resolve_seg_reg(), which does not use it,
> let the functions it calls do the check if they need to.

Of course it is using it - it is passing it down to callers.

No, this is completely backwards. You're pushing the if (!insn) check
down instead of up. What you wanna do instead is get that "strange" case
out of the way *first* where insn is NULL and then have the remaining
flow with a properly allocated struct insn.

And the only case where insn is NULL is fixup_umip_exception(). All the
other callers of insn_get_seg_base() supply a properly setup struct insn
* AFAICT.

So do the minimum work of getting the segment base either directly in
fixup_umip_exception() by calling a helper function as it matters only
there.

And IINM, you have two possible cases:

1. INAT_SEG_REG_IGNORE which makes segment base 0

2. INAT_SEG_REG_DEFAULT which maps to INAT_SEG_REG_CS for the rIP case
and then gets the selector:

	sel = (unsigned short)(regs->cs & 0xffff);

and then computes the base.

And the mapping of sel to base you can do by carving out the piece of
insn_get_seg_base() *after* you've computed @sel and you do the base
computation, i.e., the piece which starts with this:

        if (v8086_mode(regs))
                /*
                 * Base is simply the segment selector shifted 4
                 * positions to the right.
	...

into a separate function called __get_seg_base(sel, ...).

The important thing to note here is that this function won't need insn
so you can call it without one.

This way you have it nice and clean designed with a clear separation of
the cases *before* a valid struct insn * and *after*.

Right now, you have both intermixed and the code is hard to follow as
you have to pay attention at each time: how and where am I being called.

Makes sense?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-13 11:37           ` Borislav Petkov
@ 2017-10-13 18:43             ` Ricardo Neri
  2017-10-17  9:35               ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-13 18:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, 2017-10-13 at 13:37 +0200, Borislav Petkov wrote:
> On Thu, Oct 12, 2017 at 06:08:17PM -0700, Ricardo Neri wrote:
> > 
> > In my opinion it would be better to have all the checks in a single place.
> > This
> > makes the code easier to read that having this special case directly
> > in resolve_default_seg().
> > ...
> > Rather than checking for null insn in resolve_seg_reg(), which does not use
> > it,
> > let the functions it calls do the check if they need to.
> Of course it is using it - it is passing it down to callers.
> 
> No, this is completely backwards. You're pushing the if (!insn) check
> down instead of up. What you wanna do instead is get that "strange" case
> out of the way *first* where insn is NULL and then have the remaining
> flow with a properly allocated struct insn.

Furthermore, resolve_seg_reg() there should not be need to call
resolve_seg_reg(). It is meant to be used when decoding instructions. Callers
already know they must use CS for rIP. resolve_seg_reg() is to be used when the
segment register is not know beforehand.

Then it sounds that your second proposal address it:

1) It first handles the strange case
2) It checks if insn is valid
3) Checks if segment overrides prefixes can be used
   3.a) If yes, use them, if any.
   3.b) If no, resolve default segment register index.

resolve_default_seg() and get_seg_reg_override_idx() will assume that insn is
not null. Please find at the bottom a diff of your second proposal with minor
tweaks.

> 
> And the only case where insn is NULL is fixup_umip_exception(). All the
> other callers of insn_get_seg_base() supply a properly setup struct insn
> * AFAICT.

Yes, this is correct.
> 
> So do the minimum work of getting the segment base either directly in
> fixup_umip_exception() by calling a helper function as it matters only
> there.
> 
> And IINM, you have two possible cases:
> 
> 1. INAT_SEG_REG_IGNORE which makes segment base 0
> 
> 2. INAT_SEG_REG_DEFAULT which maps to INAT_SEG_REG_CS for the rIP case
> and then gets the selector:
> 
> 	sel = (unsigned short)(regs->cs & 0xffff);
> and then computes the base.

Yes, also correct; I modified your second proposal for cases 1 and 2.

> 
> And the mapping of sel to base you can do by carving out the piece of
> insn_get_seg_base() *after* you've computed @sel and you do the base
> computation, i.e., the piece which starts with this:
> 
>         if (v8086_mode(regs))
>                 /*
>                  * Base is simply the segment selector shifted 4
>                  * positions to the right.
> 	...
> 
> into a separate function called __get_seg_base(sel, ...).
> 
> The important thing to note here is that this function won't need insn
> so you can call it without one.

In v9 insn_get_seg_base() does not take an insn as argument (it used to take one
until v8). Also, fixup_umip_exception() calls insn_get_seg_base with
INAT_SEG_REG_CS, no need to resolve the segment register. It does it only for
!user_64bit_mode().

If a __get_seg_base(sel, ...) was implemented, then insn_get_seg_base() would
look like:

unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
{
	struct desc_struct *desc;
	short sel;

	sel = get_segment_selector(regs, seg_reg_idx);
	if (sel < 0)
		return -1L;
		}

	return __get_seg_base(sel, regs);
}

After having clarified that an insn is not needed in the latest iteration of
insn_get_seg_base(), I just want to double check if this is what you would like
to see.

> 
> This way you have it nice and clean designed with a clear separation of
> the cases *before* a valid struct insn * and *after*.
> 
> Right now, you have both intermixed and the code is hard to follow as
> you have to pay attention at each time: how and where am I being called.
> 
> Makes sense?

I think it does now. This is a modification of your second proposal (I hope text
does not wrap):

@@ -71,9 +71,6 @@ static int get_seg_reg_override_idx(struct insn *insn)
 	int idx = INAT_SEG_REG_DEFAULT;
 	int num_overrides = 0, i;
 
-	if (!insn)
-		return -EINVAL;
-
 	insn_get_prefixes(insn);
 
 	/* Look for any segment override prefixes. */
@@ -128,28 +125,16 @@ static int get_seg_reg_override_idx(struct insn *insn)
  *
  * Returns:
  *
- * 1 if segment override prefixes can be used with the register indicated
- * in @regoff. 0 if otherwise.
+ * True if segment override prefixes can be used with the register indicated
+ * in @regoff. False if otherwise.
  *
- * -EINVAL in case of error.
  */
-static int check_seg_overrides(struct insn *insn, int regoff)
+static bool check_seg_overrides(struct insn *insn, int regoff)
 {
-	/*
-	 * Segment override prefixes should not be used for rIP. It is not
-	 * necessary to inspect the instruction structure.
-	 */
-	if (regoff == offsetof(struct pt_regs, ip))
-		return 0;
-
-	/* Subsequent checks require a valid insn. */
-	if (!insn)
-		return -EINVAL;
-
 	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
-		return 0;
+		return false;
 
-	return 1;
+	return true;
 }
 
 /**
@@ -249,18 +234,26 @@ static int resolve_default_seg(struct insn *insn, struct
pt_regs *regs, int off)
  */
 static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
 {
-	int ret, idx;
-
-	ret = check_seg_overrides(insn, regoff);
-	if (ret < 0)
-		return ret;
+	int idx;
 
-	if (!ret)
-		return resolve_default_seg(insn, regs, regoff);
+	/*
+	 * In the unlikely event of having to resolve the segment register
+	 * index for rIP, do it first. Segment override prefixes should not
+	 * be used. Hence, it is not necessary to inspect the instruction.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip)) {
+		if (user_64bit_mode(regs))
+			return INAT_SEG_REG_IGNORE;
+		else
+			return INAT_SEG_REG_CS;
+	}
 
 	if (!insn)
 		return -EINVAL;
 
+	if (!check_seg_overrides(insn, regoff))
+		return resolve_default_seg(insn, regs, regoff);
+
 	idx = get_seg_reg_override_idx(insn);
 	if (idx < 0)
 		return idx;
Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-13 18:43             ` Ricardo Neri
@ 2017-10-17  9:35               ` Borislav Petkov
  2017-10-17 20:31                 ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-17  9:35 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 13, 2017 at 11:43:43AM -0700, Ricardo Neri wrote:
> I think it does now. This is a modification of your second proposal (I hope text
> does not wrap):

Instead of hoping that it doesn't wrap, please fix your mail client. For
that, generate a diff and send it to yourself and try applying the mail.

And I can see the wrap already.

Also when you paste a patch, you need to add the full output of git
diff. This hunk of yours doesn't even have the file to patch so there's
no applying it from the mail.

And so on and so on.

So please send me a whole

"x86/insn-eval: Add utility functions to get segment selector"

patch how you think it should be and I can take a look at it then.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-17  9:35               ` Borislav Petkov
@ 2017-10-17 20:31                 ` Ricardo Neri
  2017-10-18 20:29                   ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-17 20:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, 2017-10-17 at 11:35 +0200, Borislav Petkov wrote:
> On Fri, Oct 13, 2017 at 11:43:43AM -0700, Ricardo Neri wrote:
> > 
> > I think it does now. This is a modification of your second proposal (I hope text
> > does not wrap):
> Instead of hoping that it doesn't wrap, please fix your mail client. For
> that, generate a diff and send it to yourself and try applying the mail.
> 
> And I can see the wrap already.
> 
> Also when you paste a patch, you need to add the full output of git
> diff. This hunk of yours doesn't even have the file to patch so there's
> no applying it from the mail.
> 
> And so on and so on.
> 
> So please send me a whole
> 
> "x86/insn-eval: Add utility functions to get segment selector"
> 
> patch how you think it should be and I can take a look at it then.

I apologize for the inconvenience, I have verified that may mail client
works properly this time. I double-checked that it did not wrap. Please
find below, the whole patch, including commit message:

When computing a linear address and segmentation is used, we need to know
the base address of the segment involved in the computation. In most of
the cases, the segment base address will be zero as in USER_DS/USER32_DS.
However, it may be possible that a user space program defines its own
segments via a local descriptor table. In such a case, the segment base
address may not be zero. Thus, the segment base address is needed to
calculate correctly the linear address.

If running in protected mode, the segment selector to be used when
computing a linear address is determined by either any of segment override
prefixes in the instruction or inferred from the registers involved in the
computation of the effective address; in that order. Also, there are cases
when the segment override prefixes shall be ignored (i.e., code segments
are always selected by the CS segment register; string instructions always
use the ES segment register when using rDI register as operand). In long
mode, segment registers are ignored, except for FS and GS. In these two
cases, base addresses are obtained from the respective MSRs.

For clarity, this process can be split into four steps (and an equal
number of functions): determine if segment prefixes overrides can be used;
parse the segment override prefixes, and use them if found; if not found
or cannot be used, use the default segment registers associated with the
operand registers. Once the segment register to use has been identified,
read its value to obtain the segment selector.

The method to obtain the segment selector depends on several factors. In
32-bit builds, segment selectors are saved into a pt_regs structure
when switching to kernel mode. The same is also true for virtual-8086
mode. In 64-bit builds, segmentation is mostly ignored, except when
running a program in 32-bit legacy mode. In this case, CS and SS can be
obtained from pt_regs. DS, ES, FS and GS can be read directly from
the respective segment registers.

In order to identify the segment registers, a new set of #defines is
introduced. It also includes two special identifiers. One of them
indicates when the default segment register associated with instruction
operands shall be used. Another one indicates that the contents of the
segment register shall be ignored; this identifier is used when in long
mode.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/inat.h |  10 ++
 arch/x86/lib/insn-eval.c    | 340 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 350 insertions(+)

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
index 02aff08..1c78580 100644
--- a/arch/x86/include/asm/inat.h
+++ b/arch/x86/include/asm/inat.h
@@ -97,6 +97,16 @@
 #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
 #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
 
+/* Identifiers for segment registers */
+#define INAT_SEG_REG_IGNORE	0
+#define INAT_SEG_REG_DEFAULT	1
+#define INAT_SEG_REG_CS		2
+#define INAT_SEG_REG_SS		3
+#define INAT_SEG_REG_DS		4
+#define INAT_SEG_REG_ES		5
+#define INAT_SEG_REG_FS		6
+#define INAT_SEG_REG_GS		7
+
 /* Attribute search APIs */
 extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
 extern int inat_get_last_prefix_id(insn_byte_t last_pfx);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ac7b87c..5f610be 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -9,6 +9,7 @@
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/vm86.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "insn: " fmt
@@ -47,6 +48,345 @@ static bool is_string_insn(struct insn *insn)
 	}
 }
 
+/**
+ * get_seg_reg_override_idx() - obtain segment register override index
+ * @insn:	Valid instruction with segment override prefixes
+ *
+ * Inspect the instruction prefixes in @insn and find segment overrides, if any.
+ *
+ * Returns:
+ *
+ * A constant identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_DEFAULT is returned if no segment override
+ * prefixes were found.
+ *
+ * -EINVAL in case of error.
+ */
+static int get_seg_reg_override_idx(struct insn *insn)
+{
+	int idx = INAT_SEG_REG_DEFAULT;
+	int num_overrides = 0, i;
+
+	insn_get_prefixes(insn);
+
+	/* Look for any segment override prefixes. */
+	for (i = 0; i < insn->prefixes.nbytes; i++) {
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+			idx = INAT_SEG_REG_CS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+			idx = INAT_SEG_REG_SS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+			idx = INAT_SEG_REG_DS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+			idx = INAT_SEG_REG_ES;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_FS):
+			idx = INAT_SEG_REG_FS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_GS):
+			idx = INAT_SEG_REG_GS;
+			num_overrides++;
+			break;
+		/* No default action needed. */
+		}
+	}
+
+	/* More than one segment override prefix leads to undefined behavior. */
+	if (num_overrides > 1)
+		return -EINVAL;
+
+	return idx;
+}
+
+/**
+ * check_seg_overrides() - check if segment override prefixes are allowed
+ * @insn:	Valid instruction with segment override prefixes
+ * @regoff:	Operand offset, in pt_regs, for which the check is performed
+ *
+ * For a particular register used in register-indirect addressing, determine if
+ * segment override prefixes can be used. Specifically, no overrides are allowed
+ * for rDI if used with a string instruction.
+ *
+ * Returns:
+ *
+ * True if segment override prefixes can be used with the register indicated
+ * in @regoff. False if otherwise.
+ */
+static bool check_seg_overrides(struct insn *insn, int regoff)
+{
+	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
+		return false;
+
+	return true;
+}
+
+/**
+ * resolve_default_seg() - resolve default segment register index for an operand
+ * @insn:	Valid, instruction with opcode
+ * @regs:	Register values as seen when entering kernel mode
+ * @off:	Operand offset, in pt_regs, for which resolution is needed
+ *
+ * Resolve the default segment register index associated with the instruction
+ * operand register indicated by @off. Such index is resolved based on defaults
+ * described in the Intel Software Development Manual.
+ *
+ * Returns:
+ *
+ * If in protected mode, a constant identifying the segment register to use,
+ * among CS, SS, ES or DS. If in long mode, INAT_SEG_REG_IGNORE.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
+{
+	if (user_64bit_mode(regs))
+		return INAT_SEG_REG_IGNORE;
+	/*
+	 * Resolve the default segment register as described in Section 3.7.4
+	 * of the Intel Software Development Manual Vol. 1:
+	 *
+	 *  + DS for all references involving r[ABCD]X, and rSI.
+	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
+	 *  + AX, CX and DX are not valid register operands in 16-bit address
+	 *    encodings but are valid for 32-bit and 64-bit encodings.
+	 *  + -EDOM is reserved to identify for cases in which no register
+	 *    is used (i.e., displacement-only addressing). Use DS.
+	 *  + SS for rSP or rBP.
+	 *  + CS for rIP.
+	 */
+
+	switch (off) {
+	case offsetof(struct pt_regs, ax):
+	case offsetof(struct pt_regs, cx):
+	case offsetof(struct pt_regs, dx):
+		/* Need insn to verify address size. */
+		if (insn->addr_bytes == 2)
+			return -EINVAL;
+
+	case -EDOM:
+	case offsetof(struct pt_regs, bx):
+	case offsetof(struct pt_regs, si):
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, di):
+		if (is_string_insn(insn))
+			return INAT_SEG_REG_ES;
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, bp):
+	case offsetof(struct pt_regs, sp):
+		return INAT_SEG_REG_SS;
+
+	case offsetof(struct pt_regs, ip):
+		return INAT_SEG_REG_CS;
+
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
+ * resolve_seg_reg() - obtain segment register index
+ * @insn:	Instruction with operands
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to deterimine segment register
+ *
+ * Determine the segment register associated with the operands and, if
+ * applicable, prefixes and the instruction pointed by @insn.
+ *
+ * The segment register associated to an operand used in register-indirect
+ * addressing depends on:
+ *
+ * a) Whether running in long mode (in such a case segments are ignored, except
+ * if FS or GS are used).
+ *
+ * b) Whether segment override prefixes can be used. Certain instructions and
+ *    registers do not allow override prefixes.
+ *
+ * c) Whether segment overrides prefixes are found in the instruction prefixes.
+ *
+ * d) If there are not segment override prefixes or they cannot be used, the
+ *    default segment register associated with the operand register is used.
+ *
+ * The function checks first if segment override prefixes can be used with the
+ * operand indicated by @regoff. If allowed, obtain such overridden segment
+ * register index. Lastly, if not prefixes were found or cannot be used, resolve
+ * the segment register index to use based on the defaults described in the
+ * Intel documentation. In long mode, all segment register indexes will be
+ * ignored, except if overrides were found for FS or GS. All these operations
+ * are done using helper functions.
+ *
+ * The operand register, @regoff, is represented as the offset from the base of
+ * pt_regs.
+ *
+ * As stated, the main use of this function is to determine the segment register
+ * index based on the instruction, its operands and prefixes. Hence, @insn
+ * must be valid. However, if @regoff indicates rIP, we don't need to inspect
+ * @insn at all as in this case CS is used in all cases. This case is checked
+ * before proceeding further.
+ *
+ * Please note that this function does not return the value in the segment
+ * register (i.e., the segment selector) but our defined index. The segment
+ * selector needs to be obtained using get_segment_selector() and passing the
+ * segment register index resolved by this function.
+ *
+ * Returns:
+ *
+ * An index identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_IGNORE is returned if running in long mode.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
+{
+	int idx;
+
+	/*
+	 * In the unlikely event of having to resolve the segment register
+	 * index for rIP, do it first. Segment override prefixes should not
+	 * be used. Hence, it is not necessary to inspect the instruction,
+	 * which may be invalid at this point.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip)) {
+		if (user_64bit_mode(regs))
+			return INAT_SEG_REG_IGNORE;
+		else
+			return INAT_SEG_REG_CS;
+	}
+
+	if (!insn)
+		return -EINVAL;
+
+	if (!check_seg_overrides(insn, regoff))
+		return resolve_default_seg(insn, regs, regoff);
+
+	idx = get_seg_reg_override_idx(insn);
+	if (idx < 0)
+		return idx;
+
+	if (idx == INAT_SEG_REG_DEFAULT)
+		return resolve_default_seg(insn, regs, regoff);
+
+	/*
+	 * In long mode, segment override prefixes are ignored, except for
+	 * overrides for FS and GS.
+	 */
+	if (user_64bit_mode(regs)) {
+		if (idx != INAT_SEG_REG_FS &&
+		    idx != INAT_SEG_REG_GS)
+			idx = INAT_SEG_REG_IGNORE;
+	}
+
+	return idx;
+}
+
+/**
+ * get_segment_selector() - obtain segment selector
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Segment register index to use
+ *
+ * Obtain the segment selector from any of the CS, SS, DS, ES, FS, GS segment
+ * registers. In CONFIG_X86_32, the segment is obtained from either pt_regs or
+ * kernel_vm86_regs as applicable. In CONFIG_X86_64, CS and SS are obtained
+ * from pt_regs. DS, ES, FS and GS are obtained by reading the actual CPU
+ * registers. This done for only for completeness as in CONFIG_X86_64 segment
+ * registers are ignored.
+ *
+ * Returns:
+ *
+ * Value of the segment selector, including null when running in
+ * long mode.
+ *
+ * -EINVAL on error.
+ */
+static short get_segment_selector(struct pt_regs *regs, int seg_reg_idx)
+{
+#ifdef CONFIG_X86_64
+	unsigned short sel;
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_IGNORE:
+		return 0;
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		savesegment(ds, sel);
+		return sel;
+	case INAT_SEG_REG_ES:
+		savesegment(es, sel);
+		return sel;
+	case INAT_SEG_REG_FS:
+		savesegment(fs, sel);
+		return sel;
+	case INAT_SEG_REG_GS:
+		savesegment(gs, sel);
+		return sel;
+	default:
+		return -EINVAL;
+	}
+#else /* CONFIG_X86_32 */
+	struct kernel_vm86_regs *vm86regs = (struct kernel_vm86_regs *)regs;
+
+	if (v8086_mode(regs)) {
+		switch (seg_reg_idx) {
+		case INAT_SEG_REG_CS:
+			return (unsigned short)(regs->cs & 0xffff);
+		case INAT_SEG_REG_SS:
+			return (unsigned short)(regs->ss & 0xffff);
+		case INAT_SEG_REG_DS:
+			return vm86regs->ds;
+		case INAT_SEG_REG_ES:
+			return vm86regs->es;
+		case INAT_SEG_REG_FS:
+			return vm86regs->fs;
+		case INAT_SEG_REG_GS:
+			return vm86regs->gs;
+		case INAT_SEG_REG_IGNORE:
+			/* fall through */
+		default:
+			return -EINVAL;
+		}
+	}
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		return (unsigned short)(regs->ds & 0xffff);
+	case INAT_SEG_REG_ES:
+		return (unsigned short)(regs->es & 0xffff);
+	case INAT_SEG_REG_FS:
+		return (unsigned short)(regs->fs & 0xffff);
+	case INAT_SEG_REG_GS:
+		/*
+		 * GS may or may not be in regs as per CONFIG_X86_32_LAZY_GS.
+		 * The macro below takes care of both cases.
+		 */
+		return get_user_gs(regs);
+	case INAT_SEG_REG_IGNORE:
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+#endif /* CONFIG_X86_64 */
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {
-- 

Thanks and BR,
Ricardo

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-17 20:31                 ` Ricardo Neri
@ 2017-10-18 20:29                   ` Borislav Petkov
  2017-10-19  6:30                     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-18 20:29 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 17, 2017 at 01:31:52PM -0700, Ricardo Neri wrote:
> I apologize for the inconvenience, I have verified that may mail client
> works properly this time. I double-checked that it did not wrap.

But did you try applying the patch which you have sent to yourself first?

Because it doesn't work here:

[boris@pd: ~/kernel/linux> test-apply.sh /tmp/ricardo.neri-calderon.13
checking file arch/x86/include/asm/inat.h
patch: **** malformed patch at line 124:  #define INAT_MAKE_GROUP(grp)  ((grp << INAT_GRP_OFFS) | INAT_MODRM)

Diffing your original patch which you've sent with git and this one
which you've sent with evolution gives:

 diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
 index 02aff08..1c78580 100644
 --- a/arch/x86/include/asm/inat.h
 +++ b/arch/x86/include/asm/inat.h
 @@ -97,6 +97,16 @@
- #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
- #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
- 
+ #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
+ #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
+ 

And already those spaces at the beginning of the line must be funky
because the rest looks identical.

They should be:

00001420  2c 31 36 20 40 40 0a 20  23 64 65 66 69 6e 65 20  |,16 @@. #define

i.e., a 0x0a for LF and 0x20 for space.

Yours are

000017b0  36 20 2b 39 37 2c 31 36  20 40 40 0a c2 a0 23 64  |6 +97,16 @@...#d|

so there's 0x0a LF, but then there's 0xc2, and then there's 0xa0.

Looking at an UTF-8 table, it says:

U+00A0	 	c2 a0	NO-BREAK SPACE

so your patch is utf-8, no wonder it doesn't apply.

So try sending the patch again. But send it to yourself first and try
applying it.

Alternatively, git send-email supports threading with --in-reply-to=
so that is another possibility. But here you'll have to add the whole
CC-list.

Also, Documentation/process/email-clients.rst has some notes on how to
send patches with Evolution.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-18 20:29                   ` Borislav Petkov
@ 2017-10-19  6:30                     ` Ricardo Neri
  2017-10-20  7:55                       ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-19  6:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, Oct 18, 2017 at 10:29:43PM +0200, Borislav Petkov wrote:
> On Tue, Oct 17, 2017 at 01:31:52PM -0700, Ricardo Neri wrote:
> > I apologize for the inconvenience, I have verified that may mail client
> > works properly this time. I double-checked that it did not wrap.
> 
> But did you try applying the patch which you have sent to yourself first?
> 
> Because it doesn't work here:
> 
> [boris@pd: ~/kernel/linux> test-apply.sh /tmp/ricardo.neri-calderon.13
> checking file arch/x86/include/asm/inat.h
> patch: **** malformed patch at line 124:  #define INAT_MAKE_GROUP(grp)  ((grp << INAT_GRP_OFFS) | INAT_MODRM)
> 
> Diffing your original patch which you've sent with git and this one
> which you've sent with evolution gives:
> 
>  diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
>  index 02aff08..1c78580 100644
>  --- a/arch/x86/include/asm/inat.h
>  +++ b/arch/x86/include/asm/inat.h
>  @@ -97,6 +97,16 @@
> - #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
> - #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
> - 
> + #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
> + #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
> + 
> 
> And already those spaces at the beginning of the line must be funky
> because the rest looks identical.
> 
> They should be:
> 
> 00001420  2c 31 36 20 40 40 0a 20  23 64 65 66 69 6e 65 20  |,16 @@. #define
> 
> i.e., a 0x0a for LF and 0x20 for space.
> 
> Yours are
> 
> 000017b0  36 20 2b 39 37 2c 31 36  20 40 40 0a c2 a0 23 64  |6 +97,16 @@...#d|
> 
> so there's 0x0a LF, but then there's 0xc2, and then there's 0xa0.
> 
> Looking at an UTF-8 table, it says:
> 
> U+00A0	 	c2 a0	NO-BREAK SPACE
> 
> so your patch is utf-8, no wonder it doesn't apply.

I saw that Documentation/process/email-clients.rst that emailed patches
should be in ASCII or UTF-8 encodings only, but my patch in UTF-8
causes problems. Then is UTF-8 not desirable?

> 
> So try sending the patch again. But send it to yourself first and try
> applying it.
> 
> Alternatively, git send-email supports threading with --in-reply-to=
> so that is another possibility. But here you'll have to add the whole
> CC-list.
> 
> Also, Documentation/process/email-clients.rst has some notes on how to
> send patches with Evolution.

Thanks for the detailed explanation and the pointers to fix the problem.
I am sorry for the inconvenience. Here is the patch again. I made sure
that it applies cleanly with git am and patch:

When computing a linear address and segmentation is used, we need to know
the base address of the segment involved in the computation. In most of
the cases, the segment base address will be zero as in USER_DS/USER32_DS.
However, it may be possible that a user space program defines its own
segments via a local descriptor table. In such a case, the segment base
address may not be zero. Thus, the segment base address is needed to
calculate correctly the linear address.

If running in protected mode, the segment selector to be used when
computing a linear address is determined by either any of segment override
prefixes in the instruction or inferred from the registers involved in the
computation of the effective address; in that order. Also, there are cases
when the segment override prefixes shall be ignored (i.e., code segments
are always selected by the CS segment register; string instructions always
use the ES segment register when using rDI register as operand). In long
mode, segment registers are ignored, except for FS and GS. In these two
cases, base addresses are obtained from the respective MSRs.

For clarity, this process can be split into four steps (and an equal
number of functions): determine if segment prefixes overrides can be used;
parse the segment override prefixes, and use them if found; if not found
or cannot be used, use the default segment registers associated with the
operand registers. Once the segment register to use has been identified,
read its value to obtain the segment selector.

The method to obtain the segment selector depends on several factors. In
32-bit builds, segment selectors are saved into a pt_regs structure
when switching to kernel mode. The same is also true for virtual-8086
mode. In 64-bit builds, segmentation is mostly ignored, except when
running a program in 32-bit legacy mode. In this case, CS and SS can be
obtained from pt_regs. DS, ES, FS and GS can be read directly from
the respective segment registers.

In order to identify the segment registers, a new set of #defines is
introduced. It also includes two special identifiers. One of them
indicates when the default segment register associated with instruction
operands shall be used. Another one indicates that the contents of the
segment register shall be ignored; this identifier is used when in long
mode.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
 arch/x86/include/asm/inat.h |  10 ++
 arch/x86/lib/insn-eval.c    | 340 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 350 insertions(+)

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
index 02aff08..1c78580 100644
--- a/arch/x86/include/asm/inat.h
+++ b/arch/x86/include/asm/inat.h
@@ -97,6 +97,16 @@
 #define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
 #define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
 
+/* Identifiers for segment registers */
+#define INAT_SEG_REG_IGNORE	0
+#define INAT_SEG_REG_DEFAULT	1
+#define INAT_SEG_REG_CS		2
+#define INAT_SEG_REG_SS		3
+#define INAT_SEG_REG_DS		4
+#define INAT_SEG_REG_ES		5
+#define INAT_SEG_REG_FS		6
+#define INAT_SEG_REG_GS		7
+
 /* Attribute search APIs */
 extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
 extern int inat_get_last_prefix_id(insn_byte_t last_pfx);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ac7b87c..5f610be 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -9,6 +9,7 @@
 #include <asm/inat.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/vm86.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "insn: " fmt
@@ -47,6 +48,345 @@ static bool is_string_insn(struct insn *insn)
 	}
 }
 
+/**
+ * get_seg_reg_override_idx() - obtain segment register override index
+ * @insn:	Valid instruction with segment override prefixes
+ *
+ * Inspect the instruction prefixes in @insn and find segment overrides, if any.
+ *
+ * Returns:
+ *
+ * A constant identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_DEFAULT is returned if no segment override
+ * prefixes were found.
+ *
+ * -EINVAL in case of error.
+ */
+static int get_seg_reg_override_idx(struct insn *insn)
+{
+	int idx = INAT_SEG_REG_DEFAULT;
+	int num_overrides = 0, i;
+
+	insn_get_prefixes(insn);
+
+	/* Look for any segment override prefixes. */
+	for (i = 0; i < insn->prefixes.nbytes; i++) {
+		insn_attr_t attr;
+
+		attr = inat_get_opcode_attribute(insn->prefixes.bytes[i]);
+		switch (attr) {
+		case INAT_MAKE_PREFIX(INAT_PFX_CS):
+			idx = INAT_SEG_REG_CS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_SS):
+			idx = INAT_SEG_REG_SS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_DS):
+			idx = INAT_SEG_REG_DS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_ES):
+			idx = INAT_SEG_REG_ES;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_FS):
+			idx = INAT_SEG_REG_FS;
+			num_overrides++;
+			break;
+		case INAT_MAKE_PREFIX(INAT_PFX_GS):
+			idx = INAT_SEG_REG_GS;
+			num_overrides++;
+			break;
+		/* No default action needed. */
+		}
+	}
+
+	/* More than one segment override prefix leads to undefined behavior. */
+	if (num_overrides > 1)
+		return -EINVAL;
+
+	return idx;
+}
+
+/**
+ * check_seg_overrides() - check if segment override prefixes are allowed
+ * @insn:	Valid instruction with segment override prefixes
+ * @regoff:	Operand offset, in pt_regs, for which the check is performed
+ *
+ * For a particular register used in register-indirect addressing, determine if
+ * segment override prefixes can be used. Specifically, no overrides are allowed
+ * for rDI if used with a string instruction.
+ *
+ * Returns:
+ *
+ * True if segment override prefixes can be used with the register indicated
+ * in @regoff. False if otherwise.
+ */
+static bool check_seg_overrides(struct insn *insn, int regoff)
+{
+	if (regoff == offsetof(struct pt_regs, di) && is_string_insn(insn))
+		return false;
+
+	return true;
+}
+
+/**
+ * resolve_default_seg() - resolve default segment register index for an operand
+ * @insn:	Valid, instruction with opcode
+ * @regs:	Register values as seen when entering kernel mode
+ * @off:	Operand offset, in pt_regs, for which resolution is needed
+ *
+ * Resolve the default segment register index associated with the instruction
+ * operand register indicated by @off. Such index is resolved based on defaults
+ * described in the Intel Software Development Manual.
+ *
+ * Returns:
+ *
+ * If in protected mode, a constant identifying the segment register to use,
+ * among CS, SS, ES or DS. If in long mode, INAT_SEG_REG_IGNORE.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
+{
+	if (user_64bit_mode(regs))
+		return INAT_SEG_REG_IGNORE;
+	/*
+	 * Resolve the default segment register as described in Section 3.7.4
+	 * of the Intel Software Development Manual Vol. 1:
+	 *
+	 *  + DS for all references involving r[ABCD]X, and rSI.
+	 *  + If used in a string instruction, ES for rDI. Otherwise, DS.
+	 *  + AX, CX and DX are not valid register operands in 16-bit address
+	 *    encodings but are valid for 32-bit and 64-bit encodings.
+	 *  + -EDOM is reserved to identify for cases in which no register
+	 *    is used (i.e., displacement-only addressing). Use DS.
+	 *  + SS for rSP or rBP.
+	 *  + CS for rIP.
+	 */
+
+	switch (off) {
+	case offsetof(struct pt_regs, ax):
+	case offsetof(struct pt_regs, cx):
+	case offsetof(struct pt_regs, dx):
+		/* Need insn to verify address size. */
+		if (insn->addr_bytes == 2)
+			return -EINVAL;
+
+	case -EDOM:
+	case offsetof(struct pt_regs, bx):
+	case offsetof(struct pt_regs, si):
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, di):
+		if (is_string_insn(insn))
+			return INAT_SEG_REG_ES;
+		return INAT_SEG_REG_DS;
+
+	case offsetof(struct pt_regs, bp):
+	case offsetof(struct pt_regs, sp):
+		return INAT_SEG_REG_SS;
+
+	case offsetof(struct pt_regs, ip):
+		return INAT_SEG_REG_CS;
+
+	default:
+		return -EINVAL;
+	}
+}
+
+/**
+ * resolve_seg_reg() - obtain segment register index
+ * @insn:	Instruction with operands
+ * @regs:	Register values as seen when entering kernel mode
+ * @regoff:	Operand offset, in pt_regs, used to deterimine segment register
+ *
+ * Determine the segment register associated with the operands and, if
+ * applicable, prefixes and the instruction pointed by @insn.
+ *
+ * The segment register associated to an operand used in register-indirect
+ * addressing depends on:
+ *
+ * a) Whether running in long mode (in such a case segments are ignored, except
+ * if FS or GS are used).
+ *
+ * b) Whether segment override prefixes can be used. Certain instructions and
+ *    registers do not allow override prefixes.
+ *
+ * c) Whether segment overrides prefixes are found in the instruction prefixes.
+ *
+ * d) If there are not segment override prefixes or they cannot be used, the
+ *    default segment register associated with the operand register is used.
+ *
+ * The function checks first if segment override prefixes can be used with the
+ * operand indicated by @regoff. If allowed, obtain such overridden segment
+ * register index. Lastly, if not prefixes were found or cannot be used, resolve
+ * the segment register index to use based on the defaults described in the
+ * Intel documentation. In long mode, all segment register indexes will be
+ * ignored, except if overrides were found for FS or GS. All these operations
+ * are done using helper functions.
+ *
+ * The operand register, @regoff, is represented as the offset from the base of
+ * pt_regs.
+ *
+ * As stated, the main use of this function is to determine the segment register
+ * index based on the instruction, its operands and prefixes. Hence, @insn
+ * must be valid. However, if @regoff indicates rIP, we don't need to inspect
+ * @insn at all as in this case CS is used in all cases. This case is checked
+ * before proceeding further.
+ *
+ * Please note that this function does not return the value in the segment
+ * register (i.e., the segment selector) but our defined index. The segment
+ * selector needs to be obtained using get_segment_selector() and passing the
+ * segment register index resolved by this function.
+ *
+ * Returns:
+ *
+ * An index identifying the segment register to use, among CS, SS, DS,
+ * ES, FS, or GS. INAT_SEG_REG_IGNORE is returned if running in long mode.
+ *
+ * -EINVAL in case of error.
+ */
+static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
+{
+	int idx;
+
+	/*
+	 * In the unlikely event of having to resolve the segment register
+	 * index for rIP, do it first. Segment override prefixes should not
+	 * be used. Hence, it is not necessary to inspect the instruction,
+	 * which may be invalid at this point.
+	 */
+	if (regoff == offsetof(struct pt_regs, ip)) {
+		if (user_64bit_mode(regs))
+			return INAT_SEG_REG_IGNORE;
+		else
+			return INAT_SEG_REG_CS;
+	}
+
+	if (!insn)
+		return -EINVAL;
+
+	if (!check_seg_overrides(insn, regoff))
+		return resolve_default_seg(insn, regs, regoff);
+
+	idx = get_seg_reg_override_idx(insn);
+	if (idx < 0)
+		return idx;
+
+	if (idx == INAT_SEG_REG_DEFAULT)
+		return resolve_default_seg(insn, regs, regoff);
+
+	/*
+	 * In long mode, segment override prefixes are ignored, except for
+	 * overrides for FS and GS.
+	 */
+	if (user_64bit_mode(regs)) {
+		if (idx != INAT_SEG_REG_FS &&
+		    idx != INAT_SEG_REG_GS)
+			idx = INAT_SEG_REG_IGNORE;
+	}
+
+	return idx;
+}
+
+/**
+ * get_segment_selector() - obtain segment selector
+ * @regs:		Register values as seen when entering kernel mode
+ * @seg_reg_idx:	Segment register index to use
+ *
+ * Obtain the segment selector from any of the CS, SS, DS, ES, FS, GS segment
+ * registers. In CONFIG_X86_32, the segment is obtained from either pt_regs or
+ * kernel_vm86_regs as applicable. In CONFIG_X86_64, CS and SS are obtained
+ * from pt_regs. DS, ES, FS and GS are obtained by reading the actual CPU
+ * registers. This done for only for completeness as in CONFIG_X86_64 segment
+ * registers are ignored.
+ *
+ * Returns:
+ *
+ * Value of the segment selector, including null when running in
+ * long mode.
+ *
+ * -EINVAL on error.
+ */
+static short get_segment_selector(struct pt_regs *regs, int seg_reg_idx)
+{
+#ifdef CONFIG_X86_64
+	unsigned short sel;
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_IGNORE:
+		return 0;
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		savesegment(ds, sel);
+		return sel;
+	case INAT_SEG_REG_ES:
+		savesegment(es, sel);
+		return sel;
+	case INAT_SEG_REG_FS:
+		savesegment(fs, sel);
+		return sel;
+	case INAT_SEG_REG_GS:
+		savesegment(gs, sel);
+		return sel;
+	default:
+		return -EINVAL;
+	}
+#else /* CONFIG_X86_32 */
+	struct kernel_vm86_regs *vm86regs = (struct kernel_vm86_regs *)regs;
+
+	if (v8086_mode(regs)) {
+		switch (seg_reg_idx) {
+		case INAT_SEG_REG_CS:
+			return (unsigned short)(regs->cs & 0xffff);
+		case INAT_SEG_REG_SS:
+			return (unsigned short)(regs->ss & 0xffff);
+		case INAT_SEG_REG_DS:
+			return vm86regs->ds;
+		case INAT_SEG_REG_ES:
+			return vm86regs->es;
+		case INAT_SEG_REG_FS:
+			return vm86regs->fs;
+		case INAT_SEG_REG_GS:
+			return vm86regs->gs;
+		case INAT_SEG_REG_IGNORE:
+			/* fall through */
+		default:
+			return -EINVAL;
+		}
+	}
+
+	switch (seg_reg_idx) {
+	case INAT_SEG_REG_CS:
+		return (unsigned short)(regs->cs & 0xffff);
+	case INAT_SEG_REG_SS:
+		return (unsigned short)(regs->ss & 0xffff);
+	case INAT_SEG_REG_DS:
+		return (unsigned short)(regs->ds & 0xffff);
+	case INAT_SEG_REG_ES:
+		return (unsigned short)(regs->es & 0xffff);
+	case INAT_SEG_REG_FS:
+		return (unsigned short)(regs->fs & 0xffff);
+	case INAT_SEG_REG_GS:
+		/*
+		 * GS may or may not be in regs as per CONFIG_X86_32_LAZY_GS.
+		 * The macro below takes care of both cases.
+		 */
+		return get_user_gs(regs);
+	case INAT_SEG_REG_IGNORE:
+		/* fall through */
+	default:
+		return -EINVAL;
+	}
+#endif /* CONFIG_X86_64 */
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
 			  enum reg_type type)
 {
-- 
2.7.4

Thanks and BR,
Ricardo

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-19  6:30                     ` Ricardo Neri
@ 2017-10-20  7:55                       ` Borislav Petkov
  2017-10-20 18:05                         ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-20  7:55 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Wed, Oct 18, 2017 at 11:30:54PM -0700, Ricardo Neri wrote:
> I saw that Documentation/process/email-clients.rst that emailed patches
> should be in ASCII or UTF-8 encodings only, but my patch in UTF-8
> causes problems. Then is UTF-8 not desirable?

Well, I don't think so even though I couldn't find anything that says so
explicitly.

But I *think* source files' encoding should be ASCII, as otherwise even
the tool patch barfs:

 patch: **** malformed patch at line 124:

> Thanks for the detailed explanation and the pointers to fix the problem.
> I am sorry for the inconvenience. Here is the patch again. I made sure
> that it applies cleanly with git am and patch:

Yes, it looks good now:

Reviewed-by: Borislav Petkov <bp@suse.de>

( ... or I've been staring definitely too long at this code and don't care
  anymore :-)))

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b
  2017-10-04  3:54 ` [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
@ 2017-10-20 15:44   ` Borislav Petkov
  2017-10-20 18:07     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-20 15:44 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:20PM -0700, Ricardo Neri wrote:
> Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual volume 2A states that when ModRM.mod is zero and
> ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
> that none of the registers are used in the computation of the effective
> address. A return value of -EDOM indicates callers that they should not
> use the value of registers when computing the effective address for the
> instruction.
> 
> In long mode, the effective address is given by the 32-bit displacement
> plus the location of the next instruction. In protected mode, only the
> displacement is used.
> 
> The instruction decoder takes care of obtaining the displacement.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation
  2017-10-04  3:54 ` [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
@ 2017-10-20 16:08   ` Borislav Petkov
  2017-10-20 18:10     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-20 16:08 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:21PM -0700, Ricardo Neri wrote:
> insn_get_addr_ref() returns the effective address as defined by the
> section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual. In order to compute the linear address, we must add
> to the effective address the segment base address as set in the segment
> descriptor. The segment descriptor to use depends on the register used as
> operand and segment override prefixes, if any.
> 
> In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
> segment is used or if segmentation is not used. However, the base address
> is not necessarily zero if a user programs defines its own segments. This
> is possible by using a local descriptor table.
> 
> Since the effective address is a signed quantity, the unsigned segment
> base address is saved in a separate variable and added to the final,
> unsigned, effective address.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index dd84819..b3aa891 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -719,8 +719,8 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
>   */
>  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
>  {
> -	int addr_offset, base_offset, indx_offset;
> -	unsigned long linear_addr = -1L;
> +	int addr_offset, base_offset, indx_offset, seg_reg_indx;
> +	unsigned long linear_addr = -1L, seg_base_addr;
>  	long eff_addr, base, indx;
>  	insn_byte_t sib;
>  
> @@ -734,6 +734,14 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
>  			goto out;
>  
>  		eff_addr = regs_get_register(regs, addr_offset);
> +
> +		seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset);
> +		if (seg_reg_indx < 0)
> +			goto out;
> +
> +		seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
> +		if (seg_base_addr == -1L)
> +			goto out;

Instead of replicating the same calls three times, add a
get_seg_base_addr() helper and call it where needed.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-04  3:54 ` [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
@ 2017-10-20 17:12   ` Borislav Petkov
  2017-10-20 18:24     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-20 17:12 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Tue, Oct 03, 2017 at 08:54:22PM -0700, Ricardo Neri wrote:
> The new function get_addr_ref_32() is almost identical to the existing
> function insn_get_addr_ref() (used for 64-bit addresses); except for the
> differences mentioned above. For the sake of simplicity and readability,
> it is better to use two separate functions.

You're kidding, right?

You're not adding another small function - this new one is just as big. And
almost identical.

So if you split the whole handling into helpers - for example, each
if-clause is doing very similar things - you can carve out the repeating
pieces into helpers and then call them each time with the respective
parameters, you can get rid of all that needless duplication.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector
  2017-10-20  7:55                       ` Borislav Petkov
@ 2017-10-20 18:05                         ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-20 18:05 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 09:55:40AM +0200, Borislav Petkov wrote:
> On Wed, Oct 18, 2017 at 11:30:54PM -0700, Ricardo Neri wrote:
> > I saw that Documentation/process/email-clients.rst that emailed patches
> > should be in ASCII or UTF-8 encodings only, but my patch in UTF-8
> > causes problems. Then is UTF-8 not desirable?
> 
> Well, I don't think so even though I couldn't find anything that says so
> explicitly.
> 
> But I *think* source files' encoding should be ASCII, as otherwise even
> the tool patch barfs:
> 
>  patch: **** malformed patch at line 124:

I will stay away from UTF-8.

> 
> > Thanks for the detailed explanation and the pointers to fix the problem.
> > I am sorry for the inconvenience. Here is the patch again. I made sure
> > that it applies cleanly with git am and patch:
> 
> Yes, it looks good now:
> 
> Reviewed-by: Borislav Petkov <bp@suse.de>

Thank you!
> 
> ( ... or I've been staring definitely too long at this code and don't care
>   anymore :-)))

Your guidance definitely helped to make it more readable.

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b
  2017-10-20 15:44   ` Borislav Petkov
@ 2017-10-20 18:07     ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-20 18:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 05:44:48PM +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:20PM -0700, Ricardo Neri wrote:
> > Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual volume 2A states that when ModRM.mod is zero and
> > ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
> > that none of the registers are used in the computation of the effective
> > address. A return value of -EDOM indicates callers that they should not
> > use the value of registers when computing the effective address for the
> > instruction.
> > 
> > In long mode, the effective address is given by the 32-bit displacement
> > plus the location of the next instruction. In protected mode, only the
> > displacement is used.
> > 
> > The instruction decoder takes care of obtaining the displacement.
> > 
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Thomas Garnier <thgarnie@google.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 25 ++++++++++++++++++++++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> Reviewed-by: Borislav Petkov <bp@suse.de>

Thank you!

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation
  2017-10-20 16:08   ` Borislav Petkov
@ 2017-10-20 18:10     ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-20 18:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 06:08:41PM +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:21PM -0700, Ricardo Neri wrote:
> > insn_get_addr_ref() returns the effective address as defined by the
> > section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual. In order to compute the linear address, we must add
> > to the effective address the segment base address as set in the segment
> > descriptor. The segment descriptor to use depends on the register used as
> > operand and segment override prefixes, if any.
> > 
> > In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
> > segment is used or if segmentation is not used. However, the base address
> > is not necessarily zero if a user programs defines its own segments. This
> > is possible by using a local descriptor table.
> > 
> > Since the effective address is a signed quantity, the unsigned segment
> > base address is saved in a separate variable and added to the final,
> > unsigned, effective address.
> > 
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Thomas Garnier <thgarnie@google.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 30 +++++++++++++++++++++++++++---
> >  1 file changed, 27 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index dd84819..b3aa891 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -719,8 +719,8 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
> >   */
> >  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
> >  {
> > -	int addr_offset, base_offset, indx_offset;
> > -	unsigned long linear_addr = -1L;
> > +	int addr_offset, base_offset, indx_offset, seg_reg_indx;
> > +	unsigned long linear_addr = -1L, seg_base_addr;
> >  	long eff_addr, base, indx;
> >  	insn_byte_t sib;
> >  
> > @@ -734,6 +734,14 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
> >  			goto out;
> >  
> >  		eff_addr = regs_get_register(regs, addr_offset);
> > +
> > +		seg_reg_indx = resolve_seg_reg(insn, regs, addr_offset);
> > +		if (seg_reg_indx < 0)
> > +			goto out;
> > +
> > +		seg_base_addr = insn_get_seg_base(regs, seg_reg_indx);
> > +		if (seg_base_addr == -1L)
> > +			goto out;
> 
> Instead of replicating the same calls three times, add a
> get_seg_base_addr() helper and call it where needed.

I will add this function.

Thanks and BR,
Ricardo
> -- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-20 17:12   ` Borislav Petkov
@ 2017-10-20 18:24     ` Ricardo Neri
  2017-10-20 18:38       ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-20 18:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 07:12:30PM +0200, Borislav Petkov wrote:
> On Tue, Oct 03, 2017 at 08:54:22PM -0700, Ricardo Neri wrote:
> > The new function get_addr_ref_32() is almost identical to the existing
> > function insn_get_addr_ref() (used for 64-bit addresses); except for the
> > differences mentioned above. For the sake of simplicity and readability,
> > it is better to use two separate functions.
> 
> You're kidding, right?
> 
> You're not adding another small function - this new one is just as big. And
> almost identical.
> 
> So if you split the whole handling into helpers - for example, each
> if-clause is doing very similar things - you can carve out the repeating
> pieces into helpers and then call them each time with the respective
> parameters, you can get rid of all that needless duplication.

I will create these helper functions. This change and your suggestion in
patch 18 will impact other patches in the series (e.g., the function
get_addr_ref_16() in patch 22). Would it make sense to submit a v10 and
resume review there?

Also, do you think I am still on-time to make it to v4.15?

Thanks and BR,
Ricardo 
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> -- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-20 18:24     ` Ricardo Neri
@ 2017-10-20 18:38       ` Borislav Petkov
  2017-10-20 19:16         ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2017-10-20 18:38 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 11:24:48AM -0700, Ricardo Neri wrote:
> I will create these helper functions. This change and your suggestion in
> patch 18 will impact other patches in the series (e.g., the function
> get_addr_ref_16() in patch 22). Would it make sense to submit a v10 and
> resume review there?
> 
> Also, do you think I am still on-time to make it to v4.15?

Well, I've been thinking about it: handling huge patchsets is always
very cumbersome, time-consuming and error prone. So perhaps it would be
easier - maybe - I'm not saying it will definitely but only maybe - if
you would split the patchset into, say, two, pieces, or halves, if you
will.

And I think the first piece is more or less reviewed and if tip guys
don't find any booboos, it could go in now. Which would free you to deal
with the other half later.

Anyway, this is just an idea - it might not work but it is still worth
considering.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-20 18:38       ` Borislav Petkov
@ 2017-10-20 19:16         ` Ricardo Neri
  2017-10-20 22:04           ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Ricardo Neri @ 2017-10-20 19:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 08:38:25PM +0200, Borislav Petkov wrote:
> On Fri, Oct 20, 2017 at 11:24:48AM -0700, Ricardo Neri wrote:
> > I will create these helper functions. This change and your suggestion in
> > patch 18 will impact other patches in the series (e.g., the function
> > get_addr_ref_16() in patch 22). Would it make sense to submit a v10 and
> > resume review there?
> > 
> > Also, do you think I am still on-time to make it to v4.15?
> 
> Well, I've been thinking about it: handling huge patchsets is always
> very cumbersome, time-consuming and error prone. So perhaps it would be
> easier - maybe - I'm not saying it will definitely but only maybe - if
> you would split the patchset into, say, two, pieces, or halves, if you
> will.
> 
> And I think the first piece is more or less reviewed and if tip guys
> don't find any booboos, it could go in now. Which would free you to deal
> with the other half later.

Since MPX uses this emulation code and only cares about 64-bit addresses
(given the initial implemention from which I based my code), patches 1-18
need to be pulled together.

Perhaps I can send the v10 of patches 1-18 (or a v1 since is a new
series?). Patches 19-29 would constitute a series of improved emulation
plus UMIP code.

Does it make sense?

Thanks and BR,
Ricardo
> -- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings
  2017-10-20 19:16         ` Ricardo Neri
@ 2017-10-20 22:04           ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-20 22:04 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, x86, ricardo.neri, Adam Buchbinder, Colin Ian King,
	Lorenzo Stoakes, Qiaowei Ren, Arnaldo Carvalho de Melo,
	Adrian Hunter, Kees Cook, Thomas Garnier, Dmitry Vyukov

On Fri, Oct 20, 2017 at 12:16:06PM -0700, Ricardo Neri wrote:
> Perhaps I can send the v10 of patches 1-18 (or a v1 since is a new
> series?). Patches 19-29 would constitute a series of improved emulation
> plus UMIP code.
> 
> Does it make sense?

Yap, it does. It is still tip guys' final decision but I think it should
be a bit easier one this way.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
  2017-10-04  3:54   ` Ricardo Neri
  (?)
@ 2017-10-26  7:51     ` Andy Lutomirski
  -1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  7:51 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo,
	Borislav Petkov, Dave Hansen, Denys Vlasenko, Josh Poimboeuf,
	Linus Torvalds, linux-arch, linux-mm

On Tue, Oct 3, 2017 at 8:54 PM, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
> Both head_32.S and head_64.S utilize the same value to initialize the
> control register CR0. Also, other parts of the kernel might want to access
> this initial definition (e.g., emulation code for User-Mode Instruction
> Prevention uses this state to provide a sane dummy value for CR0 when
> emulating the smsw instruction). Thus, relocate this definition to a
> header file from which it can be conveniently accessed.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

with the slight caveat that I think it might be a wee bit better if
UMIP emulation used a separate define UMIP_REPORTED_CR0.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26  7:51     ` Andy Lutomirski
  0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  7:51 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel, X86 ML

On Tue, Oct 3, 2017 at 8:54 PM, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
> Both head_32.S and head_64.S utilize the same value to initialize the
> control register CR0. Also, other parts of the kernel might want to access
> this initial definition (e.g., emulation code for User-Mode Instruction
> Prevention uses this state to provide a sane dummy value for CR0 when
> emulating the smsw instruction). Thus, relocate this definition to a
> header file from which it can be conveniently accessed.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

with the slight caveat that I think it might be a wee bit better if
UMIP emulation used a separate define UMIP_REPORTED_CR0.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26  7:51     ` Andy Lutomirski
  0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  7:51 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo,
	Borislav Petkov, Dave Hansen, Denys Vlasenko, Josh Poimboeuf,
	Linus Torvalds, linux-arch, linux-mm

On Tue, Oct 3, 2017 at 8:54 PM, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
> Both head_32.S and head_64.S utilize the same value to initialize the
> control register CR0. Also, other parts of the kernel might want to access
> this initial definition (e.g., emulation code for User-Mode Instruction
> Prevention uses this state to provide a sane dummy value for CR0 when
> emulating the smsw instruction). Thus, relocate this definition to a
> header file from which it can be conveniently accessed.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

with the slight caveat that I think it might be a wee bit better if
UMIP emulation used a separate define UMIP_REPORTED_CR0.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user
  2017-10-04  3:54 ` [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
@ 2017-10-26  7:59   ` Andy Lutomirski
  2017-10-27 21:46     ` Ricardo Neri
  0 siblings, 1 reply; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  7:59 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andy Lutomirski,
	Borislav Petkov, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo,
	Fenghua Yu, Tony Luck

On Tue, Oct 3, 2017 at 8:54 PM, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
> fixup_umip_exception() will be called from do_general_protection(). If the
> former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
> However, when emulation is successful but the emulated result cannot be
> copied to user space memory, it is more accurate to issue a SIGSEGV with
> SEGV_MAPERR with the offending address. A new function, inspired in
> force_sig_info_fault(), is introduced to model the page fault.

This code is slightly buggy (with, for example, PKRU, although the
chance that anyone ever notices is about nil).  For an alternative
approach, see current->thread.sig_on_uaccess_err, used in
arch/x86/entry/vsyscall/vsyscall_64.c.  But I'm fine with this patch
as is, too.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
  2017-10-26  7:51     ` Andy Lutomirski
  (?)
@ 2017-10-26  9:00       ` Borislav Petkov
  -1 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-26  9:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo, Dave Hansen,
	Denys Vlasenko, Josh Poimboeuf, Linus Torvalds, linux-arch,
	linux-mm

On Thu, Oct 26, 2017 at 12:51:25AM -0700, Andy Lutomirski wrote:
> with the slight caveat that I think it might be a wee bit better if
> UMIP emulation used a separate define UMIP_REPORTED_CR0.

Why, do you see CR0_STATE and UMIP_REPORTED_CR0 becoming different at
some point?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26  9:00       ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-26  9:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar

On Thu, Oct 26, 2017 at 12:51:25AM -0700, Andy Lutomirski wrote:
> with the slight caveat that I think it might be a wee bit better if
> UMIP emulation used a separate define UMIP_REPORTED_CR0.

Why, do you see CR0_STATE and UMIP_REPORTED_CR0 becoming different at
some point?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26  9:00       ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-26  9:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo, Dave Hansen,
	Denys Vlasenko, Josh Poimboeuf, Linus Torvalds, linux-arch,
	linux-mm

On Thu, Oct 26, 2017 at 12:51:25AM -0700, Andy Lutomirski wrote:
> with the slight caveat that I think it might be a wee bit better if
> UMIP emulation used a separate define UMIP_REPORTED_CR0.

Why, do you see CR0_STATE and UMIP_REPORTED_CR0 becoming different at
some point?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
  2017-10-26  9:00       ` Borislav Petkov
  (?)
@ 2017-10-26  9:02         ` Andy Lutomirski
  -1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  9:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ricardo Neri, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo,
	Dave Hansen, Denys Vlasenko, Josh Poimboeuf, Linus Torvalds,
	linux-arch, linux-mm

On Thu, Oct 26, 2017 at 2:00 AM, Borislav Petkov <bp@suse.de> wrote:
> On Thu, Oct 26, 2017 at 12:51:25AM -0700, Andy Lutomirski wrote:
>> with the slight caveat that I think it might be a wee bit better if
>> UMIP emulation used a separate define UMIP_REPORTED_CR0.
>
> Why, do you see CR0_STATE and UMIP_REPORTED_CR0 becoming different at
> some point?

I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
new field that we set some day, then I assume that CR0_STATE would add
that bit but UMIP_REPORTED_CR0 would not.

>
> --
> Regards/Gruss,
>     Boris.
>
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26  9:02         ` Andy Lutomirski
  0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  9:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ricardo Neri, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel@vger.kernel.org

On Thu, Oct 26, 2017 at 2:00 AM, Borislav Petkov <bp@suse.de> wrote:
> On Thu, Oct 26, 2017 at 12:51:25AM -0700, Andy Lutomirski wrote:
>> with the slight caveat that I think it might be a wee bit better if
>> UMIP emulation used a separate define UMIP_REPORTED_CR0.
>
> Why, do you see CR0_STATE and UMIP_REPORTED_CR0 becoming different at
> some point?

I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
new field that we set some day, then I assume that CR0_STATE would add
that bit but UMIP_REPORTED_CR0 would not.

>
> --
> Regards/Gruss,
>     Boris.
>
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26  9:02         ` Andy Lutomirski
  0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2017-10-26  9:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ricardo Neri, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Peter Zijlstra, Andrew Morton, Brian Gerst,
	Chris Metcalf, Dave Hansen, Paolo Bonzini, Liang Z Li,
	Masami Hiramatsu, Huang Rui, Jiri Slaby, Jonathan Corbet,
	Michael S. Tsirkin, Paul Gortmaker, Vlastimil Babka, Chen Yucong,
	Ravi V. Shankar, Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo,
	Dave Hansen, Denys Vlasenko, Josh Poimboeuf, Linus Torvalds,
	linux-arch, linux-mm

On Thu, Oct 26, 2017 at 2:00 AM, Borislav Petkov <bp@suse.de> wrote:
> On Thu, Oct 26, 2017 at 12:51:25AM -0700, Andy Lutomirski wrote:
>> with the slight caveat that I think it might be a wee bit better if
>> UMIP emulation used a separate define UMIP_REPORTED_CR0.
>
> Why, do you see CR0_STATE and UMIP_REPORTED_CR0 becoming different at
> some point?

I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
new field that we set some day, then I assume that CR0_STATE would add
that bit but UMIP_REPORTED_CR0 would not.

>
> --
> Regards/Gruss,
>     Boris.
>
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
  2017-10-26  9:02         ` Andy Lutomirski
  (?)
@ 2017-10-26 12:55           ` Borislav Petkov
  -1 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-26 12:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, X86 ML, Neri, Ricardo, Dave Hansen, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, linux-arch, linux-mm

On Thu, Oct 26, 2017 at 02:02:02AM -0700, Andy Lutomirski wrote:
> I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
> new field that we set some day, then I assume that CR0_STATE would add
> that bit but UMIP_REPORTED_CR0 would not.

Yeah, let's do that when it is actually needed.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26 12:55           ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-26 12:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, X86 ML, Neri, Ricardo

On Thu, Oct 26, 2017 at 02:02:02AM -0700, Andy Lutomirski wrote:
> I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
> new field that we set some day, then I assume that CR0_STATE would add
> that bit but UMIP_REPORTED_CR0 would not.

Yeah, let's do that when it is actually needed.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-26 12:55           ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2017-10-26 12:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ricardo Neri, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, X86 ML, Neri, Ricardo, Dave Hansen, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, linux-arch, linux-mm

On Thu, Oct 26, 2017 at 02:02:02AM -0700, Andy Lutomirski wrote:
> I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
> new field that we set some day, then I assume that CR0_STATE would add
> that bit but UMIP_REPORTED_CR0 would not.

Yeah, let's do that when it is actually needed.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
  2017-10-26 12:55           ` Borislav Petkov
  (?)
@ 2017-10-27 19:02             ` Ricardo Neri
  -1 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-27 19:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, X86 ML, Neri, Ricardo, Dave Hansen, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, linux-arch, linux-mm

On Thu, Oct 26, 2017 at 02:55:13PM +0200, Borislav Petkov wrote:
> On Thu, Oct 26, 2017 at 02:02:02AM -0700, Andy Lutomirski wrote:
> > I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
> > new field that we set some day, then I assume that CR0_STATE would add
> > that bit but UMIP_REPORTED_CR0 would not.
> 
> Yeah, let's do that when it is actually needed.

Thanks Andy! I reasoned that for UMIP could report CR0_STATE a value that
is already revealed in the source code. Thus, if CR0 ever changes at run
time, an attacker could only see what is set programmatically.

BR,

Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-27 19:02             ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-27 19:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel

On Thu, Oct 26, 2017 at 02:55:13PM +0200, Borislav Petkov wrote:
> On Thu, Oct 26, 2017 at 02:02:02AM -0700, Andy Lutomirski wrote:
> > I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
> > new field that we set some day, then I assume that CR0_STATE would add
> > that bit but UMIP_REPORTED_CR0 would not.
> 
> Yeah, let's do that when it is actually needed.

Thanks Andy! I reasoned that for UMIP could report CR0_STATE a value that
is already revealed in the source code. Thus, if CR0 ever changes at run
time, an attacker could only see what is set programmatically.

BR,

Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0
@ 2017-10-27 19:02             ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-27 19:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Masami Hiramatsu, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin, Paul Gortmaker,
	Vlastimil Babka, Chen Yucong, Ravi V. Shankar, Shuah Khan,
	linux-kernel, X86 ML, Neri, Ricardo, Dave Hansen, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, linux-arch, linux-mm

On Thu, Oct 26, 2017 at 02:55:13PM +0200, Borislav Petkov wrote:
> On Thu, Oct 26, 2017 at 02:02:02AM -0700, Andy Lutomirski wrote:
> > I'm assuming that UMIP_REPORTED_CR0 will never change.  If CR0 gets a
> > new field that we set some day, then I assume that CR0_STATE would add
> > that bit but UMIP_REPORTED_CR0 would not.
> 
> Yeah, let's do that when it is actually needed.

Thanks Andy! I reasoned that for UMIP could report CR0_STATE a value that
is already revealed in the source code. Thus, if CR0 ever changes at run
time, an attacker could only see what is set programmatically.

BR,

Ricardo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user
  2017-10-26  7:59   ` Andy Lutomirski
@ 2017-10-27 21:46     ` Ricardo Neri
  0 siblings, 0 replies; 83+ messages in thread
From: Ricardo Neri @ 2017-10-27 21:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Borislav Petkov,
	Peter Zijlstra, Andrew Morton, Brian Gerst, Chris Metcalf,
	Dave Hansen, Paolo Bonzini, Liang Z Li, Masami Hiramatsu,
	Huang Rui, Jiri Slaby, Jonathan Corbet, Michael S. Tsirkin,
	Paul Gortmaker, Vlastimil Babka, Chen Yucong, Ravi V. Shankar,
	Shuah Khan, linux-kernel, X86 ML, Neri, Ricardo, Fenghua Yu,
	Tony Luck

On Thu, Oct 26, 2017 at 12:59:55AM -0700, Andy Lutomirski wrote:
> On Tue, Oct 3, 2017 at 8:54 PM, Ricardo Neri
> <ricardo.neri-calderon@linux.intel.com> wrote:
> > fixup_umip_exception() will be called from do_general_protection(). If the
> > former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
> > However, when emulation is successful but the emulated result cannot be
> > copied to user space memory, it is more accurate to issue a SIGSEGV with
> > SEGV_MAPERR with the offending address. A new function, inspired in
> > force_sig_info_fault(), is introduced to model the page fault.
> 
> This code is slightly buggy (with, for example, PKRU, although the
> chance that anyone ever notices is about nil).  For an alternative
> approach, see current->thread.sig_on_uaccess_err, used in
> arch/x86/entry/vsyscall/vsyscall_64.c.  But I'm fine with this patch
> as is, too.

Thanks Andy, I will study the alternative you mention. Since you are OK
with this patch, I will submit v10 of this series to allow the review
of the series to continue.

BR,
Ricardo

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2017-10-27 21:48 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-04  3:54 [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 01/29] x86/mm: Relocate page fault error codes to traps.h Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 02/29] x86/boot: Relocate definition of the initial state of CR0 Ricardo Neri
2017-10-04  3:54   ` Ricardo Neri
2017-10-04  3:54   ` Ricardo Neri
2017-10-26  7:51   ` Andy Lutomirski
2017-10-26  7:51     ` Andy Lutomirski
2017-10-26  7:51     ` Andy Lutomirski
2017-10-26  9:00     ` Borislav Petkov
2017-10-26  9:00       ` Borislav Petkov
2017-10-26  9:00       ` Borislav Petkov
2017-10-26  9:02       ` Andy Lutomirski
2017-10-26  9:02         ` Andy Lutomirski
2017-10-26  9:02         ` Andy Lutomirski
2017-10-26 12:55         ` Borislav Petkov
2017-10-26 12:55           ` Borislav Petkov
2017-10-26 12:55           ` Borislav Petkov
2017-10-27 19:02           ` Ricardo Neri
2017-10-27 19:02             ` Ricardo Neri
2017-10-27 19:02             ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 03/29] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 04/29] uprobes/x86: Use existing definitions for segment override prefixes Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 05/29] x86/mpx: Simplify handling of errors when computing linear addresses Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 06/29] x86/mpx: Use signed variables to compute effective addresses Ricardo Neri
2017-10-05  9:41   ` Borislav Petkov
2017-10-05 17:38     ` Neri, Ricardo
2017-10-04  3:54 ` [PATCH v9 07/29] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 08/29] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0 Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 09/29] x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval file Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 10/29] x86/insn-eval: Do not BUG on invalid register type Ricardo Neri
2017-10-07 16:22   ` Borislav Petkov
2017-10-09 23:56     ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 11/29] x86/insn-eval: Add a utility function to get register offsets Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 12/29] x86/insn-eval: Add utility function to identify string instructions Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 13/29] x86/insn-eval: Add utility functions to get segment selector Ricardo Neri
2017-10-10 22:41   ` Borislav Petkov
2017-10-12  1:12     ` Ricardo Neri
2017-10-12  9:48       ` Borislav Petkov
2017-10-13  1:08         ` Ricardo Neri
2017-10-13 11:37           ` Borislav Petkov
2017-10-13 18:43             ` Ricardo Neri
2017-10-17  9:35               ` Borislav Petkov
2017-10-17 20:31                 ` Ricardo Neri
2017-10-18 20:29                   ` Borislav Petkov
2017-10-19  6:30                     ` Ricardo Neri
2017-10-20  7:55                       ` Borislav Petkov
2017-10-20 18:05                         ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 14/29] x86/insn-eval: Add utility function to get segment descriptor Ricardo Neri
2017-10-11 14:57   ` Borislav Petkov
2017-10-12  0:45     ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 15/29] x86/insn-eval: Add utility functions to get segment descriptor base address and limit Ricardo Neri
2017-10-11 15:15   ` Borislav Petkov
2017-10-11 19:57     ` Ricardo Neri
2017-10-11 20:16       ` Borislav Petkov
2017-10-12  1:24         ` Ricardo Neri
2017-10-12 16:02           ` Borislav Petkov
2017-10-04  3:54 ` [PATCH v9 16/29] x86/insn-eval: Add function to get default params of code segment Ricardo Neri
2017-10-12 16:31   ` Borislav Petkov
2017-10-12 18:27     ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 17/29] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 101b Ricardo Neri
2017-10-20 15:44   ` Borislav Petkov
2017-10-20 18:07     ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 18/29] x86/insn-eval: Incorporate segment base in linear address computation Ricardo Neri
2017-10-20 16:08   ` Borislav Petkov
2017-10-20 18:10     ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 19/29] x86/insn-eval: Add support to resolve 32-bit address encodings Ricardo Neri
2017-10-20 17:12   ` Borislav Petkov
2017-10-20 18:24     ` Ricardo Neri
2017-10-20 18:38       ` Borislav Petkov
2017-10-20 19:16         ` Ricardo Neri
2017-10-20 22:04           ` Borislav Petkov
2017-10-04  3:54 ` [PATCH v9 20/29] x86/insn-eval: Add wrapper function for 32 and 64-bit addresses Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 21/29] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 22/29] x86/insn-eval: Add support to resolve 16-bit addressing encodings Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 23/29] x86/cpufeature: Add User-Mode Instruction Prevention definitions Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 24/29] x86: Add emulation code for UMIP instructions Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 25/29] x86/umip: Force a page fault when unable to copy emulated result to user Ricardo Neri
2017-10-26  7:59   ` Andy Lutomirski
2017-10-27 21:46     ` Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 26/29] x86: Enable User-Mode Instruction Prevention Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 27/29] x86/traps: Fixup general protection faults caused by UMIP Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 28/29] selftests/x86: Add tests for User-Mode Instruction Prevention Ricardo Neri
2017-10-04  3:54 ` [PATCH v9 29/29] selftests/x86: Add tests for instruction str and sldt Ricardo Neri

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.