linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/33] Compile-time stack metadata validation
@ 2016-01-21 22:49 Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 01/33] x86/stacktool: " Josh Poimboeuf
                   ` (35 more replies)
  0 siblings, 36 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	David Vrabel, Borislav Petkov, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, Jeremy Fitzhardinge, Chris Wright, Alok Kataria,
	Rusty Russell, Herbert Xu, David S. Miller, Pavel Machek,
	Rafael J. Wysocki, Len Brown, Matt Fleming, Alexei Starovoitov,
	netdev, Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

This is v16 of the compile-time stack metadata validation patch set,
along with proposed fixes for most of the warnings it found.  It's based
on the tip/master branch.

v15 can be found here:

  https://lkml.kernel.org/r/cover.1450442274.git.jpoimboe@redhat.com

For more information about the motivation behind this patch set, and
more details about what it does, see the first patch changelog and
tools/stacktool/Documentation/stack-validation.txt.

Patches 1-4 add stacktool and integrate it into the kernel build.

Patches 5-28 are some proposed fixes for several of the warnings
reported by stacktool.  They've been compile-tested and boot-tested in a
VM, but I haven't attempted any meaningful testing for many of them.

Patches 29-33 add some directories, files, and functions to the
stacktool whitelist in order to silence false positive warnings.

v16:
- fix all allyesconfig warnings, except for staging
- get rid of STACKTOOL_IGNORE_INSN which is no longer needed
- remove several whitelists in favor of automatically whitelisting any
  function with a special instruction like ljmp, lret, or vmrun
- split up stacktool patch into 3 parts as suggested by Ingo
- update the global noreturn function list
- detect noreturn function fallthroughs
- skip weak functions in noreturn call detection logic
- add empty function check to noreturn logic
- allow non-section rela symbols for __ex_table sections
- support rare switch table case with jmpq *[addr](%rip)
- don't warn on frame pointer restore without save
- rearrange patch order a bit

v15:
- restructure code for a new cmdline interface "stacktool check" using
  the new subcommand framework in tools/lib/subcmd
- fix 32 bit build fail (put __sp at end) in paravirt_types.h patch 10
  which was reported by 0day

v14:
- make tools/include/linux/list.h self-sufficient
- create FRAME_OFFSET to allow 32-bit code to be able to access function
  arguments on the stack
- add FRAME_OFFSET usage in crypto patch 14/24: "Create stack frames in
  aesni-intel_asm.S"
- rename "index" -> "idx" to fix build with some compilers

v13:
- LDFLAGS order fix from Chris J Arges
- new warning fix patches from Chris J Arges
- "--frame-pointer" -> "--check-frame-pointer"

v12:
- rename "stackvalidate" -> "stacktool"
- move from scripts/ to tools/:
  - makefile rework
  - make a copy of the x86 insn code (and warn if the code diverges)
  - use tools/include/linux/list.h
- move warning macros to a new warn.h file
- change wording: "stack validation" -> "stack metadata validation"

v11:
- attempt to answer the "why" question better in the documentation and
  commit message
- s/FP_SAVE/FRAME_BEGIN/ in documentation

v10:
- add scripts/mod to directory ignores
- remove circular dependencies for ignored objects which are built
  before stackvalidate
- fix CONFIG_MODVERSIONS incompatibility

v9:
- rename FRAME/ENDFRAME -> FRAME_BEGIN/FRAME_END
- fix jump table issue for when the original instruction is a jump
- drop paravirt thunk alignment patch
- add maintainers to CC for proposed warning fixes

v8:
- add proposed fixes for warnings
- fix all memory leaks
- process ignores earlier and add more ignore checks
- always assume POPCNT alternative is enabled
- drop hweight inline asm fix
- drop __schedule() ignore patch
- change .Ltemp_\@ to .Lstackvalidate_ignore_\@ in asm macro
- fix CONFIG_* checks in asm macros
- add C versions of ignore macros and frame macros
- change ";" to "\n" in C macros
- add ifdef CONFIG_STACK_VALIDATION checks in C ignore macros
- use numbered label in C ignore macro
- add missing break in switch case statement in arch-x86.c

v7:
- sibling call support
- document proposed solution for inline asm() frame pointer issues
- say "kernel entry/exit" instead of "context switch"
- clarify the checking of switch statement jump tables
- discard __stackvalidate_ignore_* sections in linker script
- use .Ltemp_\@ to get a unique label instead of static 3-digit number
- change STACKVALIDATE_IGNORE_FUNC variable to a static
- move STACKVALIDATE_IGNORE_INSN to arch-specific .h file

v6:
- rename asmvalidate -> stackvalidate (again)
- gcc-generated object file support
- recursive branch state analysis
- external jump support
- fixup/exception table support
- jump label support
- switch statement jump table support
- added documentation
- detection of "noreturn" dead end functions
- added a Kbuild mechanism for skipping files and dirs
- moved frame pointer macros to arch/x86/include/asm/frame.h
- moved ignore macros to include/linux/stackvalidate.h

v5:
- stackvalidate -> asmvalidate
- frame pointers only required for non-leaf functions
- check for the use of the FP_SAVE/RESTORE macros instead of manually
  analyzing code to detect frame pointer usage
- additional checks to ensure each function doesn't leave its boundaries
- make the macros simpler and more flexible
- support for analyzing ALTERNATIVE macros
- simplified the arch interfaces in scripts/asmvalidate/arch.h
- fixed some asmvalidate warnings
- rebased onto latest tip asm cleanups
- many more small changes

v4:
- Changed the default to CONFIG_STACK_VALIDATION=n, until all the asm
  code can get cleaned up.
- Fixed a stackvalidate error path exit code issue found by Michal
  Marek.

v3:
- Added a patch to make the push/pop CFI macros arch-independent, as
  suggested by H. Peter Anvin

v2:
- Fixed memory leaks reported by Petr Mladek

Cc: linux-kernel@vger.kernel.org
Cc: live-patching@vger.kernel.org
Cc: Michal Marek <mmarek@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>

Chris J Arges (1):
  x86/uaccess: Add stack frame output operand in get_user inline asm

Josh Poimboeuf (32):
  x86/stacktool: Compile-time stack metadata validation
  kbuild/stacktool: Add CONFIG_STACK_VALIDATION option
  x86/stacktool: Enable stacktool on x86_64
  x86/stacktool: Add STACKTOOL_IGNORE_FUNC macro
  x86/xen: Add stack frame dependency to hypercall inline asm calls
  x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()
  x86/asm/xen: Create stack frames in xen-asm.S
  x86/paravirt: Add stack frame dependency to PVOP inline asm calls
  x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK
  x86/amd: Set ELF function type for vide()
  x86/asm/crypto: Move .Lbswap_mask data to .rodata section
  x86/asm/crypto: Move jump_table to .rodata section
  x86/asm/crypto: Simplify stack usage in sha-mb functions
  x86/asm/crypto: Don't use rbp as a scratch register
  x86/asm/crypto: Create stack frames in crypto functions
  x86/asm/entry: Create stack frames in thunk functions
  x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
  x86/asm: Create stack frames in rwsem functions
  x86/asm/efi: Create a stack frame in efi_call()
  x86/asm/power: Create stack frames in hibernate_asm_64.S
  x86/asm/bpf: Annotate callable functions
  x86/asm/bpf: Create stack frames in bpf_jit.S
  x86/kprobes: Get rid of kretprobe_trampoline_holder()
  x86/kvm: Set ELF function type for fastop functions
  x86/kvm: Add stack frame dependency to test_cc() inline asm
  watchdog/hpwdt: Create stack frame in asminline_call()
  x86/locking: Create stack frame in PV unlock
  x86/stacktool: Add directory and file whitelists
  x86/xen: Add xen_cpuid() to stacktool whitelist
  bpf: Add __bpf_prog_run() to stacktool whitelist
  sched: Add __schedule() to stacktool whitelist
  x86/kprobes: Add kretprobe_trampoline() to stacktool whitelist

 MAINTAINERS                                        |   6 +
 Makefile                                           |   5 +-
 arch/Kconfig                                       |   6 +
 arch/x86/Kconfig                                   |   1 +
 arch/x86/boot/Makefile                             |   1 +
 arch/x86/boot/compressed/Makefile                  |   3 +-
 arch/x86/crypto/aesni-intel_asm.S                  |  75 +-
 arch/x86/crypto/camellia-aesni-avx-asm_64.S        |  15 +
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S       |  15 +
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S          |   9 +
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S          |  13 +
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S          |   8 +-
 arch/x86/crypto/ghash-clmulni-intel_asm.S          |   5 +
 arch/x86/crypto/serpent-avx-x86_64-asm_64.S        |  13 +
 arch/x86/crypto/serpent-avx2-asm_64.S              |  13 +
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S    |  35 +-
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S   |  36 +-
 arch/x86/crypto/twofish-avx-x86_64-asm_64.S        |  13 +
 arch/x86/entry/Makefile                            |   4 +
 arch/x86/entry/thunk_64.S                          |   4 +
 arch/x86/entry/vdso/Makefile                       |   5 +-
 arch/x86/include/asm/paravirt.h                    |   9 +-
 arch/x86/include/asm/paravirt_types.h              |  18 +-
 arch/x86/include/asm/qspinlock_paravirt.h          |   4 +
 arch/x86/include/asm/uaccess.h                     |   5 +-
 arch/x86/include/asm/xen/hypercall.h               |   5 +-
 arch/x86/kernel/Makefile                           |   5 +
 arch/x86/kernel/acpi/wakeup_64.S                   |   3 +
 arch/x86/kernel/cpu/amd.c                          |   5 +-
 arch/x86/kernel/kprobes/core.c                     |  59 +-
 arch/x86/kernel/vmlinux.lds.S                      |   5 +-
 arch/x86/kvm/emulate.c                             |  33 +-
 arch/x86/lib/rwsem.S                               |  11 +-
 arch/x86/net/bpf_jit.S                             |  48 +-
 arch/x86/platform/efi/Makefile                     |   2 +
 arch/x86/platform/efi/efi_stub_64.S                |   3 +
 arch/x86/power/hibernate_asm_64.S                  |   7 +
 arch/x86/purgatory/Makefile                        |   2 +
 arch/x86/realmode/Makefile                         |   4 +-
 arch/x86/realmode/rm/Makefile                      |   3 +-
 arch/x86/xen/enlighten.c                           |   3 +-
 arch/x86/xen/xen-asm.S                             |  10 +-
 arch/x86/xen/xen-asm_64.S                          |   1 +
 drivers/firmware/efi/libstub/Makefile              |   1 +
 drivers/watchdog/hpwdt.c                           |   8 +-
 include/linux/stacktool.h                          |  23 +
 kernel/bpf/core.c                                  |   2 +
 kernel/sched/core.c                                |   2 +
 lib/Kconfig.debug                                  |  12 +
 scripts/Makefile.build                             |  38 +-
 scripts/mod/Makefile                               |   2 +
 tools/Makefile                                     |  14 +-
 tools/stacktool/.gitignore                         |   2 +
 tools/stacktool/Build                              |  13 +
 tools/stacktool/Documentation/stack-validation.txt | 333 +++++++
 tools/stacktool/Makefile                           |  60 ++
 tools/stacktool/arch.h                             |  44 +
 tools/stacktool/arch/x86/Build                     |  12 +
 tools/stacktool/arch/x86/decode.c                  | 172 ++++
 .../stacktool/arch/x86/insn/gen-insn-attr-x86.awk  | 387 ++++++++
 tools/stacktool/arch/x86/insn/inat.c               |  97 ++
 tools/stacktool/arch/x86/insn/inat.h               | 221 +++++
 tools/stacktool/arch/x86/insn/inat_types.h         |  29 +
 tools/stacktool/arch/x86/insn/insn.c               | 594 ++++++++++++
 tools/stacktool/arch/x86/insn/insn.h               | 201 +++++
 tools/stacktool/arch/x86/insn/x86-opcode-map.txt   | 984 ++++++++++++++++++++
 tools/stacktool/builtin-check.c                    | 991 +++++++++++++++++++++
 tools/stacktool/builtin.h                          |  22 +
 tools/stacktool/elf.c                              | 403 +++++++++
 tools/stacktool/elf.h                              |  79 ++
 tools/stacktool/special.c                          | 193 ++++
 tools/stacktool/special.h                          |  42 +
 tools/stacktool/stacktool.c                        | 134 +++
 tools/stacktool/warn.h                             |  60 ++
 74 files changed, 5516 insertions(+), 189 deletions(-)
 create mode 100644 include/linux/stacktool.h
 create mode 100644 tools/stacktool/.gitignore
 create mode 100644 tools/stacktool/Build
 create mode 100644 tools/stacktool/Documentation/stack-validation.txt
 create mode 100644 tools/stacktool/Makefile
 create mode 100644 tools/stacktool/arch.h
 create mode 100644 tools/stacktool/arch/x86/Build
 create mode 100644 tools/stacktool/arch/x86/decode.c
 create mode 100644 tools/stacktool/arch/x86/insn/gen-insn-attr-x86.awk
 create mode 100644 tools/stacktool/arch/x86/insn/inat.c
 create mode 100644 tools/stacktool/arch/x86/insn/inat.h
 create mode 100644 tools/stacktool/arch/x86/insn/inat_types.h
 create mode 100644 tools/stacktool/arch/x86/insn/insn.c
 create mode 100644 tools/stacktool/arch/x86/insn/insn.h
 create mode 100644 tools/stacktool/arch/x86/insn/x86-opcode-map.txt
 create mode 100644 tools/stacktool/builtin-check.c
 create mode 100644 tools/stacktool/builtin.h
 create mode 100644 tools/stacktool/elf.c
 create mode 100644 tools/stacktool/elf.h
 create mode 100644 tools/stacktool/special.c
 create mode 100644 tools/stacktool/special.h
 create mode 100644 tools/stacktool/stacktool.c
 create mode 100644 tools/stacktool/warn.h

-- 
2.4.3

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 01/33] x86/stacktool: Compile-time stack metadata validation
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 02/33] kbuild/stacktool: Add CONFIG_STACK_VALIDATION option Josh Poimboeuf
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

This adds a host tool named stacktool which analyzes .o files to ensure
the validity of stack metadata.  It enforces a set of rules on asm code
and C inline assembly code so that stack traces can be reliable.

For each function, it recursively follows all possible code paths and
validates the correct frame pointer state at each instruction.

It also follows code paths involving kernel special sections, like
.altinstructions, __jump_table, and __ex_table, which can add
alternative execution paths to a given instruction (or set of
instructions).  Similarly, it knows how to follow switch statements, for
which gcc sometimes uses jump tables.

Here are some of the benefits of validating stack metadata:

a) More reliable stack traces for frame pointer enabled kernels

   Frame pointers are used for debugging purposes.  They allow runtime
   code and debug tools to be able to walk the stack to determine the
   chain of function call sites that led to the currently executing
   code.

   For some architectures, frame pointers are enabled by
   CONFIG_FRAME_POINTER.  For some other architectures they may be
   required by the ABI (sometimes referred to as "backchain pointers").

   For C code, gcc automatically generates instructions for setting up
   frame pointers when the -fno-omit-frame-pointer option is used.

   But for asm code, the frame setup instructions have to be written by
   hand, which most people don't do.  So the end result is that
   CONFIG_FRAME_POINTER is honored for C code but not for most asm code.

   For stack traces based on frame pointers to be reliable, all
   functions which call other functions must first create a stack frame
   and update the frame pointer.  If a first function doesn't properly
   create a stack frame before calling a second function, the *caller*
   of the first function will be skipped on the stack trace.

   For example, consider the following example backtrace with frame
   pointers enabled:

     [<ffffffff81812584>] dump_stack+0x4b/0x63
     [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
     [<ffffffff8127f568>] seq_read+0x108/0x3e0
     [<ffffffff812cce62>] proc_reg_read+0x42/0x70
     [<ffffffff81256197>] __vfs_read+0x37/0x100
     [<ffffffff81256b16>] vfs_read+0x86/0x130
     [<ffffffff81257898>] SyS_read+0x58/0xd0
     [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76

   It correctly shows that the caller of cmdline_proc_show() is
   seq_read().

   If we remove the frame pointer logic from cmdline_proc_show() by
   replacing the frame pointer related instructions with nops, here's
   what it looks like instead:

     [<ffffffff81812584>] dump_stack+0x4b/0x63
     [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
     [<ffffffff812cce62>] proc_reg_read+0x42/0x70
     [<ffffffff81256197>] __vfs_read+0x37/0x100
     [<ffffffff81256b16>] vfs_read+0x86/0x130
     [<ffffffff81257898>] SyS_read+0x58/0xd0
     [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76

   Notice that cmdline_proc_show()'s caller, seq_read(), has been
   skipped.  Instead the stack trace seems to show that
   cmdline_proc_show() was called by proc_reg_read().

   The benefit of stacktool here is that because it ensures that *all*
   functions honor CONFIG_FRAME_POINTER, no functions will ever[*] be
   skipped on a stack trace.

   [*] unless an interrupt or exception has occurred at the very
       beginning of a function before the stack frame has been created,
       or at the very end of the function after the stack frame has been
       destroyed.  This is an inherent limitation of frame pointers.

b) 100% reliable stack traces for DWARF enabled kernels

   This is not yet implemented.  For more details about what is planned,
   see tools/stacktool/Documentation/stack-validation.txt.

c) Higher live patching compatibility rate

   This is not yet implemented.  For more details about what is planned,
   see tools/stacktool/Documentation/stack-validation.txt.

To achieve the validation, stacktool enforces the following rules:

1. Each callable function must be annotated as such with the ELF
   function type.  In asm code, this is typically done using the
   ENTRY/ENDPROC macros.  If stacktool finds a return instruction
   outside of a function, it flags an error since that usually indicates
   callable code which should be annotated accordingly.

   This rule is needed so that stacktool can properly identify each
   callable function in order to analyze its stack metadata.

2. Conversely, each section of code which is *not* callable should *not*
   be annotated as an ELF function.  The ENDPROC macro shouldn't be used
   in this case.

   This rule is needed so that stacktool can ignore non-callable code.
   Such code doesn't have to follow any of the other rules.

3. Each callable function which calls another function must have the
   correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
   the architecture's back chain rules.  This can by done in asm code
   with the FRAME_BEGIN/FRAME_END macros.

   This rule ensures that frame pointer based stack traces will work as
   designed.  If function A doesn't create a stack frame before calling
   function B, the _caller_ of function A will be skipped on the stack
   trace.

4. Dynamic jumps and jumps to undefined symbols are only allowed if:

   a) the jump is part of a switch statement; or

   b) the jump matches sibling call semantics and the frame pointer has
      the same value it had on function entry.

   This rule is needed so that stacktool can reliably analyze all of a
   function's code paths.  If a function jumps to code in another file,
   and it's not a sibling call, stacktool has no way to follow the jump
   because it only analyzes a single file at a time.

5. A callable function may not execute kernel entry/exit instructions.
   The only code which needs such instructions is kernel entry code,
   which shouldn't be be in callable functions anyway.

   This rule is just a sanity check to ensure that callable functions
   return normally.

It currently only supports x86_64.  I tried to make the code generic so
that support for other architectures can hopefully be plugged in
relatively easily.

On my Lenovo laptop with a i7-4810MQ 4-core/8-thread CPU, building the
kernel with stacktool checking every .o file adds about three seconds of
total build time.  It hasn't been optimized for performance yet, so
there are probably some opportunities for better build performance.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 MAINTAINERS                                        |   5 +
 tools/Makefile                                     |  14 +-
 tools/stacktool/.gitignore                         |   2 +
 tools/stacktool/Build                              |  13 +
 tools/stacktool/Documentation/stack-validation.txt | 333 +++++++
 tools/stacktool/Makefile                           |  60 ++
 tools/stacktool/arch.h                             |  44 +
 tools/stacktool/arch/x86/Build                     |  12 +
 tools/stacktool/arch/x86/decode.c                  | 172 ++++
 .../stacktool/arch/x86/insn/gen-insn-attr-x86.awk  | 387 ++++++++
 tools/stacktool/arch/x86/insn/inat.c               |  97 ++
 tools/stacktool/arch/x86/insn/inat.h               | 221 +++++
 tools/stacktool/arch/x86/insn/inat_types.h         |  29 +
 tools/stacktool/arch/x86/insn/insn.c               | 594 ++++++++++++
 tools/stacktool/arch/x86/insn/insn.h               | 201 +++++
 tools/stacktool/arch/x86/insn/x86-opcode-map.txt   | 984 ++++++++++++++++++++
 tools/stacktool/builtin-check.c                    | 991 +++++++++++++++++++++
 tools/stacktool/builtin.h                          |  22 +
 tools/stacktool/elf.c                              | 403 +++++++++
 tools/stacktool/elf.h                              |  79 ++
 tools/stacktool/special.c                          | 193 ++++
 tools/stacktool/special.h                          |  42 +
 tools/stacktool/stacktool.c                        | 134 +++
 tools/stacktool/warn.h                             |  60 ++
 24 files changed, 5086 insertions(+), 6 deletions(-)
 create mode 100644 tools/stacktool/.gitignore
 create mode 100644 tools/stacktool/Build
 create mode 100644 tools/stacktool/Documentation/stack-validation.txt
 create mode 100644 tools/stacktool/Makefile
 create mode 100644 tools/stacktool/arch.h
 create mode 100644 tools/stacktool/arch/x86/Build
 create mode 100644 tools/stacktool/arch/x86/decode.c
 create mode 100644 tools/stacktool/arch/x86/insn/gen-insn-attr-x86.awk
 create mode 100644 tools/stacktool/arch/x86/insn/inat.c
 create mode 100644 tools/stacktool/arch/x86/insn/inat.h
 create mode 100644 tools/stacktool/arch/x86/insn/inat_types.h
 create mode 100644 tools/stacktool/arch/x86/insn/insn.c
 create mode 100644 tools/stacktool/arch/x86/insn/insn.h
 create mode 100644 tools/stacktool/arch/x86/insn/x86-opcode-map.txt
 create mode 100644 tools/stacktool/builtin-check.c
 create mode 100644 tools/stacktool/builtin.h
 create mode 100644 tools/stacktool/elf.c
 create mode 100644 tools/stacktool/elf.h
 create mode 100644 tools/stacktool/special.c
 create mode 100644 tools/stacktool/special.h
 create mode 100644 tools/stacktool/stacktool.c
 create mode 100644 tools/stacktool/warn.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ab68d05..7ecbea9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10186,6 +10186,11 @@ L:	stable@vger.kernel.org
 S:	Supported
 F:	Documentation/stable_kernel_rules.txt
 
+STACK METADATA VALIDATION
+M:	Josh Poimboeuf <jpoimboe@redhat.com>
+S:	Supported
+F:	tools/stacktool/
+
 STAGING SUBSYSTEM
 M:	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
diff --git a/tools/Makefile b/tools/Makefile
index 6339f6a..ac4f97a 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -20,6 +20,7 @@ help:
 	@echo '  perf                   - Linux performance measurement and analysis tool'
 	@echo '  selftests              - various kernel selftests'
 	@echo '  spi                    - spi tools'
+	@echo '  stacktool              - a stack metadata validation tool'
 	@echo '  tmon                   - thermal monitoring and tuning tool'
 	@echo '  turbostat              - Intel CPU idle stats and freq reporting tool'
 	@echo '  usb                    - USB testing tools'
@@ -53,7 +54,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-cgroup firewire hv guest spi usb virtio vm net iio: FORCE
+cgroup firewire hv guest spi usb virtio vm net iio stacktool: FORCE
 	$(call descend,$@)
 
 liblockdep: FORCE
@@ -85,7 +86,7 @@ freefall: FORCE
 all: acpi cgroup cpupower hv firewire lguest \
 		perf selftests turbostat usb \
 		virtio vm net x86_energy_perf_policy \
-		tmon freefall
+		tmon freefall stacktool
 
 acpi_install:
 	$(call descend,power/$(@:_install=),install)
@@ -93,7 +94,7 @@ acpi_install:
 cpupower_install:
 	$(call descend,power/$(@:_install=),install)
 
-cgroup_install firewire_install hv_install lguest_install perf_install usb_install virtio_install vm_install net_install:
+cgroup_install firewire_install hv_install lguest_install perf_install usb_install virtio_install vm_install net_install stacktool_install:
 	$(call descend,$(@:_install=),install)
 
 selftests_install:
@@ -111,7 +112,7 @@ freefall_install:
 install: acpi_install cgroup_install cpupower_install hv_install firewire_install lguest_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install vm_install net_install x86_energy_perf_policy_install \
-		tmon_install freefall_install
+		tmon_install freefall_install stacktool_install
 
 acpi_clean:
 	$(call descend,power/acpi,clean)
@@ -119,7 +120,7 @@ acpi_clean:
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-cgroup_clean hv_clean firewire_clean lguest_clean spi_clean usb_clean virtio_clean vm_clean net_clean iio_clean:
+cgroup_clean hv_clean firewire_clean lguest_clean spi_clean usb_clean virtio_clean vm_clean net_clean iio_clean stacktool_clean:
 	$(call descend,$(@:_clean=),clean)
 
 liblockdep_clean:
@@ -155,6 +156,7 @@ build_clean:
 clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean lguest_clean \
 		perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
 		vm_clean net_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
-		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean
+		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
+		stacktool_clean
 
 .PHONY: FORCE
diff --git a/tools/stacktool/.gitignore b/tools/stacktool/.gitignore
new file mode 100644
index 0000000..6023f61
--- /dev/null
+++ b/tools/stacktool/.gitignore
@@ -0,0 +1,2 @@
+arch/x86/insn/inat-tables.c
+stacktool
diff --git a/tools/stacktool/Build b/tools/stacktool/Build
new file mode 100644
index 0000000..9f697e6
--- /dev/null
+++ b/tools/stacktool/Build
@@ -0,0 +1,13 @@
+stacktool-y += arch/$(ARCH)/
+stacktool-y += builtin-check.o
+stacktool-y += elf.o
+stacktool-y += special.o
+stacktool-y += stacktool.o
+
+stacktool-y += libstring.o
+
+CFLAGS += -I$(srctree)/tools/lib
+
+$(OUTPUT)libstring.o: ../lib/string.c FORCE
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
diff --git a/tools/stacktool/Documentation/stack-validation.txt b/tools/stacktool/Documentation/stack-validation.txt
new file mode 100644
index 0000000..cddd600
--- /dev/null
+++ b/tools/stacktool/Documentation/stack-validation.txt
@@ -0,0 +1,333 @@
+Compile-time stack metadata validation
+======================================
+
+
+Overview
+--------
+
+The kernel CONFIG_STACK_VALIDATION option enables a host tool named
+stacktool which runs at compile time.  It analyzes every .o file and
+ensures the validity of its stack metadata.  It enforces a set of rules
+on asm code and C inline assembly code so that stack traces can be
+reliable.
+
+Currently it only checks frame pointer usage, but there are plans to add
+CFI validation for C files and CFI generation for asm files.
+
+For each function, it recursively follows all possible code paths and
+validates the correct frame pointer state at each instruction.
+
+It also follows code paths involving special sections, like
+.altinstructions, __jump_table, and __ex_table, which can add
+alternative execution paths to a given instruction (or set of
+instructions).  Similarly, it knows how to follow switch statements, for
+which gcc sometimes uses jump tables.
+
+
+Why do we need stack metadata validation?
+-----------------------------------------
+
+Here are some of the benefits of validating stack metadata:
+
+a) More reliable stack traces for frame pointer enabled kernels
+
+   Frame pointers are used for debugging purposes.  They allow runtime
+   code and debug tools to be able to walk the stack to determine the
+   chain of function call sites that led to the currently executing
+   code.
+
+   For some architectures, frame pointers are enabled by
+   CONFIG_FRAME_POINTER.  For some other architectures they may be
+   required by the ABI (sometimes referred to as "backchain pointers").
+
+   For C code, gcc automatically generates instructions for setting up
+   frame pointers when the -fno-omit-frame-pointer option is used.
+
+   But for asm code, the frame setup instructions have to be written by
+   hand, which most people don't do.  So the end result is that
+   CONFIG_FRAME_POINTER is honored for C code but not for most asm code.
+
+   For stack traces based on frame pointers to be reliable, all
+   functions which call other functions must first create a stack frame
+   and update the frame pointer.  If a first function doesn't properly
+   create a stack frame before calling a second function, the *caller*
+   of the first function will be skipped on the stack trace.
+
+   For example, consider the following example backtrace with frame
+   pointers enabled:
+
+     [<ffffffff81812584>] dump_stack+0x4b/0x63
+     [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
+     [<ffffffff8127f568>] seq_read+0x108/0x3e0
+     [<ffffffff812cce62>] proc_reg_read+0x42/0x70
+     [<ffffffff81256197>] __vfs_read+0x37/0x100
+     [<ffffffff81256b16>] vfs_read+0x86/0x130
+     [<ffffffff81257898>] SyS_read+0x58/0xd0
+     [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76
+
+   It correctly shows that the caller of cmdline_proc_show() is
+   seq_read().
+
+   If we remove the frame pointer logic from cmdline_proc_show() by
+   replacing the frame pointer related instructions with nops, here's
+   what it looks like instead:
+
+     [<ffffffff81812584>] dump_stack+0x4b/0x63
+     [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
+     [<ffffffff812cce62>] proc_reg_read+0x42/0x70
+     [<ffffffff81256197>] __vfs_read+0x37/0x100
+     [<ffffffff81256b16>] vfs_read+0x86/0x130
+     [<ffffffff81257898>] SyS_read+0x58/0xd0
+     [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76
+
+   Notice that cmdline_proc_show()'s caller, seq_read(), has been
+   skipped.  Instead the stack trace seems to show that
+   cmdline_proc_show() was called by proc_reg_read().
+
+   The benefit of stacktool here is that because it ensures that *all*
+   functions honor CONFIG_FRAME_POINTER, no functions will ever[*] be
+   skipped on a stack trace.
+
+   [*] unless an interrupt or exception has occurred at the very
+       beginning of a function before the stack frame has been created,
+       or at the very end of the function after the stack frame has been
+       destroyed.  This is an inherent limitation of frame pointers.
+
+b) 100% reliable stack traces for DWARF enabled kernels
+
+   (NOTE: This is not yet implemented)
+
+   As an alternative to frame pointers, DWARF Call Frame Information
+   (CFI) metadata can be used to walk the stack.  Unlike frame pointers,
+   CFI metadata is out of band.  So it doesn't affect runtime
+   performance and it can be reliable even when interrupts or exceptions
+   are involved.
+
+   For C code, gcc automatically generates DWARF CFI metadata.  But for
+   asm code, generating CFI is a tedious manual approach which requires
+   manually placed .cfi assembler macros to be scattered throughout the
+   code.  It's clumsy and very easy to get wrong, and it makes the real
+   code harder to read.
+
+   Stacktool will improve this situation in several ways.  For code
+   which already has CFI annotations, it will validate them.  For code
+   which doesn't have CFI annotations, it will generate them.  So an
+   architecture can opt to strip out all the manual .cfi annotations
+   from their asm code and have stacktool generate them instead.
+
+   We might also add a runtime stack validation debug option where we
+   periodically walk the stack from schedule() and/or an NMI to ensure
+   that the stack metadata is sane and that we reach the bottom of the
+   stack.
+
+   So the benefit of stacktool here will be that external tooling should
+   always show perfect stack traces.  And the same will be true for
+   kernel warning/oops traces if the architecture has a runtime DWARF
+   unwinder.
+
+c) Higher live patching compatibility rate
+
+   (NOTE: This is not yet implemented)
+
+   Currently with CONFIG_LIVEPATCH there's a basic live patching
+   framework which is safe for roughly 85-90% of "security" fixes.  But
+   patches can't have complex features like function dependency or
+   prototype changes, or data structure changes.
+
+   There's a strong need to support patches which have the more complex
+   features so that the patch compatibility rate for security fixes can
+   eventually approach something resembling 100%.  To achieve that, a
+   "consistency model" is needed, which allows tasks to be safely
+   transitioned from an unpatched state to a patched state.
+
+   One of the key requirements of the currently proposed livepatch
+   consistency model [*] is that it needs to walk the stack of each
+   sleeping task to determine if it can be transitioned to the patched
+   state.  If stacktool can ensure that stack traces are reliable, this
+   consistency model can be used and the live patching compatibility
+   rate can be improved significantly.
+
+   [*] https://lkml.kernel.org/r/cover.1423499826.git.jpoimboe@redhat.com
+
+
+Rules
+-----
+
+To achieve the validation, stacktool enforces the following rules:
+
+1. Each callable function must be annotated as such with the ELF
+   function type.  In asm code, this is typically done using the
+   ENTRY/ENDPROC macros.  If stacktool finds a return instruction
+   outside of a function, it flags an error since that usually indicates
+   callable code which should be annotated accordingly.
+
+   This rule is needed so that stacktool can properly identify each
+   callable function in order to analyze its stack metadata.
+
+2. Conversely, each section of code which is *not* callable should *not*
+   be annotated as an ELF function.  The ENDPROC macro shouldn't be used
+   in this case.
+
+   This rule is needed so that stacktool can ignore non-callable code.
+   Such code doesn't have to follow any of the other rules.
+
+3. Each callable function which calls another function must have the
+   correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
+   the architecture's back chain rules.  This can by done in asm code
+   with the FRAME_BEGIN/FRAME_END macros.
+
+   This rule ensures that frame pointer based stack traces will work as
+   designed.  If function A doesn't create a stack frame before calling
+   function B, the _caller_ of function A will be skipped on the stack
+   trace.
+
+4. Dynamic jumps and jumps to undefined symbols are only allowed if:
+
+   a) the jump is part of a switch statement; or
+
+   b) the jump matches sibling call semantics and the frame pointer has
+      the same value it had on function entry.
+
+   This rule is needed so that stacktool can reliably analyze all of a
+   function's code paths.  If a function jumps to code in another file,
+   and it's not a sibling call, stacktool has no way to follow the jump
+   because it only analyzes a single file at a time.
+
+5. A callable function may not execute kernel entry/exit instructions.
+   The only code which needs such instructions is kernel entry code,
+   which shouldn't be be in callable functions anyway.
+
+   This rule is just a sanity check to ensure that callable functions
+   return normally.
+
+
+Errors in .S files
+------------------
+
+If you're getting an error in a compiled .S file which you don't
+understand, first make sure that the affected code follows the above
+rules.
+
+Here are some examples of common warnings reported by stacktool, what
+they mean, and suggestions for how to fix them.
+
+
+1. stacktool: asm_file.o: func()+0x128: call without frame pointer save/setup
+
+   The func() function made a function call without first saving and/or
+   updating the frame pointer.
+
+   If func() is indeed a callable function, add proper frame pointer
+   logic using the FRAME_BEGIN and FRAME_END macros.  Otherwise, remove
+   its ELF function annotation by changing ENDPROC to END.
+
+   If you're getting this error in a .c file, see the "Errors in .c
+   files" section.
+
+
+2. stacktool: asm_file.o: .text+0x53: return instruction outside of a callable function
+
+   A return instruction was detected, but stacktool couldn't find a way
+   for a callable function to reach the instruction.
+
+   If the return instruction is inside (or reachable from) a callable
+   function, the function needs to be annotated with the ENTRY/ENDPROC
+   macros.
+
+   If you _really_ need a return instruction outside of a function, and
+   are 100% sure that it won't affect stack traces, you can tell
+   stacktool to ignore it.  See the "Adding exceptions" section below.
+
+
+3. stacktool: asm_file.o: func()+0x9: function has unreachable instruction
+
+   The instruction lives inside of a callable function, but there's no
+   possible control flow path from the beginning of the function to the
+   instruction.
+
+   If the instruction is actually needed, and it's actually in a
+   callable function, ensure that its function is properly annotated
+   with ENTRY/ENDPROC.
+
+   If it's not actually in a callable function (e.g. kernel entry code),
+   change ENDPROC to END.
+
+
+4. stacktool: asm_file.o: func(): can't find starting instruction
+   or
+   stacktool: asm_file.o: func()+0x11dd: can't decode instruction
+
+   Did you put data in a text section?  If so, that can confuse
+   stacktool's instruction decoder.  Move the data to a more appropriate
+   section like .data or .rodata.
+
+
+5. stacktool: asm_file.o: func()+0x6: kernel entry/exit from callable instruction
+
+   This is a kernel entry/exit instruction like sysenter or sysret.
+   Such instructions aren't allowed in a callable function, and are most
+   likely part of the kernel entry code.
+
+   If the instruction isn't actually in a callable function, change
+   ENDPROC to END.
+
+
+6. stacktool: asm_file.o: func()+0x26: sibling call from callable instruction with changed frame pointer
+
+   This is a dynamic jump or a jump to an undefined symbol.  Stacktool
+   assumed it's a sibling call and detected that the frame pointer
+   wasn't first restored to its original state.
+
+   If it's not really a sibling call, you may need to move the
+   destination code to the local file.
+
+   If the instruction is not actually in a callable function (e.g.
+   kernel entry code), change ENDPROC to END.
+
+
+7. stacktool: asm_file: func()+0x5c: frame pointer state mismatch
+
+   The instruction's frame pointer state is inconsistent, depending on
+   which execution path was taken to reach the instruction.
+
+   Make sure the function pushes and sets up the frame pointer (for
+   x86_64, this means rbp) at the beginning of the function and pops it
+   at the end of the function.  Also make sure that no other code in the
+   function touches the frame pointer.
+
+
+Errors in .c files
+------------------
+
+If you're getting a stacktool error in a compiled .c file, chances are
+the file uses an asm() statement which has a "call" instruction.  An
+asm() statement with a call instruction must declare the use of the
+stack pointer in its output operand.  For example, on x86_64:
+
+   register void *__sp asm("rsp");
+   asm volatile("call func" : "+r" (__sp));
+
+Otherwise the stack frame may not get created before the call.
+
+Another possible cause for errors in C code is if the Makefile removes
+-fno-omit-frame-pointer or adds -fomit-frame-pointer to the gcc options.
+
+Also see the above section for .S file errors for more information what
+the individual error messages mean.
+
+
+
+Adding exceptions
+-----------------
+
+If you _really_ need stacktool to ignore something, and are 100% sure
+that it won't affect kernel stack traces, you can tell stacktool to
+ignore it:
+
+- To skip validation of a function, use the STACKTOOL_IGNORE_FUNC macro.
+
+- To skip validation of a file, add "STACKTOOL_filename.o := n" to the
+  Makefile.
+
+- To skip validation of a directory, add "STACKTOOL := n" to the
+  Makefile.
diff --git a/tools/stacktool/Makefile b/tools/stacktool/Makefile
new file mode 100644
index 0000000..613bf9d
--- /dev/null
+++ b/tools/stacktool/Makefile
@@ -0,0 +1,60 @@
+include ../scripts/Makefile.include
+
+ifndef ($(ARCH))
+ARCH ?= $(shell uname -m)
+ifeq ($(ARCH),x86_64)
+ARCH := x86
+endif
+endif
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+
+SUBCMD_SRCDIR	= $(srctree)/tools/lib/subcmd/
+LIBSUBCMD	= $(if $(OUTPUT),$(OUTPUT),$(SUBCMD_SRCDIR))libsubcmd.a
+
+STACKTOOL    := $(OUTPUT)stacktool
+STACKTOOL_IN := $(STACKTOOL)-in.o
+
+all: $(STACKTOOL)
+
+INCLUDES := -I$(srctree)/tools/include
+CFLAGS   += -Wall -Werror $(EXTRA_WARNINGS) -fomit-frame-pointer -O2 $(INCLUDES)
+LDFLAGS  += -lelf $(LIBSUBCMD)
+
+AWK = awk
+export srctree OUTPUT CFLAGS ARCH AWK
+include $(srctree)/tools/build/Makefile.include
+
+$(STACKTOOL_IN): fixdep FORCE
+	@$(MAKE) $(build)=stacktool
+
+$(STACKTOOL): $(LIBSUBCMD) $(STACKTOOL_IN)
+	@(test -d ../../kernel -a -d ../../tools -a -d ../stacktool && (( \
+	diff -I'^#include' arch/x86/insn/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
+	diff -I'^#include' arch/x86/insn/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
+	diff arch/x86/insn/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
+	diff arch/x86/insn/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
+	diff -I'^#include' arch/x86/insn/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
+	diff -I'^#include' arch/x86/insn/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
+	diff -I'^#include' arch/x86/insn/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
+	|| echo "Warning: stacktool: x86 instruction decoder differs from kernel" >&2 )) || true
+	$(QUIET_LINK)$(CC) $(STACKTOOL_IN) $(LDFLAGS) -o $@
+
+
+$(LIBSUBCMD): fixdep FORCE
+	$(Q)$(MAKE) -C $(SUBCMD_SRCDIR)
+
+$(LIBSUBCMD)-clean:
+	$(Q)$(MAKE) -C $(SUBCMD_SRCDIR) clean > /dev/null
+
+clean: $(LIBSUBCMD)-clean
+	$(call QUIET_CLEAN, stacktool) $(RM) $(STACKTOOL)
+	$(Q)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
+	$(Q)$(RM) $(OUTPUT)arch/x86/insn/inat-tables.c $(OUTPUT)fixdep
+
+FORCE:
+
+.PHONY: clean FORCE
diff --git a/tools/stacktool/arch.h b/tools/stacktool/arch.h
new file mode 100644
index 0000000..f7350fc
--- /dev/null
+++ b/tools/stacktool/arch.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _ARCH_H
+#define _ARCH_H
+
+#include <stdbool.h>
+#include "elf.h"
+
+#define INSN_FP_SAVE		1
+#define INSN_FP_SETUP		2
+#define INSN_FP_RESTORE		3
+#define INSN_JUMP_CONDITIONAL	4
+#define INSN_JUMP_UNCONDITIONAL	5
+#define INSN_JUMP_DYNAMIC	6
+#define INSN_CALL		7
+#define INSN_CALL_DYNAMIC	8
+#define INSN_RETURN		9
+#define INSN_CONTEXT_SWITCH	10
+#define INSN_BUG		11
+#define INSN_NOP		12
+#define INSN_OTHER		13
+#define INSN_LAST		INSN_OTHER
+
+int arch_decode_instruction(struct elf *elf, struct section *sec,
+			    unsigned long offset, unsigned int maxlen,
+			    unsigned int *len, unsigned char *type,
+			    unsigned long *displacement);
+
+#endif /* _ARCH_H */
diff --git a/tools/stacktool/arch/x86/Build b/tools/stacktool/arch/x86/Build
new file mode 100644
index 0000000..9b9777d
--- /dev/null
+++ b/tools/stacktool/arch/x86/Build
@@ -0,0 +1,12 @@
+stacktool-y += decode.o
+
+inat_tables_script = arch/x86/insn/gen-insn-attr-x86.awk
+inat_tables_maps = arch/x86/insn/x86-opcode-map.txt
+
+$(OUTPUT)arch/x86/insn/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
+	$(call rule_mkdir)
+	$(Q)$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@
+
+$(OUTPUT)arch/x86/decode.o: $(OUTPUT)arch/x86/insn/inat-tables.c
+
+CFLAGS_decode.o += -I$(OUTPUT)arch/x86/insn
diff --git a/tools/stacktool/arch/x86/decode.c b/tools/stacktool/arch/x86/decode.c
new file mode 100644
index 0000000..c0c0b26
--- /dev/null
+++ b/tools/stacktool/arch/x86/decode.c
@@ -0,0 +1,172 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#define unlikely(cond) (cond)
+#include "insn/insn.h"
+#include "insn/inat.c"
+#include "insn/insn.c"
+
+#include "../../elf.h"
+#include "../../arch.h"
+#include "../../warn.h"
+
+static int is_x86_64(struct elf *elf)
+{
+	switch (elf->ehdr.e_machine) {
+	case EM_X86_64:
+		return 1;
+	case EM_386:
+		return 0;
+	default:
+		WARN("unexpected ELF machine type %d", elf->ehdr.e_machine);
+		return -1;
+	}
+}
+
+int arch_decode_instruction(struct elf *elf, struct section *sec,
+			    unsigned long offset, unsigned int maxlen,
+			    unsigned int *len, unsigned char *type,
+			    unsigned long *immediate)
+{
+	struct insn insn;
+	int x86_64;
+	unsigned char op1, op2, ext;
+
+	x86_64 = is_x86_64(elf);
+	if (x86_64 == -1)
+		return -1;
+
+	insn_init(&insn, (void *)(sec->data + offset), maxlen, x86_64);
+	insn_get_length(&insn);
+	insn_get_opcode(&insn);
+	insn_get_modrm(&insn);
+	insn_get_immediate(&insn);
+
+	if (!insn_complete(&insn)) {
+		WARN_FUNC("can't decode instruction", sec, offset);
+		return -1;
+	}
+
+	*len = insn.length;
+	*type = INSN_OTHER;
+
+	if (insn.vex_prefix.nbytes)
+		return 0;
+
+	op1 = insn.opcode.bytes[0];
+	op2 = insn.opcode.bytes[1];
+
+	switch (op1) {
+	case 0x55:
+		if (!insn.rex_prefix.nbytes)
+			/* push rbp */
+			*type = INSN_FP_SAVE;
+		break;
+
+	case 0x5d:
+		if (!insn.rex_prefix.nbytes)
+			/* pop rbp */
+			*type = INSN_FP_RESTORE;
+		break;
+
+	case 0x70 ... 0x7f:
+		*type = INSN_JUMP_CONDITIONAL;
+		break;
+
+	case 0x89:
+		if (insn.rex_prefix.nbytes == 1 &&
+		    insn.rex_prefix.bytes[0] == 0x48 &&
+		    insn.modrm.nbytes && insn.modrm.bytes[0] == 0xe5)
+			/* mov rsp, rbp */
+			*type = INSN_FP_SETUP;
+		break;
+
+	case 0x90:
+		*type = INSN_NOP;
+		break;
+
+	case 0x0f:
+		if (op2 >= 0x80 && op2 <= 0x8f)
+			*type = INSN_JUMP_CONDITIONAL;
+		else if (op2 == 0x05 || op2 == 0x07 || op2 == 0x34 ||
+			 op2 == 0x35)
+			/* sysenter, sysret */
+			*type = INSN_CONTEXT_SWITCH;
+		else if (op2 == 0x0b || op2 == 0xb9)
+			/* ud2 */
+			*type = INSN_BUG;
+		else if (op2 == 0x0d || op2 == 0x1f)
+			/* nopl/nopw */
+			*type = INSN_NOP;
+		else if (op2 == 0x01 && insn.modrm.nbytes &&
+			 (insn.modrm.bytes[0] == 0xc2 ||
+			  insn.modrm.bytes[0] == 0xd8))
+			/* vmlaunch, vmrun */
+			*type = INSN_CONTEXT_SWITCH;
+
+		break;
+
+	case 0xc9: /* leave */
+		*type = INSN_FP_RESTORE;
+		break;
+
+	case 0xe3: /* jecxz/jrcxz */
+		*type = INSN_JUMP_CONDITIONAL;
+		break;
+
+	case 0xe9:
+	case 0xeb:
+		*type = INSN_JUMP_UNCONDITIONAL;
+		break;
+
+	case 0xc2:
+	case 0xc3:
+		*type = INSN_RETURN;
+		break;
+
+	case 0xc5: /* iret */
+	case 0xca: /* retf */
+	case 0xcb: /* retf */
+		*type = INSN_CONTEXT_SWITCH;
+		break;
+
+	case 0xe8:
+		*type = INSN_CALL;
+		break;
+
+	case 0xff:
+		ext = X86_MODRM_REG(insn.modrm.bytes[0]);
+		if (ext == 2 || ext == 3)
+			*type = INSN_CALL_DYNAMIC;
+		else if (ext == 4)
+			*type = INSN_JUMP_DYNAMIC;
+		else if (ext == 5) /*jmpf */
+			*type = INSN_CONTEXT_SWITCH;
+
+		break;
+
+	default:
+		break;
+	}
+
+	*immediate = insn.immediate.nbytes ? insn.immediate.value : 0;
+
+	return 0;
+}
diff --git a/tools/stacktool/arch/x86/insn/gen-insn-attr-x86.awk b/tools/stacktool/arch/x86/insn/gen-insn-attr-x86.awk
new file mode 100644
index 0000000..093a892
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/gen-insn-attr-x86.awk
@@ -0,0 +1,387 @@
+#!/bin/awk -f
+# gen-insn-attr-x86.awk: Instruction attribute table generator
+# Written by Masami Hiramatsu <mhiramat@redhat.com>
+#
+# Usage: awk -f gen-insn-attr-x86.awk x86-opcode-map.txt > inat-tables.c
+
+# Awk implementation sanity check
+function check_awk_implement() {
+	if (sprintf("%x", 0) != "0")
+		return "Your awk has a printf-format problem."
+	return ""
+}
+
+# Clear working vars
+function clear_vars() {
+	delete table
+	delete lptable2
+	delete lptable1
+	delete lptable3
+	eid = -1 # escape id
+	gid = -1 # group id
+	aid = -1 # AVX id
+	tname = ""
+}
+
+BEGIN {
+	# Implementation error checking
+	awkchecked = check_awk_implement()
+	if (awkchecked != "") {
+		print "Error: " awkchecked > "/dev/stderr"
+		print "Please try to use gawk." > "/dev/stderr"
+		exit 1
+	}
+
+	# Setup generating tables
+	print "/* x86 opcode map generated from x86-opcode-map.txt */"
+	print "/* Do not change this code. */\n"
+	ggid = 1
+	geid = 1
+	gaid = 0
+	delete etable
+	delete gtable
+	delete atable
+
+	opnd_expr = "^[A-Za-z/]"
+	ext_expr = "^\\("
+	sep_expr = "^\\|$"
+	group_expr = "^Grp[0-9A-Za-z]+"
+
+	imm_expr = "^[IJAOL][a-z]"
+	imm_flag["Ib"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
+	imm_flag["Jb"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
+	imm_flag["Iw"] = "INAT_MAKE_IMM(INAT_IMM_WORD)"
+	imm_flag["Id"] = "INAT_MAKE_IMM(INAT_IMM_DWORD)"
+	imm_flag["Iq"] = "INAT_MAKE_IMM(INAT_IMM_QWORD)"
+	imm_flag["Ap"] = "INAT_MAKE_IMM(INAT_IMM_PTR)"
+	imm_flag["Iz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
+	imm_flag["Jz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
+	imm_flag["Iv"] = "INAT_MAKE_IMM(INAT_IMM_VWORD)"
+	imm_flag["Ob"] = "INAT_MOFFSET"
+	imm_flag["Ov"] = "INAT_MOFFSET"
+	imm_flag["Lx"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
+
+	modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])"
+	force64_expr = "\\([df]64\\)"
+	rex_expr = "^REX(\\.[XRWB]+)*"
+	fpu_expr = "^ESC" # TODO
+
+	lprefix1_expr = "\\((66|!F3)\\)"
+	lprefix2_expr = "\\(F3\\)"
+	lprefix3_expr = "\\((F2|!F3|66\\&F2)\\)"
+	lprefix_expr = "\\((66|F2|F3)\\)"
+	max_lprefix = 4
+
+	# All opcodes starting with lower-case 'v' or with (v1) superscript
+	# accepts VEX prefix
+	vexok_opcode_expr = "^v.*"
+	vexok_expr = "\\(v1\\)"
+	# All opcodes with (v) superscript supports *only* VEX prefix
+	vexonly_expr = "\\(v\\)"
+
+	prefix_expr = "\\(Prefix\\)"
+	prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ"
+	prefix_num["REPNE"] = "INAT_PFX_REPNE"
+	prefix_num["REP/REPE"] = "INAT_PFX_REPE"
+	prefix_num["XACQUIRE"] = "INAT_PFX_REPNE"
+	prefix_num["XRELEASE"] = "INAT_PFX_REPE"
+	prefix_num["LOCK"] = "INAT_PFX_LOCK"
+	prefix_num["SEG=CS"] = "INAT_PFX_CS"
+	prefix_num["SEG=DS"] = "INAT_PFX_DS"
+	prefix_num["SEG=ES"] = "INAT_PFX_ES"
+	prefix_num["SEG=FS"] = "INAT_PFX_FS"
+	prefix_num["SEG=GS"] = "INAT_PFX_GS"
+	prefix_num["SEG=SS"] = "INAT_PFX_SS"
+	prefix_num["Address-Size"] = "INAT_PFX_ADDRSZ"
+	prefix_num["VEX+1byte"] = "INAT_PFX_VEX2"
+	prefix_num["VEX+2byte"] = "INAT_PFX_VEX3"
+
+	clear_vars()
+}
+
+function semantic_error(msg) {
+	print "Semantic error at " NR ": " msg > "/dev/stderr"
+	exit 1
+}
+
+function debug(msg) {
+	print "DEBUG: " msg
+}
+
+function array_size(arr,   i,c) {
+	c = 0
+	for (i in arr)
+		c++
+	return c
+}
+
+/^Table:/ {
+	print "/* " $0 " */"
+	if (tname != "")
+		semantic_error("Hit Table: before EndTable:.");
+}
+
+/^Referrer:/ {
+	if (NF != 1) {
+		# escape opcode table
+		ref = ""
+		for (i = 2; i <= NF; i++)
+			ref = ref $i
+		eid = escape[ref]
+		tname = sprintf("inat_escape_table_%d", eid)
+	}
+}
+
+/^AVXcode:/ {
+	if (NF != 1) {
+		# AVX/escape opcode table
+		aid = $2
+		if (gaid <= aid)
+			gaid = aid + 1
+		if (tname == "")	# AVX only opcode table
+			tname = sprintf("inat_avx_table_%d", $2)
+	}
+	if (aid == -1 && eid == -1)	# primary opcode table
+		tname = "inat_primary_table"
+}
+
+/^GrpTable:/ {
+	print "/* " $0 " */"
+	if (!($2 in group))
+		semantic_error("No group: " $2 )
+	gid = group[$2]
+	tname = "inat_group_table_" gid
+}
+
+function print_table(tbl,name,fmt,n)
+{
+	print "const insn_attr_t " name " = {"
+	for (i = 0; i < n; i++) {
+		id = sprintf(fmt, i)
+		if (tbl[id])
+			print "	[" id "] = " tbl[id] ","
+	}
+	print "};"
+}
+
+/^EndTable/ {
+	if (gid != -1) {
+		# print group tables
+		if (array_size(table) != 0) {
+			print_table(table, tname "[INAT_GROUP_TABLE_SIZE]",
+				    "0x%x", 8)
+			gtable[gid,0] = tname
+		}
+		if (array_size(lptable1) != 0) {
+			print_table(lptable1, tname "_1[INAT_GROUP_TABLE_SIZE]",
+				    "0x%x", 8)
+			gtable[gid,1] = tname "_1"
+		}
+		if (array_size(lptable2) != 0) {
+			print_table(lptable2, tname "_2[INAT_GROUP_TABLE_SIZE]",
+				    "0x%x", 8)
+			gtable[gid,2] = tname "_2"
+		}
+		if (array_size(lptable3) != 0) {
+			print_table(lptable3, tname "_3[INAT_GROUP_TABLE_SIZE]",
+				    "0x%x", 8)
+			gtable[gid,3] = tname "_3"
+		}
+	} else {
+		# print primary/escaped tables
+		if (array_size(table) != 0) {
+			print_table(table, tname "[INAT_OPCODE_TABLE_SIZE]",
+				    "0x%02x", 256)
+			etable[eid,0] = tname
+			if (aid >= 0)
+				atable[aid,0] = tname
+		}
+		if (array_size(lptable1) != 0) {
+			print_table(lptable1,tname "_1[INAT_OPCODE_TABLE_SIZE]",
+				    "0x%02x", 256)
+			etable[eid,1] = tname "_1"
+			if (aid >= 0)
+				atable[aid,1] = tname "_1"
+		}
+		if (array_size(lptable2) != 0) {
+			print_table(lptable2,tname "_2[INAT_OPCODE_TABLE_SIZE]",
+				    "0x%02x", 256)
+			etable[eid,2] = tname "_2"
+			if (aid >= 0)
+				atable[aid,2] = tname "_2"
+		}
+		if (array_size(lptable3) != 0) {
+			print_table(lptable3,tname "_3[INAT_OPCODE_TABLE_SIZE]",
+				    "0x%02x", 256)
+			etable[eid,3] = tname "_3"
+			if (aid >= 0)
+				atable[aid,3] = tname "_3"
+		}
+	}
+	print ""
+	clear_vars()
+}
+
+function add_flags(old,new) {
+	if (old && new)
+		return old " | " new
+	else if (old)
+		return old
+	else
+		return new
+}
+
+# convert operands to flags.
+function convert_operands(count,opnd,       i,j,imm,mod)
+{
+	imm = null
+	mod = null
+	for (j = 1; j <= count; j++) {
+		i = opnd[j]
+		if (match(i, imm_expr) == 1) {
+			if (!imm_flag[i])
+				semantic_error("Unknown imm opnd: " i)
+			if (imm) {
+				if (i != "Ib")
+					semantic_error("Second IMM error")
+				imm = add_flags(imm, "INAT_SCNDIMM")
+			} else
+				imm = imm_flag[i]
+		} else if (match(i, modrm_expr))
+			mod = "INAT_MODRM"
+	}
+	return add_flags(imm, mod)
+}
+
+/^[0-9a-f]+\:/ {
+	if (NR == 1)
+		next
+	# get index
+	idx = "0x" substr($1, 1, index($1,":") - 1)
+	if (idx in table)
+		semantic_error("Redefine " idx " in " tname)
+
+	# check if escaped opcode
+	if ("escape" == $2) {
+		if ($3 != "#")
+			semantic_error("No escaped name")
+		ref = ""
+		for (i = 4; i <= NF; i++)
+			ref = ref $i
+		if (ref in escape)
+			semantic_error("Redefine escape (" ref ")")
+		escape[ref] = geid
+		geid++
+		table[idx] = "INAT_MAKE_ESCAPE(" escape[ref] ")"
+		next
+	}
+
+	variant = null
+	# converts
+	i = 2
+	while (i <= NF) {
+		opcode = $(i++)
+		delete opnds
+		ext = null
+		flags = null
+		opnd = null
+		# parse one opcode
+		if (match($i, opnd_expr)) {
+			opnd = $i
+			count = split($(i++), opnds, ",")
+			flags = convert_operands(count, opnds)
+		}
+		if (match($i, ext_expr))
+			ext = $(i++)
+		if (match($i, sep_expr))
+			i++
+		else if (i < NF)
+			semantic_error($i " is not a separator")
+
+		# check if group opcode
+		if (match(opcode, group_expr)) {
+			if (!(opcode in group)) {
+				group[opcode] = ggid
+				ggid++
+			}
+			flags = add_flags(flags, "INAT_MAKE_GROUP(" group[opcode] ")")
+		}
+		# check force(or default) 64bit
+		if (match(ext, force64_expr))
+			flags = add_flags(flags, "INAT_FORCE64")
+
+		# check REX prefix
+		if (match(opcode, rex_expr))
+			flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)")
+
+		# check coprocessor escape : TODO
+		if (match(opcode, fpu_expr))
+			flags = add_flags(flags, "INAT_MODRM")
+
+		# check VEX codes
+		if (match(ext, vexonly_expr))
+			flags = add_flags(flags, "INAT_VEXOK | INAT_VEXONLY")
+		else if (match(ext, vexok_expr) || match(opcode, vexok_opcode_expr))
+			flags = add_flags(flags, "INAT_VEXOK")
+
+		# check prefixes
+		if (match(ext, prefix_expr)) {
+			if (!prefix_num[opcode])
+				semantic_error("Unknown prefix: " opcode)
+			flags = add_flags(flags, "INAT_MAKE_PREFIX(" prefix_num[opcode] ")")
+		}
+		if (length(flags) == 0)
+			continue
+		# check if last prefix
+		if (match(ext, lprefix1_expr)) {
+			lptable1[idx] = add_flags(lptable1[idx],flags)
+			variant = "INAT_VARIANT"
+		}
+		if (match(ext, lprefix2_expr)) {
+			lptable2[idx] = add_flags(lptable2[idx],flags)
+			variant = "INAT_VARIANT"
+		}
+		if (match(ext, lprefix3_expr)) {
+			lptable3[idx] = add_flags(lptable3[idx],flags)
+			variant = "INAT_VARIANT"
+		}
+		if (!match(ext, lprefix_expr)){
+			table[idx] = add_flags(table[idx],flags)
+		}
+	}
+	if (variant)
+		table[idx] = add_flags(table[idx],variant)
+}
+
+END {
+	if (awkchecked != "")
+		exit 1
+	# print escape opcode map's array
+	print "/* Escape opcode map array */"
+	print "const insn_attr_t * const inat_escape_tables[INAT_ESC_MAX + 1]" \
+	      "[INAT_LSTPFX_MAX + 1] = {"
+	for (i = 0; i < geid; i++)
+		for (j = 0; j < max_lprefix; j++)
+			if (etable[i,j])
+				print "	["i"]["j"] = "etable[i,j]","
+	print "};\n"
+	# print group opcode map's array
+	print "/* Group opcode map array */"
+	print "const insn_attr_t * const inat_group_tables[INAT_GRP_MAX + 1]"\
+	      "[INAT_LSTPFX_MAX + 1] = {"
+	for (i = 0; i < ggid; i++)
+		for (j = 0; j < max_lprefix; j++)
+			if (gtable[i,j])
+				print "	["i"]["j"] = "gtable[i,j]","
+	print "};\n"
+	# print AVX opcode map's array
+	print "/* AVX opcode map array */"
+	print "const insn_attr_t * const inat_avx_tables[X86_VEX_M_MAX + 1]"\
+	      "[INAT_LSTPFX_MAX + 1] = {"
+	for (i = 0; i < gaid; i++)
+		for (j = 0; j < max_lprefix; j++)
+			if (atable[i,j])
+				print "	["i"]["j"] = "atable[i,j]","
+	print "};"
+}
+
diff --git a/tools/stacktool/arch/x86/insn/inat.c b/tools/stacktool/arch/x86/insn/inat.c
new file mode 100644
index 0000000..e4bf28e
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/inat.c
@@ -0,0 +1,97 @@
+/*
+ * x86 instruction attribute tables
+ *
+ * Written by Masami Hiramatsu <mhiramat@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+#include "insn.h"
+
+/* Attribute tables are generated from opcode map */
+#include "inat-tables.c"
+
+/* Attribute search APIs */
+insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode)
+{
+	return inat_primary_table[opcode];
+}
+
+int inat_get_last_prefix_id(insn_byte_t last_pfx)
+{
+	insn_attr_t lpfx_attr;
+
+	lpfx_attr = inat_get_opcode_attribute(last_pfx);
+	return inat_last_prefix_id(lpfx_attr);
+}
+
+insn_attr_t inat_get_escape_attribute(insn_byte_t opcode, int lpfx_id,
+				      insn_attr_t esc_attr)
+{
+	const insn_attr_t *table;
+	int n;
+
+	n = inat_escape_id(esc_attr);
+
+	table = inat_escape_tables[n][0];
+	if (!table)
+		return 0;
+	if (inat_has_variant(table[opcode]) && lpfx_id) {
+		table = inat_escape_tables[n][lpfx_id];
+		if (!table)
+			return 0;
+	}
+	return table[opcode];
+}
+
+insn_attr_t inat_get_group_attribute(insn_byte_t modrm, int lpfx_id,
+				     insn_attr_t grp_attr)
+{
+	const insn_attr_t *table;
+	int n;
+
+	n = inat_group_id(grp_attr);
+
+	table = inat_group_tables[n][0];
+	if (!table)
+		return inat_group_common_attribute(grp_attr);
+	if (inat_has_variant(table[X86_MODRM_REG(modrm)]) && lpfx_id) {
+		table = inat_group_tables[n][lpfx_id];
+		if (!table)
+			return inat_group_common_attribute(grp_attr);
+	}
+	return table[X86_MODRM_REG(modrm)] |
+	       inat_group_common_attribute(grp_attr);
+}
+
+insn_attr_t inat_get_avx_attribute(insn_byte_t opcode, insn_byte_t vex_m,
+				   insn_byte_t vex_p)
+{
+	const insn_attr_t *table;
+	if (vex_m > X86_VEX_M_MAX || vex_p > INAT_LSTPFX_MAX)
+		return 0;
+	/* At first, this checks the master table */
+	table = inat_avx_tables[vex_m][0];
+	if (!table)
+		return 0;
+	if (!inat_is_group(table[opcode]) && vex_p) {
+		/* If this is not a group, get attribute directly */
+		table = inat_avx_tables[vex_m][vex_p];
+		if (!table)
+			return 0;
+	}
+	return table[opcode];
+}
+
diff --git a/tools/stacktool/arch/x86/insn/inat.h b/tools/stacktool/arch/x86/insn/inat.h
new file mode 100644
index 0000000..611645e
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/inat.h
@@ -0,0 +1,221 @@
+#ifndef _ASM_X86_INAT_H
+#define _ASM_X86_INAT_H
+/*
+ * x86 instruction attributes
+ *
+ * Written by Masami Hiramatsu <mhiramat@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+#include "inat_types.h"
+
+/*
+ * Internal bits. Don't use bitmasks directly, because these bits are
+ * unstable. You should use checking functions.
+ */
+
+#define INAT_OPCODE_TABLE_SIZE 256
+#define INAT_GROUP_TABLE_SIZE 8
+
+/* Legacy last prefixes */
+#define INAT_PFX_OPNDSZ	1	/* 0x66 */ /* LPFX1 */
+#define INAT_PFX_REPE	2	/* 0xF3 */ /* LPFX2 */
+#define INAT_PFX_REPNE	3	/* 0xF2 */ /* LPFX3 */
+/* Other Legacy prefixes */
+#define INAT_PFX_LOCK	4	/* 0xF0 */
+#define INAT_PFX_CS	5	/* 0x2E */
+#define INAT_PFX_DS	6	/* 0x3E */
+#define INAT_PFX_ES	7	/* 0x26 */
+#define INAT_PFX_FS	8	/* 0x64 */
+#define INAT_PFX_GS	9	/* 0x65 */
+#define INAT_PFX_SS	10	/* 0x36 */
+#define INAT_PFX_ADDRSZ	11	/* 0x67 */
+/* x86-64 REX prefix */
+#define INAT_PFX_REX	12	/* 0x4X */
+/* AVX VEX prefixes */
+#define INAT_PFX_VEX2	13	/* 2-bytes VEX prefix */
+#define INAT_PFX_VEX3	14	/* 3-bytes VEX prefix */
+
+#define INAT_LSTPFX_MAX	3
+#define INAT_LGCPFX_MAX	11
+
+/* Immediate size */
+#define INAT_IMM_BYTE		1
+#define INAT_IMM_WORD		2
+#define INAT_IMM_DWORD		3
+#define INAT_IMM_QWORD		4
+#define INAT_IMM_PTR		5
+#define INAT_IMM_VWORD32	6
+#define INAT_IMM_VWORD		7
+
+/* Legacy prefix */
+#define INAT_PFX_OFFS	0
+#define INAT_PFX_BITS	4
+#define INAT_PFX_MAX    ((1 << INAT_PFX_BITS) - 1)
+#define INAT_PFX_MASK	(INAT_PFX_MAX << INAT_PFX_OFFS)
+/* Escape opcodes */
+#define INAT_ESC_OFFS	(INAT_PFX_OFFS + INAT_PFX_BITS)
+#define INAT_ESC_BITS	2
+#define INAT_ESC_MAX	((1 << INAT_ESC_BITS) - 1)
+#define INAT_ESC_MASK	(INAT_ESC_MAX << INAT_ESC_OFFS)
+/* Group opcodes (1-16) */
+#define INAT_GRP_OFFS	(INAT_ESC_OFFS + INAT_ESC_BITS)
+#define INAT_GRP_BITS	5
+#define INAT_GRP_MAX	((1 << INAT_GRP_BITS) - 1)
+#define INAT_GRP_MASK	(INAT_GRP_MAX << INAT_GRP_OFFS)
+/* Immediates */
+#define INAT_IMM_OFFS	(INAT_GRP_OFFS + INAT_GRP_BITS)
+#define INAT_IMM_BITS	3
+#define INAT_IMM_MASK	(((1 << INAT_IMM_BITS) - 1) << INAT_IMM_OFFS)
+/* Flags */
+#define INAT_FLAG_OFFS	(INAT_IMM_OFFS + INAT_IMM_BITS)
+#define INAT_MODRM	(1 << (INAT_FLAG_OFFS))
+#define INAT_FORCE64	(1 << (INAT_FLAG_OFFS + 1))
+#define INAT_SCNDIMM	(1 << (INAT_FLAG_OFFS + 2))
+#define INAT_MOFFSET	(1 << (INAT_FLAG_OFFS + 3))
+#define INAT_VARIANT	(1 << (INAT_FLAG_OFFS + 4))
+#define INAT_VEXOK	(1 << (INAT_FLAG_OFFS + 5))
+#define INAT_VEXONLY	(1 << (INAT_FLAG_OFFS + 6))
+/* Attribute making macros for attribute tables */
+#define INAT_MAKE_PREFIX(pfx)	(pfx << INAT_PFX_OFFS)
+#define INAT_MAKE_ESCAPE(esc)	(esc << INAT_ESC_OFFS)
+#define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)
+#define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)
+
+/* Attribute search APIs */
+extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
+extern int inat_get_last_prefix_id(insn_byte_t last_pfx);
+extern insn_attr_t inat_get_escape_attribute(insn_byte_t opcode,
+					     int lpfx_id,
+					     insn_attr_t esc_attr);
+extern insn_attr_t inat_get_group_attribute(insn_byte_t modrm,
+					    int lpfx_id,
+					    insn_attr_t esc_attr);
+extern insn_attr_t inat_get_avx_attribute(insn_byte_t opcode,
+					  insn_byte_t vex_m,
+					  insn_byte_t vex_pp);
+
+/* Attribute checking functions */
+static inline int inat_is_legacy_prefix(insn_attr_t attr)
+{
+	attr &= INAT_PFX_MASK;
+	return attr && attr <= INAT_LGCPFX_MAX;
+}
+
+static inline int inat_is_address_size_prefix(insn_attr_t attr)
+{
+	return (attr & INAT_PFX_MASK) == INAT_PFX_ADDRSZ;
+}
+
+static inline int inat_is_operand_size_prefix(insn_attr_t attr)
+{
+	return (attr & INAT_PFX_MASK) == INAT_PFX_OPNDSZ;
+}
+
+static inline int inat_is_rex_prefix(insn_attr_t attr)
+{
+	return (attr & INAT_PFX_MASK) == INAT_PFX_REX;
+}
+
+static inline int inat_last_prefix_id(insn_attr_t attr)
+{
+	if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX)
+		return 0;
+	else
+		return attr & INAT_PFX_MASK;
+}
+
+static inline int inat_is_vex_prefix(insn_attr_t attr)
+{
+	attr &= INAT_PFX_MASK;
+	return attr == INAT_PFX_VEX2 || attr == INAT_PFX_VEX3;
+}
+
+static inline int inat_is_vex3_prefix(insn_attr_t attr)
+{
+	return (attr & INAT_PFX_MASK) == INAT_PFX_VEX3;
+}
+
+static inline int inat_is_escape(insn_attr_t attr)
+{
+	return attr & INAT_ESC_MASK;
+}
+
+static inline int inat_escape_id(insn_attr_t attr)
+{
+	return (attr & INAT_ESC_MASK) >> INAT_ESC_OFFS;
+}
+
+static inline int inat_is_group(insn_attr_t attr)
+{
+	return attr & INAT_GRP_MASK;
+}
+
+static inline int inat_group_id(insn_attr_t attr)
+{
+	return (attr & INAT_GRP_MASK) >> INAT_GRP_OFFS;
+}
+
+static inline int inat_group_common_attribute(insn_attr_t attr)
+{
+	return attr & ~INAT_GRP_MASK;
+}
+
+static inline int inat_has_immediate(insn_attr_t attr)
+{
+	return attr & INAT_IMM_MASK;
+}
+
+static inline int inat_immediate_size(insn_attr_t attr)
+{
+	return (attr & INAT_IMM_MASK) >> INAT_IMM_OFFS;
+}
+
+static inline int inat_has_modrm(insn_attr_t attr)
+{
+	return attr & INAT_MODRM;
+}
+
+static inline int inat_is_force64(insn_attr_t attr)
+{
+	return attr & INAT_FORCE64;
+}
+
+static inline int inat_has_second_immediate(insn_attr_t attr)
+{
+	return attr & INAT_SCNDIMM;
+}
+
+static inline int inat_has_moffset(insn_attr_t attr)
+{
+	return attr & INAT_MOFFSET;
+}
+
+static inline int inat_has_variant(insn_attr_t attr)
+{
+	return attr & INAT_VARIANT;
+}
+
+static inline int inat_accept_vex(insn_attr_t attr)
+{
+	return attr & INAT_VEXOK;
+}
+
+static inline int inat_must_vex(insn_attr_t attr)
+{
+	return attr & INAT_VEXONLY;
+}
+#endif
diff --git a/tools/stacktool/arch/x86/insn/inat_types.h b/tools/stacktool/arch/x86/insn/inat_types.h
new file mode 100644
index 0000000..cb3c20c
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/inat_types.h
@@ -0,0 +1,29 @@
+#ifndef _ASM_X86_INAT_TYPES_H
+#define _ASM_X86_INAT_TYPES_H
+/*
+ * x86 instruction attributes
+ *
+ * Written by Masami Hiramatsu <mhiramat@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+/* Instruction attributes */
+typedef unsigned int insn_attr_t;
+typedef unsigned char insn_byte_t;
+typedef signed int insn_value_t;
+
+#endif
diff --git a/tools/stacktool/arch/x86/insn/insn.c b/tools/stacktool/arch/x86/insn/insn.c
new file mode 100644
index 0000000..47314a6
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/insn.c
@@ -0,0 +1,594 @@
+/*
+ * x86 instruction analysis
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2002, 2004, 2009
+ */
+
+#ifdef __KERNEL__
+#include <linux/string.h>
+#else
+#include <string.h>
+#endif
+#include "inat.h"
+#include "insn.h"
+
+/* Verify next sizeof(t) bytes can be on the same instruction */
+#define validate_next(t, insn, n)	\
+	((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
+
+#define __get_next(t, insn)	\
+	({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })
+
+#define __peek_nbyte_next(t, insn, n)	\
+	({ t r = *(t*)((insn)->next_byte + n); r; })
+
+#define get_next(t, insn)	\
+	({ if (unlikely(!validate_next(t, insn, 0))) goto err_out; __get_next(t, insn); })
+
+#define peek_nbyte_next(t, insn, n)	\
+	({ if (unlikely(!validate_next(t, insn, n))) goto err_out; __peek_nbyte_next(t, insn, n); })
+
+#define peek_next(t, insn)	peek_nbyte_next(t, insn, 0)
+
+/**
+ * insn_init() - initialize struct insn
+ * @insn:	&struct insn to be initialized
+ * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
+ * @x86_64:	!0 for 64-bit kernel or 64-bit app
+ */
+void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64)
+{
+	/*
+	 * Instructions longer than MAX_INSN_SIZE (15 bytes) are invalid
+	 * even if the input buffer is long enough to hold them.
+	 */
+	if (buf_len > MAX_INSN_SIZE)
+		buf_len = MAX_INSN_SIZE;
+
+	memset(insn, 0, sizeof(*insn));
+	insn->kaddr = kaddr;
+	insn->end_kaddr = kaddr + buf_len;
+	insn->next_byte = kaddr;
+	insn->x86_64 = x86_64 ? 1 : 0;
+	insn->opnd_bytes = 4;
+	if (x86_64)
+		insn->addr_bytes = 8;
+	else
+		insn->addr_bytes = 4;
+}
+
+/**
+ * insn_get_prefixes - scan x86 instruction prefix bytes
+ * @insn:	&struct insn containing instruction
+ *
+ * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
+ * to point to the (first) opcode.  No effect if @insn->prefixes.got
+ * is already set.
+ */
+void insn_get_prefixes(struct insn *insn)
+{
+	struct insn_field *prefixes = &insn->prefixes;
+	insn_attr_t attr;
+	insn_byte_t b, lb;
+	int i, nb;
+
+	if (prefixes->got)
+		return;
+
+	nb = 0;
+	lb = 0;
+	b = peek_next(insn_byte_t, insn);
+	attr = inat_get_opcode_attribute(b);
+	while (inat_is_legacy_prefix(attr)) {
+		/* Skip if same prefix */
+		for (i = 0; i < nb; i++)
+			if (prefixes->bytes[i] == b)
+				goto found;
+		if (nb == 4)
+			/* Invalid instruction */
+			break;
+		prefixes->bytes[nb++] = b;
+		if (inat_is_address_size_prefix(attr)) {
+			/* address size switches 2/4 or 4/8 */
+			if (insn->x86_64)
+				insn->addr_bytes ^= 12;
+			else
+				insn->addr_bytes ^= 6;
+		} else if (inat_is_operand_size_prefix(attr)) {
+			/* oprand size switches 2/4 */
+			insn->opnd_bytes ^= 6;
+		}
+found:
+		prefixes->nbytes++;
+		insn->next_byte++;
+		lb = b;
+		b = peek_next(insn_byte_t, insn);
+		attr = inat_get_opcode_attribute(b);
+	}
+	/* Set the last prefix */
+	if (lb && lb != insn->prefixes.bytes[3]) {
+		if (unlikely(insn->prefixes.bytes[3])) {
+			/* Swap the last prefix */
+			b = insn->prefixes.bytes[3];
+			for (i = 0; i < nb; i++)
+				if (prefixes->bytes[i] == lb)
+					prefixes->bytes[i] = b;
+		}
+		insn->prefixes.bytes[3] = lb;
+	}
+
+	/* Decode REX prefix */
+	if (insn->x86_64) {
+		b = peek_next(insn_byte_t, insn);
+		attr = inat_get_opcode_attribute(b);
+		if (inat_is_rex_prefix(attr)) {
+			insn->rex_prefix.value = b;
+			insn->rex_prefix.nbytes = 1;
+			insn->next_byte++;
+			if (X86_REX_W(b))
+				/* REX.W overrides opnd_size */
+				insn->opnd_bytes = 8;
+		}
+	}
+	insn->rex_prefix.got = 1;
+
+	/* Decode VEX prefix */
+	b = peek_next(insn_byte_t, insn);
+	attr = inat_get_opcode_attribute(b);
+	if (inat_is_vex_prefix(attr)) {
+		insn_byte_t b2 = peek_nbyte_next(insn_byte_t, insn, 1);
+		if (!insn->x86_64) {
+			/*
+			 * In 32-bits mode, if the [7:6] bits (mod bits of
+			 * ModRM) on the second byte are not 11b, it is
+			 * LDS or LES.
+			 */
+			if (X86_MODRM_MOD(b2) != 3)
+				goto vex_end;
+		}
+		insn->vex_prefix.bytes[0] = b;
+		insn->vex_prefix.bytes[1] = b2;
+		if (inat_is_vex3_prefix(attr)) {
+			b2 = peek_nbyte_next(insn_byte_t, insn, 2);
+			insn->vex_prefix.bytes[2] = b2;
+			insn->vex_prefix.nbytes = 3;
+			insn->next_byte += 3;
+			if (insn->x86_64 && X86_VEX_W(b2))
+				/* VEX.W overrides opnd_size */
+				insn->opnd_bytes = 8;
+		} else {
+			/*
+			 * For VEX2, fake VEX3-like byte#2.
+			 * Makes it easier to decode vex.W, vex.vvvv,
+			 * vex.L and vex.pp. Masking with 0x7f sets vex.W == 0.
+			 */
+			insn->vex_prefix.bytes[2] = b2 & 0x7f;
+			insn->vex_prefix.nbytes = 2;
+			insn->next_byte += 2;
+		}
+	}
+vex_end:
+	insn->vex_prefix.got = 1;
+
+	prefixes->got = 1;
+
+err_out:
+	return;
+}
+
+/**
+ * insn_get_opcode - collect opcode(s)
+ * @insn:	&struct insn containing instruction
+ *
+ * Populates @insn->opcode, updates @insn->next_byte to point past the
+ * opcode byte(s), and set @insn->attr (except for groups).
+ * If necessary, first collects any preceding (prefix) bytes.
+ * Sets @insn->opcode.value = opcode1.  No effect if @insn->opcode.got
+ * is already 1.
+ */
+void insn_get_opcode(struct insn *insn)
+{
+	struct insn_field *opcode = &insn->opcode;
+	insn_byte_t op;
+	int pfx_id;
+	if (opcode->got)
+		return;
+	if (!insn->prefixes.got)
+		insn_get_prefixes(insn);
+
+	/* Get first opcode */
+	op = get_next(insn_byte_t, insn);
+	opcode->bytes[0] = op;
+	opcode->nbytes = 1;
+
+	/* Check if there is VEX prefix or not */
+	if (insn_is_avx(insn)) {
+		insn_byte_t m, p;
+		m = insn_vex_m_bits(insn);
+		p = insn_vex_p_bits(insn);
+		insn->attr = inat_get_avx_attribute(op, m, p);
+		if (!inat_accept_vex(insn->attr) && !inat_is_group(insn->attr))
+			insn->attr = 0;	/* This instruction is bad */
+		goto end;	/* VEX has only 1 byte for opcode */
+	}
+
+	insn->attr = inat_get_opcode_attribute(op);
+	while (inat_is_escape(insn->attr)) {
+		/* Get escaped opcode */
+		op = get_next(insn_byte_t, insn);
+		opcode->bytes[opcode->nbytes++] = op;
+		pfx_id = insn_last_prefix_id(insn);
+		insn->attr = inat_get_escape_attribute(op, pfx_id, insn->attr);
+	}
+	if (inat_must_vex(insn->attr))
+		insn->attr = 0;	/* This instruction is bad */
+end:
+	opcode->got = 1;
+
+err_out:
+	return;
+}
+
+/**
+ * insn_get_modrm - collect ModRM byte, if any
+ * @insn:	&struct insn containing instruction
+ *
+ * Populates @insn->modrm and updates @insn->next_byte to point past the
+ * ModRM byte, if any.  If necessary, first collects the preceding bytes
+ * (prefixes and opcode(s)).  No effect if @insn->modrm.got is already 1.
+ */
+void insn_get_modrm(struct insn *insn)
+{
+	struct insn_field *modrm = &insn->modrm;
+	insn_byte_t pfx_id, mod;
+	if (modrm->got)
+		return;
+	if (!insn->opcode.got)
+		insn_get_opcode(insn);
+
+	if (inat_has_modrm(insn->attr)) {
+		mod = get_next(insn_byte_t, insn);
+		modrm->value = mod;
+		modrm->nbytes = 1;
+		if (inat_is_group(insn->attr)) {
+			pfx_id = insn_last_prefix_id(insn);
+			insn->attr = inat_get_group_attribute(mod, pfx_id,
+							      insn->attr);
+			if (insn_is_avx(insn) && !inat_accept_vex(insn->attr))
+				insn->attr = 0;	/* This is bad */
+		}
+	}
+
+	if (insn->x86_64 && inat_is_force64(insn->attr))
+		insn->opnd_bytes = 8;
+	modrm->got = 1;
+
+err_out:
+	return;
+}
+
+
+/**
+ * insn_rip_relative() - Does instruction use RIP-relative addressing mode?
+ * @insn:	&struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * ModRM byte.  No effect if @insn->x86_64 is 0.
+ */
+int insn_rip_relative(struct insn *insn)
+{
+	struct insn_field *modrm = &insn->modrm;
+
+	if (!insn->x86_64)
+		return 0;
+	if (!modrm->got)
+		insn_get_modrm(insn);
+	/*
+	 * For rip-relative instructions, the mod field (top 2 bits)
+	 * is zero and the r/m field (bottom 3 bits) is 0x5.
+	 */
+	return (modrm->nbytes && (modrm->value & 0xc7) == 0x5);
+}
+
+/**
+ * insn_get_sib() - Get the SIB byte of instruction
+ * @insn:	&struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * ModRM byte.
+ */
+void insn_get_sib(struct insn *insn)
+{
+	insn_byte_t modrm;
+
+	if (insn->sib.got)
+		return;
+	if (!insn->modrm.got)
+		insn_get_modrm(insn);
+	if (insn->modrm.nbytes) {
+		modrm = (insn_byte_t)insn->modrm.value;
+		if (insn->addr_bytes != 2 &&
+		    X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) {
+			insn->sib.value = get_next(insn_byte_t, insn);
+			insn->sib.nbytes = 1;
+		}
+	}
+	insn->sib.got = 1;
+
+err_out:
+	return;
+}
+
+
+/**
+ * insn_get_displacement() - Get the displacement of instruction
+ * @insn:	&struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * SIB byte.
+ * Displacement value is sign-expanded.
+ */
+void insn_get_displacement(struct insn *insn)
+{
+	insn_byte_t mod, rm, base;
+
+	if (insn->displacement.got)
+		return;
+	if (!insn->sib.got)
+		insn_get_sib(insn);
+	if (insn->modrm.nbytes) {
+		/*
+		 * Interpreting the modrm byte:
+		 * mod = 00 - no displacement fields (exceptions below)
+		 * mod = 01 - 1-byte displacement field
+		 * mod = 10 - displacement field is 4 bytes, or 2 bytes if
+		 * 	address size = 2 (0x67 prefix in 32-bit mode)
+		 * mod = 11 - no memory operand
+		 *
+		 * If address size = 2...
+		 * mod = 00, r/m = 110 - displacement field is 2 bytes
+		 *
+		 * If address size != 2...
+		 * mod != 11, r/m = 100 - SIB byte exists
+		 * mod = 00, SIB base = 101 - displacement field is 4 bytes
+		 * mod = 00, r/m = 101 - rip-relative addressing, displacement
+		 * 	field is 4 bytes
+		 */
+		mod = X86_MODRM_MOD(insn->modrm.value);
+		rm = X86_MODRM_RM(insn->modrm.value);
+		base = X86_SIB_BASE(insn->sib.value);
+		if (mod == 3)
+			goto out;
+		if (mod == 1) {
+			insn->displacement.value = get_next(char, insn);
+			insn->displacement.nbytes = 1;
+		} else if (insn->addr_bytes == 2) {
+			if ((mod == 0 && rm == 6) || mod == 2) {
+				insn->displacement.value =
+					 get_next(short, insn);
+				insn->displacement.nbytes = 2;
+			}
+		} else {
+			if ((mod == 0 && rm == 5) || mod == 2 ||
+			    (mod == 0 && base == 5)) {
+				insn->displacement.value = get_next(int, insn);
+				insn->displacement.nbytes = 4;
+			}
+		}
+	}
+out:
+	insn->displacement.got = 1;
+
+err_out:
+	return;
+}
+
+/* Decode moffset16/32/64. Return 0 if failed */
+static int __get_moffset(struct insn *insn)
+{
+	switch (insn->addr_bytes) {
+	case 2:
+		insn->moffset1.value = get_next(short, insn);
+		insn->moffset1.nbytes = 2;
+		break;
+	case 4:
+		insn->moffset1.value = get_next(int, insn);
+		insn->moffset1.nbytes = 4;
+		break;
+	case 8:
+		insn->moffset1.value = get_next(int, insn);
+		insn->moffset1.nbytes = 4;
+		insn->moffset2.value = get_next(int, insn);
+		insn->moffset2.nbytes = 4;
+		break;
+	default:	/* opnd_bytes must be modified manually */
+		goto err_out;
+	}
+	insn->moffset1.got = insn->moffset2.got = 1;
+
+	return 1;
+
+err_out:
+	return 0;
+}
+
+/* Decode imm v32(Iz). Return 0 if failed */
+static int __get_immv32(struct insn *insn)
+{
+	switch (insn->opnd_bytes) {
+	case 2:
+		insn->immediate.value = get_next(short, insn);
+		insn->immediate.nbytes = 2;
+		break;
+	case 4:
+	case 8:
+		insn->immediate.value = get_next(int, insn);
+		insn->immediate.nbytes = 4;
+		break;
+	default:	/* opnd_bytes must be modified manually */
+		goto err_out;
+	}
+
+	return 1;
+
+err_out:
+	return 0;
+}
+
+/* Decode imm v64(Iv/Ov), Return 0 if failed */
+static int __get_immv(struct insn *insn)
+{
+	switch (insn->opnd_bytes) {
+	case 2:
+		insn->immediate1.value = get_next(short, insn);
+		insn->immediate1.nbytes = 2;
+		break;
+	case 4:
+		insn->immediate1.value = get_next(int, insn);
+		insn->immediate1.nbytes = 4;
+		break;
+	case 8:
+		insn->immediate1.value = get_next(int, insn);
+		insn->immediate1.nbytes = 4;
+		insn->immediate2.value = get_next(int, insn);
+		insn->immediate2.nbytes = 4;
+		break;
+	default:	/* opnd_bytes must be modified manually */
+		goto err_out;
+	}
+	insn->immediate1.got = insn->immediate2.got = 1;
+
+	return 1;
+err_out:
+	return 0;
+}
+
+/* Decode ptr16:16/32(Ap) */
+static int __get_immptr(struct insn *insn)
+{
+	switch (insn->opnd_bytes) {
+	case 2:
+		insn->immediate1.value = get_next(short, insn);
+		insn->immediate1.nbytes = 2;
+		break;
+	case 4:
+		insn->immediate1.value = get_next(int, insn);
+		insn->immediate1.nbytes = 4;
+		break;
+	case 8:
+		/* ptr16:64 is not exist (no segment) */
+		return 0;
+	default:	/* opnd_bytes must be modified manually */
+		goto err_out;
+	}
+	insn->immediate2.value = get_next(unsigned short, insn);
+	insn->immediate2.nbytes = 2;
+	insn->immediate1.got = insn->immediate2.got = 1;
+
+	return 1;
+err_out:
+	return 0;
+}
+
+/**
+ * insn_get_immediate() - Get the immediates of instruction
+ * @insn:	&struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * displacement bytes.
+ * Basically, most of immediates are sign-expanded. Unsigned-value can be
+ * get by bit masking with ((1 << (nbytes * 8)) - 1)
+ */
+void insn_get_immediate(struct insn *insn)
+{
+	if (insn->immediate.got)
+		return;
+	if (!insn->displacement.got)
+		insn_get_displacement(insn);
+
+	if (inat_has_moffset(insn->attr)) {
+		if (!__get_moffset(insn))
+			goto err_out;
+		goto done;
+	}
+
+	if (!inat_has_immediate(insn->attr))
+		/* no immediates */
+		goto done;
+
+	switch (inat_immediate_size(insn->attr)) {
+	case INAT_IMM_BYTE:
+		insn->immediate.value = get_next(char, insn);
+		insn->immediate.nbytes = 1;
+		break;
+	case INAT_IMM_WORD:
+		insn->immediate.value = get_next(short, insn);
+		insn->immediate.nbytes = 2;
+		break;
+	case INAT_IMM_DWORD:
+		insn->immediate.value = get_next(int, insn);
+		insn->immediate.nbytes = 4;
+		break;
+	case INAT_IMM_QWORD:
+		insn->immediate1.value = get_next(int, insn);
+		insn->immediate1.nbytes = 4;
+		insn->immediate2.value = get_next(int, insn);
+		insn->immediate2.nbytes = 4;
+		break;
+	case INAT_IMM_PTR:
+		if (!__get_immptr(insn))
+			goto err_out;
+		break;
+	case INAT_IMM_VWORD32:
+		if (!__get_immv32(insn))
+			goto err_out;
+		break;
+	case INAT_IMM_VWORD:
+		if (!__get_immv(insn))
+			goto err_out;
+		break;
+	default:
+		/* Here, insn must have an immediate, but failed */
+		goto err_out;
+	}
+	if (inat_has_second_immediate(insn->attr)) {
+		insn->immediate2.value = get_next(char, insn);
+		insn->immediate2.nbytes = 1;
+	}
+done:
+	insn->immediate.got = 1;
+
+err_out:
+	return;
+}
+
+/**
+ * insn_get_length() - Get the length of instruction
+ * @insn:	&struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * immediates bytes.
+ */
+void insn_get_length(struct insn *insn)
+{
+	if (insn->length)
+		return;
+	if (!insn->immediate.got)
+		insn_get_immediate(insn);
+	insn->length = (unsigned char)((unsigned long)insn->next_byte
+				     - (unsigned long)insn->kaddr);
+}
diff --git a/tools/stacktool/arch/x86/insn/insn.h b/tools/stacktool/arch/x86/insn/insn.h
new file mode 100644
index 0000000..dd12da0
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/insn.h
@@ -0,0 +1,201 @@
+#ifndef _ASM_X86_INSN_H
+#define _ASM_X86_INSN_H
+/*
+ * x86 instruction analysis
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2009
+ */
+
+/* insn_attr_t is defined in inat.h */
+#include "inat.h"
+
+struct insn_field {
+	union {
+		insn_value_t value;
+		insn_byte_t bytes[4];
+	};
+	/* !0 if we've run insn_get_xxx() for this field */
+	unsigned char got;
+	unsigned char nbytes;
+};
+
+struct insn {
+	struct insn_field prefixes;	/*
+					 * Prefixes
+					 * prefixes.bytes[3]: last prefix
+					 */
+	struct insn_field rex_prefix;	/* REX prefix */
+	struct insn_field vex_prefix;	/* VEX prefix */
+	struct insn_field opcode;	/*
+					 * opcode.bytes[0]: opcode1
+					 * opcode.bytes[1]: opcode2
+					 * opcode.bytes[2]: opcode3
+					 */
+	struct insn_field modrm;
+	struct insn_field sib;
+	struct insn_field displacement;
+	union {
+		struct insn_field immediate;
+		struct insn_field moffset1;	/* for 64bit MOV */
+		struct insn_field immediate1;	/* for 64bit imm or off16/32 */
+	};
+	union {
+		struct insn_field moffset2;	/* for 64bit MOV */
+		struct insn_field immediate2;	/* for 64bit imm or seg16 */
+	};
+
+	insn_attr_t attr;
+	unsigned char opnd_bytes;
+	unsigned char addr_bytes;
+	unsigned char length;
+	unsigned char x86_64;
+
+	const insn_byte_t *kaddr;	/* kernel address of insn to analyze */
+	const insn_byte_t *end_kaddr;	/* kernel address of last insn in buffer */
+	const insn_byte_t *next_byte;
+};
+
+#define MAX_INSN_SIZE	15
+
+#define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6)
+#define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3)
+#define X86_MODRM_RM(modrm) ((modrm) & 0x07)
+
+#define X86_SIB_SCALE(sib) (((sib) & 0xc0) >> 6)
+#define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3)
+#define X86_SIB_BASE(sib) ((sib) & 0x07)
+
+#define X86_REX_W(rex) ((rex) & 8)
+#define X86_REX_R(rex) ((rex) & 4)
+#define X86_REX_X(rex) ((rex) & 2)
+#define X86_REX_B(rex) ((rex) & 1)
+
+/* VEX bit flags  */
+#define X86_VEX_W(vex)	((vex) & 0x80)	/* VEX3 Byte2 */
+#define X86_VEX_R(vex)	((vex) & 0x80)	/* VEX2/3 Byte1 */
+#define X86_VEX_X(vex)	((vex) & 0x40)	/* VEX3 Byte1 */
+#define X86_VEX_B(vex)	((vex) & 0x20)	/* VEX3 Byte1 */
+#define X86_VEX_L(vex)	((vex) & 0x04)	/* VEX3 Byte2, VEX2 Byte1 */
+/* VEX bit fields */
+#define X86_VEX3_M(vex)	((vex) & 0x1f)		/* VEX3 Byte1 */
+#define X86_VEX2_M	1			/* VEX2.M always 1 */
+#define X86_VEX_V(vex)	(((vex) & 0x78) >> 3)	/* VEX3 Byte2, VEX2 Byte1 */
+#define X86_VEX_P(vex)	((vex) & 0x03)		/* VEX3 Byte2, VEX2 Byte1 */
+#define X86_VEX_M_MAX	0x1f			/* VEX3.M Maximum value */
+
+extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64);
+extern void insn_get_prefixes(struct insn *insn);
+extern void insn_get_opcode(struct insn *insn);
+extern void insn_get_modrm(struct insn *insn);
+extern void insn_get_sib(struct insn *insn);
+extern void insn_get_displacement(struct insn *insn);
+extern void insn_get_immediate(struct insn *insn);
+extern void insn_get_length(struct insn *insn);
+
+/* Attribute will be determined after getting ModRM (for opcode groups) */
+static inline void insn_get_attribute(struct insn *insn)
+{
+	insn_get_modrm(insn);
+}
+
+/* Instruction uses RIP-relative addressing */
+extern int insn_rip_relative(struct insn *insn);
+
+/* Init insn for kernel text */
+static inline void kernel_insn_init(struct insn *insn,
+				    const void *kaddr, int buf_len)
+{
+#ifdef CONFIG_X86_64
+	insn_init(insn, kaddr, buf_len, 1);
+#else /* CONFIG_X86_32 */
+	insn_init(insn, kaddr, buf_len, 0);
+#endif
+}
+
+static inline int insn_is_avx(struct insn *insn)
+{
+	if (!insn->prefixes.got)
+		insn_get_prefixes(insn);
+	return (insn->vex_prefix.value != 0);
+}
+
+/* Ensure this instruction is decoded completely */
+static inline int insn_complete(struct insn *insn)
+{
+	return insn->opcode.got && insn->modrm.got && insn->sib.got &&
+		insn->displacement.got && insn->immediate.got;
+}
+
+static inline insn_byte_t insn_vex_m_bits(struct insn *insn)
+{
+	if (insn->vex_prefix.nbytes == 2)	/* 2 bytes VEX */
+		return X86_VEX2_M;
+	else
+		return X86_VEX3_M(insn->vex_prefix.bytes[1]);
+}
+
+static inline insn_byte_t insn_vex_p_bits(struct insn *insn)
+{
+	if (insn->vex_prefix.nbytes == 2)	/* 2 bytes VEX */
+		return X86_VEX_P(insn->vex_prefix.bytes[1]);
+	else
+		return X86_VEX_P(insn->vex_prefix.bytes[2]);
+}
+
+/* Get the last prefix id from last prefix or VEX prefix */
+static inline int insn_last_prefix_id(struct insn *insn)
+{
+	if (insn_is_avx(insn))
+		return insn_vex_p_bits(insn);	/* VEX_p is a SIMD prefix id */
+
+	if (insn->prefixes.bytes[3])
+		return inat_get_last_prefix_id(insn->prefixes.bytes[3]);
+
+	return 0;
+}
+
+/* Offset of each field from kaddr */
+static inline int insn_offset_rex_prefix(struct insn *insn)
+{
+	return insn->prefixes.nbytes;
+}
+static inline int insn_offset_vex_prefix(struct insn *insn)
+{
+	return insn_offset_rex_prefix(insn) + insn->rex_prefix.nbytes;
+}
+static inline int insn_offset_opcode(struct insn *insn)
+{
+	return insn_offset_vex_prefix(insn) + insn->vex_prefix.nbytes;
+}
+static inline int insn_offset_modrm(struct insn *insn)
+{
+	return insn_offset_opcode(insn) + insn->opcode.nbytes;
+}
+static inline int insn_offset_sib(struct insn *insn)
+{
+	return insn_offset_modrm(insn) + insn->modrm.nbytes;
+}
+static inline int insn_offset_displacement(struct insn *insn)
+{
+	return insn_offset_sib(insn) + insn->sib.nbytes;
+}
+static inline int insn_offset_immediate(struct insn *insn)
+{
+	return insn_offset_displacement(insn) + insn->displacement.nbytes;
+}
+
+#endif /* _ASM_X86_INSN_H */
diff --git a/tools/stacktool/arch/x86/insn/x86-opcode-map.txt b/tools/stacktool/arch/x86/insn/x86-opcode-map.txt
new file mode 100644
index 0000000..d388de7
--- /dev/null
+++ b/tools/stacktool/arch/x86/insn/x86-opcode-map.txt
@@ -0,0 +1,984 @@
+# x86 Opcode Maps
+#
+# This is (mostly) based on following documentations.
+# - Intel(R) 64 and IA-32 Architectures Software Developer's Manual Vol.2C
+#   (#326018-047US, June 2013)
+#
+#<Opcode maps>
+# Table: table-name
+# Referrer: escaped-name
+# AVXcode: avx-code
+# opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
+# (or)
+# opcode: escape # escaped-name
+# EndTable
+#
+#<group maps>
+# GrpTable: GrpXXX
+# reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
+# EndTable
+#
+# AVX Superscripts
+#  (v): this opcode requires VEX prefix.
+#  (v1): this opcode only supports 128bit VEX.
+#
+# Last Prefix Superscripts
+#  - (66): the last prefix is 0x66
+#  - (F3): the last prefix is 0xF3
+#  - (F2): the last prefix is 0xF2
+#  - (!F3) : the last prefix is not 0xF3 (including non-last prefix case)
+#  - (66&F2): Both 0x66 and 0xF2 prefixes are specified.
+
+Table: one byte opcode
+Referrer:
+AVXcode:
+# 0x00 - 0x0f
+00: ADD Eb,Gb
+01: ADD Ev,Gv
+02: ADD Gb,Eb
+03: ADD Gv,Ev
+04: ADD AL,Ib
+05: ADD rAX,Iz
+06: PUSH ES (i64)
+07: POP ES (i64)
+08: OR Eb,Gb
+09: OR Ev,Gv
+0a: OR Gb,Eb
+0b: OR Gv,Ev
+0c: OR AL,Ib
+0d: OR rAX,Iz
+0e: PUSH CS (i64)
+0f: escape # 2-byte escape
+# 0x10 - 0x1f
+10: ADC Eb,Gb
+11: ADC Ev,Gv
+12: ADC Gb,Eb
+13: ADC Gv,Ev
+14: ADC AL,Ib
+15: ADC rAX,Iz
+16: PUSH SS (i64)
+17: POP SS (i64)
+18: SBB Eb,Gb
+19: SBB Ev,Gv
+1a: SBB Gb,Eb
+1b: SBB Gv,Ev
+1c: SBB AL,Ib
+1d: SBB rAX,Iz
+1e: PUSH DS (i64)
+1f: POP DS (i64)
+# 0x20 - 0x2f
+20: AND Eb,Gb
+21: AND Ev,Gv
+22: AND Gb,Eb
+23: AND Gv,Ev
+24: AND AL,Ib
+25: AND rAx,Iz
+26: SEG=ES (Prefix)
+27: DAA (i64)
+28: SUB Eb,Gb
+29: SUB Ev,Gv
+2a: SUB Gb,Eb
+2b: SUB Gv,Ev
+2c: SUB AL,Ib
+2d: SUB rAX,Iz
+2e: SEG=CS (Prefix)
+2f: DAS (i64)
+# 0x30 - 0x3f
+30: XOR Eb,Gb
+31: XOR Ev,Gv
+32: XOR Gb,Eb
+33: XOR Gv,Ev
+34: XOR AL,Ib
+35: XOR rAX,Iz
+36: SEG=SS (Prefix)
+37: AAA (i64)
+38: CMP Eb,Gb
+39: CMP Ev,Gv
+3a: CMP Gb,Eb
+3b: CMP Gv,Ev
+3c: CMP AL,Ib
+3d: CMP rAX,Iz
+3e: SEG=DS (Prefix)
+3f: AAS (i64)
+# 0x40 - 0x4f
+40: INC eAX (i64) | REX (o64)
+41: INC eCX (i64) | REX.B (o64)
+42: INC eDX (i64) | REX.X (o64)
+43: INC eBX (i64) | REX.XB (o64)
+44: INC eSP (i64) | REX.R (o64)
+45: INC eBP (i64) | REX.RB (o64)
+46: INC eSI (i64) | REX.RX (o64)
+47: INC eDI (i64) | REX.RXB (o64)
+48: DEC eAX (i64) | REX.W (o64)
+49: DEC eCX (i64) | REX.WB (o64)
+4a: DEC eDX (i64) | REX.WX (o64)
+4b: DEC eBX (i64) | REX.WXB (o64)
+4c: DEC eSP (i64) | REX.WR (o64)
+4d: DEC eBP (i64) | REX.WRB (o64)
+4e: DEC eSI (i64) | REX.WRX (o64)
+4f: DEC eDI (i64) | REX.WRXB (o64)
+# 0x50 - 0x5f
+50: PUSH rAX/r8 (d64)
+51: PUSH rCX/r9 (d64)
+52: PUSH rDX/r10 (d64)
+53: PUSH rBX/r11 (d64)
+54: PUSH rSP/r12 (d64)
+55: PUSH rBP/r13 (d64)
+56: PUSH rSI/r14 (d64)
+57: PUSH rDI/r15 (d64)
+58: POP rAX/r8 (d64)
+59: POP rCX/r9 (d64)
+5a: POP rDX/r10 (d64)
+5b: POP rBX/r11 (d64)
+5c: POP rSP/r12 (d64)
+5d: POP rBP/r13 (d64)
+5e: POP rSI/r14 (d64)
+5f: POP rDI/r15 (d64)
+# 0x60 - 0x6f
+60: PUSHA/PUSHAD (i64)
+61: POPA/POPAD (i64)
+62: BOUND Gv,Ma (i64)
+63: ARPL Ew,Gw (i64) | MOVSXD Gv,Ev (o64)
+64: SEG=FS (Prefix)
+65: SEG=GS (Prefix)
+66: Operand-Size (Prefix)
+67: Address-Size (Prefix)
+68: PUSH Iz (d64)
+69: IMUL Gv,Ev,Iz
+6a: PUSH Ib (d64)
+6b: IMUL Gv,Ev,Ib
+6c: INS/INSB Yb,DX
+6d: INS/INSW/INSD Yz,DX
+6e: OUTS/OUTSB DX,Xb
+6f: OUTS/OUTSW/OUTSD DX,Xz
+# 0x70 - 0x7f
+70: JO Jb
+71: JNO Jb
+72: JB/JNAE/JC Jb
+73: JNB/JAE/JNC Jb
+74: JZ/JE Jb
+75: JNZ/JNE Jb
+76: JBE/JNA Jb
+77: JNBE/JA Jb
+78: JS Jb
+79: JNS Jb
+7a: JP/JPE Jb
+7b: JNP/JPO Jb
+7c: JL/JNGE Jb
+7d: JNL/JGE Jb
+7e: JLE/JNG Jb
+7f: JNLE/JG Jb
+# 0x80 - 0x8f
+80: Grp1 Eb,Ib (1A)
+81: Grp1 Ev,Iz (1A)
+82: Grp1 Eb,Ib (1A),(i64)
+83: Grp1 Ev,Ib (1A)
+84: TEST Eb,Gb
+85: TEST Ev,Gv
+86: XCHG Eb,Gb
+87: XCHG Ev,Gv
+88: MOV Eb,Gb
+89: MOV Ev,Gv
+8a: MOV Gb,Eb
+8b: MOV Gv,Ev
+8c: MOV Ev,Sw
+8d: LEA Gv,M
+8e: MOV Sw,Ew
+8f: Grp1A (1A) | POP Ev (d64)
+# 0x90 - 0x9f
+90: NOP | PAUSE (F3) | XCHG r8,rAX
+91: XCHG rCX/r9,rAX
+92: XCHG rDX/r10,rAX
+93: XCHG rBX/r11,rAX
+94: XCHG rSP/r12,rAX
+95: XCHG rBP/r13,rAX
+96: XCHG rSI/r14,rAX
+97: XCHG rDI/r15,rAX
+98: CBW/CWDE/CDQE
+99: CWD/CDQ/CQO
+9a: CALLF Ap (i64)
+9b: FWAIT/WAIT
+9c: PUSHF/D/Q Fv (d64)
+9d: POPF/D/Q Fv (d64)
+9e: SAHF
+9f: LAHF
+# 0xa0 - 0xaf
+a0: MOV AL,Ob
+a1: MOV rAX,Ov
+a2: MOV Ob,AL
+a3: MOV Ov,rAX
+a4: MOVS/B Yb,Xb
+a5: MOVS/W/D/Q Yv,Xv
+a6: CMPS/B Xb,Yb
+a7: CMPS/W/D Xv,Yv
+a8: TEST AL,Ib
+a9: TEST rAX,Iz
+aa: STOS/B Yb,AL
+ab: STOS/W/D/Q Yv,rAX
+ac: LODS/B AL,Xb
+ad: LODS/W/D/Q rAX,Xv
+ae: SCAS/B AL,Yb
+# Note: The May 2011 Intel manual shows Xv for the second parameter of the
+# next instruction but Yv is correct
+af: SCAS/W/D/Q rAX,Yv
+# 0xb0 - 0xbf
+b0: MOV AL/R8L,Ib
+b1: MOV CL/R9L,Ib
+b2: MOV DL/R10L,Ib
+b3: MOV BL/R11L,Ib
+b4: MOV AH/R12L,Ib
+b5: MOV CH/R13L,Ib
+b6: MOV DH/R14L,Ib
+b7: MOV BH/R15L,Ib
+b8: MOV rAX/r8,Iv
+b9: MOV rCX/r9,Iv
+ba: MOV rDX/r10,Iv
+bb: MOV rBX/r11,Iv
+bc: MOV rSP/r12,Iv
+bd: MOV rBP/r13,Iv
+be: MOV rSI/r14,Iv
+bf: MOV rDI/r15,Iv
+# 0xc0 - 0xcf
+c0: Grp2 Eb,Ib (1A)
+c1: Grp2 Ev,Ib (1A)
+c2: RETN Iw (f64)
+c3: RETN
+c4: LES Gz,Mp (i64) | VEX+2byte (Prefix)
+c5: LDS Gz,Mp (i64) | VEX+1byte (Prefix)
+c6: Grp11A Eb,Ib (1A)
+c7: Grp11B Ev,Iz (1A)
+c8: ENTER Iw,Ib
+c9: LEAVE (d64)
+ca: RETF Iw
+cb: RETF
+cc: INT3
+cd: INT Ib
+ce: INTO (i64)
+cf: IRET/D/Q
+# 0xd0 - 0xdf
+d0: Grp2 Eb,1 (1A)
+d1: Grp2 Ev,1 (1A)
+d2: Grp2 Eb,CL (1A)
+d3: Grp2 Ev,CL (1A)
+d4: AAM Ib (i64)
+d5: AAD Ib (i64)
+d6:
+d7: XLAT/XLATB
+d8: ESC
+d9: ESC
+da: ESC
+db: ESC
+dc: ESC
+dd: ESC
+de: ESC
+df: ESC
+# 0xe0 - 0xef
+# Note: "forced64" is Intel CPU behavior: they ignore 0x66 prefix
+# in 64-bit mode. AMD CPUs accept 0x66 prefix, it causes RIP truncation
+# to 16 bits. In 32-bit mode, 0x66 is accepted by both Intel and AMD.
+e0: LOOPNE/LOOPNZ Jb (f64)
+e1: LOOPE/LOOPZ Jb (f64)
+e2: LOOP Jb (f64)
+e3: JrCXZ Jb (f64)
+e4: IN AL,Ib
+e5: IN eAX,Ib
+e6: OUT Ib,AL
+e7: OUT Ib,eAX
+# With 0x66 prefix in 64-bit mode, for AMD CPUs immediate offset
+# in "near" jumps and calls is 16-bit. For CALL,
+# push of return address is 16-bit wide, RSP is decremented by 2
+# but is not truncated to 16 bits, unlike RIP.
+e8: CALL Jz (f64)
+e9: JMP-near Jz (f64)
+ea: JMP-far Ap (i64)
+eb: JMP-short Jb (f64)
+ec: IN AL,DX
+ed: IN eAX,DX
+ee: OUT DX,AL
+ef: OUT DX,eAX
+# 0xf0 - 0xff
+f0: LOCK (Prefix)
+f1:
+f2: REPNE (Prefix) | XACQUIRE (Prefix)
+f3: REP/REPE (Prefix) | XRELEASE (Prefix)
+f4: HLT
+f5: CMC
+f6: Grp3_1 Eb (1A)
+f7: Grp3_2 Ev (1A)
+f8: CLC
+f9: STC
+fa: CLI
+fb: STI
+fc: CLD
+fd: STD
+fe: Grp4 (1A)
+ff: Grp5 (1A)
+EndTable
+
+Table: 2-byte opcode (0x0f)
+Referrer: 2-byte escape
+AVXcode: 1
+# 0x0f 0x00-0x0f
+00: Grp6 (1A)
+01: Grp7 (1A)
+02: LAR Gv,Ew
+03: LSL Gv,Ew
+04:
+05: SYSCALL (o64)
+06: CLTS
+07: SYSRET (o64)
+08: INVD
+09: WBINVD
+0a:
+0b: UD2 (1B)
+0c:
+# AMD's prefetch group. Intel supports prefetchw(/1) only.
+0d: GrpP
+0e: FEMMS
+# 3DNow! uses the last imm byte as opcode extension.
+0f: 3DNow! Pq,Qq,Ib
+# 0x0f 0x10-0x1f
+# NOTE: According to Intel SDM opcode map, vmovups and vmovupd has no operands
+# but it actually has operands. And also, vmovss and vmovsd only accept 128bit.
+# MOVSS/MOVSD has too many forms(3) on SDM. This map just shows a typical form.
+# Many AVX instructions lack v1 superscript, according to Intel AVX-Prgramming
+# Reference A.1
+10: vmovups Vps,Wps | vmovupd Vpd,Wpd (66) | vmovss Vx,Hx,Wss (F3),(v1) | vmovsd Vx,Hx,Wsd (F2),(v1)
+11: vmovups Wps,Vps | vmovupd Wpd,Vpd (66) | vmovss Wss,Hx,Vss (F3),(v1) | vmovsd Wsd,Hx,Vsd (F2),(v1)
+12: vmovlps Vq,Hq,Mq (v1) | vmovhlps Vq,Hq,Uq (v1) | vmovlpd Vq,Hq,Mq (66),(v1) | vmovsldup Vx,Wx (F3) | vmovddup Vx,Wx (F2)
+13: vmovlps Mq,Vq (v1) | vmovlpd Mq,Vq (66),(v1)
+14: vunpcklps Vx,Hx,Wx | vunpcklpd Vx,Hx,Wx (66)
+15: vunpckhps Vx,Hx,Wx | vunpckhpd Vx,Hx,Wx (66)
+16: vmovhps Vdq,Hq,Mq (v1) | vmovlhps Vdq,Hq,Uq (v1) | vmovhpd Vdq,Hq,Mq (66),(v1) | vmovshdup Vx,Wx (F3)
+17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
+18: Grp16 (1A)
+19:
+# Intel SDM opcode map does not list MPX instructions. For now using Gv for
+# bnd registers and Ev for everything else is OK because the instruction
+# decoder does not use the information except as an indication that there is
+# a ModR/M byte.
+1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev
+1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv
+1c:
+1d:
+1e:
+1f: NOP Ev
+# 0x0f 0x20-0x2f
+20: MOV Rd,Cd
+21: MOV Rd,Dd
+22: MOV Cd,Rd
+23: MOV Dd,Rd
+24:
+25:
+26:
+27:
+28: vmovaps Vps,Wps | vmovapd Vpd,Wpd (66)
+29: vmovaps Wps,Vps | vmovapd Wpd,Vpd (66)
+2a: cvtpi2ps Vps,Qpi | cvtpi2pd Vpd,Qpi (66) | vcvtsi2ss Vss,Hss,Ey (F3),(v1) | vcvtsi2sd Vsd,Hsd,Ey (F2),(v1)
+2b: vmovntps Mps,Vps | vmovntpd Mpd,Vpd (66)
+2c: cvttps2pi Ppi,Wps | cvttpd2pi Ppi,Wpd (66) | vcvttss2si Gy,Wss (F3),(v1) | vcvttsd2si Gy,Wsd (F2),(v1)
+2d: cvtps2pi Ppi,Wps | cvtpd2pi Qpi,Wpd (66) | vcvtss2si Gy,Wss (F3),(v1) | vcvtsd2si Gy,Wsd (F2),(v1)
+2e: vucomiss Vss,Wss (v1) | vucomisd  Vsd,Wsd (66),(v1)
+2f: vcomiss Vss,Wss (v1) | vcomisd  Vsd,Wsd (66),(v1)
+# 0x0f 0x30-0x3f
+30: WRMSR
+31: RDTSC
+32: RDMSR
+33: RDPMC
+34: SYSENTER
+35: SYSEXIT
+36:
+37: GETSEC
+38: escape # 3-byte escape 1
+39:
+3a: escape # 3-byte escape 2
+3b:
+3c:
+3d:
+3e:
+3f:
+# 0x0f 0x40-0x4f
+40: CMOVO Gv,Ev
+41: CMOVNO Gv,Ev
+42: CMOVB/C/NAE Gv,Ev
+43: CMOVAE/NB/NC Gv,Ev
+44: CMOVE/Z Gv,Ev
+45: CMOVNE/NZ Gv,Ev
+46: CMOVBE/NA Gv,Ev
+47: CMOVA/NBE Gv,Ev
+48: CMOVS Gv,Ev
+49: CMOVNS Gv,Ev
+4a: CMOVP/PE Gv,Ev
+4b: CMOVNP/PO Gv,Ev
+4c: CMOVL/NGE Gv,Ev
+4d: CMOVNL/GE Gv,Ev
+4e: CMOVLE/NG Gv,Ev
+4f: CMOVNLE/G Gv,Ev
+# 0x0f 0x50-0x5f
+50: vmovmskps Gy,Ups | vmovmskpd Gy,Upd (66)
+51: vsqrtps Vps,Wps | vsqrtpd Vpd,Wpd (66) | vsqrtss Vss,Hss,Wss (F3),(v1) | vsqrtsd Vsd,Hsd,Wsd (F2),(v1)
+52: vrsqrtps Vps,Wps | vrsqrtss Vss,Hss,Wss (F3),(v1)
+53: vrcpps Vps,Wps | vrcpss Vss,Hss,Wss (F3),(v1)
+54: vandps Vps,Hps,Wps | vandpd Vpd,Hpd,Wpd (66)
+55: vandnps Vps,Hps,Wps | vandnpd Vpd,Hpd,Wpd (66)
+56: vorps Vps,Hps,Wps | vorpd Vpd,Hpd,Wpd (66)
+57: vxorps Vps,Hps,Wps | vxorpd Vpd,Hpd,Wpd (66)
+58: vaddps Vps,Hps,Wps | vaddpd Vpd,Hpd,Wpd (66) | vaddss Vss,Hss,Wss (F3),(v1) | vaddsd Vsd,Hsd,Wsd (F2),(v1)
+59: vmulps Vps,Hps,Wps | vmulpd Vpd,Hpd,Wpd (66) | vmulss Vss,Hss,Wss (F3),(v1) | vmulsd Vsd,Hsd,Wsd (F2),(v1)
+5a: vcvtps2pd Vpd,Wps | vcvtpd2ps Vps,Wpd (66) | vcvtss2sd Vsd,Hx,Wss (F3),(v1) | vcvtsd2ss Vss,Hx,Wsd (F2),(v1)
+5b: vcvtdq2ps Vps,Wdq | vcvtps2dq Vdq,Wps (66) | vcvttps2dq Vdq,Wps (F3)
+5c: vsubps Vps,Hps,Wps | vsubpd Vpd,Hpd,Wpd (66) | vsubss Vss,Hss,Wss (F3),(v1) | vsubsd Vsd,Hsd,Wsd (F2),(v1)
+5d: vminps Vps,Hps,Wps | vminpd Vpd,Hpd,Wpd (66) | vminss Vss,Hss,Wss (F3),(v1) | vminsd Vsd,Hsd,Wsd (F2),(v1)
+5e: vdivps Vps,Hps,Wps | vdivpd Vpd,Hpd,Wpd (66) | vdivss Vss,Hss,Wss (F3),(v1) | vdivsd Vsd,Hsd,Wsd (F2),(v1)
+5f: vmaxps Vps,Hps,Wps | vmaxpd Vpd,Hpd,Wpd (66) | vmaxss Vss,Hss,Wss (F3),(v1) | vmaxsd Vsd,Hsd,Wsd (F2),(v1)
+# 0x0f 0x60-0x6f
+60: punpcklbw Pq,Qd | vpunpcklbw Vx,Hx,Wx (66),(v1)
+61: punpcklwd Pq,Qd | vpunpcklwd Vx,Hx,Wx (66),(v1)
+62: punpckldq Pq,Qd | vpunpckldq Vx,Hx,Wx (66),(v1)
+63: packsswb Pq,Qq | vpacksswb Vx,Hx,Wx (66),(v1)
+64: pcmpgtb Pq,Qq | vpcmpgtb Vx,Hx,Wx (66),(v1)
+65: pcmpgtw Pq,Qq | vpcmpgtw Vx,Hx,Wx (66),(v1)
+66: pcmpgtd Pq,Qq | vpcmpgtd Vx,Hx,Wx (66),(v1)
+67: packuswb Pq,Qq | vpackuswb Vx,Hx,Wx (66),(v1)
+68: punpckhbw Pq,Qd | vpunpckhbw Vx,Hx,Wx (66),(v1)
+69: punpckhwd Pq,Qd | vpunpckhwd Vx,Hx,Wx (66),(v1)
+6a: punpckhdq Pq,Qd | vpunpckhdq Vx,Hx,Wx (66),(v1)
+6b: packssdw Pq,Qd | vpackssdw Vx,Hx,Wx (66),(v1)
+6c: vpunpcklqdq Vx,Hx,Wx (66),(v1)
+6d: vpunpckhqdq Vx,Hx,Wx (66),(v1)
+6e: movd/q Pd,Ey | vmovd/q Vy,Ey (66),(v1)
+6f: movq Pq,Qq | vmovdqa Vx,Wx (66) | vmovdqu Vx,Wx (F3)
+# 0x0f 0x70-0x7f
+70: pshufw Pq,Qq,Ib | vpshufd Vx,Wx,Ib (66),(v1) | vpshufhw Vx,Wx,Ib (F3),(v1) | vpshuflw Vx,Wx,Ib (F2),(v1)
+71: Grp12 (1A)
+72: Grp13 (1A)
+73: Grp14 (1A)
+74: pcmpeqb Pq,Qq | vpcmpeqb Vx,Hx,Wx (66),(v1)
+75: pcmpeqw Pq,Qq | vpcmpeqw Vx,Hx,Wx (66),(v1)
+76: pcmpeqd Pq,Qq | vpcmpeqd Vx,Hx,Wx (66),(v1)
+# Note: Remove (v), because vzeroall and vzeroupper becomes emms without VEX.
+77: emms | vzeroupper | vzeroall
+78: VMREAD Ey,Gy
+79: VMWRITE Gy,Ey
+7a:
+7b:
+7c: vhaddpd Vpd,Hpd,Wpd (66) | vhaddps Vps,Hps,Wps (F2)
+7d: vhsubpd Vpd,Hpd,Wpd (66) | vhsubps Vps,Hps,Wps (F2)
+7e: movd/q Ey,Pd | vmovd/q Ey,Vy (66),(v1) | vmovq Vq,Wq (F3),(v1)
+7f: movq Qq,Pq | vmovdqa Wx,Vx (66) | vmovdqu Wx,Vx (F3)
+# 0x0f 0x80-0x8f
+# Note: "forced64" is Intel CPU behavior (see comment about CALL insn).
+80: JO Jz (f64)
+81: JNO Jz (f64)
+82: JB/JC/JNAE Jz (f64)
+83: JAE/JNB/JNC Jz (f64)
+84: JE/JZ Jz (f64)
+85: JNE/JNZ Jz (f64)
+86: JBE/JNA Jz (f64)
+87: JA/JNBE Jz (f64)
+88: JS Jz (f64)
+89: JNS Jz (f64)
+8a: JP/JPE Jz (f64)
+8b: JNP/JPO Jz (f64)
+8c: JL/JNGE Jz (f64)
+8d: JNL/JGE Jz (f64)
+8e: JLE/JNG Jz (f64)
+8f: JNLE/JG Jz (f64)
+# 0x0f 0x90-0x9f
+90: SETO Eb
+91: SETNO Eb
+92: SETB/C/NAE Eb
+93: SETAE/NB/NC Eb
+94: SETE/Z Eb
+95: SETNE/NZ Eb
+96: SETBE/NA Eb
+97: SETA/NBE Eb
+98: SETS Eb
+99: SETNS Eb
+9a: SETP/PE Eb
+9b: SETNP/PO Eb
+9c: SETL/NGE Eb
+9d: SETNL/GE Eb
+9e: SETLE/NG Eb
+9f: SETNLE/G Eb
+# 0x0f 0xa0-0xaf
+a0: PUSH FS (d64)
+a1: POP FS (d64)
+a2: CPUID
+a3: BT Ev,Gv
+a4: SHLD Ev,Gv,Ib
+a5: SHLD Ev,Gv,CL
+a6: GrpPDLK
+a7: GrpRNG
+a8: PUSH GS (d64)
+a9: POP GS (d64)
+aa: RSM
+ab: BTS Ev,Gv
+ac: SHRD Ev,Gv,Ib
+ad: SHRD Ev,Gv,CL
+ae: Grp15 (1A),(1C)
+af: IMUL Gv,Ev
+# 0x0f 0xb0-0xbf
+b0: CMPXCHG Eb,Gb
+b1: CMPXCHG Ev,Gv
+b2: LSS Gv,Mp
+b3: BTR Ev,Gv
+b4: LFS Gv,Mp
+b5: LGS Gv,Mp
+b6: MOVZX Gv,Eb
+b7: MOVZX Gv,Ew
+b8: JMPE (!F3) | POPCNT Gv,Ev (F3)
+b9: Grp10 (1A)
+ba: Grp8 Ev,Ib (1A)
+bb: BTC Ev,Gv
+bc: BSF Gv,Ev (!F3) | TZCNT Gv,Ev (F3)
+bd: BSR Gv,Ev (!F3) | LZCNT Gv,Ev (F3)
+be: MOVSX Gv,Eb
+bf: MOVSX Gv,Ew
+# 0x0f 0xc0-0xcf
+c0: XADD Eb,Gb
+c1: XADD Ev,Gv
+c2: vcmpps Vps,Hps,Wps,Ib | vcmppd Vpd,Hpd,Wpd,Ib (66) | vcmpss Vss,Hss,Wss,Ib (F3),(v1) | vcmpsd Vsd,Hsd,Wsd,Ib (F2),(v1)
+c3: movnti My,Gy
+c4: pinsrw Pq,Ry/Mw,Ib | vpinsrw Vdq,Hdq,Ry/Mw,Ib (66),(v1)
+c5: pextrw Gd,Nq,Ib | vpextrw Gd,Udq,Ib (66),(v1)
+c6: vshufps Vps,Hps,Wps,Ib | vshufpd Vpd,Hpd,Wpd,Ib (66)
+c7: Grp9 (1A)
+c8: BSWAP RAX/EAX/R8/R8D
+c9: BSWAP RCX/ECX/R9/R9D
+ca: BSWAP RDX/EDX/R10/R10D
+cb: BSWAP RBX/EBX/R11/R11D
+cc: BSWAP RSP/ESP/R12/R12D
+cd: BSWAP RBP/EBP/R13/R13D
+ce: BSWAP RSI/ESI/R14/R14D
+cf: BSWAP RDI/EDI/R15/R15D
+# 0x0f 0xd0-0xdf
+d0: vaddsubpd Vpd,Hpd,Wpd (66) | vaddsubps Vps,Hps,Wps (F2)
+d1: psrlw Pq,Qq | vpsrlw Vx,Hx,Wx (66),(v1)
+d2: psrld Pq,Qq | vpsrld Vx,Hx,Wx (66),(v1)
+d3: psrlq Pq,Qq | vpsrlq Vx,Hx,Wx (66),(v1)
+d4: paddq Pq,Qq | vpaddq Vx,Hx,Wx (66),(v1)
+d5: pmullw Pq,Qq | vpmullw Vx,Hx,Wx (66),(v1)
+d6: vmovq Wq,Vq (66),(v1) | movq2dq Vdq,Nq (F3) | movdq2q Pq,Uq (F2)
+d7: pmovmskb Gd,Nq | vpmovmskb Gd,Ux (66),(v1)
+d8: psubusb Pq,Qq | vpsubusb Vx,Hx,Wx (66),(v1)
+d9: psubusw Pq,Qq | vpsubusw Vx,Hx,Wx (66),(v1)
+da: pminub Pq,Qq | vpminub Vx,Hx,Wx (66),(v1)
+db: pand Pq,Qq | vpand Vx,Hx,Wx (66),(v1)
+dc: paddusb Pq,Qq | vpaddusb Vx,Hx,Wx (66),(v1)
+dd: paddusw Pq,Qq | vpaddusw Vx,Hx,Wx (66),(v1)
+de: pmaxub Pq,Qq | vpmaxub Vx,Hx,Wx (66),(v1)
+df: pandn Pq,Qq | vpandn Vx,Hx,Wx (66),(v1)
+# 0x0f 0xe0-0xef
+e0: pavgb Pq,Qq | vpavgb Vx,Hx,Wx (66),(v1)
+e1: psraw Pq,Qq | vpsraw Vx,Hx,Wx (66),(v1)
+e2: psrad Pq,Qq | vpsrad Vx,Hx,Wx (66),(v1)
+e3: pavgw Pq,Qq | vpavgw Vx,Hx,Wx (66),(v1)
+e4: pmulhuw Pq,Qq | vpmulhuw Vx,Hx,Wx (66),(v1)
+e5: pmulhw Pq,Qq | vpmulhw Vx,Hx,Wx (66),(v1)
+e6: vcvttpd2dq Vx,Wpd (66) | vcvtdq2pd Vx,Wdq (F3) | vcvtpd2dq Vx,Wpd (F2)
+e7: movntq Mq,Pq | vmovntdq Mx,Vx (66)
+e8: psubsb Pq,Qq | vpsubsb Vx,Hx,Wx (66),(v1)
+e9: psubsw Pq,Qq | vpsubsw Vx,Hx,Wx (66),(v1)
+ea: pminsw Pq,Qq | vpminsw Vx,Hx,Wx (66),(v1)
+eb: por Pq,Qq | vpor Vx,Hx,Wx (66),(v1)
+ec: paddsb Pq,Qq | vpaddsb Vx,Hx,Wx (66),(v1)
+ed: paddsw Pq,Qq | vpaddsw Vx,Hx,Wx (66),(v1)
+ee: pmaxsw Pq,Qq | vpmaxsw Vx,Hx,Wx (66),(v1)
+ef: pxor Pq,Qq | vpxor Vx,Hx,Wx (66),(v1)
+# 0x0f 0xf0-0xff
+f0: vlddqu Vx,Mx (F2)
+f1: psllw Pq,Qq | vpsllw Vx,Hx,Wx (66),(v1)
+f2: pslld Pq,Qq | vpslld Vx,Hx,Wx (66),(v1)
+f3: psllq Pq,Qq | vpsllq Vx,Hx,Wx (66),(v1)
+f4: pmuludq Pq,Qq | vpmuludq Vx,Hx,Wx (66),(v1)
+f5: pmaddwd Pq,Qq | vpmaddwd Vx,Hx,Wx (66),(v1)
+f6: psadbw Pq,Qq | vpsadbw Vx,Hx,Wx (66),(v1)
+f7: maskmovq Pq,Nq | vmaskmovdqu Vx,Ux (66),(v1)
+f8: psubb Pq,Qq | vpsubb Vx,Hx,Wx (66),(v1)
+f9: psubw Pq,Qq | vpsubw Vx,Hx,Wx (66),(v1)
+fa: psubd Pq,Qq | vpsubd Vx,Hx,Wx (66),(v1)
+fb: psubq Pq,Qq | vpsubq Vx,Hx,Wx (66),(v1)
+fc: paddb Pq,Qq | vpaddb Vx,Hx,Wx (66),(v1)
+fd: paddw Pq,Qq | vpaddw Vx,Hx,Wx (66),(v1)
+fe: paddd Pq,Qq | vpaddd Vx,Hx,Wx (66),(v1)
+ff:
+EndTable
+
+Table: 3-byte opcode 1 (0x0f 0x38)
+Referrer: 3-byte escape 1
+AVXcode: 2
+# 0x0f 0x38 0x00-0x0f
+00: pshufb Pq,Qq | vpshufb Vx,Hx,Wx (66),(v1)
+01: phaddw Pq,Qq | vphaddw Vx,Hx,Wx (66),(v1)
+02: phaddd Pq,Qq | vphaddd Vx,Hx,Wx (66),(v1)
+03: phaddsw Pq,Qq | vphaddsw Vx,Hx,Wx (66),(v1)
+04: pmaddubsw Pq,Qq | vpmaddubsw Vx,Hx,Wx (66),(v1)
+05: phsubw Pq,Qq | vphsubw Vx,Hx,Wx (66),(v1)
+06: phsubd Pq,Qq | vphsubd Vx,Hx,Wx (66),(v1)
+07: phsubsw Pq,Qq | vphsubsw Vx,Hx,Wx (66),(v1)
+08: psignb Pq,Qq | vpsignb Vx,Hx,Wx (66),(v1)
+09: psignw Pq,Qq | vpsignw Vx,Hx,Wx (66),(v1)
+0a: psignd Pq,Qq | vpsignd Vx,Hx,Wx (66),(v1)
+0b: pmulhrsw Pq,Qq | vpmulhrsw Vx,Hx,Wx (66),(v1)
+0c: vpermilps Vx,Hx,Wx (66),(v)
+0d: vpermilpd Vx,Hx,Wx (66),(v)
+0e: vtestps Vx,Wx (66),(v)
+0f: vtestpd Vx,Wx (66),(v)
+# 0x0f 0x38 0x10-0x1f
+10: pblendvb Vdq,Wdq (66)
+11:
+12:
+13: vcvtph2ps Vx,Wx,Ib (66),(v)
+14: blendvps Vdq,Wdq (66)
+15: blendvpd Vdq,Wdq (66)
+16: vpermps Vqq,Hqq,Wqq (66),(v)
+17: vptest Vx,Wx (66)
+18: vbroadcastss Vx,Wd (66),(v)
+19: vbroadcastsd Vqq,Wq (66),(v)
+1a: vbroadcastf128 Vqq,Mdq (66),(v)
+1b:
+1c: pabsb Pq,Qq | vpabsb Vx,Wx (66),(v1)
+1d: pabsw Pq,Qq | vpabsw Vx,Wx (66),(v1)
+1e: pabsd Pq,Qq | vpabsd Vx,Wx (66),(v1)
+1f:
+# 0x0f 0x38 0x20-0x2f
+20: vpmovsxbw Vx,Ux/Mq (66),(v1)
+21: vpmovsxbd Vx,Ux/Md (66),(v1)
+22: vpmovsxbq Vx,Ux/Mw (66),(v1)
+23: vpmovsxwd Vx,Ux/Mq (66),(v1)
+24: vpmovsxwq Vx,Ux/Md (66),(v1)
+25: vpmovsxdq Vx,Ux/Mq (66),(v1)
+26:
+27:
+28: vpmuldq Vx,Hx,Wx (66),(v1)
+29: vpcmpeqq Vx,Hx,Wx (66),(v1)
+2a: vmovntdqa Vx,Mx (66),(v1)
+2b: vpackusdw Vx,Hx,Wx (66),(v1)
+2c: vmaskmovps Vx,Hx,Mx (66),(v)
+2d: vmaskmovpd Vx,Hx,Mx (66),(v)
+2e: vmaskmovps Mx,Hx,Vx (66),(v)
+2f: vmaskmovpd Mx,Hx,Vx (66),(v)
+# 0x0f 0x38 0x30-0x3f
+30: vpmovzxbw Vx,Ux/Mq (66),(v1)
+31: vpmovzxbd Vx,Ux/Md (66),(v1)
+32: vpmovzxbq Vx,Ux/Mw (66),(v1)
+33: vpmovzxwd Vx,Ux/Mq (66),(v1)
+34: vpmovzxwq Vx,Ux/Md (66),(v1)
+35: vpmovzxdq Vx,Ux/Mq (66),(v1)
+36: vpermd Vqq,Hqq,Wqq (66),(v)
+37: vpcmpgtq Vx,Hx,Wx (66),(v1)
+38: vpminsb Vx,Hx,Wx (66),(v1)
+39: vpminsd Vx,Hx,Wx (66),(v1)
+3a: vpminuw Vx,Hx,Wx (66),(v1)
+3b: vpminud Vx,Hx,Wx (66),(v1)
+3c: vpmaxsb Vx,Hx,Wx (66),(v1)
+3d: vpmaxsd Vx,Hx,Wx (66),(v1)
+3e: vpmaxuw Vx,Hx,Wx (66),(v1)
+3f: vpmaxud Vx,Hx,Wx (66),(v1)
+# 0x0f 0x38 0x40-0x8f
+40: vpmulld Vx,Hx,Wx (66),(v1)
+41: vphminposuw Vdq,Wdq (66),(v1)
+42:
+43:
+44:
+45: vpsrlvd/q Vx,Hx,Wx (66),(v)
+46: vpsravd Vx,Hx,Wx (66),(v)
+47: vpsllvd/q Vx,Hx,Wx (66),(v)
+# Skip 0x48-0x57
+58: vpbroadcastd Vx,Wx (66),(v)
+59: vpbroadcastq Vx,Wx (66),(v)
+5a: vbroadcasti128 Vqq,Mdq (66),(v)
+# Skip 0x5b-0x77
+78: vpbroadcastb Vx,Wx (66),(v)
+79: vpbroadcastw Vx,Wx (66),(v)
+# Skip 0x7a-0x7f
+80: INVEPT Gy,Mdq (66)
+81: INVPID Gy,Mdq (66)
+82: INVPCID Gy,Mdq (66)
+8c: vpmaskmovd/q Vx,Hx,Mx (66),(v)
+8e: vpmaskmovd/q Mx,Vx,Hx (66),(v)
+# 0x0f 0x38 0x90-0xbf (FMA)
+90: vgatherdd/q Vx,Hx,Wx (66),(v)
+91: vgatherqd/q Vx,Hx,Wx (66),(v)
+92: vgatherdps/d Vx,Hx,Wx (66),(v)
+93: vgatherqps/d Vx,Hx,Wx (66),(v)
+94:
+95:
+96: vfmaddsub132ps/d Vx,Hx,Wx (66),(v)
+97: vfmsubadd132ps/d Vx,Hx,Wx (66),(v)
+98: vfmadd132ps/d Vx,Hx,Wx (66),(v)
+99: vfmadd132ss/d Vx,Hx,Wx (66),(v),(v1)
+9a: vfmsub132ps/d Vx,Hx,Wx (66),(v)
+9b: vfmsub132ss/d Vx,Hx,Wx (66),(v),(v1)
+9c: vfnmadd132ps/d Vx,Hx,Wx (66),(v)
+9d: vfnmadd132ss/d Vx,Hx,Wx (66),(v),(v1)
+9e: vfnmsub132ps/d Vx,Hx,Wx (66),(v)
+9f: vfnmsub132ss/d Vx,Hx,Wx (66),(v),(v1)
+a6: vfmaddsub213ps/d Vx,Hx,Wx (66),(v)
+a7: vfmsubadd213ps/d Vx,Hx,Wx (66),(v)
+a8: vfmadd213ps/d Vx,Hx,Wx (66),(v)
+a9: vfmadd213ss/d Vx,Hx,Wx (66),(v),(v1)
+aa: vfmsub213ps/d Vx,Hx,Wx (66),(v)
+ab: vfmsub213ss/d Vx,Hx,Wx (66),(v),(v1)
+ac: vfnmadd213ps/d Vx,Hx,Wx (66),(v)
+ad: vfnmadd213ss/d Vx,Hx,Wx (66),(v),(v1)
+ae: vfnmsub213ps/d Vx,Hx,Wx (66),(v)
+af: vfnmsub213ss/d Vx,Hx,Wx (66),(v),(v1)
+b6: vfmaddsub231ps/d Vx,Hx,Wx (66),(v)
+b7: vfmsubadd231ps/d Vx,Hx,Wx (66),(v)
+b8: vfmadd231ps/d Vx,Hx,Wx (66),(v)
+b9: vfmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
+ba: vfmsub231ps/d Vx,Hx,Wx (66),(v)
+bb: vfmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
+bc: vfnmadd231ps/d Vx,Hx,Wx (66),(v)
+bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
+be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
+bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
+# 0x0f 0x38 0xc0-0xff
+c8: sha1nexte Vdq,Wdq
+c9: sha1msg1 Vdq,Wdq
+ca: sha1msg2 Vdq,Wdq
+cb: sha256rnds2 Vdq,Wdq
+cc: sha256msg1 Vdq,Wdq
+cd: sha256msg2 Vdq,Wdq
+db: VAESIMC Vdq,Wdq (66),(v1)
+dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
+dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
+de: VAESDEC Vdq,Hdq,Wdq (66),(v1)
+df: VAESDECLAST Vdq,Hdq,Wdq (66),(v1)
+f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2)
+f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2)
+f2: ANDN Gy,By,Ey (v)
+f3: Grp17 (1A)
+f5: BZHI Gy,Ey,By (v) | PEXT Gy,By,Ey (F3),(v) | PDEP Gy,By,Ey (F2),(v)
+f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v)
+f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v)
+EndTable
+
+Table: 3-byte opcode 2 (0x0f 0x3a)
+Referrer: 3-byte escape 2
+AVXcode: 3
+# 0x0f 0x3a 0x00-0xff
+00: vpermq Vqq,Wqq,Ib (66),(v)
+01: vpermpd Vqq,Wqq,Ib (66),(v)
+02: vpblendd Vx,Hx,Wx,Ib (66),(v)
+03:
+04: vpermilps Vx,Wx,Ib (66),(v)
+05: vpermilpd Vx,Wx,Ib (66),(v)
+06: vperm2f128 Vqq,Hqq,Wqq,Ib (66),(v)
+07:
+08: vroundps Vx,Wx,Ib (66)
+09: vroundpd Vx,Wx,Ib (66)
+0a: vroundss Vss,Wss,Ib (66),(v1)
+0b: vroundsd Vsd,Wsd,Ib (66),(v1)
+0c: vblendps Vx,Hx,Wx,Ib (66)
+0d: vblendpd Vx,Hx,Wx,Ib (66)
+0e: vpblendw Vx,Hx,Wx,Ib (66),(v1)
+0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1)
+14: vpextrb Rd/Mb,Vdq,Ib (66),(v1)
+15: vpextrw Rd/Mw,Vdq,Ib (66),(v1)
+16: vpextrd/q Ey,Vdq,Ib (66),(v1)
+17: vextractps Ed,Vdq,Ib (66),(v1)
+18: vinsertf128 Vqq,Hqq,Wqq,Ib (66),(v)
+19: vextractf128 Wdq,Vqq,Ib (66),(v)
+1d: vcvtps2ph Wx,Vx,Ib (66),(v)
+20: vpinsrb Vdq,Hdq,Ry/Mb,Ib (66),(v1)
+21: vinsertps Vdq,Hdq,Udq/Md,Ib (66),(v1)
+22: vpinsrd/q Vdq,Hdq,Ey,Ib (66),(v1)
+38: vinserti128 Vqq,Hqq,Wqq,Ib (66),(v)
+39: vextracti128 Wdq,Vqq,Ib (66),(v)
+40: vdpps Vx,Hx,Wx,Ib (66)
+41: vdppd Vdq,Hdq,Wdq,Ib (66),(v1)
+42: vmpsadbw Vx,Hx,Wx,Ib (66),(v1)
+44: vpclmulqdq Vdq,Hdq,Wdq,Ib (66),(v1)
+46: vperm2i128 Vqq,Hqq,Wqq,Ib (66),(v)
+4a: vblendvps Vx,Hx,Wx,Lx (66),(v)
+4b: vblendvpd Vx,Hx,Wx,Lx (66),(v)
+4c: vpblendvb Vx,Hx,Wx,Lx (66),(v1)
+60: vpcmpestrm Vdq,Wdq,Ib (66),(v1)
+61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
+62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
+63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
+cc: sha1rnds4 Vdq,Wdq,Ib
+df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
+f0: RORX Gy,Ey,Ib (F2),(v)
+EndTable
+
+GrpTable: Grp1
+0: ADD
+1: OR
+2: ADC
+3: SBB
+4: AND
+5: SUB
+6: XOR
+7: CMP
+EndTable
+
+GrpTable: Grp1A
+0: POP
+EndTable
+
+GrpTable: Grp2
+0: ROL
+1: ROR
+2: RCL
+3: RCR
+4: SHL/SAL
+5: SHR
+6:
+7: SAR
+EndTable
+
+GrpTable: Grp3_1
+0: TEST Eb,Ib
+1:
+2: NOT Eb
+3: NEG Eb
+4: MUL AL,Eb
+5: IMUL AL,Eb
+6: DIV AL,Eb
+7: IDIV AL,Eb
+EndTable
+
+GrpTable: Grp3_2
+0: TEST Ev,Iz
+1:
+2: NOT Ev
+3: NEG Ev
+4: MUL rAX,Ev
+5: IMUL rAX,Ev
+6: DIV rAX,Ev
+7: IDIV rAX,Ev
+EndTable
+
+GrpTable: Grp4
+0: INC Eb
+1: DEC Eb
+EndTable
+
+GrpTable: Grp5
+0: INC Ev
+1: DEC Ev
+# Note: "forced64" is Intel CPU behavior (see comment about CALL insn).
+2: CALLN Ev (f64)
+3: CALLF Ep
+4: JMPN Ev (f64)
+5: JMPF Mp
+6: PUSH Ev (d64)
+7:
+EndTable
+
+GrpTable: Grp6
+0: SLDT Rv/Mw
+1: STR Rv/Mw
+2: LLDT Ew
+3: LTR Ew
+4: VERR Ew
+5: VERW Ew
+EndTable
+
+GrpTable: Grp7
+0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B)
+1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B)
+2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B)
+3: LIDT Ms
+4: SMSW Mw/Rv
+5: rdpkru (110),(11B) | wrpkru (111),(11B)
+6: LMSW Ew
+7: INVLPG Mb | SWAPGS (o64),(000),(11B) | RDTSCP (001),(11B)
+EndTable
+
+GrpTable: Grp8
+4: BT
+5: BTS
+6: BTR
+7: BTC
+EndTable
+
+GrpTable: Grp9
+1: CMPXCHG8B/16B Mq/Mdq
+3: xrstors
+4: xsavec
+5: xsaves
+6: VMPTRLD Mq | VMCLEAR Mq (66) | VMXON Mq (F3) | RDRAND Rv (11B)
+7: VMPTRST Mq | VMPTRST Mq (F3) | RDSEED Rv (11B)
+EndTable
+
+GrpTable: Grp10
+EndTable
+
+# Grp11A and Grp11B are expressed as Grp11 in Intel SDM
+GrpTable: Grp11A
+0: MOV Eb,Ib
+7: XABORT Ib (000),(11B)
+EndTable
+
+GrpTable: Grp11B
+0: MOV Eb,Iz
+7: XBEGIN Jz (000),(11B)
+EndTable
+
+GrpTable: Grp12
+2: psrlw Nq,Ib (11B) | vpsrlw Hx,Ux,Ib (66),(11B),(v1)
+4: psraw Nq,Ib (11B) | vpsraw Hx,Ux,Ib (66),(11B),(v1)
+6: psllw Nq,Ib (11B) | vpsllw Hx,Ux,Ib (66),(11B),(v1)
+EndTable
+
+GrpTable: Grp13
+2: psrld Nq,Ib (11B) | vpsrld Hx,Ux,Ib (66),(11B),(v1)
+4: psrad Nq,Ib (11B) | vpsrad Hx,Ux,Ib (66),(11B),(v1)
+6: pslld Nq,Ib (11B) | vpslld Hx,Ux,Ib (66),(11B),(v1)
+EndTable
+
+GrpTable: Grp14
+2: psrlq Nq,Ib (11B) | vpsrlq Hx,Ux,Ib (66),(11B),(v1)
+3: vpsrldq Hx,Ux,Ib (66),(11B),(v1)
+6: psllq Nq,Ib (11B) | vpsllq Hx,Ux,Ib (66),(11B),(v1)
+7: vpslldq Hx,Ux,Ib (66),(11B),(v1)
+EndTable
+
+GrpTable: Grp15
+0: fxsave | RDFSBASE Ry (F3),(11B)
+1: fxstor | RDGSBASE Ry (F3),(11B)
+2: vldmxcsr Md (v1) | WRFSBASE Ry (F3),(11B)
+3: vstmxcsr Md (v1) | WRGSBASE Ry (F3),(11B)
+4: XSAVE
+5: XRSTOR | lfence (11B)
+6: XSAVEOPT | clwb (66) | mfence (11B)
+7: clflush | clflushopt (66) | sfence (11B) | pcommit (66),(11B)
+EndTable
+
+GrpTable: Grp16
+0: prefetch NTA
+1: prefetch T0
+2: prefetch T1
+3: prefetch T2
+EndTable
+
+GrpTable: Grp17
+1: BLSR By,Ey (v)
+2: BLSMSK By,Ey (v)
+3: BLSI By,Ey (v)
+EndTable
+
+# AMD's Prefetch Group
+GrpTable: GrpP
+0: PREFETCH
+1: PREFETCHW
+EndTable
+
+GrpTable: GrpPDLK
+0: MONTMUL
+1: XSHA1
+2: XSHA2
+EndTable
+
+GrpTable: GrpRNG
+0: xstore-rng
+1: xcrypt-ecb
+2: xcrypt-cbc
+4: xcrypt-cfb
+5: xcrypt-ofb
+EndTable
diff --git a/tools/stacktool/builtin-check.c b/tools/stacktool/builtin-check.c
new file mode 100644
index 0000000..5b0e91f
--- /dev/null
+++ b/tools/stacktool/builtin-check.c
@@ -0,0 +1,991 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * stacktool check:
+ *
+ * This command analyzes every .o file and ensures the validity of its stack
+ * trace metadata.  It enforces a set of rules on asm code and C inline
+ * assembly code so that stack traces can be reliable.
+ *
+ * For more information, see tools/stacktool/Documentation/stack-validation.txt.
+ */
+
+#include <string.h>
+#include <subcmd/parse-options.h>
+
+#include "builtin.h"
+#include "elf.h"
+#include "special.h"
+#include "arch.h"
+#include "warn.h"
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+#define STATE_FP_SAVED		0x1
+#define STATE_FP_SETUP		0x2
+#define STATE_FENTRY		0x4
+
+struct instruction {
+	struct list_head list;
+	struct section *sec;
+	unsigned long offset;
+	unsigned int len, state;
+	unsigned char type;
+	unsigned long immediate;
+	bool alt_group, visited;
+	struct symbol *call_dest;
+	struct instruction *jump_dest;
+	struct list_head alts;
+};
+
+struct alternative {
+	struct list_head list;
+	struct instruction *insn;
+};
+
+struct stacktool_file {
+	struct elf *elf;
+	struct list_head insns;
+};
+
+const char *objname;
+static bool nofp;
+
+static struct instruction *find_instruction(struct stacktool_file *file,
+					    struct section *sec,
+					    unsigned long offset)
+{
+	struct instruction *insn;
+
+	list_for_each_entry(insn, &file->insns, list)
+		if (insn->sec == sec && insn->offset == offset)
+			return insn;
+
+	return NULL;
+}
+
+/*
+ * Check if the function has been manually whitelisted with the
+ * STACKTOOL_IGNORE_FUNC macro, or if should be automatically whitelisted due
+ * to its use of a context switching instruction.
+ */
+static bool ignore_func(struct stacktool_file *file, struct symbol *func)
+{
+	struct section *macro_sec;
+	struct rela *rela;
+	struct instruction *insn;
+
+	/* check for STACKTOOL_IGNORE_FUNC */
+	macro_sec = find_section_by_name(file->elf, "__stacktool_ignore_func");
+	if (macro_sec && macro_sec->rela)
+		list_for_each_entry(rela, &macro_sec->rela->relas, list)
+			if (rela->sym->sec == func->sec &&
+			    rela->addend == func->offset)
+				return true;
+
+	/* check if it has a context switching instruction */
+	insn = find_instruction(file, func->sec, func->offset);
+	if (!insn)
+		return false;
+	list_for_each_entry_from(insn, &file->insns, list) {
+		if (insn->sec != func->sec ||
+		    insn->offset >= func->offset + func->len)
+			break;
+		if (insn->type == INSN_CONTEXT_SWITCH)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * This checks to see if the given function is a "noreturn" function.
+ *
+ * For global functions which are outside the scope of this object file, we
+ * have to keep a manual list of them.
+ *
+ * For local functions, we have to detect them manually by simply looking for
+ * the lack of a return instruction.
+ */
+static bool dead_end_function(struct stacktool_file *file, struct symbol *func)
+{
+	int i;
+	struct instruction *insn;
+	bool empty = true;
+
+	/*
+	 * Unfortunately these have to be hard coded because the noreturn
+	 * attribute isn't provided in ELF data.
+	 */
+	static const char * const global_noreturns[] = {
+		"__stack_chk_fail",
+		"panic",
+		"do_exit",
+		"__module_put_and_exit",
+		"complete_and_exit",
+		"kvm_spurious_fault",
+	};
+
+	if (func->bind == STB_WEAK)
+		return false;
+
+	if (func->bind == STB_GLOBAL)
+		for (i = 0; i < ARRAY_SIZE(global_noreturns); i++)
+			if (!strcmp(func->name, global_noreturns[i]))
+				return true;
+
+	if (!func->sec)
+		return false;
+
+	insn = find_instruction(file, func->sec, func->offset);
+	if (!insn)
+		return false;
+
+	list_for_each_entry_from(insn, &file->insns, list) {
+		if (insn->sec != func->sec ||
+		    insn->offset >= func->offset + func->len)
+			break;
+
+		empty = false;
+
+		if (insn->type == INSN_RETURN)
+			return false;
+
+		if (insn->type == INSN_JUMP_UNCONDITIONAL) {
+			struct instruction *dest = insn->jump_dest;
+			struct symbol *dest_func;
+
+			if (!dest)
+				/* sibling call to another file */
+				return false;
+
+			if (dest->sec != func->sec ||
+			    dest->offset < func->offset ||
+			    dest->offset >= func->offset + func->len) {
+				/* local sibling call */
+				dest_func = find_symbol_by_offset(dest->sec,
+								  dest->offset);
+				if (!dest_func)
+					continue;
+
+				return dead_end_function(file, dest_func);
+			}
+		}
+
+		if (insn->type == INSN_JUMP_DYNAMIC)
+			/* sibling call */
+			return false;
+	}
+
+	return !empty;
+}
+
+/*
+ * Call the arch-specific instruction decoder for all the instructions and add
+ * them to the global insns list.
+ */
+static int decode_instructions(struct stacktool_file *file)
+{
+	struct section *sec;
+	unsigned long offset;
+	struct instruction *insn;
+	int ret;
+
+	INIT_LIST_HEAD(&file->insns);
+
+	list_for_each_entry(sec, &file->elf->sections, list) {
+
+		if (!(sec->sh.sh_flags & SHF_EXECINSTR))
+			continue;
+
+		for (offset = 0; offset < sec->len; offset += insn->len) {
+			insn = malloc(sizeof(*insn));
+			memset(insn, 0, sizeof(*insn));
+
+			INIT_LIST_HEAD(&insn->alts);
+			insn->sec = sec;
+			insn->offset = offset;
+
+			ret = arch_decode_instruction(file->elf, sec, offset,
+						      sec->len - offset,
+						      &insn->len, &insn->type,
+						      &insn->immediate);
+			if (ret)
+				return ret;
+
+			if (!insn->type || insn->type > INSN_LAST) {
+				WARN_FUNC("invalid instruction type %d",
+					  insn->sec, insn->offset, insn->type);
+				return -1;
+			}
+
+			list_add_tail(&insn->list, &file->insns);
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Warnings shouldn't be reported for ignored functions.
+ */
+static void get_ignores(struct stacktool_file *file)
+{
+	struct instruction *insn;
+	struct section *sec;
+	struct symbol *func;
+
+	list_for_each_entry(sec, &file->elf->sections, list) {
+		list_for_each_entry(func, &sec->symbols, list) {
+			if (func->type != STT_FUNC)
+				continue;
+
+			if (!ignore_func(file, func))
+				continue;
+
+			insn = find_instruction(file, sec, func->offset);
+			if (!insn)
+				continue;
+
+			list_for_each_entry_from(insn, &file->insns, list) {
+				if (insn->sec != func->sec ||
+				    insn->offset >= func->offset + func->len)
+					break;
+
+				insn->visited = true;
+			}
+		}
+	}
+}
+
+/*
+ * Find the destination instructions for all jumps.
+ */
+static int get_jump_destinations(struct stacktool_file *file)
+{
+	struct instruction *insn;
+	struct rela *rela;
+	struct section *dest_sec;
+	unsigned long dest_off;
+
+	list_for_each_entry(insn, &file->insns, list) {
+		if (insn->type != INSN_JUMP_CONDITIONAL &&
+		    insn->type != INSN_JUMP_UNCONDITIONAL)
+			continue;
+
+		/* skip ignores */
+		if (insn->visited)
+			continue;
+
+		rela = find_rela_by_dest_range(insn->sec, insn->offset,
+					       insn->len);
+		if (!rela) {
+			dest_sec = insn->sec;
+			dest_off = insn->offset + insn->len + insn->immediate;
+		} else if (rela->sym->type == STT_SECTION) {
+			dest_sec = rela->sym->sec;
+			dest_off = rela->addend + 4;
+		} else if (rela->sym->sec->idx) {
+			dest_sec = rela->sym->sec;
+			dest_off = rela->sym->sym.st_value + rela->addend + 4;
+		} else {
+			/*
+			 * This error (jump to another file) will be handled
+			 * later in validate_functions().
+			 */
+			continue;
+		}
+
+		insn->jump_dest = find_instruction(file, dest_sec, dest_off);
+		if (!insn->jump_dest) {
+
+			/*
+			 * This is a special case where an alt instruction
+			 * jumps past the end of the section.  These are
+			 * handled later.
+			 */
+			if (!strcmp(insn->sec->name, ".altinstr_replacement"))
+				continue;
+
+			WARN_FUNC("can't find jump dest instruction at %s+0x%lx",
+				  insn->sec, insn->offset, dest_sec->name,
+				  dest_off);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Find the destination instructions for all calls.
+ */
+static int get_call_destinations(struct stacktool_file *file)
+{
+	struct instruction *insn;
+	unsigned long dest_off;
+	struct rela *rela;
+
+	list_for_each_entry(insn, &file->insns, list) {
+		if (insn->type != INSN_CALL)
+			continue;
+
+		rela = find_rela_by_dest_range(insn->sec, insn->offset,
+					       insn->len);
+		if (!rela) {
+			dest_off = insn->offset + insn->len + insn->immediate;
+			insn->call_dest = find_symbol_by_offset(insn->sec,
+								dest_off);
+			if (!insn->call_dest) {
+				WARN_FUNC("can't find call dest symbol at offset 0x%lx",
+					  insn->sec, insn->offset, dest_off);
+				return -1;
+			}
+		} else if (rela->sym->type == STT_SECTION) {
+			insn->call_dest = find_symbol_by_offset(rela->sym->sec,
+								rela->addend+4);
+			if (!insn->call_dest ||
+			    insn->call_dest->type != STT_FUNC) {
+				WARN_FUNC("can't find call dest symbol at %s+0x%x",
+					  insn->sec, insn->offset,
+					  rela->sym->sec->name,
+					  rela->addend + 4);
+				return -1;
+			}
+		} else
+			insn->call_dest = rela->sym;
+	}
+
+	return 0;
+}
+
+/*
+ * The .alternatives section requires some extra special care, over and above
+ * what other special sections require:
+ *
+ * 1. Because alternatives are patched in-place, we need to insert a fake jump
+ *    instruction at the end so that validate_branch() skips all the original
+ *    replaced instructions when validating the new instruction path.
+ *
+ * 2. An added wrinkle is that the new instruction length might be zero.  In
+ *    that case the old instructions are replaced with noops.  We simulate that
+ *    by creating a fake jump as the only new instruction.
+ *
+ * 3. In some cases, the alternative section includes an instruction which
+ *    conditionally jumps to the _end_ of the entry.  We have to modify these
+ *    jumps' destinations to point back to .text rather than the end of the
+ *    entry in .altinstr_replacement.
+ *
+ * 4. It has been requested that we don't validate the !POPCNT feature path
+ *    which is a "very very small percentage of machines".
+ */
+static int handle_group_alt(struct stacktool_file *file,
+			    struct special_alt *special_alt,
+			    struct instruction *orig_insn,
+			    struct instruction **new_insn)
+{
+	struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump;
+	unsigned long dest_off;
+
+	last_orig_insn = NULL;
+	insn = orig_insn;
+	list_for_each_entry_from(insn, &file->insns, list) {
+		if (insn->sec != special_alt->orig_sec ||
+		    insn->offset >= special_alt->orig_off + special_alt->orig_len)
+			break;
+
+		if (special_alt->skip_orig)
+			insn->type = INSN_NOP;
+
+		insn->alt_group = true;
+		last_orig_insn = insn;
+	}
+
+	if (list_is_last(&last_orig_insn->list, &file->insns) ||
+	    list_next_entry(last_orig_insn, list)->sec != special_alt->orig_sec) {
+		WARN("%s: don't know how to handle alternatives at end of section",
+		     special_alt->orig_sec->name);
+		return -1;
+	}
+
+	fake_jump = malloc(sizeof(*fake_jump));
+	if (!fake_jump) {
+		WARN("malloc failed");
+		return -1;
+	}
+	memset(fake_jump, 0, sizeof(*fake_jump));
+	INIT_LIST_HEAD(&fake_jump->alts);
+	fake_jump->sec = special_alt->new_sec;
+	fake_jump->offset = -1;
+	fake_jump->type = INSN_JUMP_UNCONDITIONAL;
+	fake_jump->jump_dest = list_next_entry(last_orig_insn, list);
+
+	if (!special_alt->new_len) {
+		*new_insn = fake_jump;
+		return 0;
+	}
+
+	last_new_insn = NULL;
+	insn = *new_insn;
+	list_for_each_entry_from(insn, &file->insns, list) {
+		if (insn->sec != special_alt->new_sec ||
+		    insn->offset >= special_alt->new_off + special_alt->new_len)
+			break;
+
+		last_new_insn = insn;
+
+		if (insn->type != INSN_JUMP_CONDITIONAL &&
+		    insn->type != INSN_JUMP_UNCONDITIONAL)
+			continue;
+
+		if (!insn->immediate)
+			continue;
+
+		dest_off = insn->offset + insn->len + insn->immediate;
+		if (dest_off == special_alt->new_off + special_alt->new_len)
+			insn->jump_dest = fake_jump;
+
+		if (!insn->jump_dest) {
+			WARN_FUNC("can't find alternative jump destination",
+				  insn->sec, insn->offset);
+			return -1;
+		}
+	}
+
+	if (!last_new_insn) {
+		WARN_FUNC("can't find last new alternative instruction",
+			  special_alt->new_sec, special_alt->new_off);
+		return -1;
+	}
+
+	list_add(&fake_jump->list, &last_new_insn->list);
+
+	return 0;
+}
+
+/*
+ * A jump table entry can either convert a nop to a jump or a jump to a nop.
+ * If the original instruction is a jump, make the alt entry an effective nop
+ * by just skipping the original instruction.
+ */
+static int handle_jump_alt(struct stacktool_file *file,
+			   struct special_alt *special_alt,
+			   struct instruction *orig_insn,
+			   struct instruction **new_insn)
+{
+	if (orig_insn->type == INSN_NOP)
+		return 0;
+
+	if (orig_insn->type != INSN_JUMP_UNCONDITIONAL) {
+		WARN_FUNC("unsupported instruction at jump label",
+			  orig_insn->sec, orig_insn->offset);
+		return -1;
+	}
+
+	*new_insn = list_next_entry(orig_insn, list);
+	return 0;
+}
+
+/*
+ * Read all the special sections which have alternate instructions which can be
+ * patched in or redirected to at runtime.  Each instruction having alternate
+ * instruction(s) has them added to its insn->alts list, which will be
+ * traversed in validate_branch().
+ */
+static int get_special_section_alts(struct stacktool_file *file)
+{
+	struct list_head special_alts;
+	struct instruction *orig_insn, *new_insn;
+	struct special_alt *special_alt, *tmp;
+	struct alternative *alt;
+	int ret;
+
+	ret = special_get_alts(file->elf, &special_alts);
+	if (ret)
+		return ret;
+
+	list_for_each_entry_safe(special_alt, tmp, &special_alts, list) {
+		alt = malloc(sizeof(*alt));
+		if (!alt) {
+			WARN("malloc failed");
+			ret = -1;
+			goto out;
+		}
+
+		orig_insn = find_instruction(file, special_alt->orig_sec,
+					     special_alt->orig_off);
+		if (!orig_insn) {
+			WARN_FUNC("special: can't find orig instruction",
+				  special_alt->orig_sec, special_alt->orig_off);
+			ret = -1;
+			goto out;
+		}
+
+		new_insn = NULL;
+		if (!special_alt->group || special_alt->new_len) {
+			new_insn = find_instruction(file, special_alt->new_sec,
+						    special_alt->new_off);
+			if (!new_insn) {
+				WARN_FUNC("special: can't find new instruction",
+					  special_alt->new_sec,
+					  special_alt->new_off);
+				ret = -1;
+				goto out;
+			}
+		}
+
+		if (special_alt->group) {
+			ret = handle_group_alt(file, special_alt, orig_insn,
+					       &new_insn);
+			if (ret)
+				goto out;
+		} else if (special_alt->jump_or_nop) {
+			ret = handle_jump_alt(file, special_alt, orig_insn,
+					      &new_insn);
+			if (ret)
+				goto out;
+		}
+
+		alt->insn = new_insn;
+		list_add_tail(&alt->list, &orig_insn->alts);
+
+		list_del(&special_alt->list);
+		free(special_alt);
+	}
+
+out:
+	return ret;
+}
+
+/*
+ * For some switch statements, gcc generates a jump table in the .rodata
+ * section which contains a list of addresses within the function to jump to.
+ * This finds these jump tables and adds them to the insn->alts lists.
+ */
+static int get_switch_alts(struct stacktool_file *file)
+{
+	struct instruction *insn, *alt_insn;
+	struct rela *rodata_rela, *rela;
+	struct section *rodata;
+	struct symbol *func;
+	struct alternative *alt;
+
+	list_for_each_entry(insn, &file->insns, list) {
+		if (insn->type != INSN_JUMP_DYNAMIC)
+			continue;
+
+		rodata_rela = find_rela_by_dest_range(insn->sec, insn->offset,
+						      insn->len);
+		if (!rodata_rela || strcmp(rodata_rela->sym->name, ".rodata"))
+			continue;
+
+		rodata = find_section_by_name(file->elf, ".rodata");
+		if (!rodata || !rodata->rela)
+			continue;
+
+		/* common case: jmpq *[addr](,%rax,8) */
+		rela = find_rela_by_dest(rodata, rodata_rela->addend);
+
+		/* rare case:   jmpq *[addr](%rip) */
+		if (!rela)
+			rela = find_rela_by_dest(rodata,
+						 rodata_rela->addend + 4);
+		if (!rela)
+			continue;
+
+		func = find_containing_func(insn->sec, insn->offset);
+		if (!func) {
+			WARN_FUNC("can't find containing func",
+				  insn->sec, insn->offset);
+			return -1;
+		}
+
+		list_for_each_entry_from(rela, &rodata->rela->relas, list) {
+			if (rela->sym->sec != insn->sec ||
+			    rela->addend <= func->offset ||
+			    rela->addend >= func->offset + func->len)
+				break;
+
+			alt_insn = find_instruction(file, insn->sec,
+						    rela->addend);
+			if (!alt_insn) {
+				WARN("%s: can't find instruction at %s+0x%x",
+				     rodata->rela->name, insn->sec->name,
+				     rela->addend);
+				return -1;
+			}
+
+			alt = malloc(sizeof(*alt));
+			if (!alt) {
+				WARN("malloc failed");
+				return -1;
+			}
+
+			alt->insn = alt_insn;
+			list_add_tail(&alt->list, &insn->alts);
+		}
+	}
+
+	return 0;
+}
+
+static int decode_sections(struct stacktool_file *file)
+{
+	int ret;
+
+	ret = decode_instructions(file);
+	if (ret)
+		return ret;
+
+	get_ignores(file);
+
+	ret = get_jump_destinations(file);
+	if (ret)
+		return ret;
+
+	ret = get_call_destinations(file);
+	if (ret)
+		return ret;
+
+	ret = get_special_section_alts(file);
+	if (ret)
+		return ret;
+
+	ret = get_switch_alts(file);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static bool is_fentry_call(struct instruction *insn)
+{
+	if (insn->type == INSN_CALL &&
+	    insn->call_dest->type == STT_NOTYPE &&
+	    !strcmp(insn->call_dest->name, "__fentry__"))
+		return true;
+
+	return false;
+}
+
+static bool has_modified_stack_frame(struct instruction *insn)
+{
+	return (insn->state & STATE_FP_SAVED) ||
+	       (insn->state & STATE_FP_SETUP);
+}
+
+static bool has_valid_stack_frame(struct instruction *insn)
+{
+	return (insn->state & STATE_FP_SAVED) &&
+	       (insn->state & STATE_FP_SETUP);
+}
+
+/*
+ * Follow the branch starting at the given instruction, and recursively follow
+ * any other branches (jumps).  Meanwhile, track the frame pointer state at
+ * each instruction and validate all the rules described in
+ * tools/stacktool/Documentation/stack-validation.txt.
+ */
+static int validate_branch(struct stacktool_file *file,
+			   struct instruction *first, unsigned char first_state)
+{
+	struct alternative *alt;
+	struct instruction *insn;
+	struct section *sec;
+	unsigned char state;
+	int ret, warnings = 0;
+
+	insn = first;
+	sec = insn->sec;
+	state = first_state;
+
+	if (insn->alt_group && list_empty(&insn->alts)) {
+		WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
+			  sec, insn->offset);
+		warnings++;
+	}
+
+	while (1) {
+		if (insn->visited) {
+			if (insn->state != state) {
+				WARN_FUNC("frame pointer state mismatch",
+					  sec, insn->offset);
+				warnings++;
+			}
+
+			return warnings;
+		}
+
+		/*
+		 * Catch a rare case where a noreturn function falls through to
+		 * the next function.
+		 */
+		if (is_fentry_call(insn) && (state & STATE_FENTRY))
+			return warnings;
+
+		insn->visited = true;
+		insn->state = state;
+
+		list_for_each_entry(alt, &insn->alts, list) {
+			ret = validate_branch(file, alt->insn, state);
+			warnings += ret;
+		}
+
+		switch (insn->type) {
+
+		case INSN_FP_SAVE:
+			if (!nofp) {
+				if (state & STATE_FP_SAVED) {
+					WARN_FUNC("duplicate frame pointer save",
+						  sec, insn->offset);
+					warnings++;
+				}
+				state |= STATE_FP_SAVED;
+			}
+			break;
+
+		case INSN_FP_SETUP:
+			if (!nofp) {
+				if (state & STATE_FP_SETUP) {
+					WARN_FUNC("duplicate frame pointer setup",
+						  sec, insn->offset);
+					warnings++;
+				}
+				state |= STATE_FP_SETUP;
+			}
+			break;
+
+		case INSN_FP_RESTORE:
+			if (!nofp) {
+				if (has_valid_stack_frame(insn))
+					state &= ~STATE_FP_SETUP;
+
+				state &= ~STATE_FP_SAVED;
+			}
+			break;
+
+		case INSN_RETURN:
+			if (!nofp && has_modified_stack_frame(insn)) {
+				WARN_FUNC("return without frame pointer restore",
+					  sec, insn->offset);
+				warnings++;
+			}
+			return warnings;
+
+		case INSN_CALL:
+			if (is_fentry_call(insn)) {
+				state |= STATE_FENTRY;
+				break;
+			}
+
+			if (dead_end_function(file, insn->call_dest))
+				return warnings;
+
+			/* fallthrough */
+		case INSN_CALL_DYNAMIC:
+			if (!nofp && !has_valid_stack_frame(insn)) {
+				WARN_FUNC("call without frame pointer save/setup",
+					  sec, insn->offset);
+				warnings++;
+			}
+			break;
+
+		case INSN_JUMP_CONDITIONAL:
+		case INSN_JUMP_UNCONDITIONAL:
+			if (insn->jump_dest) {
+				ret = validate_branch(file, insn->jump_dest,
+						      state);
+				warnings += ret;
+			} else if (has_modified_stack_frame(insn)) {
+				WARN_FUNC("sibling call from callable instruction with changed frame pointer",
+					  sec, insn->offset);
+				warnings++;
+			} /* else it's a sibling call */
+
+			if (insn->type == INSN_JUMP_UNCONDITIONAL)
+				return warnings;
+
+			break;
+
+		case INSN_JUMP_DYNAMIC:
+			if (list_empty(&insn->alts) &&
+			    has_modified_stack_frame(insn)) {
+				WARN_FUNC("sibling call from callable instruction with changed frame pointer",
+					  sec, insn->offset);
+				warnings++;
+			}
+
+			return warnings;
+
+		case INSN_BUG:
+			return warnings;
+
+		default:
+			break;
+		}
+
+		insn = list_next_entry(insn, list);
+
+		if (&insn->list == &file->insns || insn->sec != sec) {
+			WARN("%s: unexpected end of section", sec->name);
+			warnings++;
+			return warnings;
+		}
+	}
+
+	return warnings;
+}
+
+static int validate_functions(struct stacktool_file *file)
+{
+	struct section *sec;
+	struct symbol *func;
+	struct instruction *insn;
+	int ret, warnings = 0;
+
+	list_for_each_entry(sec, &file->elf->sections, list) {
+		list_for_each_entry(func, &sec->symbols, list) {
+			if (func->type != STT_FUNC)
+				continue;
+
+			insn = find_instruction(file, sec, func->offset);
+			if (!insn) {
+				WARN("%s(): can't find starting instruction",
+				     func->name);
+				warnings++;
+				continue;
+			}
+
+			ret = validate_branch(file, insn, 0);
+			warnings += ret;
+		}
+	}
+
+	list_for_each_entry(sec, &file->elf->sections, list) {
+		list_for_each_entry(func, &sec->symbols, list) {
+			if (func->type != STT_FUNC)
+				continue;
+
+			insn = find_instruction(file, sec, func->offset);
+			if (!insn)
+				continue;
+
+			list_for_each_entry_from(insn, &file->insns, list) {
+				if (insn->sec != func->sec ||
+				    insn->offset >= func->offset + func->len)
+					break;
+
+				if (!insn->visited && insn->type != INSN_NOP) {
+					WARN_FUNC("function has unreachable instruction",
+						  insn->sec, insn->offset);
+					warnings++;
+				}
+
+				insn->visited = true;
+			}
+		}
+	}
+
+	return warnings;
+}
+
+static int validate_uncallable_instructions(struct stacktool_file *file)
+{
+	struct instruction *insn;
+	int warnings = 0;
+
+	list_for_each_entry(insn, &file->insns, list) {
+		if (!insn->visited && insn->type == INSN_RETURN) {
+			WARN_FUNC("return instruction outside of a callable function",
+				  insn->sec, insn->offset);
+			warnings++;
+		}
+	}
+
+	return warnings;
+}
+
+static void cleanup(struct stacktool_file *file)
+{
+	struct instruction *insn, *tmpinsn;
+	struct alternative *alt, *tmpalt;
+
+	list_for_each_entry_safe(insn, tmpinsn, &file->insns, list) {
+		list_for_each_entry_safe(alt, tmpalt, &insn->alts, list) {
+			list_del(&alt->list);
+			free(alt);
+		}
+		list_del(&insn->list);
+		free(insn);
+	}
+	elf_close(file->elf);
+}
+
+const char * const check_usage[] = {
+	"stacktool check [<options>] file.o",
+	NULL,
+};
+
+int cmd_check(int argc, const char **argv)
+{
+	struct stacktool_file file;
+	int ret, warnings = 0;
+
+	const struct option options[] = {
+		OPT_BOOLEAN('f', "no-fp", &nofp, "Skip frame pointer validation"),
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, options, check_usage, 0);
+
+	if (argc != 1)
+		usage_with_options(check_usage, options);
+
+	objname = argv[0];
+
+	file.elf = elf_open(objname);
+	if (!file.elf) {
+		fprintf(stderr, "error reading elf file %s\n", objname);
+		return 1;
+	}
+
+	INIT_LIST_HEAD(&file.insns);
+
+	ret = decode_sections(&file);
+	if (ret < 0)
+		goto out;
+	warnings += ret;
+
+	ret = validate_functions(&file);
+	if (ret < 0)
+		goto out;
+	warnings += ret;
+
+	ret = validate_uncallable_instructions(&file);
+	if (ret < 0)
+		goto out;
+	warnings += ret;
+
+out:
+	cleanup(&file);
+
+	/* ignore warnings for now until we get all the code cleaned up */
+	if (ret || warnings)
+		return 0;
+	return 0;
+}
diff --git a/tools/stacktool/builtin.h b/tools/stacktool/builtin.h
new file mode 100644
index 0000000..34d2ba7
--- /dev/null
+++ b/tools/stacktool/builtin.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _BUILTIN_H
+#define _BUILTIN_H
+
+extern int cmd_check(int argc, const char **argv);
+
+#endif /* _BUILTIN_H */
diff --git a/tools/stacktool/elf.c b/tools/stacktool/elf.c
new file mode 100644
index 0000000..d547e3f
--- /dev/null
+++ b/tools/stacktool/elf.c
@@ -0,0 +1,403 @@
+/*
+ * elf.c - ELF access library
+ *
+ * Adapted from kpatch (https://github.com/dynup/kpatch):
+ * Copyright (C) 2013-2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ * Copyright (C) 2014 Seth Jennings <sjenning@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "elf.h"
+#include "warn.h"
+
+struct section *find_section_by_name(struct elf *elf, const char *name)
+{
+	struct section *sec;
+
+	list_for_each_entry(sec, &elf->sections, list)
+		if (!strcmp(sec->name, name))
+			return sec;
+
+	return NULL;
+}
+
+static struct section *find_section_by_index(struct elf *elf,
+					     unsigned int idx)
+{
+	struct section *sec;
+
+	list_for_each_entry(sec, &elf->sections, list)
+		if (sec->idx == idx)
+			return sec;
+
+	return NULL;
+}
+
+static struct symbol *find_symbol_by_index(struct elf *elf, unsigned int idx)
+{
+	struct section *sec;
+	struct symbol *sym;
+
+	list_for_each_entry(sec, &elf->sections, list)
+		list_for_each_entry(sym, &sec->symbols, list)
+			if (sym->idx == idx)
+				return sym;
+
+	return NULL;
+}
+
+struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset)
+{
+	struct symbol *sym;
+
+	list_for_each_entry(sym, &sec->symbols, list)
+		if (sym->type != STT_SECTION &&
+		    sym->offset == offset)
+			return sym;
+
+	return NULL;
+}
+
+struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
+				     unsigned int len)
+{
+	struct rela *rela;
+
+	if (!sec->rela)
+		return NULL;
+
+	list_for_each_entry(rela, &sec->rela->relas, list)
+		if (rela->offset >= offset && rela->offset < offset + len)
+			return rela;
+
+	return NULL;
+}
+
+struct rela *find_rela_by_dest(struct section *sec, unsigned long offset)
+{
+	return find_rela_by_dest_range(sec, offset, 1);
+}
+
+struct symbol *find_containing_func(struct section *sec, unsigned long offset)
+{
+	struct symbol *func;
+
+	list_for_each_entry(func, &sec->symbols, list)
+		if (func->type == STT_FUNC && offset >= func->offset &&
+		    offset < func->offset + func->len)
+			return func;
+
+	return NULL;
+}
+
+static int read_sections(struct elf *elf)
+{
+	Elf_Scn *s = NULL;
+	struct section *sec;
+	size_t shstrndx, sections_nr;
+	int i;
+
+	if (elf_getshdrnum(elf->elf, &sections_nr)) {
+		perror("elf_getshdrnum");
+		return -1;
+	}
+
+	if (elf_getshdrstrndx(elf->elf, &shstrndx)) {
+		perror("elf_getshdrstrndx");
+		return -1;
+	}
+
+	for (i = 0; i < sections_nr; i++) {
+		sec = malloc(sizeof(*sec));
+		if (!sec) {
+			perror("malloc");
+			return -1;
+		}
+		memset(sec, 0, sizeof(*sec));
+
+		INIT_LIST_HEAD(&sec->symbols);
+		INIT_LIST_HEAD(&sec->relas);
+
+		list_add_tail(&sec->list, &elf->sections);
+
+		s = elf_getscn(elf->elf, i);
+		if (!s) {
+			perror("elf_getscn");
+			return -1;
+		}
+
+		sec->idx = elf_ndxscn(s);
+
+		if (!gelf_getshdr(s, &sec->sh)) {
+			perror("gelf_getshdr");
+			return -1;
+		}
+
+		sec->name = elf_strptr(elf->elf, shstrndx, sec->sh.sh_name);
+		if (!sec->name) {
+			perror("elf_strptr");
+			return -1;
+		}
+
+		sec->elf_data = elf_getdata(s, NULL);
+		if (!sec->elf_data) {
+			perror("elf_getdata");
+			return -1;
+		}
+
+		if (sec->elf_data->d_off != 0 ||
+		    sec->elf_data->d_size != sec->sh.sh_size) {
+			WARN("unexpected data attributes for %s", sec->name);
+			return -1;
+		}
+
+		sec->data = (unsigned long)sec->elf_data->d_buf;
+		sec->len = sec->elf_data->d_size;
+	}
+
+	/* sanity check, one more call to elf_nextscn() should return NULL */
+	if (elf_nextscn(elf->elf, s)) {
+		WARN("section entry mismatch");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int read_symbols(struct elf *elf)
+{
+	struct section *symtab;
+	struct symbol *sym;
+	struct list_head *entry, *tmp;
+	int symbols_nr, i;
+
+	symtab = find_section_by_name(elf, ".symtab");
+	if (!symtab) {
+		WARN("missing symbol table");
+		return -1;
+	}
+
+	symbols_nr = symtab->sh.sh_size / symtab->sh.sh_entsize;
+
+	for (i = 0; i < symbols_nr; i++) {
+		sym = malloc(sizeof(*sym));
+		if (!sym) {
+			perror("malloc");
+			return -1;
+		}
+		memset(sym, 0, sizeof(*sym));
+
+		sym->idx = i;
+
+		if (!gelf_getsym(symtab->elf_data, i, &sym->sym)) {
+			perror("gelf_getsym");
+			goto err;
+		}
+
+		sym->name = elf_strptr(elf->elf, symtab->sh.sh_link,
+				       sym->sym.st_name);
+		if (!sym->name) {
+			perror("elf_strptr");
+			goto err;
+		}
+
+		sym->type = GELF_ST_TYPE(sym->sym.st_info);
+		sym->bind = GELF_ST_BIND(sym->sym.st_info);
+
+		if (sym->sym.st_shndx > SHN_UNDEF &&
+		    sym->sym.st_shndx < SHN_LORESERVE) {
+			sym->sec = find_section_by_index(elf,
+							 sym->sym.st_shndx);
+			if (!sym->sec) {
+				WARN("couldn't find section for symbol %s",
+				     sym->name);
+				goto err;
+			}
+			if (sym->type == STT_SECTION) {
+				sym->name = sym->sec->name;
+				sym->sec->sym = sym;
+			}
+		} else
+			sym->sec = find_section_by_index(elf, 0);
+
+		sym->offset = sym->sym.st_value;
+		sym->len = sym->sym.st_size;
+
+		/* sorted insert into a per-section list */
+		entry = &sym->sec->symbols;
+		list_for_each_prev(tmp, &sym->sec->symbols) {
+			struct symbol *s;
+
+			s = list_entry(tmp, struct symbol, list);
+
+			if (sym->offset > s->offset) {
+				entry = tmp;
+				break;
+			}
+
+			if (sym->offset == s->offset && sym->len >= s->len) {
+				entry = tmp;
+				break;
+			}
+		}
+		list_add(&sym->list, entry);
+	}
+
+	return 0;
+
+err:
+	free(sym);
+	return -1;
+}
+
+static int read_relas(struct elf *elf)
+{
+	struct section *sec;
+	struct rela *rela;
+	int i;
+	unsigned int symndx;
+
+	list_for_each_entry(sec, &elf->sections, list) {
+		if (sec->sh.sh_type != SHT_RELA)
+			continue;
+
+		sec->base = find_section_by_name(elf, sec->name + 5);
+		if (!sec->base) {
+			WARN("can't find base section for rela section %s",
+			     sec->name);
+			return -1;
+		}
+
+		sec->base->rela = sec;
+
+		for (i = 0; i < sec->sh.sh_size / sec->sh.sh_entsize; i++) {
+			rela = malloc(sizeof(*rela));
+			if (!rela) {
+				perror("malloc");
+				return -1;
+			}
+			memset(rela, 0, sizeof(*rela));
+
+			list_add_tail(&rela->list, &sec->relas);
+
+			if (!gelf_getrela(sec->elf_data, i, &rela->rela)) {
+				perror("gelf_getrela");
+				return -1;
+			}
+
+			rela->type = GELF_R_TYPE(rela->rela.r_info);
+			rela->addend = rela->rela.r_addend;
+			rela->offset = rela->rela.r_offset;
+			symndx = GELF_R_SYM(rela->rela.r_info);
+			rela->sym = find_symbol_by_index(elf, symndx);
+			if (!rela->sym) {
+				WARN("can't find rela entry symbol %d for %s",
+				     symndx, sec->name);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+struct elf *elf_open(const char *name)
+{
+	struct elf *elf;
+
+	elf_version(EV_CURRENT);
+
+	elf = malloc(sizeof(*elf));
+	if (!elf) {
+		perror("malloc");
+		return NULL;
+	}
+	memset(elf, 0, sizeof(*elf));
+
+	INIT_LIST_HEAD(&elf->sections);
+
+	elf->name = strdup(name);
+	if (!elf->name) {
+		perror("strdup");
+		goto err;
+	}
+
+	elf->fd = open(name, O_RDONLY);
+	if (elf->fd == -1) {
+		perror("open");
+		goto err;
+	}
+
+	elf->elf = elf_begin(elf->fd, ELF_C_READ_MMAP, NULL);
+	if (!elf->elf) {
+		perror("elf_begin");
+		goto err;
+	}
+
+	if (!gelf_getehdr(elf->elf, &elf->ehdr)) {
+		perror("gelf_getehdr");
+		goto err;
+	}
+
+	if (read_sections(elf))
+		goto err;
+
+	if (read_symbols(elf))
+		goto err;
+
+	if (read_relas(elf))
+		goto err;
+
+	return elf;
+
+err:
+	elf_close(elf);
+	return NULL;
+}
+
+void elf_close(struct elf *elf)
+{
+	struct section *sec, *tmpsec;
+	struct symbol *sym, *tmpsym;
+	struct rela *rela, *tmprela;
+
+	list_for_each_entry_safe(sec, tmpsec, &elf->sections, list) {
+		list_for_each_entry_safe(sym, tmpsym, &sec->symbols, list) {
+			list_del(&sym->list);
+			free(sym);
+		}
+		list_for_each_entry_safe(rela, tmprela, &sec->relas, list) {
+			list_del(&rela->list);
+			free(rela);
+		}
+		list_del(&sec->list);
+		free(sec);
+	}
+	if (elf->name)
+		free(elf->name);
+	if (elf->fd > 0)
+		close(elf->fd);
+	if (elf->elf)
+		elf_end(elf->elf);
+	free(elf);
+}
diff --git a/tools/stacktool/elf.h b/tools/stacktool/elf.h
new file mode 100644
index 0000000..76b86b3
--- /dev/null
+++ b/tools/stacktool/elf.h
@@ -0,0 +1,79 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _STACKTOOL_ELF_H
+#define _STACKTOOL_ELF_H
+
+#include <stdio.h>
+#include <gelf.h>
+#include <linux/list.h>
+
+struct section {
+	struct list_head list;
+	GElf_Shdr sh;
+	struct list_head symbols;
+	struct list_head relas;
+	struct section *base, *rela;
+	struct symbol *sym;
+	Elf_Data *elf_data;
+	char *name;
+	int idx;
+	unsigned long data;
+	unsigned int len;
+};
+
+struct symbol {
+	struct list_head list;
+	GElf_Sym sym;
+	struct section *sec;
+	char *name;
+	int idx;
+	unsigned char bind, type;
+	unsigned long offset;
+	unsigned int len;
+};
+
+struct rela {
+	struct list_head list;
+	GElf_Rela rela;
+	struct symbol *sym;
+	unsigned int type;
+	int offset;
+	int addend;
+};
+
+struct elf {
+	Elf *elf;
+	GElf_Ehdr ehdr;
+	int fd;
+	char *name;
+	struct list_head sections;
+};
+
+
+struct elf *elf_open(const char *name);
+struct section *find_section_by_name(struct elf *elf, const char *name);
+struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset);
+struct rela *find_rela_by_dest(struct section *sec, unsigned long offset);
+struct rela *find_rela_by_dest_range(struct section *sec, unsigned long offset,
+				     unsigned int len);
+struct symbol *find_containing_func(struct section *sec, unsigned long offset);
+void elf_close(struct elf *elf);
+
+
+
+#endif /* _STACKTOOL_ELF_H */
diff --git a/tools/stacktool/special.c b/tools/stacktool/special.c
new file mode 100644
index 0000000..c73854a
--- /dev/null
+++ b/tools/stacktool/special.c
@@ -0,0 +1,193 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * This file reads all the special sections which have alternate instructions
+ * which can be patched in or redirected to at runtime.
+ */
+
+#include <stdlib.h>
+#include <string.h>
+
+#include "special.h"
+#include "warn.h"
+
+#define EX_ENTRY_SIZE		8
+#define EX_ORIG_OFFSET		0
+#define EX_NEW_OFFSET		4
+
+#define JUMP_ENTRY_SIZE		24
+#define JUMP_ORIG_OFFSET	0
+#define JUMP_NEW_OFFSET		8
+
+#define ALT_ENTRY_SIZE		13
+#define ALT_ORIG_OFFSET		0
+#define ALT_NEW_OFFSET		4
+#define ALT_FEATURE_OFFSET	8
+#define ALT_ORIG_LEN_OFFSET	10
+#define ALT_NEW_LEN_OFFSET	11
+
+#define X86_FEATURE_POPCNT (4*32+23)
+
+struct special_entry {
+	const char *sec;
+	bool group, jump_or_nop;
+	unsigned char size, orig, new;
+	unsigned char orig_len, new_len; /* group only */
+	unsigned char feature; /* ALTERNATIVE macro CPU feature */
+};
+
+struct special_entry entries[] = {
+	{
+		.sec = ".altinstructions",
+		.group = true,
+		.size = ALT_ENTRY_SIZE,
+		.orig = ALT_ORIG_OFFSET,
+		.orig_len = ALT_ORIG_LEN_OFFSET,
+		.new = ALT_NEW_OFFSET,
+		.new_len = ALT_NEW_LEN_OFFSET,
+		.feature = ALT_FEATURE_OFFSET,
+	},
+	{
+		.sec = "__jump_table",
+		.jump_or_nop = true,
+		.size = JUMP_ENTRY_SIZE,
+		.orig = JUMP_ORIG_OFFSET,
+		.new = JUMP_NEW_OFFSET,
+	},
+	{
+		.sec = "__ex_table",
+		.size = EX_ENTRY_SIZE,
+		.orig = EX_ORIG_OFFSET,
+		.new = EX_NEW_OFFSET,
+	},
+	{},
+};
+
+static int get_alt_entry(struct elf *elf, struct special_entry *entry,
+			 struct section *sec, int idx,
+			 struct special_alt *alt)
+{
+	struct rela *orig_rela, *new_rela;
+	unsigned long offset;
+
+	offset = idx * entry->size;
+
+	alt->group = entry->group;
+	alt->jump_or_nop = entry->jump_or_nop;
+
+	if (alt->group) {
+		alt->orig_len = *(unsigned char *)(sec->data + offset +
+						   entry->orig_len);
+		alt->new_len = *(unsigned char *)(sec->data + offset +
+						  entry->new_len);
+	}
+
+	if (entry->feature) {
+		unsigned short feature;
+
+		feature = *(unsigned short *)(sec->data + offset +
+					      entry->feature);
+
+		/*
+		 * It has been requested that we don't validate the !POPCNT
+		 * feature path which is a "very very small percentage of
+		 * machines".
+		 */
+		if (feature == X86_FEATURE_POPCNT)
+			alt->skip_orig = true;
+	}
+
+	orig_rela = find_rela_by_dest(sec, offset + entry->orig);
+	if (!orig_rela) {
+		WARN_FUNC("can't find orig rela", sec, offset + entry->orig);
+		return -1;
+	}
+	if (orig_rela->sym->type != STT_SECTION) {
+		WARN_FUNC("don't know how to handle non-section rela symbol %s",
+			   sec, offset + entry->orig, orig_rela->sym->name);
+		return -1;
+	}
+
+	alt->orig_sec = orig_rela->sym->sec;
+	alt->orig_off = orig_rela->addend;
+
+	if (!entry->group || alt->new_len) {
+		new_rela = find_rela_by_dest(sec, offset + entry->new);
+		if (!new_rela) {
+			WARN_FUNC("can't find new rela",
+				  sec, offset + entry->new);
+			return -1;
+		}
+
+		alt->new_sec = new_rela->sym->sec;
+		alt->new_off = (unsigned int)new_rela->addend;
+
+		/* _ASM_EXTABLE_EX hack */
+		if (alt->new_off >= 0x7ffffff0)
+			alt->new_off -= 0x7ffffff0;
+	}
+
+	return 0;
+}
+
+/*
+ * Read all the special sections and create a list of special_alt structs which
+ * describe all the alternate instructions which can be patched in or
+ * redirected to at runtime.
+ */
+int special_get_alts(struct elf *elf, struct list_head *alts)
+{
+	struct special_entry *entry;
+	struct section *sec;
+	unsigned int nr_entries;
+	struct special_alt *alt;
+	int idx, ret;
+
+	INIT_LIST_HEAD(alts);
+
+	for (entry = entries; entry->sec; entry++) {
+		sec = find_section_by_name(elf, entry->sec);
+		if (!sec)
+			continue;
+
+		if (sec->len % entry->size != 0) {
+			WARN("%s size not a multiple of %d",
+			     sec->name, JUMP_ENTRY_SIZE);
+			return -1;
+		}
+
+		nr_entries = sec->len / entry->size;
+
+		for (idx = 0; idx < nr_entries; idx++) {
+			alt = malloc(sizeof(*alt));
+			if (!alt) {
+				WARN("malloc failed");
+				return -1;
+			}
+			memset(alt, 0, sizeof(*alt));
+
+			ret = get_alt_entry(elf, entry, sec, idx, alt);
+			if (ret)
+				return ret;
+
+			list_add_tail(&alt->list, alts);
+		}
+	}
+
+	return 0;
+}
diff --git a/tools/stacktool/special.h b/tools/stacktool/special.h
new file mode 100644
index 0000000..fad1d092
--- /dev/null
+++ b/tools/stacktool/special.h
@@ -0,0 +1,42 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _SPECIAL_H
+#define _SPECIAL_H
+
+#include <stdbool.h>
+#include "elf.h"
+
+struct special_alt {
+	struct list_head list;
+
+	bool group;
+	bool skip_orig;
+	bool jump_or_nop;
+
+	struct section *orig_sec;
+	unsigned long orig_off;
+
+	struct section *new_sec;
+	unsigned long new_off;
+
+	unsigned int orig_len, new_len; /* group only */
+};
+
+int special_get_alts(struct elf *elf, struct list_head *alts);
+
+#endif /* _SPECIAL_H */
diff --git a/tools/stacktool/stacktool.c b/tools/stacktool/stacktool.c
new file mode 100644
index 0000000..6220132
--- /dev/null
+++ b/tools/stacktool/stacktool.c
@@ -0,0 +1,134 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * stacktool:
+ *
+ * This tool analyzes every .o file and ensures the validity of its stack trace
+ * metadata.  It enforces a set of rules on asm code and C inline assembly code
+ * so that stack traces can be reliable.
+ *
+ * For more information, see tools/stacktool/Documentation/stack-validation.txt.
+ */
+
+#include <stdio.h>
+#include <stdbool.h>
+#include <string.h>
+#include <stdlib.h>
+#include <subcmd/exec-cmd.h>
+#include <subcmd/pager.h>
+
+#include "builtin.h"
+
+#define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
+
+struct cmd_struct {
+	const char *name;
+	int (*fn)(int, const char **);
+	const char *help;
+};
+
+static const char stacktool_usage_string[] =
+	"stacktool [OPTIONS] COMMAND [ARGS]";
+
+static struct cmd_struct stacktool_cmds[] = {
+	{"check",	cmd_check,	"Perform stack metadata validation on an object file" },
+};
+
+bool help;
+
+static void cmd_usage(void)
+{
+	unsigned int i, longest = 0;
+
+	printf("\n usage: %s\n\n", stacktool_usage_string);
+
+	for (i = 0; i < ARRAY_SIZE(stacktool_cmds); i++) {
+		if (longest < strlen(stacktool_cmds[i].name))
+			longest = strlen(stacktool_cmds[i].name);
+	}
+
+	puts(" Commands:");
+	for (i = 0; i < ARRAY_SIZE(stacktool_cmds); i++) {
+		printf("   %-*s   ", longest, stacktool_cmds[i].name);
+		puts(stacktool_cmds[i].help);
+	}
+
+	printf("\n");
+
+	exit(1);
+}
+
+static void handle_options(int *argc, const char ***argv)
+{
+	while (*argc > 0) {
+		const char *cmd = (*argv)[0];
+
+		if (cmd[0] != '-')
+			break;
+
+		if (!strcmp(cmd, "--help") || !strcmp(cmd, "-h")) {
+			help = true;
+			break;
+		} else {
+			fprintf(stderr, "Unknown option: %s\n", cmd);
+			fprintf(stderr, "\n Usage: %s\n",
+				stacktool_usage_string);
+			exit(1);
+		}
+
+		(*argv)++;
+		(*argc)--;
+	}
+}
+
+static void handle_internal_command(int argc, const char **argv)
+{
+	const char *cmd = argv[0];
+	unsigned int i, ret;
+
+	for (i = 0; i < ARRAY_SIZE(stacktool_cmds); i++) {
+		struct cmd_struct *p = stacktool_cmds+i;
+
+		if (strcmp(p->name, cmd))
+			continue;
+
+		ret = p->fn(argc, argv);
+
+		exit(ret);
+	}
+
+	cmd_usage();
+}
+
+int main(int argc, const char **argv)
+{
+	static const char *UNUSED = "STACKTOOL_NOT_IMPLEMENTED";
+
+	/* libsubcmd init */
+	exec_cmd_init("stacktool", UNUSED, UNUSED, UNUSED);
+	pager_init(UNUSED);
+
+	argv++;
+	argc--;
+	handle_options(&argc, &argv);
+
+	if (!argc || help)
+		cmd_usage();
+
+	handle_internal_command(argc, argv);
+}
diff --git a/tools/stacktool/warn.h b/tools/stacktool/warn.h
new file mode 100644
index 0000000..2e93f17
--- /dev/null
+++ b/tools/stacktool/warn.h
@@ -0,0 +1,60 @@
+/*
+ * Copyright (C) 2015 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _WARN_H
+#define _WARN_H
+
+extern const char *objname;
+
+static inline char *offstr(struct section *sec, unsigned long offset)
+{
+	struct symbol *func;
+	char *name, *str;
+	unsigned long name_off;
+
+	func = find_containing_func(sec, offset);
+	if (func) {
+		name = func->name;
+		name_off = offset - func->offset;
+	} else {
+		name = sec->name;
+		name_off = offset;
+	}
+
+	str = malloc(strlen(name) + 20);
+
+	if (func)
+		sprintf(str, "%s()+0x%lx", name, name_off);
+	else
+		sprintf(str, "%s+0x%lx", name, name_off);
+
+	return str;
+}
+
+#define WARN(format, ...)				\
+	fprintf(stderr,					\
+		"stacktool: %s: " format "\n",		\
+		objname, ##__VA_ARGS__)
+
+#define WARN_FUNC(format, sec, offset, ...)		\
+({							\
+	char *_str = offstr(sec, offset);		\
+	WARN("%s: " format, _str, ##__VA_ARGS__);	\
+	free(_str);					\
+})
+
+#endif /* _WARN_H */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 02/33] kbuild/stacktool: Add CONFIG_STACK_VALIDATION option
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 01/33] x86/stacktool: " Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 03/33] x86/stacktool: Enable stacktool on x86_64 Josh Poimboeuf
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

Add a CONFIG_STACK_VALIDATION option which will run "stacktool check"
for each .o file to ensure the validity of its stack metadata.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 Makefile               |  5 ++++-
 arch/Kconfig           |  6 ++++++
 lib/Kconfig.debug      | 12 ++++++++++++
 scripts/Makefile.build | 38 ++++++++++++++++++++++++++++++++++----
 scripts/mod/Makefile   |  2 ++
 5 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 70dea02..8e518fe 100644
--- a/Makefile
+++ b/Makefile
@@ -986,7 +986,10 @@ prepare0: archprepare FORCE
 	$(Q)$(MAKE) $(build)=.
 
 # All the preparing..
-prepare: prepare0
+prepare: prepare0 prepare-stacktool
+
+PHONY += prepare-stacktool
+prepare-stacktool: $(if $(CONFIG_STACK_VALIDATION), tools/stacktool FORCE)
 
 # Generate some files
 # ---------------------------------------------------------------------------
diff --git a/arch/Kconfig b/arch/Kconfig
index 671810c..b20f472 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -527,6 +527,12 @@ config HAVE_COPY_THREAD_TLS
 	  normal C parameter passing, rather than extracting the syscall
 	  argument from pt_regs.
 
+config HAVE_STACK_VALIDATION
+	bool
+	help
+	  Architecture supports the stacktool host tool, which adds
+	  compile-time stack metadata validation.
+
 #
 # ABI hall of shame
 #
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ee1ac1c..a984656 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -342,6 +342,18 @@ config FRAME_POINTER
 	  larger and slower, but it gives very useful debugging information
 	  in case of kernel bugs. (precise oopses/stacktraces/warnings)
 
+config STACK_VALIDATION
+	bool "Enable compile-time stack metadata validation"
+	depends on HAVE_STACK_VALIDATION
+	default n
+	help
+	  Add compile-time checks to validate stack metadata, including frame
+	  pointers (if CONFIG_FRAME_POINTER is enabled).  This helps ensure
+	  that runtime stack traces are more reliable.
+
+	  For more information, see
+	  tools/stacktool/Documentation/stack-validation.txt.
+
 config DEBUG_FORCE_WEAK_PER_CPU
 	bool "Force weak per-cpu definitions"
 	depends on DEBUG_KERNEL
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 01df30a..5ec40fc 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -241,10 +241,31 @@ cmd_record_mcount =						\
 	fi;
 endif
 
+ifdef CONFIG_STACK_VALIDATION
+
+__stacktool_obj := $(objtree)/tools/stacktool/stacktool
+
+stacktool_args = check
+ifndef CONFIG_FRAME_POINTER
+stacktool_args += --no-fp
+endif
+
+# Set STACKTOOL_foo.o=n to skip stack metadata validation for a file.
+# Set STACKTOOL=n to skip stack metadata validation for a directory.
+stacktool_obj = $(if $(patsubst n%,, \
+	$(STACKTOOL_$(basetarget).o)$(STACKTOOL)y), \
+	$(__stacktool_obj))
+cmd_stacktool = $(if $(patsubst n%,, \
+	$(STACKTOOL_$(basetarget).o)$(STACKTOOL)y), \
+	$(__stacktool_obj) $(stacktool_args) "$(@)";)
+
+endif # CONFIG_STACK_VALIDATION
+
 define rule_cc_o_c
 	$(call echo-cmd,checksrc) $(cmd_checksrc)			  \
 	$(call echo-cmd,cc_o_c) $(cmd_cc_o_c);				  \
 	$(cmd_modversions)						  \
+	$(cmd_stacktool)						  \
 	$(call echo-cmd,record_mcount)					  \
 	$(cmd_record_mcount)						  \
 	scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,cc_o_c)' >    \
@@ -253,14 +274,23 @@ define rule_cc_o_c
 	mv -f $(dot-target).tmp $(dot-target).cmd
 endef
 
+define rule_as_o_S
+	$(call echo-cmd,as_o_S) $(cmd_as_o_S);				  \
+	$(cmd_stacktool)						  \
+	scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,as_o_S)' >    \
+	                                              $(dot-target).tmp;  \
+	rm -f $(depfile);						  \
+	mv -f $(dot-target).tmp $(dot-target).cmd
+endef
+
 # Built-in and composite module parts
-$(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
+$(obj)/%.o: $(src)/%.c $(recordmcount_source) $(stacktool_obj) FORCE
 	$(call cmd,force_checksrc)
 	$(call if_changed_rule,cc_o_c)
 
 # Single-part modules are special since we need to mark them in $(MODVERDIR)
 
-$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
+$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(stacktool_obj) FORCE
 	$(call cmd,force_checksrc)
 	$(call if_changed_rule,cc_o_c)
 	@{ echo $(@:.o=.ko); echo $@; } > $(MODVERDIR)/$(@F:.o=.mod)
@@ -290,8 +320,8 @@ $(obj)/%.s: $(src)/%.S FORCE
 quiet_cmd_as_o_S = AS $(quiet_modtag)  $@
 cmd_as_o_S       = $(CC) $(a_flags) -c -o $@ $<
 
-$(obj)/%.o: $(src)/%.S FORCE
-	$(call if_changed_dep,as_o_S)
+$(obj)/%.o: $(src)/%.S $(stacktool_obj) FORCE
+	$(call if_changed_rule,as_o_S)
 
 targets += $(real-objs-y) $(real-objs-m) $(lib-y)
 targets += $(extra-y) $(MAKECMDGOALS) $(always)
diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index c11212f..496184d 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -1,3 +1,5 @@
+STACKTOOL	:= n
+
 hostprogs-y	:= modpost mk_elfconfig
 always		:= $(hostprogs-y) empty.o
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 03/33] x86/stacktool: Enable stacktool on x86_64
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 01/33] x86/stacktool: " Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 02/33] kbuild/stacktool: Add CONFIG_STACK_VALIDATION option Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 04/33] x86/stacktool: Add STACKTOOL_IGNORE_FUNC macro Josh Poimboeuf
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

Set HAVE_STACK_VALIDATION to enable stack metadata validation for
x86_64.

Also, disable stacktool checking for the kexec purgatory code, because:

1. It's built by the archprepare target, which can run before stacktool
   has been built yet.

2. It runs outside the scope of the kernel's normal mode of operation
   and doesn't need stack checking anyway.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/Kconfig            | 1 +
 arch/x86/purgatory/Makefile | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 41b74e2..a7e4959 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -154,6 +154,7 @@ config X86
 	select VIRT_TO_BUS
 	select X86_DEV_DMA_OPS			if X86_64
 	select X86_FEATURE_NAMES		if PROC_FS
+	select HAVE_STACK_VALIDATION		if X86_64
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 2c835e3..30d7d58 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -1,3 +1,5 @@
+STACKTOOL := n
+
 purgatory-y := purgatory.o stack.o setup-x86_$(BITS).o sha256.o entry64.o string.o
 
 targets += $(purgatory-y)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 04/33] x86/stacktool: Add STACKTOOL_IGNORE_FUNC macro
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (2 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 03/33] x86/stacktool: Enable stacktool on x86_64 Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 05/33] x86/xen: Add stack frame dependency to hypercall inline asm calls Josh Poimboeuf
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

Add a new stacktool ignore macro, STACKTOOL_IGNORE_FUNC, which can be
used to tell stacktool to skip validation of a function.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 MAINTAINERS                   |  1 +
 arch/x86/kernel/vmlinux.lds.S |  5 ++++-
 include/linux/stacktool.h     | 23 +++++++++++++++++++++++
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/stacktool.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7ecbea9..80b26ec 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10190,6 +10190,7 @@ STACK METADATA VALIDATION
 M:	Josh Poimboeuf <jpoimboe@redhat.com>
 S:	Supported
 F:	tools/stacktool/
+F:	include/linux/stacktool.h
 
 STAGING SUBSYSTEM
 M:	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 4f19942..c08c283c 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -333,7 +333,10 @@ SECTIONS
 
 	/* Sections to be discarded */
 	DISCARDS
-	/DISCARD/ : { *(.eh_frame) }
+	/DISCARD/ : {
+		*(.eh_frame)
+		*(__stacktool_ignore_*)
+	}
 }
 
 
diff --git a/include/linux/stacktool.h b/include/linux/stacktool.h
new file mode 100644
index 0000000..0d90db7
--- /dev/null
+++ b/include/linux/stacktool.h
@@ -0,0 +1,23 @@
+#ifndef _LINUX_STACKTOOL_H
+#define _LINUX_STACKTOOL_H
+
+#ifdef CONFIG_STACK_VALIDATION
+/*
+ * This C macro tells stacktool to ignore the function when doing stack
+ * metadata validation.  It should only be used in special cases where you're
+ * 100% sure it won't affect the reliability of frame pointers and kernel stack
+ * traces.
+ *
+ * For more information, see tools/stacktool/Documentation/stack-validation.txt.
+ */
+#define STACKTOOL_IGNORE_FUNC(_func) \
+	static void __used __section(__stacktool_ignore_func) \
+		*__stacktool_ignore_func_##_func = _func
+
+#else /* !CONFIG_STACK_VALIDATION */
+
+#define STACKTOOL_IGNORE_FUNC(_func)
+
+#endif /* CONFIG_STACK_VALIDATION */
+
+#endif /* _LINUX_STACKTOOL_H */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 05/33] x86/xen: Add stack frame dependency to hypercall inline asm calls
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (3 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 04/33] x86/stacktool: Add STACKTOOL_IGNORE_FUNC macro Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:55   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 06/33] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame() Josh Poimboeuf
                   ` (30 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	David Vrabel, Borislav Petkov, Konrad Rzeszutek Wilk,
	Boris Ostrovsky

If a hypercall is inlined at the beginning of a function, gcc can insert
the call instruction before setting up a stack frame, which breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the hypercall inline
asm statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 arch/x86/include/asm/xen/hypercall.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 3bcdcc8..a12a047 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -110,9 +110,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 	register unsigned long __arg2 asm(__HYPERCALL_ARG2REG) = __arg2; \
 	register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
 	register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
-	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5;
+	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \
+	register void *__sp asm(_ASM_SP);
 
-#define __HYPERCALL_0PARAM	"=r" (__res)
+#define __HYPERCALL_0PARAM	"=r" (__res), "+r" (__sp)
 #define __HYPERCALL_1PARAM	__HYPERCALL_0PARAM, "+r" (__arg1)
 #define __HYPERCALL_2PARAM	__HYPERCALL_1PARAM, "+r" (__arg2)
 #define __HYPERCALL_3PARAM	__HYPERCALL_2PARAM, "+r" (__arg3)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 06/33] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (4 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 05/33] x86/xen: Add stack frame dependency to hypercall inline asm calls Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 07/33] x86/asm/xen: Create stack frames in xen-asm.S Josh Poimboeuf
                   ` (29 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, David Vrabel

xen_adjust_exception_frame() is a callable function, but is missing the
ELF function type, which confuses tools like stacktool.

Properly annotate it to be a callable function.  The generated code is
unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/xen-asm_64.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index cc8acc4..c3df431 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -26,6 +26,7 @@ ENTRY(xen_adjust_exception_frame)
 	mov 8+0(%rsp), %rcx
 	mov 8+8(%rsp), %r11
 	ret $16
+ENDPROC(xen_adjust_exception_frame)
 
 hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32
 /*
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 07/33] x86/asm/xen: Create stack frames in xen-asm.S
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (5 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 06/33] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame() Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 08/33] x86/paravirt: Add stack frame dependency to PVOP inline asm calls Josh Poimboeuf
                   ` (28 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, David Vrabel

xen_irq_enable_direct(), xen_restore_fl_direct(), and check_events() are
callable non-leaf functions which don't honor CONFIG_FRAME_POINTER,
which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/xen-asm.S | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 3e45aa0..eff224d 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -14,6 +14,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/percpu.h>
 #include <asm/processor-flags.h>
+#include <asm/frame.h>
 
 #include "xen-asm.h"
 
@@ -23,6 +24,7 @@
  * then enter the hypervisor to get them handled.
  */
 ENTRY(xen_irq_enable_direct)
+	FRAME_BEGIN
 	/* Unmask events */
 	movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
 
@@ -39,6 +41,7 @@ ENTRY(xen_irq_enable_direct)
 2:	call check_events
 1:
 ENDPATCH(xen_irq_enable_direct)
+	FRAME_END
 	ret
 	ENDPROC(xen_irq_enable_direct)
 	RELOC(xen_irq_enable_direct, 2b+1)
@@ -82,6 +85,7 @@ ENDPATCH(xen_save_fl_direct)
  * enters the hypervisor to get them delivered if so.
  */
 ENTRY(xen_restore_fl_direct)
+	FRAME_BEGIN
 #ifdef CONFIG_X86_64
 	testw $X86_EFLAGS_IF, %di
 #else
@@ -100,6 +104,7 @@ ENTRY(xen_restore_fl_direct)
 2:	call check_events
 1:
 ENDPATCH(xen_restore_fl_direct)
+	FRAME_END
 	ret
 	ENDPROC(xen_restore_fl_direct)
 	RELOC(xen_restore_fl_direct, 2b+1)
@@ -109,7 +114,8 @@ ENDPATCH(xen_restore_fl_direct)
  * Force an event check by making a hypercall, but preserve regs
  * before making the call.
  */
-check_events:
+ENTRY(check_events)
+	FRAME_BEGIN
 #ifdef CONFIG_X86_32
 	push %eax
 	push %ecx
@@ -139,4 +145,6 @@ check_events:
 	pop %rcx
 	pop %rax
 #endif
+	FRAME_END
 	ret
+ENDPROC(check_events)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 08/33] x86/paravirt: Add stack frame dependency to PVOP inline asm calls
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (6 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 07/33] x86/asm/xen: Create stack frames in xen-asm.S Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 09/33] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK Josh Poimboeuf
                   ` (27 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov, Jeremy Fitzhardinge, Chris Wright, Alok Kataria,
	Rusty Russell

If a PVOP call macro is inlined at the beginning of a function, gcc can
insert the call instruction before setting up a stack frame, which
breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and
can result in a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the PVOP inline asm
statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
---
 arch/x86/include/asm/paravirt_types.h | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 77db561..e8c2326 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -466,8 +466,9 @@ int paravirt_disable_iospace(void);
  * makes sure the incoming and outgoing types are always correct.
  */
 #ifdef CONFIG_X86_32
-#define PVOP_VCALL_ARGS				\
-	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx
+#define PVOP_VCALL_ARGS							\
+	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;	\
+	register void *__sp asm("esp")
 #define PVOP_CALL_ARGS			PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)		"a" ((unsigned long)(x))
@@ -485,9 +486,10 @@ int paravirt_disable_iospace(void);
 #define VEXTRA_CLOBBERS
 #else  /* CONFIG_X86_64 */
 /* [re]ax isn't an arg, but the return val */
-#define PVOP_VCALL_ARGS					\
-	unsigned long __edi = __edi, __esi = __esi,	\
-		__edx = __edx, __ecx = __ecx, __eax = __eax
+#define PVOP_VCALL_ARGS						\
+	unsigned long __edi = __edi, __esi = __esi,		\
+		__edx = __edx, __ecx = __ecx, __eax = __eax;	\
+	register void *__sp asm("rsp")
 #define PVOP_CALL_ARGS		PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)		"D" ((unsigned long)(x))
@@ -526,7 +528,7 @@ int paravirt_disable_iospace(void);
 			asm volatile(pre				\
 				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
-				     : call_clbr			\
+				     : call_clbr, "+r" (__sp)		\
 				     : paravirt_type(op),		\
 				       paravirt_clobber(clbr),		\
 				       ##__VA_ARGS__			\
@@ -536,7 +538,7 @@ int paravirt_disable_iospace(void);
 			asm volatile(pre				\
 				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
-				     : call_clbr			\
+				     : call_clbr, "+r" (__sp)		\
 				     : paravirt_type(op),		\
 				       paravirt_clobber(clbr),		\
 				       ##__VA_ARGS__			\
@@ -563,7 +565,7 @@ int paravirt_disable_iospace(void);
 		asm volatile(pre					\
 			     paravirt_alt(PARAVIRT_CALL)		\
 			     post					\
-			     : call_clbr				\
+			     : call_clbr, "+r" (__sp)			\
 			     : paravirt_type(op),			\
 			       paravirt_clobber(clbr),			\
 			       ##__VA_ARGS__				\
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 09/33] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (7 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 08/33] x86/paravirt: Add stack frame dependency to PVOP inline asm calls Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:57   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 10/33] x86/amd: Set ELF function type for vide() Josh Poimboeuf
                   ` (26 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov, Jeremy Fitzhardinge, Chris Wright, Alok Kataria,
	Rusty Russell

A function created with the PV_CALLEE_SAVE_REGS_THUNK macro doesn't set
up a new stack frame before the call instruction, which breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.  Also, the thunk functions aren't annotated as ELF
callable functions.

Create a stack frame when CONFIG_FRAME_POINTER is enabled and add the
ELF function type.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
---
 arch/x86/include/asm/paravirt.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index f619250..601f1b8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -13,6 +13,7 @@
 #include <linux/bug.h>
 #include <linux/types.h>
 #include <linux/cpumask.h>
+#include <asm/frame.h>
 
 static inline int paravirt_enabled(void)
 {
@@ -756,15 +757,19 @@ static __always_inline void __ticket_unlock_kick(struct arch_spinlock *lock,
  * call. The return value in rax/eax will not be saved, even for void
  * functions.
  */
+#define PV_THUNK_NAME(func) "__raw_callee_save_" #func
 #define PV_CALLEE_SAVE_REGS_THUNK(func)					\
 	extern typeof(func) __raw_callee_save_##func;			\
 									\
 	asm(".pushsection .text;"					\
-	    ".globl __raw_callee_save_" #func " ; "			\
-	    "__raw_callee_save_" #func ": "				\
+	    ".globl " PV_THUNK_NAME(func) ";"				\
+	    ".type " PV_THUNK_NAME(func) ", @function;"			\
+	    PV_THUNK_NAME(func) ":"					\
+	    FRAME_BEGIN							\
 	    PV_SAVE_ALL_CALLER_REGS					\
 	    "call " #func ";"						\
 	    PV_RESTORE_ALL_CALLER_REGS					\
+	    FRAME_END							\
 	    "ret;"							\
 	    ".popsection")
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 10/33] x86/amd: Set ELF function type for vide()
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (8 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 09/33] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:57   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 11/33] x86/asm/crypto: Move .Lbswap_mask data to .rodata section Josh Poimboeuf
                   ` (25 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov

vide() is a callable function, but is missing the ELF function type,
which confuses tools like stacktool.

Properly annotate it to be a callable function.  The generated code is
unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/amd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index a07956a..fe2f089 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -75,7 +75,10 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
  */
 
 extern __visible void vide(void);
-__asm__(".globl vide\n\t.align 4\nvide: ret");
+__asm__(".globl vide\n"
+	".type vide, @function\n"
+	".align 4\n"
+	"vide: ret\n");
 
 static void init_amd_k5(struct cpuinfo_x86 *c)
 {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 11/33] x86/asm/crypto: Move .Lbswap_mask data to .rodata section
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (9 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 10/33] x86/amd: Set ELF function type for vide() Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:58   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 12/33] x86/asm/crypto: Move jump_table " Josh Poimboeuf
                   ` (24 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov, Herbert Xu, David S. Miller

stacktool reports the following warning:

  stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction

stacktool gets confused when it tries to disassemble the following data
in the .text section:

  .Lbswap_mask:
          .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0

Move it to .rodata which is a more appropriate section for read-only
data.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 arch/x86/crypto/aesni-intel_asm.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 6bd2c6c..c44cfed 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2538,9 +2538,11 @@ ENTRY(aesni_cbc_dec)
 ENDPROC(aesni_cbc_dec)
 
 #ifdef __x86_64__
+.pushsection .rodata
 .align 16
 .Lbswap_mask:
 	.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
+.popsection
 
 /*
  * _aesni_inc_init:	internal ABI
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 12/33] x86/asm/crypto: Move jump_table to .rodata section
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (10 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 11/33] x86/asm/crypto: Move .Lbswap_mask data to .rodata section Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:58   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 13/33] x86/asm/crypto: Simplify stack usage in sha-mb functions Josh Poimboeuf
                   ` (23 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov, Herbert Xu, David S. Miller

stacktool reports the following warning:

  stacktool: arch/x86/crypto/crc32c-pcl-intel-asm_64.o: crc_pcl()+0x11dd: can't decode instruction

It gets confused when trying to decode jump_table data.  Move jump_table
to the .rodata section which is a more appropriate home for read-only
data.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 4fe27e0..dc05f01 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -170,8 +170,8 @@ continue_block:
 	## branch into array
 	lea	jump_table(%rip), bufp
 	movzxw  (bufp, %rax, 2), len
-	offset=crc_array-jump_table
-	lea     offset(bufp, len, 1), bufp
+	lea	crc_array(%rip), bufp
+	lea     (bufp, len, 1), bufp
 	jmp     *bufp
 
 	################################################################
@@ -310,7 +310,9 @@ do_return:
 	popq    %rdi
 	popq    %rbx
         ret
+ENDPROC(crc_pcl)
 
+.section	.rodata, "a", %progbits
         ################################################################
         ## jump table        Table is 129 entries x 2 bytes each
         ################################################################
@@ -324,13 +326,11 @@ JMPTBL_ENTRY %i
 	i=i+1
 .endr
 
-ENDPROC(crc_pcl)
 
 	################################################################
 	## PCLMULQDQ tables
 	## Table is 128 entries x 2 words (8 bytes) each
 	################################################################
-.section	.rodata, "a", %progbits
 .align 8
 K_table:
 	.long 0x493c7d27, 0x00000001
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 13/33] x86/asm/crypto: Simplify stack usage in sha-mb functions
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (11 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 12/33] x86/asm/crypto: Move jump_table " Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:59   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 14/33] x86/asm/crypto: Don't use rbp as a scratch register Josh Poimboeuf
                   ` (22 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

sha1_mb_mgr_flush_avx2() and sha1_mb_mgr_submit_avx2() both allocate a
lot of stack space which is never used.  Also, many of the registers
being saved aren't being clobbered so there's no need to save them.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S  | 32 ++----------------------
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S | 29 +++------------------
 2 files changed, 6 insertions(+), 55 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
index 85c4e1c..672eaeb 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
@@ -86,16 +86,6 @@
 #define extra_blocks    %arg2
 #define p               %arg2
 
-
-# STACK_SPACE needs to be an odd multiple of 8
-_XMM_SAVE_SIZE  = 10*16
-_GPR_SAVE_SIZE  = 8*8
-_ALIGN_SIZE     = 8
-
-_XMM_SAVE       = 0
-_GPR_SAVE       = _XMM_SAVE + _XMM_SAVE_SIZE
-STACK_SPACE     = _GPR_SAVE + _GPR_SAVE_SIZE + _ALIGN_SIZE
-
 .macro LABEL prefix n
 \prefix\n\():
 .endm
@@ -113,16 +103,7 @@ offset = \_offset
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
 ENTRY(sha1_mb_mgr_flush_avx2)
-	mov	%rsp, %r10
-	sub     $STACK_SPACE, %rsp
-	and     $~31, %rsp
-	mov     %rbx, _GPR_SAVE(%rsp)
-	mov     %r10, _GPR_SAVE+8*1(%rsp) #save rsp
-	mov	%rbp, _GPR_SAVE+8*3(%rsp)
-	mov	%r12, _GPR_SAVE+8*4(%rsp)
-	mov	%r13, _GPR_SAVE+8*5(%rsp)
-	mov	%r14, _GPR_SAVE+8*6(%rsp)
-	mov	%r15, _GPR_SAVE+8*7(%rsp)
+	push	%rbx
 
 	# If bit (32+3) is set, then all lanes are empty
 	mov     _unused_lanes(state), unused_lanes
@@ -230,16 +211,7 @@ len_is_0:
 	mov     tmp2_w, offset(job_rax)
 
 return:
-
-	mov     _GPR_SAVE(%rsp), %rbx
-	mov     _GPR_SAVE+8*1(%rsp), %r10 #saved rsp
-	mov	_GPR_SAVE+8*3(%rsp), %rbp
-	mov	_GPR_SAVE+8*4(%rsp), %r12
-	mov	_GPR_SAVE+8*5(%rsp), %r13
-	mov	_GPR_SAVE+8*6(%rsp), %r14
-	mov	_GPR_SAVE+8*7(%rsp), %r15
-	mov     %r10, %rsp
-
+	pop	%rbx
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index 2ab9560..a5a14c62 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -94,25 +94,12 @@ DWORD_tmp	= %r9d
 
 lane_data       = %r10
 
-# STACK_SPACE needs to be an odd multiple of 8
-STACK_SPACE     = 8*8 + 16*10 + 8
-
 # JOB* submit_mb_mgr_submit_avx2(MB_MGR *state, job_sha1 *job)
 # arg 1 : rcx : state
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
-
-	mov	%rsp, %r10
-	sub     $STACK_SPACE, %rsp
-	and	$~31, %rsp
-
-	mov     %rbx, (%rsp)
-	mov	%r10, 8*2(%rsp)	#save old rsp
-	mov     %rbp, 8*3(%rsp)
-	mov	%r12, 8*4(%rsp)
-	mov	%r13, 8*5(%rsp)
-	mov	%r14, 8*6(%rsp)
-	mov	%r15, 8*7(%rsp)
+	push	%rbx
+	push	%rbp
 
 	mov     _unused_lanes(state), unused_lanes
 	mov	unused_lanes, lane
@@ -203,16 +190,8 @@ len_is_0:
 	movl    DWORD_tmp, _result_digest+1*16(job_rax)
 
 return:
-
-	mov     (%rsp), %rbx
-	mov	8*2(%rsp), %r10	#save old rsp
-	mov     8*3(%rsp), %rbp
-	mov	8*4(%rsp), %r12
-	mov	8*5(%rsp), %r13
-	mov	8*6(%rsp), %r14
-	mov	8*7(%rsp), %r15
-	mov     %r10, %rsp
-
+	pop	%rbp
+	pop	%rbx
 	ret
 
 return_null:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 14/33] x86/asm/crypto: Don't use rbp as a scratch register
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (12 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 13/33] x86/asm/crypto: Simplify stack usage in sha-mb functions Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:59   ` [tip:x86/debug] x86/asm/crypto: Don't use RBP " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 15/33] x86/asm/crypto: Create stack frames in crypto functions Josh Poimboeuf
                   ` (21 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

The frame pointer (rbp) is getting clobbered in
sha1_mb_mgr_submit_avx2() before a function call, which can mess up
stack traces.  Use r12 instead.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index a5a14c62..c3b9447 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -86,8 +86,8 @@ job_rax         = %rax
 len             = %rax
 DWORD_len	= %eax
 
-lane            = %rbp
-tmp3            = %rbp
+lane            = %r12
+tmp3            = %r12
 
 tmp             = %r9
 DWORD_tmp	= %r9d
@@ -99,7 +99,7 @@ lane_data       = %r10
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
 	push	%rbx
-	push	%rbp
+	push	%r12
 
 	mov     _unused_lanes(state), unused_lanes
 	mov	unused_lanes, lane
@@ -190,7 +190,7 @@ len_is_0:
 	movl    DWORD_tmp, _result_digest+1*16(job_rax)
 
 return:
-	pop	%rbp
+	pop	%r12
 	pop	%rbx
 	ret
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 15/33] x86/asm/crypto: Create stack frames in crypto functions
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (13 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 14/33] x86/asm/crypto: Don't use rbp as a scratch register Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  8:59   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 16/33] x86/asm/entry: Create stack frames in thunk functions Josh Poimboeuf
                   ` (20 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo, Herbert Xu,
	David S. Miller

The crypto code has several callable non-leaf functions which don't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 arch/x86/crypto/aesni-intel_asm.S                | 73 +++++++++++++++---------
 arch/x86/crypto/camellia-aesni-avx-asm_64.S      | 15 +++++
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S     | 15 +++++
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S        |  9 +++
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S        | 13 +++++
 arch/x86/crypto/ghash-clmulni-intel_asm.S        |  5 ++
 arch/x86/crypto/serpent-avx-x86_64-asm_64.S      | 13 +++++
 arch/x86/crypto/serpent-avx2-asm_64.S            | 13 +++++
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S  |  3 +
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S |  3 +
 arch/x86/crypto/twofish-avx-x86_64-asm_64.S      | 13 +++++
 11 files changed, 148 insertions(+), 27 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index c44cfed..383a6f8 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -31,6 +31,7 @@
 
 #include <linux/linkage.h>
 #include <asm/inst.h>
+#include <asm/frame.h>
 
 /*
  * The following macros are used to move an (un)aligned 16 byte value to/from
@@ -1800,11 +1801,12 @@ ENDPROC(_key_expansion_256b)
  *                   unsigned int key_len)
  */
 ENTRY(aesni_set_key)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
-	movl 8(%esp), KEYP		# ctx
-	movl 12(%esp), UKEYP		# in_key
-	movl 16(%esp), %edx		# key_len
+	movl (FRAME_OFFSET+8)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+12)(%esp), UKEYP	# in_key
+	movl (FRAME_OFFSET+16)(%esp), %edx	# key_len
 #endif
 	movups (UKEYP), %xmm0		# user key (first 16 bytes)
 	movaps %xmm0, (KEYP)
@@ -1905,6 +1907,7 @@ ENTRY(aesni_set_key)
 #ifndef __x86_64__
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_set_key)
 
@@ -1912,12 +1915,13 @@ ENDPROC(aesni_set_key)
  * void aesni_enc(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
 ENTRY(aesni_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
 	pushl KLEN
-	movl 12(%esp), KEYP
-	movl 16(%esp), OUTP
-	movl 20(%esp), INP
+	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+20)(%esp), INP	# src
 #endif
 	movl 480(KEYP), KLEN		# key length
 	movups (INP), STATE		# input
@@ -1927,6 +1931,7 @@ ENTRY(aesni_enc)
 	popl KLEN
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_enc)
 
@@ -2101,12 +2106,13 @@ ENDPROC(_aesni_enc4)
  * void aesni_dec (struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
 ENTRY(aesni_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
 	pushl KLEN
-	movl 12(%esp), KEYP
-	movl 16(%esp), OUTP
-	movl 20(%esp), INP
+	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+20)(%esp), INP	# src
 #endif
 	mov 480(KEYP), KLEN		# key length
 	add $240, KEYP
@@ -2117,6 +2123,7 @@ ENTRY(aesni_dec)
 	popl KLEN
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_dec)
 
@@ -2292,14 +2299,15 @@ ENDPROC(_aesni_dec4)
  *		      size_t len)
  */
 ENTRY(aesni_ecb_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 16(%esp), KEYP
-	movl 20(%esp), OUTP
-	movl 24(%esp), INP
-	movl 28(%esp), LEN
+	movl (FRAME_OFFSET+16)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+20)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+24)(%esp), INP	# src
+	movl (FRAME_OFFSET+28)(%esp), LEN	# len
 #endif
 	test LEN, LEN		# check length
 	jz .Lecb_enc_ret
@@ -2342,6 +2350,7 @@ ENTRY(aesni_ecb_enc)
 	popl KEYP
 	popl LEN
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_ecb_enc)
 
@@ -2350,14 +2359,15 @@ ENDPROC(aesni_ecb_enc)
  *		      size_t len);
  */
 ENTRY(aesni_ecb_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 16(%esp), KEYP
-	movl 20(%esp), OUTP
-	movl 24(%esp), INP
-	movl 28(%esp), LEN
+	movl (FRAME_OFFSET+16)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+20)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+24)(%esp), INP	# src
+	movl (FRAME_OFFSET+28)(%esp), LEN	# len
 #endif
 	test LEN, LEN
 	jz .Lecb_dec_ret
@@ -2401,6 +2411,7 @@ ENTRY(aesni_ecb_dec)
 	popl KEYP
 	popl LEN
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_ecb_dec)
 
@@ -2409,16 +2420,17 @@ ENDPROC(aesni_ecb_dec)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_cbc_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 20(%esp), KEYP
-	movl 24(%esp), OUTP
-	movl 28(%esp), INP
-	movl 32(%esp), LEN
-	movl 36(%esp), IVP
+	movl (FRAME_OFFSET+20)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+24)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+28)(%esp), INP	# src
+	movl (FRAME_OFFSET+32)(%esp), LEN	# len
+	movl (FRAME_OFFSET+36)(%esp), IVP	# iv
 #endif
 	cmp $16, LEN
 	jb .Lcbc_enc_ret
@@ -2443,6 +2455,7 @@ ENTRY(aesni_cbc_enc)
 	popl LEN
 	popl IVP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_cbc_enc)
 
@@ -2451,16 +2464,17 @@ ENDPROC(aesni_cbc_enc)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_cbc_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 20(%esp), KEYP
-	movl 24(%esp), OUTP
-	movl 28(%esp), INP
-	movl 32(%esp), LEN
-	movl 36(%esp), IVP
+	movl (FRAME_OFFSET+20)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+24)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+28)(%esp), INP	# src
+	movl (FRAME_OFFSET+32)(%esp), LEN	# len
+	movl (FRAME_OFFSET+36)(%esp), IVP	# iv
 #endif
 	cmp $16, LEN
 	jb .Lcbc_dec_just_ret
@@ -2534,6 +2548,7 @@ ENTRY(aesni_cbc_dec)
 	popl LEN
 	popl IVP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_cbc_dec)
 
@@ -2600,6 +2615,7 @@ ENDPROC(_aesni_inc)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_ctr_enc)
+	FRAME_BEGIN
 	cmp $16, LEN
 	jb .Lctr_enc_just_ret
 	mov 480(KEYP), KLEN
@@ -2653,6 +2669,7 @@ ENTRY(aesni_ctr_enc)
 .Lctr_enc_ret:
 	movups IV, (IVP)
 .Lctr_enc_just_ret:
+	FRAME_END
 	ret
 ENDPROC(aesni_ctr_enc)
 
@@ -2679,6 +2696,7 @@ ENDPROC(aesni_ctr_enc)
  *			 bool enc, u8 *iv)
  */
 ENTRY(aesni_xts_crypt8)
+	FRAME_BEGIN
 	cmpb $0, %cl
 	movl $0, %ecx
 	movl $240, %r10d
@@ -2779,6 +2797,7 @@ ENTRY(aesni_xts_crypt8)
 	pxor INC, STATE4
 	movdqu STATE4, 0x70(OUTP)
 
+	FRAME_END
 	ret
 ENDPROC(aesni_xts_crypt8)
 
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index ce71f92..aa9e8bd 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -16,6 +16,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -726,6 +727,7 @@ __camellia_enc_blk16:
 	 *	%xmm0..%xmm15: 16 encrypted blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 16(%rax), %rcx;
 
@@ -780,6 +782,7 @@ __camellia_enc_blk16:
 		    %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
 		    %xmm15, (key_table)(CTX, %r8, 8), (%rax), 1 * 16(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -812,6 +815,7 @@ __camellia_dec_blk16:
 	 *	%xmm0..%xmm15: 16 plaintext blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 16(%rax), %rcx;
 
@@ -865,6 +869,7 @@ __camellia_dec_blk16:
 		    %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
 		    %xmm15, (key_table)(CTX), (%rax), 1 * 16(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -890,6 +895,7 @@ ENTRY(camellia_ecb_enc_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	 FRAME_BEGIN
 
 	inpack16_pre(%xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
 		     %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
@@ -904,6 +910,7 @@ ENTRY(camellia_ecb_enc_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_enc_16way)
 
@@ -913,6 +920,7 @@ ENTRY(camellia_ecb_dec_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	 FRAME_BEGIN
 
 	cmpl $16, key_length(CTX);
 	movl $32, %r8d;
@@ -932,6 +940,7 @@ ENTRY(camellia_ecb_dec_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_dec_16way)
 
@@ -941,6 +950,7 @@ ENTRY(camellia_cbc_dec_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	FRAME_BEGIN
 
 	cmpl $16, key_length(CTX);
 	movl $32, %r8d;
@@ -981,6 +991,7 @@ ENTRY(camellia_cbc_dec_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_cbc_dec_16way)
 
@@ -997,6 +1008,7 @@ ENTRY(camellia_ctr_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
@@ -1092,6 +1104,7 @@ ENTRY(camellia_ctr_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ctr_16way)
 
@@ -1112,6 +1125,7 @@ camellia_xts_crypt_16way:
 	 *	%r8: index for input whitening key
 	 *	%r9: pointer to  __camellia_enc_blk16 or __camellia_dec_blk16
 	 */
+	FRAME_BEGIN
 
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
@@ -1234,6 +1248,7 @@ camellia_xts_crypt_16way:
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_xts_crypt_16way)
 
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 0e0b886..16186c1 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -11,6 +11,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -766,6 +767,7 @@ __camellia_enc_blk32:
 	 *	%ymm0..%ymm15: 32 encrypted blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 32(%rax), %rcx;
 
@@ -820,6 +822,7 @@ __camellia_enc_blk32:
 		    %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 		    %ymm15, (key_table)(CTX, %r8, 8), (%rax), 1 * 32(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -852,6 +855,7 @@ __camellia_dec_blk32:
 	 *	%ymm0..%ymm15: 16 plaintext blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 32(%rax), %rcx;
 
@@ -905,6 +909,7 @@ __camellia_dec_blk32:
 		    %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 		    %ymm15, (key_table)(CTX), (%rax), 1 * 32(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -930,6 +935,7 @@ ENTRY(camellia_ecb_enc_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -948,6 +954,7 @@ ENTRY(camellia_ecb_enc_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_enc_32way)
 
@@ -957,6 +964,7 @@ ENTRY(camellia_ecb_dec_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -980,6 +988,7 @@ ENTRY(camellia_ecb_dec_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_dec_32way)
 
@@ -989,6 +998,7 @@ ENTRY(camellia_cbc_dec_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1046,6 +1056,7 @@ ENTRY(camellia_cbc_dec_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_cbc_dec_32way)
 
@@ -1070,6 +1081,7 @@ ENTRY(camellia_ctr_32way)
 	 *	%rdx: src (32 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1184,6 +1196,7 @@ ENTRY(camellia_ctr_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ctr_32way)
 
@@ -1216,6 +1229,7 @@ camellia_xts_crypt_32way:
 	 *	%r8: index for input whitening key
 	 *	%r9: pointer to  __camellia_enc_blk32 or __camellia_dec_blk32
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1349,6 +1363,7 @@ camellia_xts_crypt_32way:
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_xts_crypt_32way)
 
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index c35fd5d..14fa196 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 .file "cast5-avx-x86_64-asm_64.S"
 
@@ -365,6 +366,7 @@ ENTRY(cast5_ecb_enc_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -388,6 +390,7 @@ ENTRY(cast5_ecb_enc_16way)
 	vmovdqu RR4, (6*4*4)(%r11);
 	vmovdqu RL4, (7*4*4)(%r11);
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ecb_enc_16way)
 
@@ -398,6 +401,7 @@ ENTRY(cast5_ecb_dec_16way)
 	 *	%rdx: src
 	 */
 
+	FRAME_BEGIN
 	movq %rsi, %r11;
 
 	vmovdqu (0*4*4)(%rdx), RL1;
@@ -420,6 +424,7 @@ ENTRY(cast5_ecb_dec_16way)
 	vmovdqu RR4, (6*4*4)(%r11);
 	vmovdqu RL4, (7*4*4)(%r11);
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ecb_dec_16way)
 
@@ -429,6 +434,7 @@ ENTRY(cast5_cbc_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -469,6 +475,7 @@ ENTRY(cast5_cbc_dec_16way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_cbc_dec_16way)
 
@@ -479,6 +486,7 @@ ENTRY(cast5_ctr_16way)
 	 *	%rdx: src
 	 *	%rcx: iv (big endian, 64bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -542,5 +550,6 @@ ENTRY(cast5_ctr_16way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ctr_16way)
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index e3531f8..c419389 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "cast6-avx-x86_64-asm_64.S"
@@ -349,6 +350,7 @@ ENTRY(cast6_ecb_enc_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -358,6 +360,7 @@ ENTRY(cast6_ecb_enc_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ecb_enc_8way)
 
@@ -367,6 +370,7 @@ ENTRY(cast6_ecb_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -376,6 +380,7 @@ ENTRY(cast6_ecb_dec_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ecb_dec_8way)
 
@@ -385,6 +390,7 @@ ENTRY(cast6_cbc_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -399,6 +405,7 @@ ENTRY(cast6_cbc_dec_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_cbc_dec_8way)
 
@@ -409,6 +416,7 @@ ENTRY(cast6_ctr_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -424,6 +432,7 @@ ENTRY(cast6_ctr_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ctr_8way)
 
@@ -434,6 +443,7 @@ ENTRY(cast6_xts_enc_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -446,6 +456,7 @@ ENTRY(cast6_xts_enc_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_xts_enc_8way)
 
@@ -456,6 +467,7 @@ ENTRY(cast6_xts_dec_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -468,5 +480,6 @@ ENTRY(cast6_xts_dec_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_xts_dec_8way)
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index 5d1e007..eed55c8 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -18,6 +18,7 @@
 
 #include <linux/linkage.h>
 #include <asm/inst.h>
+#include <asm/frame.h>
 
 .data
 
@@ -94,6 +95,7 @@ ENDPROC(__clmul_gf128mul_ble)
 
 /* void clmul_ghash_mul(char *dst, const u128 *shash) */
 ENTRY(clmul_ghash_mul)
+	FRAME_BEGIN
 	movups (%rdi), DATA
 	movups (%rsi), SHASH
 	movaps .Lbswap_mask, BSWAP
@@ -101,6 +103,7 @@ ENTRY(clmul_ghash_mul)
 	call __clmul_gf128mul_ble
 	PSHUFB_XMM BSWAP DATA
 	movups DATA, (%rdi)
+	FRAME_END
 	ret
 ENDPROC(clmul_ghash_mul)
 
@@ -109,6 +112,7 @@ ENDPROC(clmul_ghash_mul)
  *			   const u128 *shash);
  */
 ENTRY(clmul_ghash_update)
+	FRAME_BEGIN
 	cmp $16, %rdx
 	jb .Lupdate_just_ret	# check length
 	movaps .Lbswap_mask, BSWAP
@@ -128,5 +132,6 @@ ENTRY(clmul_ghash_update)
 	PSHUFB_XMM BSWAP DATA
 	movups DATA, (%rdi)
 .Lupdate_just_ret:
+	FRAME_END
 	ret
 ENDPROC(clmul_ghash_update)
diff --git a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
index 2f202f4..8be5718 100644
--- a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "serpent-avx-x86_64-asm_64.S"
@@ -681,6 +682,7 @@ ENTRY(serpent_ecb_enc_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -688,6 +690,7 @@ ENTRY(serpent_ecb_enc_8way_avx)
 
 	store_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_enc_8way_avx)
 
@@ -697,6 +700,7 @@ ENTRY(serpent_ecb_dec_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -704,6 +708,7 @@ ENTRY(serpent_ecb_dec_8way_avx)
 
 	store_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_dec_8way_avx)
 
@@ -713,6 +718,7 @@ ENTRY(serpent_cbc_dec_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -720,6 +726,7 @@ ENTRY(serpent_cbc_dec_8way_avx)
 
 	store_cbc_8way(%rdx, %rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_cbc_dec_8way_avx)
 
@@ -730,6 +737,7 @@ ENTRY(serpent_ctr_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	load_ctr_8way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
 		      RD2, RK0, RK1, RK2);
@@ -738,6 +746,7 @@ ENTRY(serpent_ctr_8way_avx)
 
 	store_ctr_8way(%rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ctr_8way_avx)
 
@@ -748,6 +757,7 @@ ENTRY(serpent_xts_enc_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	/* regs <= src, dst <= IVs, regs <= regs xor IVs */
 	load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
@@ -758,6 +768,7 @@ ENTRY(serpent_xts_enc_8way_avx)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_enc_8way_avx)
 
@@ -768,6 +779,7 @@ ENTRY(serpent_xts_dec_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	/* regs <= src, dst <= IVs, regs <= regs xor IVs */
 	load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
@@ -778,5 +790,6 @@ ENTRY(serpent_xts_dec_8way_avx)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_dec_8way_avx)
diff --git a/arch/x86/crypto/serpent-avx2-asm_64.S b/arch/x86/crypto/serpent-avx2-asm_64.S
index b222085..97c48ad 100644
--- a/arch/x86/crypto/serpent-avx2-asm_64.S
+++ b/arch/x86/crypto/serpent-avx2-asm_64.S
@@ -15,6 +15,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx2.S"
 
 .file "serpent-avx2-asm_64.S"
@@ -673,6 +674,7 @@ ENTRY(serpent_ecb_enc_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -684,6 +686,7 @@ ENTRY(serpent_ecb_enc_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_enc_16way)
 
@@ -693,6 +696,7 @@ ENTRY(serpent_ecb_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -704,6 +708,7 @@ ENTRY(serpent_ecb_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_dec_16way)
 
@@ -713,6 +718,7 @@ ENTRY(serpent_cbc_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -725,6 +731,7 @@ ENTRY(serpent_cbc_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_cbc_dec_16way)
 
@@ -735,6 +742,7 @@ ENTRY(serpent_ctr_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -748,6 +756,7 @@ ENTRY(serpent_ctr_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ctr_16way)
 
@@ -758,6 +767,7 @@ ENTRY(serpent_xts_enc_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -772,6 +782,7 @@ ENTRY(serpent_xts_enc_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_enc_16way)
 
@@ -782,6 +793,7 @@ ENTRY(serpent_xts_dec_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -796,5 +808,6 @@ ENTRY(serpent_xts_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_dec_16way)
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
index 672eaeb..96df6a3 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
@@ -52,6 +52,7 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "sha1_mb_mgr_datastruct.S"
 
 
@@ -103,6 +104,7 @@ offset = \_offset
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
 ENTRY(sha1_mb_mgr_flush_avx2)
+	FRAME_BEGIN
 	push	%rbx
 
 	# If bit (32+3) is set, then all lanes are empty
@@ -212,6 +214,7 @@ len_is_0:
 
 return:
 	pop	%rbx
+	FRAME_END
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index c3b9447..1435acf 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -53,6 +53,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "sha1_mb_mgr_datastruct.S"
 
 
@@ -98,6 +99,7 @@ lane_data       = %r10
 # arg 1 : rcx : state
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
+	FRAME_BEGIN
 	push	%rbx
 	push	%r12
 
@@ -192,6 +194,7 @@ len_is_0:
 return:
 	pop	%r12
 	pop	%rbx
+	FRAME_END
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
index 0505813..dc66273 100644
--- a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "twofish-avx-x86_64-asm_64.S"
@@ -333,6 +334,7 @@ ENTRY(twofish_ecb_enc_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -342,6 +344,7 @@ ENTRY(twofish_ecb_enc_8way)
 
 	store_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ecb_enc_8way)
 
@@ -351,6 +354,7 @@ ENTRY(twofish_ecb_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -360,6 +364,7 @@ ENTRY(twofish_ecb_dec_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ecb_dec_8way)
 
@@ -369,6 +374,7 @@ ENTRY(twofish_cbc_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -383,6 +389,7 @@ ENTRY(twofish_cbc_dec_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_cbc_dec_8way)
 
@@ -393,6 +400,7 @@ ENTRY(twofish_ctr_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -408,6 +416,7 @@ ENTRY(twofish_ctr_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ctr_8way)
 
@@ -418,6 +427,7 @@ ENTRY(twofish_xts_enc_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -430,6 +440,7 @@ ENTRY(twofish_xts_enc_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_xts_enc_8way)
 
@@ -440,6 +451,7 @@ ENTRY(twofish_xts_dec_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -452,5 +464,6 @@ ENTRY(twofish_xts_dec_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_xts_dec_8way)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 16/33] x86/asm/entry: Create stack frames in thunk functions
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (14 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 15/33] x86/asm/crypto: Create stack frames in crypto functions Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 17/33] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() Josh Poimboeuf
                   ` (19 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov

Thunk functions are callable non-leaf functions that don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.  Also they
aren't annotated as ELF callable functions which can confuse tooling.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled and
add the ELF function type.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/entry/thunk_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index efb2b93..98df1fa 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -8,11 +8,14 @@
 #include <linux/linkage.h>
 #include "calling.h"
 #include <asm/asm.h>
+#include <asm/frame.h>
 
 	/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
 	.macro THUNK name, func, put_ret_addr_in_rdi=0
 	.globl \name
+	.type \name, @function
 \name:
+	FRAME_BEGIN
 
 	/* this one pushes 9 elems, the next one would be %rIP */
 	pushq %rdi
@@ -62,6 +65,7 @@ restore:
 	popq %rdx
 	popq %rsi
 	popq %rdi
+	FRAME_END
 	ret
 	_ASM_NOKPROBE(restore)
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 17/33] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (15 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 16/33] x86/asm/entry: Create stack frames in thunk functions Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 18/33] x86/asm: Create stack frames in rwsem functions Josh Poimboeuf
                   ` (18 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Pavel Machek, Rafael J. Wysocki, Borislav Petkov, Len Brown

do_suspend_lowlevel() is a callable non-leaf function which doesn't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Len Brown <len.brown@intel.com>
---
 arch/x86/kernel/acpi/wakeup_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 8c35df4..169963f 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -5,6 +5,7 @@
 #include <asm/page_types.h>
 #include <asm/msr.h>
 #include <asm/asm-offsets.h>
+#include <asm/frame.h>
 
 # Copyright 2003 Pavel Machek <pavel@suse.cz>, distribute under GPLv2
 
@@ -39,6 +40,7 @@ bogus_64_magic:
 	jmp	bogus_64_magic
 
 ENTRY(do_suspend_lowlevel)
+	FRAME_BEGIN
 	subq	$8, %rsp
 	xorl	%eax, %eax
 	call	save_processor_state
@@ -109,6 +111,7 @@ ENTRY(do_suspend_lowlevel)
 
 	xorl	%eax, %eax
 	addq	$8, %rsp
+	FRAME_END
 	jmp	restore_processor_state
 ENDPROC(do_suspend_lowlevel)
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 18/33] x86/asm: Create stack frames in rwsem functions
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (16 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 17/33] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 19/33] x86/asm/efi: Create a stack frame in efi_call() Josh Poimboeuf
                   ` (17 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov

rwsem.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/lib/rwsem.S | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/rwsem.S b/arch/x86/lib/rwsem.S
index 40027db..be110ef 100644
--- a/arch/x86/lib/rwsem.S
+++ b/arch/x86/lib/rwsem.S
@@ -15,6 +15,7 @@
 
 #include <linux/linkage.h>
 #include <asm/alternative-asm.h>
+#include <asm/frame.h>
 
 #define __ASM_HALF_REG(reg)	__ASM_SEL(reg, e##reg)
 #define __ASM_HALF_SIZE(inst)	__ASM_SEL(inst##w, inst##l)
@@ -84,24 +85,29 @@
 
 /* Fix up special calling conventions */
 ENTRY(call_rwsem_down_read_failed)
+	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
 	movq %rax,%rdi
 	call rwsem_down_read_failed
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_down_read_failed)
 
 ENTRY(call_rwsem_down_write_failed)
+	FRAME_BEGIN
 	save_common_regs
 	movq %rax,%rdi
 	call rwsem_down_write_failed
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_down_write_failed)
 
 ENTRY(call_rwsem_wake)
+	FRAME_BEGIN
 	/* do nothing if still outstanding active readers */
 	__ASM_HALF_SIZE(dec) %__ASM_HALF_REG(dx)
 	jnz 1f
@@ -109,15 +115,18 @@ ENTRY(call_rwsem_wake)
 	movq %rax,%rdi
 	call rwsem_wake
 	restore_common_regs
-1:	ret
+1:	FRAME_END
+	ret
 ENDPROC(call_rwsem_wake)
 
 ENTRY(call_rwsem_downgrade_wake)
+	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
 	movq %rax,%rdi
 	call rwsem_downgrade_wake
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_downgrade_wake)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 19/33] x86/asm/efi: Create a stack frame in efi_call()
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (17 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 18/33] x86/asm: Create stack frames in rwsem functions Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 20/33] x86/asm/power: Create stack frames in hibernate_asm_64.S Josh Poimboeuf
                   ` (16 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Matt Fleming, Borislav Petkov

efi_call() is a callable non-leaf function which doesn't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/platform/efi/efi_stub_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index 32020cb..92723ae 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -11,6 +11,7 @@
 #include <asm/msr.h>
 #include <asm/processor-flags.h>
 #include <asm/page_types.h>
+#include <asm/frame.h>
 
 #define SAVE_XMM			\
 	mov %rsp, %rax;			\
@@ -39,6 +40,7 @@
 	mov (%rsp), %rsp
 
 ENTRY(efi_call)
+	FRAME_BEGIN
 	SAVE_XMM
 	mov (%rsp), %rax
 	mov 8(%rax), %rax
@@ -51,5 +53,6 @@ ENTRY(efi_call)
 	call *%rdi
 	addq $48, %rsp
 	RESTORE_XMM
+	FRAME_END
 	ret
 ENDPROC(efi_call)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 20/33] x86/asm/power: Create stack frames in hibernate_asm_64.S
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (18 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 19/33] x86/asm/efi: Create a stack frame in efi_call() Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 21/33] x86/uaccess: Add stack frame output operand in get_user inline asm Josh Poimboeuf
                   ` (15 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Pavel Machek, Rafael J. Wysocki, Borislav Petkov

swsusp_arch_suspend() and restore_registers() are callable non-leaf
functions which don't honor CONFIG_FRAME_POINTER, which can result in
bad stack traces.  Also they aren't annotated as ELF callable functions
which can confuse tooling.

Create a stack frame for them when CONFIG_FRAME_POINTER is enabled and
give them proper ELF function annotations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/power/hibernate_asm_64.S | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index e2386cb..4400a43 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -21,8 +21,10 @@
 #include <asm/page_types.h>
 #include <asm/asm-offsets.h>
 #include <asm/processor-flags.h>
+#include <asm/frame.h>
 
 ENTRY(swsusp_arch_suspend)
+	FRAME_BEGIN
 	movq	$saved_context, %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
@@ -50,7 +52,9 @@ ENTRY(swsusp_arch_suspend)
 	movq	%rax, restore_cr3(%rip)
 
 	call swsusp_save
+	FRAME_END
 	ret
+ENDPROC(swsusp_arch_suspend)
 
 ENTRY(restore_image)
 	/* switch to temporary page tables */
@@ -107,6 +111,7 @@ ENTRY(core_restore_code)
 	 */
 
 ENTRY(restore_registers)
+	FRAME_BEGIN
 	/* go back to the original page tables */
 	movq    %rbx, %cr3
 
@@ -147,4 +152,6 @@ ENTRY(restore_registers)
 	/* tell the hibernation core that we've just restored the memory */
 	movq	%rax, in_suspend(%rip)
 
+	FRAME_END
 	ret
+ENDPROC(restore_registers)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 21/33] x86/uaccess: Add stack frame output operand in get_user inline asm
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (19 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 20/33] x86/asm/power: Create stack frames in hibernate_asm_64.S Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:02   ` [tip:x86/debug] x86/uaccess: Add stack frame output operand in get_user() " =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
  2016-02-25  5:50   ` tip-bot for Chris J Arges
  2016-01-21 22:49 ` [PATCH 22/33] x86/asm/bpf: Annotate callable functions Josh Poimboeuf
                   ` (14 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: Chris J Arges, linux-kernel, live-patching, Michal Marek,
	Peter Zijlstra, Andy Lutomirski, Borislav Petkov, Linus Torvalds,
	Andi Kleen, Pedro Alves, Namhyung Kim, Bernd Petrovitsch,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Borislav Petkov

From: Chris J Arges <chris.j.arges@canonical.com>

Numerous 'call without frame pointer save/setup' warnings are introduced
by stacktool because of functions using the get_user macro. Bad stack
traces could occur due to lack of or misplacement of stack frame setup
code.

This patch forces a stack frame to be created before the inline asm code
if CONFIG_FRAME_POINTER is enabled by listing the stack pointer as an
output operand for the get_user inline assembly statement.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/uaccess.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 660458a..2584134 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -176,10 +176,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
 ({									\
 	int __ret_gu;							\
 	register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);		\
+	register void *__sp asm(_ASM_SP);				\
 	__chk_user_ptr(ptr);						\
 	might_fault();							\
-	asm volatile("call __get_user_%P3"				\
-		     : "=a" (__ret_gu), "=r" (__val_gu)			\
+	asm volatile("call __get_user_%P4"				\
+		     : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp)	\
 		     : "0" (ptr), "i" (sizeof(*(ptr))));		\
 	(x) = (__force __typeof__(*(ptr))) __val_gu;			\
 	__builtin_expect(__ret_gu, 0);					\
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 22/33] x86/asm/bpf: Annotate callable functions
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (20 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 21/33] x86/uaccess: Add stack frame output operand in get_user inline asm Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:02   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S Josh Poimboeuf
                   ` (13 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, netdev

bpf_jit.S has several functions which can be called from C code.  Give
them proper ELF annotations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: netdev@vger.kernel.org
---
 arch/x86/net/bpf_jit.S | 39 ++++++++++++++++-----------------------
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index 4093216..eb4a3bd 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -22,15 +22,16 @@
 	32 /* space for rbx,r13,r14,r15 */ + \
 	8 /* space for skb_copy_bits */)
 
-sk_load_word:
-	.globl	sk_load_word
+#define FUNC(name) \
+	.globl name; \
+	.type name, @function; \
+	name:
 
+FUNC(sk_load_word)
 	test	%esi,%esi
 	js	bpf_slow_path_word_neg
 
-sk_load_word_positive_offset:
-	.globl	sk_load_word_positive_offset
-
+FUNC(sk_load_word_positive_offset)
 	mov	%r9d,%eax		# hlen
 	sub	%esi,%eax		# hlen - offset
 	cmp	$3,%eax
@@ -39,15 +40,11 @@ sk_load_word_positive_offset:
 	bswap   %eax  			/* ntohl() */
 	ret
 
-sk_load_half:
-	.globl	sk_load_half
-
+FUNC(sk_load_half)
 	test	%esi,%esi
 	js	bpf_slow_path_half_neg
 
-sk_load_half_positive_offset:
-	.globl	sk_load_half_positive_offset
-
+FUNC(sk_load_half_positive_offset)
 	mov	%r9d,%eax
 	sub	%esi,%eax		#	hlen - offset
 	cmp	$1,%eax
@@ -56,15 +53,11 @@ sk_load_half_positive_offset:
 	rol	$8,%ax			# ntohs()
 	ret
 
-sk_load_byte:
-	.globl	sk_load_byte
-
+FUNC(sk_load_byte)
 	test	%esi,%esi
 	js	bpf_slow_path_byte_neg
 
-sk_load_byte_positive_offset:
-	.globl	sk_load_byte_positive_offset
-
+FUNC(sk_load_byte_positive_offset)
 	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
 	jle	bpf_slow_path_byte
 	movzbl	(SKBDATA,%rsi),%eax
@@ -120,8 +113,8 @@ bpf_slow_path_byte:
 bpf_slow_path_word_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi	/* test range */
 	jl	bpf_error	/* offset lower -> error  */
-sk_load_word_negative_offset:
-	.globl	sk_load_word_negative_offset
+
+FUNC(sk_load_word_negative_offset)
 	sk_negative_common(4)
 	mov	(%rax), %eax
 	bswap	%eax
@@ -130,8 +123,8 @@ sk_load_word_negative_offset:
 bpf_slow_path_half_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
-sk_load_half_negative_offset:
-	.globl	sk_load_half_negative_offset
+
+FUNC(sk_load_half_negative_offset)
 	sk_negative_common(2)
 	mov	(%rax),%ax
 	rol	$8,%ax
@@ -141,8 +134,8 @@ sk_load_half_negative_offset:
 bpf_slow_path_byte_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
-sk_load_byte_negative_offset:
-	.globl	sk_load_byte_negative_offset
+
+FUNC(sk_load_byte_negative_offset)
 	sk_negative_common(1)
 	movzbl	(%rax), %eax
 	ret
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (21 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 22/33] x86/asm/bpf: Annotate callable functions Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-22  2:44   ` Alexei Starovoitov
                     ` (2 more replies)
  2016-01-21 22:49 ` [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder() Josh Poimboeuf
                   ` (12 subsequent siblings)
  35 siblings, 3 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, netdev

bpf_jit.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame before the call instructions when
CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: netdev@vger.kernel.org
---
 arch/x86/net/bpf_jit.S | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index eb4a3bd..f2a7faf 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -8,6 +8,7 @@
  * of the License.
  */
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 /*
  * Calling convention :
@@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
 
 /* rsi contains offset and can be scratched */
 #define bpf_slow_path_common(LEN)		\
+	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
+	FRAME_BEGIN;				\
 	mov	%rbx, %rdi; /* arg1 == skb */	\
 	push	%r9;				\
 	push	SKBDATA;			\
 /* rsi already has offset */			\
 	mov	$LEN,%ecx;	/* len */	\
-	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
 	call	skb_copy_bits;			\
 	test    %eax,%eax;			\
 	pop	SKBDATA;			\
-	pop	%r9;
+	pop	%r9;				\
+	FRAME_END
 
 
 bpf_slow_path_word:
@@ -99,6 +102,7 @@ bpf_slow_path_byte:
 	ret
 
 #define sk_negative_common(SIZE)				\
+	FRAME_BEGIN;						\
 	mov	%rbx, %rdi; /* arg1 == skb */			\
 	push	%r9;						\
 	push	SKBDATA;					\
@@ -108,6 +112,7 @@ bpf_slow_path_byte:
 	test	%rax,%rax;					\
 	pop	SKBDATA;					\
 	pop	%r9;						\
+	FRAME_END;						\
 	jz	bpf_error
 
 bpf_slow_path_word_neg:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder()
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (22 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 23:42   ` 平松雅巳 / HIRAMATU,MASAMI
                     ` (2 more replies)
  2016-01-21 22:49 ` [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions Josh Poimboeuf
                   ` (11 subsequent siblings)
  35 siblings, 3 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu

The kretprobe_trampoline_holder() wrapper around kretprobe_trampoline()
isn't used anywhere and adds some unnecessary frame pointer instructions
which never execute.  Instead, just make kretprobe_trampoline() a proper
ELF function.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 arch/x86/kernel/kprobes/core.c | 57 +++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 1deffe6..5b187df 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -671,38 +671,37 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
  * When a retprobed function returns, this code saves registers and
  * calls trampoline_handler() runs, which calls the kretprobe's handler.
  */
-static void __used kretprobe_trampoline_holder(void)
-{
-	asm volatile (
-			".global kretprobe_trampoline\n"
-			"kretprobe_trampoline: \n"
+asm(
+	".global kretprobe_trampoline\n"
+	".type kretprobe_trampoline, @function\n"
+	"kretprobe_trampoline:\n"
 #ifdef CONFIG_X86_64
-			/* We don't bother saving the ss register */
-			"	pushq %rsp\n"
-			"	pushfq\n"
-			SAVE_REGS_STRING
-			"	movq %rsp, %rdi\n"
-			"	call trampoline_handler\n"
-			/* Replace saved sp with true return address. */
-			"	movq %rax, 152(%rsp)\n"
-			RESTORE_REGS_STRING
-			"	popfq\n"
+	/* We don't bother saving the ss register */
+	"	pushq %rsp\n"
+	"	pushfq\n"
+	SAVE_REGS_STRING
+	"	movq %rsp, %rdi\n"
+	"	call trampoline_handler\n"
+	/* Replace saved sp with true return address. */
+	"	movq %rax, 152(%rsp)\n"
+	RESTORE_REGS_STRING
+	"	popfq\n"
 #else
-			"	pushf\n"
-			SAVE_REGS_STRING
-			"	movl %esp, %eax\n"
-			"	call trampoline_handler\n"
-			/* Move flags to cs */
-			"	movl 56(%esp), %edx\n"
-			"	movl %edx, 52(%esp)\n"
-			/* Replace saved flags with true return address. */
-			"	movl %eax, 56(%esp)\n"
-			RESTORE_REGS_STRING
-			"	popf\n"
+	"	pushf\n"
+	SAVE_REGS_STRING
+	"	movl %esp, %eax\n"
+	"	call trampoline_handler\n"
+	/* Move flags to cs */
+	"	movl 56(%esp), %edx\n"
+	"	movl %edx, 52(%esp)\n"
+	/* Replace saved flags with true return address. */
+	"	movl %eax, 56(%esp)\n"
+	RESTORE_REGS_STRING
+	"	popf\n"
 #endif
-			"	ret\n");
-}
-NOKPROBE_SYMBOL(kretprobe_trampoline_holder);
+	"	ret\n"
+	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
+);
 NOKPROBE_SYMBOL(kretprobe_trampoline);
 
 /*
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (23 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder() Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-22 10:05   ` Paolo Bonzini
                     ` (2 more replies)
  2016-01-21 22:49 ` [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm Josh Poimboeuf
                   ` (10 subsequent siblings)
  35 siblings, 3 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Gleb Natapov, Paolo Bonzini, kvm

The callable functions created with the FOP* and FASTOP* macros are
missing ELF function annotations, which confuses tools like stacktool.
Properly annotate them.

This adds some additional labels to the assembly, but the generated
binary code is unchanged (with the exception of instructions which have
embedded references to __LINE__).

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
---
 arch/x86/kvm/emulate.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 1505587..aa4d726 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -309,23 +309,29 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
 
 static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 
-#define FOP_ALIGN ".align " __stringify(FASTOP_SIZE) " \n\t"
+#define FOP_FUNC(name) \
+	".align " __stringify(FASTOP_SIZE) " \n\t" \
+	".type " name ", @function \n\t" \
+	name ":\n\t"
+
 #define FOP_RET   "ret \n\t"
 
 #define FOP_START(op) \
 	extern void em_##op(struct fastop *fake); \
 	asm(".pushsection .text, \"ax\" \n\t" \
 	    ".global em_" #op " \n\t" \
-            FOP_ALIGN \
-	    "em_" #op ": \n\t"
+	    FOP_FUNC("em_" #op)
 
 #define FOP_END \
 	    ".popsection")
 
-#define FOPNOP() FOP_ALIGN FOP_RET
+#define FOPNOP() \
+	FOP_FUNC(__stringify(__UNIQUE_ID(nop))) \
+	FOP_RET
 
 #define FOP1E(op,  dst) \
-	FOP_ALIGN "10: " #op " %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst) \
+	"10: " #op " %" #dst " \n\t" FOP_RET
 
 #define FOP1EEX(op,  dst) \
 	FOP1E(op, dst) _ASM_EXTABLE(10b, kvm_fastop_exception)
@@ -357,7 +363,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 #define FOP2E(op,  dst, src)	   \
-	FOP_ALIGN #op " %" #src ", %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst "_" #src) \
+	#op " %" #src ", %" #dst " \n\t" FOP_RET
 
 #define FASTOP2(op) \
 	FOP_START(op) \
@@ -395,7 +402,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 #define FOP3E(op,  dst, src, src2) \
-	FOP_ALIGN #op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst "_" #src "_" #src2) \
+	#op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
 
 /* 3-operand, word-only, src2=cl */
 #define FASTOP3WCL(op) \
@@ -407,7 +415,12 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 /* Special case for SETcc - 1 instruction per cc */
-#define FOP_SETCC(op) ".align 4; " #op " %al; ret \n\t"
+#define FOP_SETCC(op) \
+	".align 4 \n\t" \
+	".type " #op ", @function \n\t" \
+	#op ": \n\t" \
+	#op " %al \n\t" \
+	FOP_RET
 
 asm(".global kvm_fastop_exception \n"
     "kvm_fastop_exception: xor %esi, %esi; ret");
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (24 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-22 10:05   ` Paolo Bonzini
  2016-01-21 22:49 ` [PATCH 27/33] watchdog/hpwdt: Create stack frame in asminline_call() Josh Poimboeuf
                   ` (9 subsequent siblings)
  35 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Gleb Natapov, Paolo Bonzini, kvm

With some configs, gcc doesn't inline test_cc().  When that happens, it
doesn't create a stack frame before inserting the call instruction.
This breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled
and can result in a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the inline asm
statement.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
---
 arch/x86/kvm/emulate.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index aa4d726..7dba65a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -972,11 +972,13 @@ static int em_bsr_c(struct x86_emulate_ctxt *ctxt)
 static u8 test_cc(unsigned int condition, unsigned long flags)
 {
 	u8 rc;
+	register void *__sp asm(_ASM_SP);
 	void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);
 
 	flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
 	asm("push %[flags]; popf; call *%[fastop]"
-	    : "=a"(rc) : [fastop]"r"(fop), [flags]"r"(flags));
+	    : "=a"(rc), "+r"(__sp)
+	    : [fastop]"r"(fop), [flags]"r"(flags));
 	return rc;
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 27/33] watchdog/hpwdt: Create stack frame in asminline_call()
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (25 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:04   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 28/33] x86/locking: Create stack frame in PV unlock Josh Poimboeuf
                   ` (8 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog

asminline_call() is a callable non-leaf function which doesn't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-watchdog@vger.kernel.org
---
 drivers/watchdog/hpwdt.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 286369d..f368383 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -353,10 +353,10 @@ static int detect_cru_service(void)
 
 asm(".text                      \n\t"
     ".align 4                   \n\t"
-    ".globl asminline_call	\n"
+    ".globl asminline_call	\n\t"
+    ".type asminline_call, @function \n\t"
     "asminline_call:            \n\t"
-    "pushq      %rbp            \n\t"
-    "movq       %rsp, %rbp      \n\t"
+    FRAME_BEGIN
     "pushq      %rax            \n\t"
     "pushq      %rbx            \n\t"
     "pushq      %rdx            \n\t"
@@ -386,7 +386,7 @@ asm(".text                      \n\t"
     "popq       %rdx            \n\t"
     "popq       %rbx            \n\t"
     "popq       %rax            \n\t"
-    "leave                      \n\t"
+    FRAME_END
     "ret                        \n\t"
     ".previous");
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 28/33] x86/locking: Create stack frame in PV unlock
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (26 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 27/33] watchdog/hpwdt: Create stack frame in asminline_call() Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-02-23  9:05   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 29/33] x86/stacktool: Add directory and file whitelists Josh Poimboeuf
                   ` (7 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo, Waiman Long

The assembly PV_UNLOCK function is a callable non-leaf function which
doesn't honor CONFIG_FRAME_POINTER, which can result in bad stack
traces.

Create a stack frame when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Waiman Long <Waiman.Long@hpe.com>
---
 arch/x86/include/asm/qspinlock_paravirt.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/qspinlock_paravirt.h b/arch/x86/include/asm/qspinlock_paravirt.h
index 9f92c18..9d55f9b 100644
--- a/arch/x86/include/asm/qspinlock_paravirt.h
+++ b/arch/x86/include/asm/qspinlock_paravirt.h
@@ -36,8 +36,10 @@ PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath);
  */
 asm    (".pushsection .text;"
 	".globl " PV_UNLOCK ";"
+	".type " PV_UNLOCK ", @function;"
 	".align 4,0x90;"
 	PV_UNLOCK ": "
+	FRAME_BEGIN
 	"push  %rdx;"
 	"mov   $0x1,%eax;"
 	"xor   %edx,%edx;"
@@ -45,6 +47,7 @@ asm    (".pushsection .text;"
 	"cmp   $0x1,%al;"
 	"jne   .slowpath;"
 	"pop   %rdx;"
+	FRAME_END
 	"ret;"
 	".slowpath: "
 	"push   %rsi;"
@@ -52,6 +55,7 @@ asm    (".pushsection .text;"
 	"call " PV_UNLOCK_SLOWPATH ";"
 	"pop    %rsi;"
 	"pop    %rdx;"
+	FRAME_END
 	"ret;"
 	".size " PV_UNLOCK ", .-" PV_UNLOCK ";"
 	".popsection");
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 29/33] x86/stacktool: Add directory and file whitelists
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (27 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 28/33] x86/locking: Create stack frame in PV unlock Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 30/33] x86/xen: Add xen_cpuid() to stacktool whitelist Josh Poimboeuf
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

Tell stacktool to skip validation of the following code which runs
outside the kernel's normal mode of operation:

- boot image
- vdso image
- relocation
- realmode
- efi
- head

Also, skip the following code which does the right thing with respect to
frame pointers, but is too "special" to be validated by a tool:

- entry
- mcount

Also skip the test_nx module because it modifies its exception handling
table at runtime, which stacktool can't understand.  Fortunately it's
just a test module so it doesn't matter much.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/boot/Makefile                | 1 +
 arch/x86/boot/compressed/Makefile     | 3 ++-
 arch/x86/entry/Makefile               | 4 ++++
 arch/x86/entry/vdso/Makefile          | 5 ++++-
 arch/x86/kernel/Makefile              | 5 +++++
 arch/x86/platform/efi/Makefile        | 2 ++
 arch/x86/realmode/Makefile            | 4 +++-
 arch/x86/realmode/rm/Makefile         | 3 ++-
 drivers/firmware/efi/libstub/Makefile | 1 +
 9 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index 2ee62db..df43778 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -10,6 +10,7 @@
 #
 
 KASAN_SANITIZE := n
+STACKTOOL	:= n
 
 # If you want to preset the SVGA mode, uncomment the next line and
 # set SVGA_MODE to whatever number you want.
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 0a291cd..8cea814 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -16,7 +16,8 @@
 #	(see scripts/Makefile.lib size_append)
 #	compressed vmlinux.bin.all + u32 size of vmlinux.bin.all
 
-KASAN_SANITIZE := n
+KASAN_SANITIZE	:= n
+STACKTOOL	:= n
 
 targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
 	vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4
diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index bd55ded..14a5b41 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -1,6 +1,10 @@
 #
 # Makefile for the x86 low level entry code
 #
+
+STACKTOOL_entry_$(BITS).o   := n
+STACKTOOL_entry_64_compat.o := n
+
 obj-y				:= entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
 obj-y				+= common.o
 
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 265c0ed..510985f 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -3,7 +3,9 @@
 #
 
 KBUILD_CFLAGS += $(DISABLE_LTO)
-KASAN_SANITIZE := n
+
+KASAN_SANITIZE	:= n
+STACKTOOL	:= n
 
 VDSO64-$(CONFIG_X86_64)		:= y
 VDSOX32-$(CONFIG_X86_X32_ABI)	:= y
@@ -15,6 +17,7 @@ vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o
 
 # files to link into kernel
 obj-y				+= vma.o
+STACKTOOL_vma.o			:= y
 
 # vDSO images to build
 vdso_img-$(VDSO64-y)		+= 64
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b1b78ff..fe410c4 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -20,6 +20,11 @@ KASAN_SANITIZE_head$(BITS).o := n
 KASAN_SANITIZE_dumpstack.o := n
 KASAN_SANITIZE_dumpstack_$(BITS).o := n
 
+STACKTOOL_head_$(BITS).o		:= n
+STACKTOOL_relocate_kernel_$(BITS).o	:= n
+STACKTOOL_mcount_$(BITS).o		:= n
+STACKTOOL_test_nx.o			:= n
+
 CFLAGS_irq.o := -I$(src)/../include/asm/trace
 
 obj-y			:= process_$(BITS).o signal.o
diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile
index 2846aaa..8a347b2 100644
--- a/arch/x86/platform/efi/Makefile
+++ b/arch/x86/platform/efi/Makefile
@@ -1,3 +1,5 @@
+STACKTOOL_efi_thunk_$(BITS).o := n
+
 obj-$(CONFIG_EFI) 		+= quirks.o efi.o efi_$(BITS).o efi_stub_$(BITS).o
 obj-$(CONFIG_ACPI_BGRT) += efi-bgrt.o
 obj-$(CONFIG_EARLY_PRINTK_EFI)	+= early_printk.o
diff --git a/arch/x86/realmode/Makefile b/arch/x86/realmode/Makefile
index e02c2c6..0c24689 100644
--- a/arch/x86/realmode/Makefile
+++ b/arch/x86/realmode/Makefile
@@ -6,7 +6,9 @@
 # for more details.
 #
 #
-KASAN_SANITIZE := n
+KASAN_SANITIZE	:= n
+STACKTOOL	:= n
+
 subdir- := rm
 
 obj-y += init.o
diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile
index 2730d77..1da2e5b 100644
--- a/arch/x86/realmode/rm/Makefile
+++ b/arch/x86/realmode/rm/Makefile
@@ -6,7 +6,8 @@
 # for more details.
 #
 #
-KASAN_SANITIZE := n
+KASAN_SANITIZE	:= n
+STACKTOOL	:= n
 
 always := realmode.bin realmode.relocs
 
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 9c12e18..a73d2d74 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -22,6 +22,7 @@ KBUILD_CFLAGS			:= $(cflags-y) -DDISABLE_BRANCH_PROFILING \
 
 GCOV_PROFILE			:= n
 KASAN_SANITIZE			:= n
+STACKTOOL			:= n
 
 lib-y				:= efi-stub-helper.o
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 30/33] x86/xen: Add xen_cpuid() to stacktool whitelist
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (28 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 29/33] x86/stacktool: Add directory and file whitelists Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 31/33] bpf: Add __bpf_prog_run() " Josh Poimboeuf
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	David Vrabel, Konrad Rzeszutek Wilk, Boris Ostrovsky

stacktool reports the following false positive warning:

  stacktool: arch/x86/xen/enlighten.o: xen_cpuid()+0x41: can't find jump dest instruction at .text+0x108

The warning is due to xen_cpuid()'s use of XEN_EMULATE_PREFIX to insert
some fake instructions which stacktool doesn't know how to decode.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 arch/x86/xen/enlighten.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index d09e4c9..7ba9520 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -32,6 +32,7 @@
 #include <linux/gfp.h>
 #include <linux/memblock.h>
 #include <linux/edd.h>
+#include <linux/stacktool.h>
 
 #ifdef CONFIG_KEXEC_CORE
 #include <linux/kexec.h>
@@ -351,8 +352,8 @@ static void xen_cpuid(unsigned int *ax, unsigned int *bx,
 	*cx &= maskecx;
 	*cx |= setecx;
 	*dx &= maskedx;
-
 }
+STACKTOOL_IGNORE_FUNC(xen_cpuid);
 
 static bool __init xen_check_mwait(void)
 {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 31/33] bpf: Add __bpf_prog_run() to stacktool whitelist
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (29 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 30/33] x86/xen: Add xen_cpuid() to stacktool whitelist Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:57   ` Daniel Borkmann
  2016-01-22  2:55   ` Alexei Starovoitov
  2016-01-21 22:49 ` [PATCH 32/33] sched: Add __schedule() " Josh Poimboeuf
                   ` (4 subsequent siblings)
  35 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, netdev

stacktool reports the following false positive warnings:

  stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x5c: sibling call from callable instruction with changed frame pointer
  stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x60: function has unreachable instruction
  stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x64: function has unreachable instruction
  [...]

It's confused by the following dynamic jump instruction in
__bpf_prog_run()::

  jmp     *(%r12,%rax,8)

which corresponds to the following line in the C code:

  goto *jumptable[insn->code];

There's no way for stacktool to deterministically find all possible
branch targets for a dynamic jump, so it can't verify this code.

In this case the jumps all stay within the function, and there's nothing
unusual going on related to the stack, so we can whitelist the function.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: netdev@vger.kernel.org
---
 kernel/bpf/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 972d9a8..7108a96 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -27,6 +27,7 @@
 #include <linux/random.h>
 #include <linux/moduleloader.h>
 #include <linux/bpf.h>
+#include <linux/stacktool.h>
 
 #include <asm/unaligned.h>
 
@@ -649,6 +650,7 @@ load_byte:
 		WARN_RATELIMIT(1, "unknown opcode %02x\n", insn->code);
 		return 0;
 }
+STACKTOOL_IGNORE_FUNC(__bpf_prog_run);
 
 bool bpf_prog_array_compatible(struct bpf_array *array,
 			       const struct bpf_prog *fp)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 32/33] sched: Add __schedule() to stacktool whitelist
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (30 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 31/33] bpf: Add __bpf_prog_run() " Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-21 22:49 ` [PATCH 33/33] x86/kprobes: Add kretprobe_trampoline() " Josh Poimboeuf
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo

stacktool reports the following warnings for __schedule():

  stacktool: kernel/sched/core.o: __schedule()+0x3c0: duplicate frame pointer save
  stacktool: kernel/sched/core.o: __schedule()+0x3fd: sibling call from callable instruction with changed frame pointer
  stacktool: kernel/sched/core.o: __schedule()+0x40a: call without frame pointer save/setup
  stacktool: kernel/sched/core.o: __schedule()+0x7fd: frame pointer state mismatch
  stacktool: kernel/sched/core.o: __schedule()+0x421: frame pointer state mismatch

Basically it's confused by two unusual attributes of the switch_to()
macro:

1. It saves prev's frame pointer to the old stack and restores next's
   frame pointer from the new stack.

2. For new tasks it jumps directly to ret_from_fork.

Eventually it would probably be a good idea to clean up the
ret_from_fork hack so that new tasks are created with a valid initial
stack, as suggested by Andy:

  https://lkml.kernel.org/r/CALCETrWsqCw4L1qKO9j9L5F+4ED4viuLQTFc=n1pKBZfFPQUFg@mail.gmail.com

Then __schedule() could return normally into the new code and stacktool
hopefully wouldn't have a problem anymore.

In the meantime, add it to the stacktool whitelist so we can have a
baseline with no stacktool warnings.  The marker also serves as a
reminder that this code could be improved a bit.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 474658b..cc7e8e70 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -74,6 +74,7 @@
 #include <linux/binfmts.h>
 #include <linux/context_tracking.h>
 #include <linux/compiler.h>
+#include <linux/stacktool.h>
 
 #include <asm/switch_to.h>
 #include <asm/tlb.h>
@@ -3288,6 +3289,7 @@ static void __sched notrace __schedule(bool preempt)
 
 	balance_callback(rq);
 }
+STACKTOOL_IGNORE_FUNC(__schedule);
 
 static inline void sched_submit_work(struct task_struct *tsk)
 {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 33/33] x86/kprobes: Add kretprobe_trampoline() to stacktool whitelist
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (31 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 32/33] sched: Add __schedule() " Josh Poimboeuf
@ 2016-01-21 22:49 ` Josh Poimboeuf
  2016-01-22 17:43 ` [PATCH 00/33] Compile-time stack metadata validation Chris J Arges
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-21 22:49 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu

stacktool reports the following warning for kretprobe_trampoline():

  stacktool: arch/x86/kernel/kprobes/core.o: kretprobe_trampoline()+0x20: call without frame pointer save/setup

kretprobes are a special case where the stack is intentionally wrong.
The return address isn't known at the beginning of the trampoline, so
the stack frame can't be set up properly before it calls
trampoline_handler().

Because kretprobe handlers don't sleep, the frame pointer doesn't *have*
to be accurate in the trampoline.  So it's ok to add the trampoline to
the stacktool whitelist.  This results in no actual changes to the
generated code.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 arch/x86/kernel/kprobes/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 5b187df..2b29272 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -49,6 +49,7 @@
 #include <linux/kdebug.h>
 #include <linux/kallsyms.h>
 #include <linux/ftrace.h>
+#include <linux/stacktool.h>
 
 #include <asm/cacheflush.h>
 #include <asm/desc.h>
@@ -703,6 +704,7 @@ asm(
 	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
 );
 NOKPROBE_SYMBOL(kretprobe_trampoline);
+STACKTOOL_IGNORE_FUNC(kretprobe_trampoline);
 
 /*
  * Called from kretprobe_trampoline
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH 31/33] bpf: Add __bpf_prog_run() to stacktool whitelist
  2016-01-21 22:49 ` [PATCH 31/33] bpf: Add __bpf_prog_run() " Josh Poimboeuf
@ 2016-01-21 22:57   ` Daniel Borkmann
  2016-01-22  2:55   ` Alexei Starovoitov
  1 sibling, 0 replies; 133+ messages in thread
From: Daniel Borkmann @ 2016-01-21 22:57 UTC (permalink / raw)
  To: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, netdev

On 01/21/2016 11:49 PM, Josh Poimboeuf wrote:
> stacktool reports the following false positive warnings:
>
>    stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x5c: sibling call from callable instruction with changed frame pointer
>    stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x60: function has unreachable instruction
>    stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x64: function has unreachable instruction
>    [...]
>
> It's confused by the following dynamic jump instruction in
> __bpf_prog_run()::
>
>    jmp     *(%r12,%rax,8)
>
> which corresponds to the following line in the C code:
>
>    goto *jumptable[insn->code];
>
> There's no way for stacktool to deterministically find all possible
> branch targets for a dynamic jump, so it can't verify this code.
>
> In this case the jumps all stay within the function, and there's nothing
> unusual going on related to the stack, so we can whitelist the function.
>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: netdev@vger.kernel.org

Fine by me:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder()
  2016-01-21 22:49 ` [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder() Josh Poimboeuf
@ 2016-01-21 23:42   ` 平松雅巳 / HIRAMATU,MASAMI
  2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2016-01-21 23:42 UTC (permalink / raw)
  To: 'Josh Poimboeuf',
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	David S. Miller

>From: Josh Poimboeuf [mailto:jpoimboe@redhat.com]
>
>The kretprobe_trampoline_holder() wrapper around kretprobe_trampoline()
>isn't used anywhere and adds some unnecessary frame pointer instructions
>which never execute.  Instead, just make kretprobe_trampoline() a proper
>ELF function.
>

Looks good to me :)

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thanks!

>Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
>Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
>Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>Cc: "David S. Miller" <davem@davemloft.net>
>Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>---
> arch/x86/kernel/kprobes/core.c | 57 +++++++++++++++++++++---------------------
> 1 file changed, 28 insertions(+), 29 deletions(-)
>
>diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
>index 1deffe6..5b187df 100644
>--- a/arch/x86/kernel/kprobes/core.c
>+++ b/arch/x86/kernel/kprobes/core.c
>@@ -671,38 +671,37 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
>  * When a retprobed function returns, this code saves registers and
>  * calls trampoline_handler() runs, which calls the kretprobe's handler.
>  */
>-static void __used kretprobe_trampoline_holder(void)
>-{
>-	asm volatile (
>-			".global kretprobe_trampoline\n"
>-			"kretprobe_trampoline: \n"
>+asm(
>+	".global kretprobe_trampoline\n"
>+	".type kretprobe_trampoline, @function\n"
>+	"kretprobe_trampoline:\n"
> #ifdef CONFIG_X86_64
>-			/* We don't bother saving the ss register */
>-			"	pushq %rsp\n"
>-			"	pushfq\n"
>-			SAVE_REGS_STRING
>-			"	movq %rsp, %rdi\n"
>-			"	call trampoline_handler\n"
>-			/* Replace saved sp with true return address. */
>-			"	movq %rax, 152(%rsp)\n"
>-			RESTORE_REGS_STRING
>-			"	popfq\n"
>+	/* We don't bother saving the ss register */
>+	"	pushq %rsp\n"
>+	"	pushfq\n"
>+	SAVE_REGS_STRING
>+	"	movq %rsp, %rdi\n"
>+	"	call trampoline_handler\n"
>+	/* Replace saved sp with true return address. */
>+	"	movq %rax, 152(%rsp)\n"
>+	RESTORE_REGS_STRING
>+	"	popfq\n"
> #else
>-			"	pushf\n"
>-			SAVE_REGS_STRING
>-			"	movl %esp, %eax\n"
>-			"	call trampoline_handler\n"
>-			/* Move flags to cs */
>-			"	movl 56(%esp), %edx\n"
>-			"	movl %edx, 52(%esp)\n"
>-			/* Replace saved flags with true return address. */
>-			"	movl %eax, 56(%esp)\n"
>-			RESTORE_REGS_STRING
>-			"	popf\n"
>+	"	pushf\n"
>+	SAVE_REGS_STRING
>+	"	movl %esp, %eax\n"
>+	"	call trampoline_handler\n"
>+	/* Move flags to cs */
>+	"	movl 56(%esp), %edx\n"
>+	"	movl %edx, 52(%esp)\n"
>+	/* Replace saved flags with true return address. */
>+	"	movl %eax, 56(%esp)\n"
>+	RESTORE_REGS_STRING
>+	"	popf\n"
> #endif
>-			"	ret\n");
>-}
>-NOKPROBE_SYMBOL(kretprobe_trampoline_holder);
>+	"	ret\n"
>+	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
>+);
> NOKPROBE_SYMBOL(kretprobe_trampoline);
>
> /*
>--
>2.4.3

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-21 22:49 ` [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S Josh Poimboeuf
@ 2016-01-22  2:44   ` Alexei Starovoitov
  2016-01-22  3:55     ` Josh Poimboeuf
  2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 1 reply; 133+ messages in thread
From: Alexei Starovoitov @ 2016-01-22  2:44 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> bpf_jit.S has several callable non-leaf functions which don't honor
> CONFIG_FRAME_POINTER, which can result in bad stack traces.
> 
> Create a stack frame before the call instructions when
> CONFIG_FRAME_POINTER is enabled.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: netdev@vger.kernel.org
> ---
>  arch/x86/net/bpf_jit.S | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
> index eb4a3bd..f2a7faf 100644
> --- a/arch/x86/net/bpf_jit.S
> +++ b/arch/x86/net/bpf_jit.S
> @@ -8,6 +8,7 @@
>   * of the License.
>   */
>  #include <linux/linkage.h>
> +#include <asm/frame.h>
>  
>  /*
>   * Calling convention :
> @@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
>  
>  /* rsi contains offset and can be scratched */
>  #define bpf_slow_path_common(LEN)		\
> +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> +	FRAME_BEGIN;				\
>  	mov	%rbx, %rdi; /* arg1 == skb */	\
>  	push	%r9;				\
>  	push	SKBDATA;			\
>  /* rsi already has offset */			\
>  	mov	$LEN,%ecx;	/* len */	\
> -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
>  	call	skb_copy_bits;			\
>  	test    %eax,%eax;			\
>  	pop	SKBDATA;			\
> -	pop	%r9;
> +	pop	%r9;				\
> +	FRAME_END

I'm not sure what above is doing.
There is already 'push rbp; mov rbp,rsp' at the beginning of generated
code and with above the stack trace will show two function at the same ip?
since there were no calls between them?
I think the stack walker will get even more confused?
Also the JIT of bpf_call insn will emit variable number of push/pop
around the call and I definitely don't want to add extra push rbp
there, since it's the critical path and callee will do its own
push rbp.
Also there are push/pops emitted around div/mod
and there is indirect goto emitted as well for bpf_tail_call
that jumps into different function body without touching
current stack.
Also none of the JITed function are dwarf annotated.
I could be missing something. I think either this patch
is not need or you need to teach the tool to ignore
all JITed stuff. I don't think it's practical to annotate
everything. Different JITs do their own magic.
s390 JIT is even more fancy.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 31/33] bpf: Add __bpf_prog_run() to stacktool whitelist
  2016-01-21 22:49 ` [PATCH 31/33] bpf: Add __bpf_prog_run() " Josh Poimboeuf
  2016-01-21 22:57   ` Daniel Borkmann
@ 2016-01-22  2:55   ` Alexei Starovoitov
  2016-01-22  4:13     ` Josh Poimboeuf
  1 sibling, 1 reply; 133+ messages in thread
From: Alexei Starovoitov @ 2016-01-22  2:55 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 04:49:35PM -0600, Josh Poimboeuf wrote:
> stacktool reports the following false positive warnings:
> 
>   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x5c: sibling call from callable instruction with changed frame pointer
>   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x60: function has unreachable instruction
>   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x64: function has unreachable instruction
>   [...]
> 
> It's confused by the following dynamic jump instruction in
> __bpf_prog_run()::
> 
>   jmp     *(%r12,%rax,8)
> 
> which corresponds to the following line in the C code:
> 
>   goto *jumptable[insn->code];
> 
> There's no way for stacktool to deterministically find all possible
> branch targets for a dynamic jump, so it can't verify this code.
> 
> In this case the jumps all stay within the function, and there's nothing
> unusual going on related to the stack, so we can whitelist the function.

well, few things are very unusual in this function.
did you see what JMP_CALL does? it's a call into a different function,
but not like typical indirect call. Will it be ok as well?

In general it's not possible for any tool to identify all possible
branch targets. bpf programs can be loaded on the fly and
jumping sequence will change.
So if this marking says 'don't bother analyzing this function because
it does sane stuff' that's probably not the case.
If this marking says 'don't bother analyzing, the stack may be crazy
from here on' then it's ok.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22  2:44   ` Alexei Starovoitov
@ 2016-01-22  3:55     ` Josh Poimboeuf
  2016-01-22  4:18       ` Alexei Starovoitov
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22  3:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 06:44:28PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> > bpf_jit.S has several callable non-leaf functions which don't honor
> > CONFIG_FRAME_POINTER, which can result in bad stack traces.
> > 
> > Create a stack frame before the call instructions when
> > CONFIG_FRAME_POINTER is enabled.
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: netdev@vger.kernel.org
> > ---
> >  arch/x86/net/bpf_jit.S | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
> > index eb4a3bd..f2a7faf 100644
> > --- a/arch/x86/net/bpf_jit.S
> > +++ b/arch/x86/net/bpf_jit.S
> > @@ -8,6 +8,7 @@
> >   * of the License.
> >   */
> >  #include <linux/linkage.h>
> > +#include <asm/frame.h>
> >  
> >  /*
> >   * Calling convention :
> > @@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
> >  
> >  /* rsi contains offset and can be scratched */
> >  #define bpf_slow_path_common(LEN)		\
> > +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> > +	FRAME_BEGIN;				\
> >  	mov	%rbx, %rdi; /* arg1 == skb */	\
> >  	push	%r9;				\
> >  	push	SKBDATA;			\
> >  /* rsi already has offset */			\
> >  	mov	$LEN,%ecx;	/* len */	\
> > -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
> >  	call	skb_copy_bits;			\
> >  	test    %eax,%eax;			\
> >  	pop	SKBDATA;			\
> > -	pop	%r9;
> > +	pop	%r9;				\
> > +	FRAME_END
> 
> I'm not sure what above is doing.
> There is already 'push rbp; mov rbp,rsp' at the beginning of generated
> code and with above the stack trace will show two function at the same ip?
> since there were no calls between them?
> I think the stack walker will get even more confused?
> Also the JIT of bpf_call insn will emit variable number of push/pop
> around the call and I definitely don't want to add extra push rbp
> there, since it's the critical path and callee will do its own
> push rbp.
> Also there are push/pops emitted around div/mod
> and there is indirect goto emitted as well for bpf_tail_call
> that jumps into different function body without touching
> current stack.

Hm, I'm not sure I follow.  Let me try to explain my understanding.

As you mentioned, the generated code sets up the frame pointer.  From
emit_prologue():

        EMIT1(0x55); /* push rbp */
	EMIT3(0x48, 0x89, 0xE5); /* mov rbp,rsp */

And then later, do_jit() can generate a call into the functions in
bpf_jit.S.  For example:

	func = CHOOSE_LOAD_FUNC(imm32, sk_load_word);
	...
	EMIT1_off32(0xE8, jmp_offset); /* call */

So the functions in bpf_jit.S are being called by the generated code.
They're not part of the generated code itself.  So they're callees and
need to create their own stack frame before they call out to somewhere
else.

Or did I miss something?

> Also none of the JITed function are dwarf annotated.

But what does that have to do with frame pointers?

> I could be missing something. I think either this patch
> is not need or you need to teach the tool to ignore
> all JITed stuff. I don't think it's practical to annotate
> everything. Different JITs do their own magic.
> s390 JIT is even more fancy.

Well, but the point of these patches isn't to make the tool happy.  It's
really to make sure that runtime stack traces can be made reliable.
Maybe I'm missing something but I don't see why JIT code can't honor
CONFIG_FRAME_POINTER just like any other code.

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 31/33] bpf: Add __bpf_prog_run() to stacktool whitelist
  2016-01-22  2:55   ` Alexei Starovoitov
@ 2016-01-22  4:13     ` Josh Poimboeuf
  2016-01-22 17:19       ` Alexei Starovoitov
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22  4:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 06:55:41PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 21, 2016 at 04:49:35PM -0600, Josh Poimboeuf wrote:
> > stacktool reports the following false positive warnings:
> > 
> >   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x5c: sibling call from callable instruction with changed frame pointer
> >   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x60: function has unreachable instruction
> >   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x64: function has unreachable instruction
> >   [...]
> > 
> > It's confused by the following dynamic jump instruction in
> > __bpf_prog_run()::
> > 
> >   jmp     *(%r12,%rax,8)
> > 
> > which corresponds to the following line in the C code:
> > 
> >   goto *jumptable[insn->code];
> > 
> > There's no way for stacktool to deterministically find all possible
> > branch targets for a dynamic jump, so it can't verify this code.
> > 
> > In this case the jumps all stay within the function, and there's nothing
> > unusual going on related to the stack, so we can whitelist the function.
> 
> well, few things are very unusual in this function.
> did you see what JMP_CALL does? it's a call into a different function,
> but not like typical indirect call. Will it be ok as well?
> 
> In general it's not possible for any tool to identify all possible
> branch targets. bpf programs can be loaded on the fly and
> jumping sequence will change.
> So if this marking says 'don't bother analyzing this function because
> it does sane stuff' that's probably not the case.
> If this marking says 'don't bother analyzing, the stack may be crazy
> from here on' then it's ok.

So the tool doesn't need to follow all possible call targets.  Instead
it just verifies that all functions follow the frame pointer convention.
That way it doesn't matter *which* function is being called because they
all do the right thing.

But it *does* need to follow all jump targets, so that it can analyze
all possible code paths within the function itself.  With a dynamic
jump, it can't do that.

So the JMP_CALL is fine, but the goto *jumptable[insn->code] isn't.
(And BTW that's the only occurrence of such a dynamic jump table in the
entire kernel.)

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22  3:55     ` Josh Poimboeuf
@ 2016-01-22  4:18       ` Alexei Starovoitov
  2016-01-22  7:36         ` Ingo Molnar
  2016-01-22 15:58         ` Josh Poimboeuf
  0 siblings, 2 replies; 133+ messages in thread
From: Alexei Starovoitov @ 2016-01-22  4:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 09:55:31PM -0600, Josh Poimboeuf wrote:
> On Thu, Jan 21, 2016 at 06:44:28PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> > > bpf_jit.S has several callable non-leaf functions which don't honor
> > > CONFIG_FRAME_POINTER, which can result in bad stack traces.
> > > 
> > > Create a stack frame before the call instructions when
> > > CONFIG_FRAME_POINTER is enabled.
> > > 
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > Cc: netdev@vger.kernel.org
> > > ---
> > >  arch/x86/net/bpf_jit.S | 9 +++++++--
> > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
> > > index eb4a3bd..f2a7faf 100644
> > > --- a/arch/x86/net/bpf_jit.S
> > > +++ b/arch/x86/net/bpf_jit.S
> > > @@ -8,6 +8,7 @@
> > >   * of the License.
> > >   */
> > >  #include <linux/linkage.h>
> > > +#include <asm/frame.h>
> > >  
> > >  /*
> > >   * Calling convention :
> > > @@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
> > >  
> > >  /* rsi contains offset and can be scratched */
> > >  #define bpf_slow_path_common(LEN)		\
> > > +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> > > +	FRAME_BEGIN;				\
> > >  	mov	%rbx, %rdi; /* arg1 == skb */	\
> > >  	push	%r9;				\
> > >  	push	SKBDATA;			\
> > >  /* rsi already has offset */			\
> > >  	mov	$LEN,%ecx;	/* len */	\
> > > -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
> > >  	call	skb_copy_bits;			\
> > >  	test    %eax,%eax;			\
> > >  	pop	SKBDATA;			\
> > > -	pop	%r9;
> > > +	pop	%r9;				\
> > > +	FRAME_END
> > 
> > I'm not sure what above is doing.
> > There is already 'push rbp; mov rbp,rsp' at the beginning of generated
> > code and with above the stack trace will show two function at the same ip?
> > since there were no calls between them?
> > I think the stack walker will get even more confused?
> > Also the JIT of bpf_call insn will emit variable number of push/pop
> > around the call and I definitely don't want to add extra push rbp
> > there, since it's the critical path and callee will do its own
> > push rbp.
> > Also there are push/pops emitted around div/mod
> > and there is indirect goto emitted as well for bpf_tail_call
> > that jumps into different function body without touching
> > current stack.
> 
> Hm, I'm not sure I follow.  Let me try to explain my understanding.
> 
> As you mentioned, the generated code sets up the frame pointer.  From
> emit_prologue():
> 
>         EMIT1(0x55); /* push rbp */
> 	EMIT3(0x48, 0x89, 0xE5); /* mov rbp,rsp */
> 
> And then later, do_jit() can generate a call into the functions in
> bpf_jit.S.  For example:
> 
> 	func = CHOOSE_LOAD_FUNC(imm32, sk_load_word);
> 	...
> 	EMIT1_off32(0xE8, jmp_offset); /* call */
> 
> So the functions in bpf_jit.S are being called by the generated code.
> They're not part of the generated code itself.  So they're callees and
> need to create their own stack frame before they call out to somewhere
> else.
> 
> Or did I miss something?

yes. all correct.
This particular patch is ok, since it adds to
bpf_slow_path_common and as the name says it's slow and rare,
but wanted to make sure the rest of it is understood.

> > Also none of the JITed function are dwarf annotated.
> 
> But what does that have to do with frame pointers?

nothing, but then why do you need
.type name, @function
annotations in another patch?

> > I could be missing something. I think either this patch
> > is not need or you need to teach the tool to ignore
> > all JITed stuff. I don't think it's practical to annotate
> > everything. Different JITs do their own magic.
> > s390 JIT is even more fancy.
> 
> Well, but the point of these patches isn't to make the tool happy.  It's
> really to make sure that runtime stack traces can be made reliable.
> Maybe I'm missing something but I don't see why JIT code can't honor
> CONFIG_FRAME_POINTER just like any other code.

It can if there is no performance cost added.
I can speak for x64 JIT, but the rest needs to be analyzed as well.
My point was that may be it's easier to ignore all JITed code and
just say that such call stacks may be unreliable?
live-patching is not applicable to JITed code anyway
or you want to livepatch the callees of it?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22  4:18       ` Alexei Starovoitov
@ 2016-01-22  7:36         ` Ingo Molnar
  2016-01-22 15:58         ` Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: Ingo Molnar @ 2016-01-22  7:36 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, netdev, daniel


* Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> > > I could be missing something. I think either this patch is not need or you 
> > > need to teach the tool to ignore all JITed stuff. I don't think it's 
> > > practical to annotate everything. Different JITs do their own magic. s390 
> > > JIT is even more fancy.
> > 
> > Well, but the point of these patches isn't to make the tool happy.  It's 
> > really to make sure that runtime stack traces can be made reliable. Maybe I'm 
> > missing something but I don't see why JIT code can't honor 
> > CONFIG_FRAME_POINTER just like any other code.
> 
> It can if there is no performance cost added. I can speak for x64 JIT, but the 
> rest needs to be analyzed as well. My point was that may be it's easier to 
> ignore all JITed code and just say that such call stacks may be unreliable? 
> live-patching is not applicable to JITed code anyway or you want to livepatch 
> the callees of it?

So the rule is that if frame pointers are enabled all kernel code should have 
correct stack frames - in case an IRQ (or NMI) hits it or it crashes.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm
  2016-01-21 22:49 ` [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm Josh Poimboeuf
@ 2016-01-22 10:05   ` Paolo Bonzini
  2016-01-22 16:02     ` Josh Poimboeuf
  0 siblings, 1 reply; 133+ messages in thread
From: Paolo Bonzini @ 2016-01-22 10:05 UTC (permalink / raw)
  To: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Gleb Natapov, kvm



On 21/01/2016 23:49, Josh Poimboeuf wrote:
> With some configs, gcc doesn't inline test_cc().  When that happens, it
> doesn't create a stack frame before inserting the call instruction.
> This breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled
> and can result in a bad stack trace.
> 
> Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
> listing the stack pointer as an output operand for the inline asm
> statement.

If an __always_inline allocation works, that would be better.

Paolo

> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: kvm@vger.kernel.org
> ---
>  arch/x86/kvm/emulate.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index aa4d726..7dba65a 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -972,11 +972,13 @@ static int em_bsr_c(struct x86_emulate_ctxt *ctxt)
>  static u8 test_cc(unsigned int condition, unsigned long flags)
>  {
>  	u8 rc;
> +	register void *__sp asm(_ASM_SP);
>  	void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);
>  
>  	flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
>  	asm("push %[flags]; popf; call *%[fastop]"
> -	    : "=a"(rc) : [fastop]"r"(fop), [flags]"r"(flags));
> +	    : "=a"(rc), "+r"(__sp)
> +	    : [fastop]"r"(fop), [flags]"r"(flags));
>  	return rc;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions
  2016-01-21 22:49 ` [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions Josh Poimboeuf
@ 2016-01-22 10:05   ` Paolo Bonzini
  2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: Paolo Bonzini @ 2016-01-22 10:05 UTC (permalink / raw)
  To: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, Arnaldo Carvalho de Melo,
	Gleb Natapov, kvm



On 21/01/2016 23:49, Josh Poimboeuf wrote:
> The callable functions created with the FOP* and FASTOP* macros are
> missing ELF function annotations, which confuses tools like stacktool.
> Properly annotate them.
> 
> This adds some additional labels to the assembly, but the generated
> binary code is unchanged (with the exception of instructions which have
> embedded references to __LINE__).
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: kvm@vger.kernel.org
> ---
>  arch/x86/kvm/emulate.c | 29 +++++++++++++++++++++--------
>  1 file changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 1505587..aa4d726 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -309,23 +309,29 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
>  
>  static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
>  
> -#define FOP_ALIGN ".align " __stringify(FASTOP_SIZE) " \n\t"
> +#define FOP_FUNC(name) \
> +	".align " __stringify(FASTOP_SIZE) " \n\t" \
> +	".type " name ", @function \n\t" \
> +	name ":\n\t"
> +
>  #define FOP_RET   "ret \n\t"
>  
>  #define FOP_START(op) \
>  	extern void em_##op(struct fastop *fake); \
>  	asm(".pushsection .text, \"ax\" \n\t" \
>  	    ".global em_" #op " \n\t" \
> -            FOP_ALIGN \
> -	    "em_" #op ": \n\t"
> +	    FOP_FUNC("em_" #op)
>  
>  #define FOP_END \
>  	    ".popsection")
>  
> -#define FOPNOP() FOP_ALIGN FOP_RET
> +#define FOPNOP() \
> +	FOP_FUNC(__stringify(__UNIQUE_ID(nop))) \
> +	FOP_RET
>  
>  #define FOP1E(op,  dst) \
> -	FOP_ALIGN "10: " #op " %" #dst " \n\t" FOP_RET
> +	FOP_FUNC(#op "_" #dst) \
> +	"10: " #op " %" #dst " \n\t" FOP_RET
>  
>  #define FOP1EEX(op,  dst) \
>  	FOP1E(op, dst) _ASM_EXTABLE(10b, kvm_fastop_exception)
> @@ -357,7 +363,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
>  	FOP_END
>  
>  #define FOP2E(op,  dst, src)	   \
> -	FOP_ALIGN #op " %" #src ", %" #dst " \n\t" FOP_RET
> +	FOP_FUNC(#op "_" #dst "_" #src) \
> +	#op " %" #src ", %" #dst " \n\t" FOP_RET
>  
>  #define FASTOP2(op) \
>  	FOP_START(op) \
> @@ -395,7 +402,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
>  	FOP_END
>  
>  #define FOP3E(op,  dst, src, src2) \
> -	FOP_ALIGN #op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
> +	FOP_FUNC(#op "_" #dst "_" #src "_" #src2) \
> +	#op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
>  
>  /* 3-operand, word-only, src2=cl */
>  #define FASTOP3WCL(op) \
> @@ -407,7 +415,12 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
>  	FOP_END
>  
>  /* Special case for SETcc - 1 instruction per cc */
> -#define FOP_SETCC(op) ".align 4; " #op " %al; ret \n\t"
> +#define FOP_SETCC(op) \
> +	".align 4 \n\t" \
> +	".type " #op ", @function \n\t" \
> +	#op ": \n\t" \
> +	#op " %al \n\t" \
> +	FOP_RET
>  
>  asm(".global kvm_fastop_exception \n"
>      "kvm_fastop_exception: xor %esi, %esi; ret");
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22  4:18       ` Alexei Starovoitov
  2016-01-22  7:36         ` Ingo Molnar
@ 2016-01-22 15:58         ` Josh Poimboeuf
  2016-01-22 17:18           ` Alexei Starovoitov
  1 sibling, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22 15:58 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 08:18:46PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 21, 2016 at 09:55:31PM -0600, Josh Poimboeuf wrote:
> > On Thu, Jan 21, 2016 at 06:44:28PM -0800, Alexei Starovoitov wrote:
> > > On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> > > > bpf_jit.S has several callable non-leaf functions which don't honor
> > > > CONFIG_FRAME_POINTER, which can result in bad stack traces.
> > > > 
> > > > Create a stack frame before the call instructions when
> > > > CONFIG_FRAME_POINTER is enabled.
> > > > 
> > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > Cc: netdev@vger.kernel.org
> > > > ---
> > > >  arch/x86/net/bpf_jit.S | 9 +++++++--
> > > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
> > > > index eb4a3bd..f2a7faf 100644
> > > > --- a/arch/x86/net/bpf_jit.S
> > > > +++ b/arch/x86/net/bpf_jit.S
> > > > @@ -8,6 +8,7 @@
> > > >   * of the License.
> > > >   */
> > > >  #include <linux/linkage.h>
> > > > +#include <asm/frame.h>
> > > >  
> > > >  /*
> > > >   * Calling convention :
> > > > @@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
> > > >  
> > > >  /* rsi contains offset and can be scratched */
> > > >  #define bpf_slow_path_common(LEN)		\
> > > > +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> > > > +	FRAME_BEGIN;				\
> > > >  	mov	%rbx, %rdi; /* arg1 == skb */	\
> > > >  	push	%r9;				\
> > > >  	push	SKBDATA;			\
> > > >  /* rsi already has offset */			\
> > > >  	mov	$LEN,%ecx;	/* len */	\
> > > > -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
> > > >  	call	skb_copy_bits;			\
> > > >  	test    %eax,%eax;			\
> > > >  	pop	SKBDATA;			\
> > > > -	pop	%r9;
> > > > +	pop	%r9;				\
> > > > +	FRAME_END
> > > 
> > > I'm not sure what above is doing.
> > > There is already 'push rbp; mov rbp,rsp' at the beginning of generated
> > > code and with above the stack trace will show two function at the same ip?
> > > since there were no calls between them?
> > > I think the stack walker will get even more confused?
> > > Also the JIT of bpf_call insn will emit variable number of push/pop
> > > around the call and I definitely don't want to add extra push rbp
> > > there, since it's the critical path and callee will do its own
> > > push rbp.
> > > Also there are push/pops emitted around div/mod
> > > and there is indirect goto emitted as well for bpf_tail_call
> > > that jumps into different function body without touching
> > > current stack.
> > 
> > Hm, I'm not sure I follow.  Let me try to explain my understanding.
> > 
> > As you mentioned, the generated code sets up the frame pointer.  From
> > emit_prologue():
> > 
> >         EMIT1(0x55); /* push rbp */
> > 	EMIT3(0x48, 0x89, 0xE5); /* mov rbp,rsp */
> > 
> > And then later, do_jit() can generate a call into the functions in
> > bpf_jit.S.  For example:
> > 
> > 	func = CHOOSE_LOAD_FUNC(imm32, sk_load_word);
> > 	...
> > 	EMIT1_off32(0xE8, jmp_offset); /* call */
> > 
> > So the functions in bpf_jit.S are being called by the generated code.
> > They're not part of the generated code itself.  So they're callees and
> > need to create their own stack frame before they call out to somewhere
> > else.
> > 
> > Or did I miss something?
> 
> yes. all correct.
> This particular patch is ok, since it adds to
> bpf_slow_path_common and as the name says it's slow and rare,
> but wanted to make sure the rest of it is understood.
> 
> > > Also none of the JITed function are dwarf annotated.
> > 
> > But what does that have to do with frame pointers?
> 
> nothing, but then why do you need
> .type name, @function
> annotations in another patch?

Those are ELF function annotations which are needed so that stacktool
can find and analyze all callable functions (and they're a good idea
anyway for the sake of other tooling).

Obviously we can't annotate the JIT functions which have no name, but
that's ok.  As long as they're doing the right thing with respect to
frame pointers, the stack dump code will still be able to see their
frames (and that they're associated with generated code).

> > > I could be missing something. I think either this patch
> > > is not need or you need to teach the tool to ignore
> > > all JITed stuff. I don't think it's practical to annotate
> > > everything. Different JITs do their own magic.
> > > s390 JIT is even more fancy.
> > 
> > Well, but the point of these patches isn't to make the tool happy.  It's
> > really to make sure that runtime stack traces can be made reliable.
> > Maybe I'm missing something but I don't see why JIT code can't honor
> > CONFIG_FRAME_POINTER just like any other code.
> 
> It can if there is no performance cost added.

CONFIG_FRAME_POINTER always adds a small performance cost but as you
mentioned it only affects the slow path here.  And hopefully we'll soon
have an in-kernel DWARF unwinder on x86 so we can get rid of the need
for frame pointers.

> I can speak for x64 JIT, but the rest needs to be analyzed as well.
> My point was that may be it's easier to ignore all JITed code and
> just say that such call stacks may be unreliable?
> live-patching is not applicable to JITed code anyway
> or you want to livepatch the callees of it?

Right.  We can't patch JIT code, but we might need to patch other
functions on the call stack (including callees and callers).

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm
  2016-01-22 10:05   ` Paolo Bonzini
@ 2016-01-22 16:02     ` Josh Poimboeuf
  2016-01-22 16:16       ` [PATCH v16.1 26/33] x86/kvm: Make test_cc() always inline Josh Poimboeuf
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22 16:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Gleb Natapov, kvm

On Fri, Jan 22, 2016 at 11:05:06AM +0100, Paolo Bonzini wrote:
> 
> 
> On 21/01/2016 23:49, Josh Poimboeuf wrote:
> > With some configs, gcc doesn't inline test_cc().  When that happens, it
> > doesn't create a stack frame before inserting the call instruction.
> > This breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled
> > and can result in a bad stack trace.
> > 
> > Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
> > listing the stack pointer as an output operand for the inline asm
> > statement.
> 
> If an __always_inline allocation works, that would be better.

Yeah, that seems to work.  I'll update the patch.

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH v16.1 26/33] x86/kvm: Make test_cc() always inline
  2016-01-22 16:02     ` Josh Poimboeuf
@ 2016-01-22 16:16       ` Josh Poimboeuf
  2016-02-23  9:04         ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:52         ` tip-bot for Josh Poimboeuf
  0 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22 16:16 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Gleb Natapov, kvm

With some configs (including allyesconfig), gcc doesn't inline
test_cc().  When that happens, test_cc() doesn't create a stack frame
before inserting the inline asm call instruction.  This breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.

Force it to always be inlined so that its containing function's stack
frame can be used.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
---
 arch/x86/kvm/emulate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index aa4d726..80363eb 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -969,7 +969,7 @@ static int em_bsr_c(struct x86_emulate_ctxt *ctxt)
 	return fastop(ctxt, em_bsr);
 }
 
-static u8 test_cc(unsigned int condition, unsigned long flags)
+static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
 {
 	u8 rc;
 	void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22 15:58         ` Josh Poimboeuf
@ 2016-01-22 17:18           ` Alexei Starovoitov
  2016-01-22 17:36             ` Josh Poimboeuf
  0 siblings, 1 reply; 133+ messages in thread
From: Alexei Starovoitov @ 2016-01-22 17:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Fri, Jan 22, 2016 at 09:58:04AM -0600, Josh Poimboeuf wrote:
> On Thu, Jan 21, 2016 at 08:18:46PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 21, 2016 at 09:55:31PM -0600, Josh Poimboeuf wrote:
> > > On Thu, Jan 21, 2016 at 06:44:28PM -0800, Alexei Starovoitov wrote:
> > > > On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> > > > > bpf_jit.S has several callable non-leaf functions which don't honor
> > > > > CONFIG_FRAME_POINTER, which can result in bad stack traces.
> > > > > 
> > > > > Create a stack frame before the call instructions when
> > > > > CONFIG_FRAME_POINTER is enabled.
> > > > > 
> > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > > Cc: netdev@vger.kernel.org
> > > > > ---
> > > > >  arch/x86/net/bpf_jit.S | 9 +++++++--
...
> > > > >  /* rsi contains offset and can be scratched */
> > > > >  #define bpf_slow_path_common(LEN)		\
> > > > > +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> > > > > +	FRAME_BEGIN;				\
> > > > >  	mov	%rbx, %rdi; /* arg1 == skb */	\
> > > > >  	push	%r9;				\
> > > > >  	push	SKBDATA;			\
> > > > >  /* rsi already has offset */			\
> > > > >  	mov	$LEN,%ecx;	/* len */	\
> > > > > -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
> > > > >  	call	skb_copy_bits;			\
> > > > >  	test    %eax,%eax;			\
> > > > >  	pop	SKBDATA;			\
> > > > > -	pop	%r9;
> > > > > +	pop	%r9;				\
> > > > > +	FRAME_END
...
> > > Well, but the point of these patches isn't to make the tool happy.  It's
> > > really to make sure that runtime stack traces can be made reliable.
> > > Maybe I'm missing something but I don't see why JIT code can't honor
> > > CONFIG_FRAME_POINTER just like any other code.
> > 
> > It can if there is no performance cost added.
> 
> CONFIG_FRAME_POINTER always adds a small performance cost but as you
> mentioned it only affects the slow path here.  And hopefully we'll soon
> have an in-kernel DWARF unwinder on x86 so we can get rid of the need
> for frame pointers.

ok. fair enough.
Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 31/33] bpf: Add __bpf_prog_run() to stacktool whitelist
  2016-01-22  4:13     ` Josh Poimboeuf
@ 2016-01-22 17:19       ` Alexei Starovoitov
  0 siblings, 0 replies; 133+ messages in thread
From: Alexei Starovoitov @ 2016-01-22 17:19 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Thu, Jan 21, 2016 at 10:13:02PM -0600, Josh Poimboeuf wrote:
> On Thu, Jan 21, 2016 at 06:55:41PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 21, 2016 at 04:49:35PM -0600, Josh Poimboeuf wrote:
> > > stacktool reports the following false positive warnings:
> > > 
> > >   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x5c: sibling call from callable instruction with changed frame pointer
> > >   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x60: function has unreachable instruction
> > >   stacktool: kernel/bpf/core.o: __bpf_prog_run()+0x64: function has unreachable instruction
> > >   [...]
> > > 
> > > It's confused by the following dynamic jump instruction in
> > > __bpf_prog_run()::
> > > 
> > >   jmp     *(%r12,%rax,8)
> > > 
> > > which corresponds to the following line in the C code:
> > > 
> > >   goto *jumptable[insn->code];
> > > 
> > > There's no way for stacktool to deterministically find all possible
> > > branch targets for a dynamic jump, so it can't verify this code.
> > > 
> > > In this case the jumps all stay within the function, and there's nothing
> > > unusual going on related to the stack, so we can whitelist the function.
> > 
> > well, few things are very unusual in this function.
> > did you see what JMP_CALL does? it's a call into a different function,
> > but not like typical indirect call. Will it be ok as well?
> > 
> > In general it's not possible for any tool to identify all possible
> > branch targets. bpf programs can be loaded on the fly and
> > jumping sequence will change.
> > So if this marking says 'don't bother analyzing this function because
> > it does sane stuff' that's probably not the case.
> > If this marking says 'don't bother analyzing, the stack may be crazy
> > from here on' then it's ok.
> 
> So the tool doesn't need to follow all possible call targets.  Instead
> it just verifies that all functions follow the frame pointer convention.
> That way it doesn't matter *which* function is being called because they
> all do the right thing.
> 
> But it *does* need to follow all jump targets, so that it can analyze
> all possible code paths within the function itself.  With a dynamic
> jump, it can't do that.
> 
> So the JMP_CALL is fine, but the goto *jumptable[insn->code] isn't.
> (And BTW that's the only occurrence of such a dynamic jump table in the
> entire kernel.)

Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22 17:18           ` Alexei Starovoitov
@ 2016-01-22 17:36             ` Josh Poimboeuf
  2016-01-22 17:40               ` Alexei Starovoitov
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22 17:36 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Fri, Jan 22, 2016 at 09:18:23AM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 22, 2016 at 09:58:04AM -0600, Josh Poimboeuf wrote:
> > On Thu, Jan 21, 2016 at 08:18:46PM -0800, Alexei Starovoitov wrote:
> > > On Thu, Jan 21, 2016 at 09:55:31PM -0600, Josh Poimboeuf wrote:
> > > > On Thu, Jan 21, 2016 at 06:44:28PM -0800, Alexei Starovoitov wrote:
> > > > > On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> > > > > > bpf_jit.S has several callable non-leaf functions which don't honor
> > > > > > CONFIG_FRAME_POINTER, which can result in bad stack traces.
> > > > > > 
> > > > > > Create a stack frame before the call instructions when
> > > > > > CONFIG_FRAME_POINTER is enabled.
> > > > > > 
> > > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > > > Cc: netdev@vger.kernel.org
> > > > > > ---
> > > > > >  arch/x86/net/bpf_jit.S | 9 +++++++--
> ...
> > > > > >  /* rsi contains offset and can be scratched */
> > > > > >  #define bpf_slow_path_common(LEN)		\
> > > > > > +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> > > > > > +	FRAME_BEGIN;				\
> > > > > >  	mov	%rbx, %rdi; /* arg1 == skb */	\
> > > > > >  	push	%r9;				\
> > > > > >  	push	SKBDATA;			\
> > > > > >  /* rsi already has offset */			\
> > > > > >  	mov	$LEN,%ecx;	/* len */	\
> > > > > > -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
> > > > > >  	call	skb_copy_bits;			\
> > > > > >  	test    %eax,%eax;			\
> > > > > >  	pop	SKBDATA;			\
> > > > > > -	pop	%r9;
> > > > > > +	pop	%r9;				\
> > > > > > +	FRAME_END
> ...
> > > > Well, but the point of these patches isn't to make the tool happy.  It's
> > > > really to make sure that runtime stack traces can be made reliable.
> > > > Maybe I'm missing something but I don't see why JIT code can't honor
> > > > CONFIG_FRAME_POINTER just like any other code.
> > > 
> > > It can if there is no performance cost added.
> > 
> > CONFIG_FRAME_POINTER always adds a small performance cost but as you
> > mentioned it only affects the slow path here.  And hopefully we'll soon
> > have an in-kernel DWARF unwinder on x86 so we can get rid of the need
> > for frame pointers.
> 
> ok. fair enough.
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Thanks!

Can I assume your ack also applies to the previous patch which adds the
ELF annotations ("x86/asm/bpf: Annotate callable functions")?

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-22 17:36             ` Josh Poimboeuf
@ 2016-01-22 17:40               ` Alexei Starovoitov
  0 siblings, 0 replies; 133+ messages in thread
From: Alexei Starovoitov @ 2016-01-22 17:40 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, Alexei Starovoitov, netdev,
	daniel

On Fri, Jan 22, 2016 at 11:36:14AM -0600, Josh Poimboeuf wrote:
> On Fri, Jan 22, 2016 at 09:18:23AM -0800, Alexei Starovoitov wrote:
> > On Fri, Jan 22, 2016 at 09:58:04AM -0600, Josh Poimboeuf wrote:
> > > On Thu, Jan 21, 2016 at 08:18:46PM -0800, Alexei Starovoitov wrote:
> > > > On Thu, Jan 21, 2016 at 09:55:31PM -0600, Josh Poimboeuf wrote:
> > > > > On Thu, Jan 21, 2016 at 06:44:28PM -0800, Alexei Starovoitov wrote:
> > > > > > On Thu, Jan 21, 2016 at 04:49:27PM -0600, Josh Poimboeuf wrote:
> > > > > > > bpf_jit.S has several callable non-leaf functions which don't honor
> > > > > > > CONFIG_FRAME_POINTER, which can result in bad stack traces.
> > > > > > > 
> > > > > > > Create a stack frame before the call instructions when
> > > > > > > CONFIG_FRAME_POINTER is enabled.
> > > > > > > 
> > > > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > > > > Cc: netdev@vger.kernel.org
> > > > > > > ---
> > > > > > >  arch/x86/net/bpf_jit.S | 9 +++++++--
> > ...
> > > > > > >  /* rsi contains offset and can be scratched */
> > > > > > >  #define bpf_slow_path_common(LEN)		\
> > > > > > > +	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
> > > > > > > +	FRAME_BEGIN;				\
> > > > > > >  	mov	%rbx, %rdi; /* arg1 == skb */	\
> > > > > > >  	push	%r9;				\
> > > > > > >  	push	SKBDATA;			\
> > > > > > >  /* rsi already has offset */			\
> > > > > > >  	mov	$LEN,%ecx;	/* len */	\
> > > > > > > -	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
> > > > > > >  	call	skb_copy_bits;			\
> > > > > > >  	test    %eax,%eax;			\
> > > > > > >  	pop	SKBDATA;			\
> > > > > > > -	pop	%r9;
> > > > > > > +	pop	%r9;				\
> > > > > > > +	FRAME_END
> > ...
> > > > > Well, but the point of these patches isn't to make the tool happy.  It's
> > > > > really to make sure that runtime stack traces can be made reliable.
> > > > > Maybe I'm missing something but I don't see why JIT code can't honor
> > > > > CONFIG_FRAME_POINTER just like any other code.
> > > > 
> > > > It can if there is no performance cost added.
> > > 
> > > CONFIG_FRAME_POINTER always adds a small performance cost but as you
> > > mentioned it only affects the slow path here.  And hopefully we'll soon
> > > have an in-kernel DWARF unwinder on x86 so we can get rid of the need
> > > for frame pointers.
> > 
> > ok. fair enough.
> > Acked-by: Alexei Starovoitov <ast@kernel.org>
> 
> Thanks!
> 
> Can I assume your ack also applies to the previous patch which adds the
> ELF annotations ("x86/asm/bpf: Annotate callable functions")?

Yes. Thanks.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (32 preceding siblings ...)
  2016-01-21 22:49 ` [PATCH 33/33] x86/kprobes: Add kretprobe_trampoline() " Josh Poimboeuf
@ 2016-01-22 17:43 ` Chris J Arges
  2016-01-22 19:14   ` Josh Poimboeuf
  2016-02-12 10:36 ` [PATCH 00/33] Compile-time stack metadata validation Jiri Slaby
  2016-02-23  8:14 ` Ingo Molnar
  35 siblings, 1 reply; 133+ messages in thread
From: Chris J Arges @ 2016-01-22 17:43 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Andrew Morton, Jiri Slaby,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Thu, Jan 21, 2016 at 04:49:04PM -0600, Josh Poimboeuf wrote:
> This is v16 of the compile-time stack metadata validation patch set,
> along with proposed fixes for most of the warnings it found.  It's based
> on the tip/master branch.
>
Josh,

Looks good, with my config [1] I do still get a few warnings building
linux/linux-next.

Here are the warnings:
$ grep ^stacktool build.log | grep -v staging
stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup
stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: return without frame pointer restore
stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x89: duplicate frame pointer save
stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x8a: duplicate frame pointer setup
stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: frame pointer state mismatch
stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x0: frame pointer state mismatch
stacktool: fs/reiserfs/ibalance.o: .text: unexpected end of section
stacktool: fs/reiserfs/tail_conversion.o: .text: unexpected end of section

For vmx_handle_external_intr, I'm wondering if ignoring this function is the
best option.

--

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e2951b6..d19dfb2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -33,6 +33,7 @@
 #include <linux/slab.h>
 #include <linux/tboot.h>
 #include <linux/hrtimer.h>
+#include <linux/stacktool.h>
 #include "kvm_cache_regs.h"
 #include "x86.h"
 
@@ -8398,6 +8399,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
        } else
                local_irq_enable();
 }
+STACKTOOL_IGNORE_FUNC(vmx_handle_external_intr);
 
 static bool vmx_has_high_real_mode_segbase(void)
 {

--chris

[1] http://paste.ubuntu.com/14599083/

 
> v15 can be found here:
> 
>   https://lkml.kernel.org/r/cover.1450442274.git.jpoimboe@redhat.com
> 
> For more information about the motivation behind this patch set, and
> more details about what it does, see the first patch changelog and
> tools/stacktool/Documentation/stack-validation.txt.
> 
> Patches 1-4 add stacktool and integrate it into the kernel build.
> 
> Patches 5-28 are some proposed fixes for several of the warnings
> reported by stacktool.  They've been compile-tested and boot-tested in a
> VM, but I haven't attempted any meaningful testing for many of them.
> 
> Patches 29-33 add some directories, files, and functions to the
> stacktool whitelist in order to silence false positive warnings.
> 
> v16:
> - fix all allyesconfig warnings, except for staging
> - get rid of STACKTOOL_IGNORE_INSN which is no longer needed
> - remove several whitelists in favor of automatically whitelisting any
>   function with a special instruction like ljmp, lret, or vmrun
> - split up stacktool patch into 3 parts as suggested by Ingo
> - update the global noreturn function list
> - detect noreturn function fallthroughs
> - skip weak functions in noreturn call detection logic
> - add empty function check to noreturn logic
> - allow non-section rela symbols for __ex_table sections
> - support rare switch table case with jmpq *[addr](%rip)
> - don't warn on frame pointer restore without save
> - rearrange patch order a bit
> 
> v15:
> - restructure code for a new cmdline interface "stacktool check" using
>   the new subcommand framework in tools/lib/subcmd
> - fix 32 bit build fail (put __sp at end) in paravirt_types.h patch 10
>   which was reported by 0day
> 
> v14:
> - make tools/include/linux/list.h self-sufficient
> - create FRAME_OFFSET to allow 32-bit code to be able to access function
>   arguments on the stack
> - add FRAME_OFFSET usage in crypto patch 14/24: "Create stack frames in
>   aesni-intel_asm.S"
> - rename "index" -> "idx" to fix build with some compilers
> 
> v13:
> - LDFLAGS order fix from Chris J Arges
> - new warning fix patches from Chris J Arges
> - "--frame-pointer" -> "--check-frame-pointer"
> 
> v12:
> - rename "stackvalidate" -> "stacktool"
> - move from scripts/ to tools/:
>   - makefile rework
>   - make a copy of the x86 insn code (and warn if the code diverges)
>   - use tools/include/linux/list.h
> - move warning macros to a new warn.h file
> - change wording: "stack validation" -> "stack metadata validation"
> 
> v11:
> - attempt to answer the "why" question better in the documentation and
>   commit message
> - s/FP_SAVE/FRAME_BEGIN/ in documentation
> 
> v10:
> - add scripts/mod to directory ignores
> - remove circular dependencies for ignored objects which are built
>   before stackvalidate
> - fix CONFIG_MODVERSIONS incompatibility
> 
> v9:
> - rename FRAME/ENDFRAME -> FRAME_BEGIN/FRAME_END
> - fix jump table issue for when the original instruction is a jump
> - drop paravirt thunk alignment patch
> - add maintainers to CC for proposed warning fixes
> 
> v8:
> - add proposed fixes for warnings
> - fix all memory leaks
> - process ignores earlier and add more ignore checks
> - always assume POPCNT alternative is enabled
> - drop hweight inline asm fix
> - drop __schedule() ignore patch
> - change .Ltemp_\@ to .Lstackvalidate_ignore_\@ in asm macro
> - fix CONFIG_* checks in asm macros
> - add C versions of ignore macros and frame macros
> - change ";" to "\n" in C macros
> - add ifdef CONFIG_STACK_VALIDATION checks in C ignore macros
> - use numbered label in C ignore macro
> - add missing break in switch case statement in arch-x86.c
> 
> v7:
> - sibling call support
> - document proposed solution for inline asm() frame pointer issues
> - say "kernel entry/exit" instead of "context switch"
> - clarify the checking of switch statement jump tables
> - discard __stackvalidate_ignore_* sections in linker script
> - use .Ltemp_\@ to get a unique label instead of static 3-digit number
> - change STACKVALIDATE_IGNORE_FUNC variable to a static
> - move STACKVALIDATE_IGNORE_INSN to arch-specific .h file
> 
> v6:
> - rename asmvalidate -> stackvalidate (again)
> - gcc-generated object file support
> - recursive branch state analysis
> - external jump support
> - fixup/exception table support
> - jump label support
> - switch statement jump table support
> - added documentation
> - detection of "noreturn" dead end functions
> - added a Kbuild mechanism for skipping files and dirs
> - moved frame pointer macros to arch/x86/include/asm/frame.h
> - moved ignore macros to include/linux/stackvalidate.h
> 
> v5:
> - stackvalidate -> asmvalidate
> - frame pointers only required for non-leaf functions
> - check for the use of the FP_SAVE/RESTORE macros instead of manually
>   analyzing code to detect frame pointer usage
> - additional checks to ensure each function doesn't leave its boundaries
> - make the macros simpler and more flexible
> - support for analyzing ALTERNATIVE macros
> - simplified the arch interfaces in scripts/asmvalidate/arch.h
> - fixed some asmvalidate warnings
> - rebased onto latest tip asm cleanups
> - many more small changes
> 
> v4:
> - Changed the default to CONFIG_STACK_VALIDATION=n, until all the asm
>   code can get cleaned up.
> - Fixed a stackvalidate error path exit code issue found by Michal
>   Marek.
> 
> v3:
> - Added a patch to make the push/pop CFI macros arch-independent, as
>   suggested by H. Peter Anvin
> 
> v2:
> - Fixed memory leaks reported by Petr Mladek
> 
> Cc: linux-kernel@vger.kernel.org
> Cc: live-patching@vger.kernel.org
> Cc: Michal Marek <mmarek@suse.cz>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andi Kleen <andi@firstfloor.org>
> Cc: Pedro Alves <palves@redhat.com>
> Cc: Namhyung Kim <namhyung@gmail.com>
> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
> Cc: Chris J Arges <chris.j.arges@canonical.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jiri Slaby <jslaby@suse.cz>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> 
> Chris J Arges (1):
>   x86/uaccess: Add stack frame output operand in get_user inline asm
> 
> Josh Poimboeuf (32):
>   x86/stacktool: Compile-time stack metadata validation
>   kbuild/stacktool: Add CONFIG_STACK_VALIDATION option
>   x86/stacktool: Enable stacktool on x86_64
>   x86/stacktool: Add STACKTOOL_IGNORE_FUNC macro
>   x86/xen: Add stack frame dependency to hypercall inline asm calls
>   x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()
>   x86/asm/xen: Create stack frames in xen-asm.S
>   x86/paravirt: Add stack frame dependency to PVOP inline asm calls
>   x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK
>   x86/amd: Set ELF function type for vide()
>   x86/asm/crypto: Move .Lbswap_mask data to .rodata section
>   x86/asm/crypto: Move jump_table to .rodata section
>   x86/asm/crypto: Simplify stack usage in sha-mb functions
>   x86/asm/crypto: Don't use rbp as a scratch register
>   x86/asm/crypto: Create stack frames in crypto functions
>   x86/asm/entry: Create stack frames in thunk functions
>   x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
>   x86/asm: Create stack frames in rwsem functions
>   x86/asm/efi: Create a stack frame in efi_call()
>   x86/asm/power: Create stack frames in hibernate_asm_64.S
>   x86/asm/bpf: Annotate callable functions
>   x86/asm/bpf: Create stack frames in bpf_jit.S
>   x86/kprobes: Get rid of kretprobe_trampoline_holder()
>   x86/kvm: Set ELF function type for fastop functions
>   x86/kvm: Add stack frame dependency to test_cc() inline asm
>   watchdog/hpwdt: Create stack frame in asminline_call()
>   x86/locking: Create stack frame in PV unlock
>   x86/stacktool: Add directory and file whitelists
>   x86/xen: Add xen_cpuid() to stacktool whitelist
>   bpf: Add __bpf_prog_run() to stacktool whitelist
>   sched: Add __schedule() to stacktool whitelist
>   x86/kprobes: Add kretprobe_trampoline() to stacktool whitelist
> 
>  MAINTAINERS                                        |   6 +
>  Makefile                                           |   5 +-
>  arch/Kconfig                                       |   6 +
>  arch/x86/Kconfig                                   |   1 +
>  arch/x86/boot/Makefile                             |   1 +
>  arch/x86/boot/compressed/Makefile                  |   3 +-
>  arch/x86/crypto/aesni-intel_asm.S                  |  75 +-
>  arch/x86/crypto/camellia-aesni-avx-asm_64.S        |  15 +
>  arch/x86/crypto/camellia-aesni-avx2-asm_64.S       |  15 +
>  arch/x86/crypto/cast5-avx-x86_64-asm_64.S          |   9 +
>  arch/x86/crypto/cast6-avx-x86_64-asm_64.S          |  13 +
>  arch/x86/crypto/crc32c-pcl-intel-asm_64.S          |   8 +-
>  arch/x86/crypto/ghash-clmulni-intel_asm.S          |   5 +
>  arch/x86/crypto/serpent-avx-x86_64-asm_64.S        |  13 +
>  arch/x86/crypto/serpent-avx2-asm_64.S              |  13 +
>  arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S    |  35 +-
>  arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S   |  36 +-
>  arch/x86/crypto/twofish-avx-x86_64-asm_64.S        |  13 +
>  arch/x86/entry/Makefile                            |   4 +
>  arch/x86/entry/thunk_64.S                          |   4 +
>  arch/x86/entry/vdso/Makefile                       |   5 +-
>  arch/x86/include/asm/paravirt.h                    |   9 +-
>  arch/x86/include/asm/paravirt_types.h              |  18 +-
>  arch/x86/include/asm/qspinlock_paravirt.h          |   4 +
>  arch/x86/include/asm/uaccess.h                     |   5 +-
>  arch/x86/include/asm/xen/hypercall.h               |   5 +-
>  arch/x86/kernel/Makefile                           |   5 +
>  arch/x86/kernel/acpi/wakeup_64.S                   |   3 +
>  arch/x86/kernel/cpu/amd.c                          |   5 +-
>  arch/x86/kernel/kprobes/core.c                     |  59 +-
>  arch/x86/kernel/vmlinux.lds.S                      |   5 +-
>  arch/x86/kvm/emulate.c                             |  33 +-
>  arch/x86/lib/rwsem.S                               |  11 +-
>  arch/x86/net/bpf_jit.S                             |  48 +-
>  arch/x86/platform/efi/Makefile                     |   2 +
>  arch/x86/platform/efi/efi_stub_64.S                |   3 +
>  arch/x86/power/hibernate_asm_64.S                  |   7 +
>  arch/x86/purgatory/Makefile                        |   2 +
>  arch/x86/realmode/Makefile                         |   4 +-
>  arch/x86/realmode/rm/Makefile                      |   3 +-
>  arch/x86/xen/enlighten.c                           |   3 +-
>  arch/x86/xen/xen-asm.S                             |  10 +-
>  arch/x86/xen/xen-asm_64.S                          |   1 +
>  drivers/firmware/efi/libstub/Makefile              |   1 +
>  drivers/watchdog/hpwdt.c                           |   8 +-
>  include/linux/stacktool.h                          |  23 +
>  kernel/bpf/core.c                                  |   2 +
>  kernel/sched/core.c                                |   2 +
>  lib/Kconfig.debug                                  |  12 +
>  scripts/Makefile.build                             |  38 +-
>  scripts/mod/Makefile                               |   2 +
>  tools/Makefile                                     |  14 +-
>  tools/stacktool/.gitignore                         |   2 +
>  tools/stacktool/Build                              |  13 +
>  tools/stacktool/Documentation/stack-validation.txt | 333 +++++++
>  tools/stacktool/Makefile                           |  60 ++
>  tools/stacktool/arch.h                             |  44 +
>  tools/stacktool/arch/x86/Build                     |  12 +
>  tools/stacktool/arch/x86/decode.c                  | 172 ++++
>  .../stacktool/arch/x86/insn/gen-insn-attr-x86.awk  | 387 ++++++++
>  tools/stacktool/arch/x86/insn/inat.c               |  97 ++
>  tools/stacktool/arch/x86/insn/inat.h               | 221 +++++
>  tools/stacktool/arch/x86/insn/inat_types.h         |  29 +
>  tools/stacktool/arch/x86/insn/insn.c               | 594 ++++++++++++
>  tools/stacktool/arch/x86/insn/insn.h               | 201 +++++
>  tools/stacktool/arch/x86/insn/x86-opcode-map.txt   | 984 ++++++++++++++++++++
>  tools/stacktool/builtin-check.c                    | 991 +++++++++++++++++++++
>  tools/stacktool/builtin.h                          |  22 +
>  tools/stacktool/elf.c                              | 403 +++++++++
>  tools/stacktool/elf.h                              |  79 ++
>  tools/stacktool/special.c                          | 193 ++++
>  tools/stacktool/special.h                          |  42 +
>  tools/stacktool/stacktool.c                        | 134 +++
>  tools/stacktool/warn.h                             |  60 ++
>  74 files changed, 5516 insertions(+), 189 deletions(-)
>  create mode 100644 include/linux/stacktool.h
>  create mode 100644 tools/stacktool/.gitignore
>  create mode 100644 tools/stacktool/Build
>  create mode 100644 tools/stacktool/Documentation/stack-validation.txt
>  create mode 100644 tools/stacktool/Makefile
>  create mode 100644 tools/stacktool/arch.h
>  create mode 100644 tools/stacktool/arch/x86/Build
>  create mode 100644 tools/stacktool/arch/x86/decode.c
>  create mode 100644 tools/stacktool/arch/x86/insn/gen-insn-attr-x86.awk
>  create mode 100644 tools/stacktool/arch/x86/insn/inat.c
>  create mode 100644 tools/stacktool/arch/x86/insn/inat.h
>  create mode 100644 tools/stacktool/arch/x86/insn/inat_types.h
>  create mode 100644 tools/stacktool/arch/x86/insn/insn.c
>  create mode 100644 tools/stacktool/arch/x86/insn/insn.h
>  create mode 100644 tools/stacktool/arch/x86/insn/x86-opcode-map.txt
>  create mode 100644 tools/stacktool/builtin-check.c
>  create mode 100644 tools/stacktool/builtin.h
>  create mode 100644 tools/stacktool/elf.c
>  create mode 100644 tools/stacktool/elf.h
>  create mode 100644 tools/stacktool/special.c
>  create mode 100644 tools/stacktool/special.h
>  create mode 100644 tools/stacktool/stacktool.c
>  create mode 100644 tools/stacktool/warn.h
> 
> -- 
> 2.4.3
> 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-01-22 17:43 ` [PATCH 00/33] Compile-time stack metadata validation Chris J Arges
@ 2016-01-22 19:14   ` Josh Poimboeuf
  2016-01-22 20:40     ` Chris J Arges
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22 19:14 UTC (permalink / raw)
  To: Chris J Arges
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Andrew Morton, Jiri Slaby,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Jan 22, 2016 at 11:43:48AM -0600, Chris J Arges wrote:
> On Thu, Jan 21, 2016 at 04:49:04PM -0600, Josh Poimboeuf wrote:
> > This is v16 of the compile-time stack metadata validation patch set,
> > along with proposed fixes for most of the warnings it found.  It's based
> > on the tip/master branch.
> >
> Josh,
> 
> Looks good, with my config [1] I do still get a few warnings building
> linux/linux-next.
> 
> Here are the warnings:
> $ grep ^stacktool build.log | grep -v staging

Thanks for reporting these!

> stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup

This can be fixed by setting the stack pointer as an output operand for
the inline asm call in vmx_handle_external_intr().

Feel free to submit a patch, or I'll get around to it eventually.

> stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: return without frame pointer restore
> stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x89: duplicate frame pointer save
> stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x8a: duplicate frame pointer setup
> stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: frame pointer state mismatch
> stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x0: frame pointer state mismatch

These are false positives.  Stacktool is confused by the use of a
"noreturn" function which it doesn't know about (__reiserfs_panic).

Unfortunately the only solution I currently have for dealing with global
noreturn functions is to just hard-code a list of them.  So the short
term fix would be to add "__reiserfs_panic" to the global_noreturns list
in tools/stacktool/builtin-check.c.

I'm still trying to figure out a better way to deal with this type of
issue, as it's a pain to have to keep a hard-coded list of noreturn
functions.  Unfortunately that info isn't available in the ELF.

> stacktool: fs/reiserfs/ibalance.o: .text: unexpected end of section
> stacktool: fs/reiserfs/tail_conversion.o: .text: unexpected end of section

For some reason I'm not able to recreate these warnings...  Can you
share one of the .o files?

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-01-22 19:14   ` Josh Poimboeuf
@ 2016-01-22 20:40     ` Chris J Arges
  2016-01-22 20:47       ` Josh Poimboeuf
  0 siblings, 1 reply; 133+ messages in thread
From: Chris J Arges @ 2016-01-22 20:40 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Andrew Morton, Jiri Slaby,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Jan 22, 2016 at 01:14:47PM -0600, Josh Poimboeuf wrote:
> On Fri, Jan 22, 2016 at 11:43:48AM -0600, Chris J Arges wrote:
> > On Thu, Jan 21, 2016 at 04:49:04PM -0600, Josh Poimboeuf wrote:
> > > This is v16 of the compile-time stack metadata validation patch set,
> > > along with proposed fixes for most of the warnings it found.  It's based
> > > on the tip/master branch.
> > >
> > Josh,
> > 
> > Looks good, with my config [1] I do still get a few warnings building
> > linux/linux-next.
> > 
> > Here are the warnings:
> > $ grep ^stacktool build.log | grep -v staging
> 
> Thanks for reporting these!
> 
> > stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup
> 
> This can be fixed by setting the stack pointer as an output operand for
> the inline asm call in vmx_handle_external_intr().
> 
> Feel free to submit a patch, or I'll get around to it eventually.
> 
> > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: return without frame pointer restore
> > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x89: duplicate frame pointer save
> > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x8a: duplicate frame pointer setup
> > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: frame pointer state mismatch
> > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x0: frame pointer state mismatch
> 
> These are false positives.  Stacktool is confused by the use of a
> "noreturn" function which it doesn't know about (__reiserfs_panic).
> 
> Unfortunately the only solution I currently have for dealing with global
> noreturn functions is to just hard-code a list of them.  So the short
> term fix would be to add "__reiserfs_panic" to the global_noreturns list
> in tools/stacktool/builtin-check.c.
> 
> I'm still trying to figure out a better way to deal with this type of
> issue, as it's a pain to have to keep a hard-coded list of noreturn
> functions.  Unfortunately that info isn't available in the ELF.
> 

Josh,
Ok I'll hack on the patches above.

> > stacktool: fs/reiserfs/ibalance.o: .text: unexpected end of section
> > stacktool: fs/reiserfs/tail_conversion.o: .text: unexpected end of section
> 
> For some reason I'm not able to recreate these warnings...  Can you
> share one of the .o files?
> 
> -- 
> Josh
> 

Binaries are here:
http://people.canonical.com/~arges/stacktool/

--chris

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-01-22 20:40     ` Chris J Arges
@ 2016-01-22 20:47       ` Josh Poimboeuf
  2016-01-22 21:44         ` [PATCH 0/2] A few stacktool warning fixes Chris J Arges
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-22 20:47 UTC (permalink / raw)
  To: Chris J Arges
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Andrew Morton, Jiri Slaby,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Jan 22, 2016 at 02:40:35PM -0600, Chris J Arges wrote:
> On Fri, Jan 22, 2016 at 01:14:47PM -0600, Josh Poimboeuf wrote:
> > On Fri, Jan 22, 2016 at 11:43:48AM -0600, Chris J Arges wrote:
> > > On Thu, Jan 21, 2016 at 04:49:04PM -0600, Josh Poimboeuf wrote:
> > > > This is v16 of the compile-time stack metadata validation patch set,
> > > > along with proposed fixes for most of the warnings it found.  It's based
> > > > on the tip/master branch.
> > > >
> > > Josh,
> > > 
> > > Looks good, with my config [1] I do still get a few warnings building
> > > linux/linux-next.
> > > 
> > > Here are the warnings:
> > > $ grep ^stacktool build.log | grep -v staging
> > 
> > Thanks for reporting these!
> > 
> > > stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup
> > 
> > This can be fixed by setting the stack pointer as an output operand for
> > the inline asm call in vmx_handle_external_intr().
> > 
> > Feel free to submit a patch, or I'll get around to it eventually.
> > 
> > > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: return without frame pointer restore
> > > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x89: duplicate frame pointer save
> > > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x8a: duplicate frame pointer setup
> > > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: frame pointer state mismatch
> > > stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x0: frame pointer state mismatch
> > 
> > These are false positives.  Stacktool is confused by the use of a
> > "noreturn" function which it doesn't know about (__reiserfs_panic).
> > 
> > Unfortunately the only solution I currently have for dealing with global
> > noreturn functions is to just hard-code a list of them.  So the short
> > term fix would be to add "__reiserfs_panic" to the global_noreturns list
> > in tools/stacktool/builtin-check.c.
> > 
> > I'm still trying to figure out a better way to deal with this type of
> > issue, as it's a pain to have to keep a hard-coded list of noreturn
> > functions.  Unfortunately that info isn't available in the ELF.
> > 
> 
> Josh,
> Ok I'll hack on the patches above.
> 
> > > stacktool: fs/reiserfs/ibalance.o: .text: unexpected end of section
> > > stacktool: fs/reiserfs/tail_conversion.o: .text: unexpected end of section
> > 
> > For some reason I'm not able to recreate these warnings...  Can you
> > share one of the .o files?
> 
> Binaries are here:
> http://people.canonical.com/~arges/stacktool/

Thanks, looks like the same __reiserfs_panic() noreturn fix for those.

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 0/2] A few stacktool warning fixes.
  2016-01-22 20:47       ` Josh Poimboeuf
@ 2016-01-22 21:44         ` Chris J Arges
  2016-01-22 21:44           ` [PATCH 1/2] tools/stacktool: Add __reiserfs_panic to global_noreturns list Chris J Arges
  2016-01-22 21:44           ` [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm Chris J Arges
  0 siblings, 2 replies; 133+ messages in thread
From: Chris J Arges @ 2016-01-22 21:44 UTC (permalink / raw)
  To: jpoimboe, live-patching, x86, pbonzini
  Cc: gleb, tglx, mingo, hpa, kvm, linux-kernel, Chris J Arges

These patches fix a few warnings I saw testing stacktool v16.
I've done light testing on these by booting it on a machine and running
kvm-unit-tests on it.

Chris J Arges (2):
  tools/stacktool: Add __reiserfs_panic to global_noreturns list
  x86/kvm: Add output operand in vmx_handle_external_intr inline asm

 arch/x86/kvm/vmx.c              | 4 +++-
 tools/stacktool/builtin-check.c | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 1/2] tools/stacktool: Add __reiserfs_panic to global_noreturns list
  2016-01-22 21:44         ` [PATCH 0/2] A few stacktool warning fixes Chris J Arges
@ 2016-01-22 21:44           ` Chris J Arges
  2016-01-25 15:04             ` Josh Poimboeuf
  2016-01-22 21:44           ` [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm Chris J Arges
  1 sibling, 1 reply; 133+ messages in thread
From: Chris J Arges @ 2016-01-22 21:44 UTC (permalink / raw)
  To: jpoimboe, live-patching, x86, pbonzini
  Cc: gleb, tglx, mingo, hpa, kvm, linux-kernel, Chris J Arges

The following false positives were noticed with stacktool:
  stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: return without frame pointer restore
  stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x89: duplicate frame pointer save
  stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x8a: duplicate frame pointer setup
  stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: frame pointer state mismatch
  stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x0: frame pointer state mismatch

These all call into '__reiserfs_panic' which has a noreturn attribute. Add this
to the global list because this particular attribute cannot be determined from
reading the ELF object.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 tools/stacktool/builtin-check.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/stacktool/builtin-check.c b/tools/stacktool/builtin-check.c
index 5b0e91f..23fa93d2 100644
--- a/tools/stacktool/builtin-check.c
+++ b/tools/stacktool/builtin-check.c
@@ -139,6 +139,7 @@ static bool dead_end_function(struct stacktool_file *file, struct symbol *func)
 		"__module_put_and_exit",
 		"complete_and_exit",
 		"kvm_spurious_fault",
+		"__reiserfs_panic",
 	};
 
 	if (func->bind == STB_WEAK)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm
  2016-01-22 21:44         ` [PATCH 0/2] A few stacktool warning fixes Chris J Arges
  2016-01-22 21:44           ` [PATCH 1/2] tools/stacktool: Add __reiserfs_panic to global_noreturns list Chris J Arges
@ 2016-01-22 21:44           ` Chris J Arges
  2016-01-25 15:05             ` Josh Poimboeuf
                               ` (2 more replies)
  1 sibling, 3 replies; 133+ messages in thread
From: Chris J Arges @ 2016-01-22 21:44 UTC (permalink / raw)
  To: jpoimboe, live-patching, x86, pbonzini
  Cc: gleb, tglx, mingo, hpa, kvm, linux-kernel, Chris J Arges

Stacktool generates the following warning:
  stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup

By adding the stackpointer as an output operand, this patch ensures that a
stack frame is created when CONFIG_FRAME_POINTER is enabled for the inline
assmebly statement.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 arch/x86/kvm/vmx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e2951b6..e153522 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8356,6 +8356,7 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 {
 	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+	register void *__sp asm(_ASM_SP);
 
 	/*
 	 * If external interrupt exists, IF bit is set in rflags/eflags on the
@@ -8388,8 +8389,9 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 			"call *%[entry]\n\t"
 			:
 #ifdef CONFIG_X86_64
-			[sp]"=&r"(tmp)
+			[sp]"=&r"(tmp),
 #endif
+			"+r"(__sp)
 			:
 			[entry]"r"(entry),
 			[ss]"i"(__KERNEL_DS),
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH 1/2] tools/stacktool: Add __reiserfs_panic to global_noreturns list
  2016-01-22 21:44           ` [PATCH 1/2] tools/stacktool: Add __reiserfs_panic to global_noreturns list Chris J Arges
@ 2016-01-25 15:04             ` Josh Poimboeuf
  0 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-25 15:04 UTC (permalink / raw)
  To: Chris J Arges
  Cc: live-patching, x86, pbonzini, gleb, tglx, mingo, hpa, kvm, linux-kernel

On Fri, Jan 22, 2016 at 03:44:37PM -0600, Chris J Arges wrote:
> The following false positives were noticed with stacktool:
>   stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: return without frame pointer restore
>   stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x89: duplicate frame pointer save
>   stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x8a: duplicate frame pointer setup
>   stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x9e: frame pointer state mismatch
>   stacktool: fs/reiserfs/namei.o: set_de_name_and_namelen()+0x0: frame pointer state mismatch
> 
> These all call into '__reiserfs_panic' which has a noreturn attribute. Add this
> to the global list because this particular attribute cannot be determined from
> reading the ELF object.
> 
> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>

> ---
>  tools/stacktool/builtin-check.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/stacktool/builtin-check.c b/tools/stacktool/builtin-check.c
> index 5b0e91f..23fa93d2 100644
> --- a/tools/stacktool/builtin-check.c
> +++ b/tools/stacktool/builtin-check.c
> @@ -139,6 +139,7 @@ static bool dead_end_function(struct stacktool_file *file, struct symbol *func)
>  		"__module_put_and_exit",
>  		"complete_and_exit",
>  		"kvm_spurious_fault",
> +		"__reiserfs_panic",
>  	};
>  
>  	if (func->bind == STB_WEAK)
> -- 
> 2.5.0
> 

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm
  2016-01-22 21:44           ` [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm Chris J Arges
@ 2016-01-25 15:05             ` Josh Poimboeuf
  2016-02-23  9:05             ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
  2016-02-25  5:53             ` tip-bot for Chris J Arges
  2 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-01-25 15:05 UTC (permalink / raw)
  To: Chris J Arges
  Cc: live-patching, x86, pbonzini, gleb, tglx, mingo, hpa, kvm, linux-kernel

On Fri, Jan 22, 2016 at 03:44:38PM -0600, Chris J Arges wrote:
> Stacktool generates the following warning:
>   stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup
> 
> By adding the stackpointer as an output operand, this patch ensures that a
> stack frame is created when CONFIG_FRAME_POINTER is enabled for the inline
> assmebly statement.
> 
> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>

Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

> ---
>  arch/x86/kvm/vmx.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index e2951b6..e153522 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -8356,6 +8356,7 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
>  static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
>  {
>  	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> +	register void *__sp asm(_ASM_SP);
>  
>  	/*
>  	 * If external interrupt exists, IF bit is set in rflags/eflags on the
> @@ -8388,8 +8389,9 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
>  			"call *%[entry]\n\t"
>  			:
>  #ifdef CONFIG_X86_64
> -			[sp]"=&r"(tmp)
> +			[sp]"=&r"(tmp),
>  #endif
> +			"+r"(__sp)
>  			:
>  			[entry]"r"(entry),
>  			[ss]"i"(__KERNEL_DS),
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (33 preceding siblings ...)
  2016-01-22 17:43 ` [PATCH 00/33] Compile-time stack metadata validation Chris J Arges
@ 2016-02-12 10:36 ` Jiri Slaby
  2016-02-12 10:41   ` Jiri Slaby
  2016-02-12 14:45   ` Josh Poimboeuf
  2016-02-23  8:14 ` Ingo Molnar
  35 siblings, 2 replies; 133+ messages in thread
From: Jiri Slaby @ 2016-02-12 10:36 UTC (permalink / raw)
  To: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Arnaldo Carvalho de Melo, David Vrabel,
	Borislav Petkov, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	Jeremy Fitzhardinge, Chris Wright, Alok Kataria, Rusty Russell,
	Herbert Xu, David S. Miller, Pavel Machek, Rafael J. Wysocki,
	Len Brown, Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On 01/21/2016, 11:49 PM, Josh Poimboeuf wrote:
> This is v16 of the compile-time stack metadata validation patch set,
> along with proposed fixes for most of the warnings it found.  It's based
> on the tip/master branch.

Hi,

with this config:
https://github.com/openSUSE/kernel-source/blob/master/config/x86_64/vanilla

I am seeing a lot of functions in C which do not have frame pointer setup/cleanup:
stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
stacktool: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.o: cfs_cdebug_show.part.5.constprop.35()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.o: cfs_cdebug_show.part.5.constprop.35()+0x8: duplicate frame pointer save
stacktool: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.o: cfs_cdebug_show.part.5.constprop.35()+0x9: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: ksocknal_connsock_decref()+0x0: duplicate frame pointer save
stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: ksocknal_connsock_decref()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: ksocknal_connsock_decref()+0x1: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: cfs_cdebug_show.part.1.constprop.16()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: cfs_cdebug_show.part.1.constprop.16()+0x8: duplicate frame pointer save
stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: cfs_cdebug_show.part.1.constprop.16()+0x9: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lnet/lnet/lo.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lnet/lnet/nidstrings.o: cfs_print_nidlist()+0x220: frame pointer state mismatch
stacktool: drivers/staging/lustre/lnet/lnet/peer.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lnet/lnet/router.o: cfs_cdebug_show.part.0.constprop.16()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lnet/lnet/router.o: cfs_cdebug_show.part.0.constprop.16()+0x8: duplicate frame pointer save
stacktool: drivers/staging/lustre/lnet/lnet/router.o: cfs_cdebug_show.part.0.constprop.16()+0x9: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lnet/lnet/router.o: lnet_find_net_locked()+0x8a: frame pointer state mismatch
stacktool: drivers/staging/lustre/lnet/lnet/router.o: lnet_find_net_locked()+0x8a: return without frame pointer restore
stacktool: drivers/staging/lustre/lustre/fid/fid_request.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/fld/lproc_fld.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/libcfs/libcfs_lock.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/libcfs/libcfs_mem.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/dir.o: obd_unpackmd()+0x0: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/llite/dir.o: obd_unpackmd()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/dir.o: obd_unpackmd()+0x4: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/file.o: md_intent_lock.part.28()+0x0: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/llite/file.o: md_intent_lock.part.28()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/file.o: md_intent_lock.part.28()+0x24: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x1a: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x1b: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x19: return without frame pointer restore
stacktool: drivers/staging/lustre/lustre/llite/../lclient/lcommon_misc.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/llite_mmap.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/lproc_llite.o: checksum_pages_store()+0x19e: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/namei.o: ll_test_inode()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/namei.o: ll_test_inode()+0x5: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/llite/namei.o: ll_test_inode()+0x9: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/rw.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: md_revalidate_lock.part.26()+0x0: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: md_revalidate_lock.part.26()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: md_revalidate_lock.part.26()+0x24: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: sa_args_fini()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: sa_args_fini()+0x5: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: sa_args_fini()+0x9: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/statahead.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/vvp_page.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/xattr_cache.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/llite/xattr.o: get_xattr_type()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/llite/xattr.o: get_xattr_type()+0x1f: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/llite/xattr.o: get_xattr_type()+0x5: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/lmv/lmv_intent.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/lmv/lmv_obd.o: __lmv_fid_alloc()+0x185: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/lov/lov_io.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/lov/lovsub_dev.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/mdc/mdc_lib.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/mdc/mdc_locks.o: .text.unlikely: unexpected end of section
stacktool: drivers/staging/lustre/lustre/obdclass/debug.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/obdclass/genops.o: class_name2dev()+0xc7: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/obdclass/lustre_handles.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/obdclass/obd_config.o: lustre_cfg_string()+0x0: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/obdclass/obd_config.o: lustre_cfg_string()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/obdclass/obd_config.o: lustre_cfg_string()+0x4: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: __client_obd_list_lock()+0x0: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: __client_obd_list_lock()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: __client_obd_list_lock()+0x1: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: osc_extent_search()+0x78: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: osc_extent_search()+0x78: return without frame pointer restore
stacktool: drivers/staging/lustre/lustre/osc/osc_dev.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/osc/osc_page.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/ptlrpc/connection.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/ptlrpc/import.o: deuuidify.constprop.8()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/ptlrpc/import.o: deuuidify.constprop.8()+0x5: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/ptlrpc/import.o: deuuidify.constprop.8()+0x6: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/ptlrpc/llog_net.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/ptlrpc/../../lustre/ldlm/ldlm_extent.o: ldlm_extent_shift_kms()+0x93: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.o: ldlm_work_bl_ast_lock()+0x156: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.o: ldlm_work_cp_ast_lock()+0xda: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: nrs_policy_register()+0x0: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: nrs_policy_register()+0x5: duplicate frame pointer save
stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: nrs_policy_register()+0x6: duplicate frame pointer setup
stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/ptlrpc/pack_generic.o: lustre_swab_mgs_nidtbl_entry()+0x89: frame pointer state mismatch
stacktool: drivers/staging/lustre/lustre/ptlrpc/pack_generic.o: lustre_swab_mgs_nidtbl_entry()+0x89: return without frame pointer restore
stacktool: drivers/staging/lustre/lustre/ptlrpc/sec_bulk.o: .text: unexpected end of section
stacktool: drivers/staging/lustre/lustre/ptlrpc/sec_config.o: .text: unexpected end of section
stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup



For example do_profile_hits.isra.5:
0000000000003360 <hpsa_scsi_do_simple_cmd.constprop.106>:
    3360:       e8 00 00 00 00          callq  3365 <hpsa_scsi_do_simple_cmd.constprop.106+0x5>
                        3361: R_X86_64_PC32     __fentry__-0x4
    3365:       65 ff 05 00 00 00 00    incl   %gs:0x0(%rip)        # 336c <hpsa_scsi_do_simple_cmd.constprop.106+0xc>
                        3368: R_X86_64_PC32     __preempt_count-0x4
    336c:       65 8b 0d 00 00 00 00    mov    %gs:0x0(%rip),%ecx        # 3373 <hpsa_scsi_do_simple_cmd.constprop.106+0x13>
                        336f: R_X86_64_PC32     cpu_number-0x4
    3373:       48 63 c9                movslq %ecx,%rcx
    3376:       48 8b 87 b8 4b 00 00    mov    0x4bb8(%rdi),%rax
    337d:       48 8b 0c cd 00 00 00    mov    0x0(,%rcx,8),%rcx
    3384:       00 
                        3381: R_X86_64_32S      __per_cpu_offset
    3385:       8b 04 01                mov    (%rcx,%rax,1),%eax
    3388:       65 ff 0d 00 00 00 00    decl   %gs:0x0(%rip)        # 338f <hpsa_scsi_do_simple_cmd.constprop.106+0x2f>
                        338b: R_X86_64_PC32     __preempt_count-0x4
    338f:       74 48                   je     33d9 <hpsa_scsi_do_simple_cmd.constprop.106+0x79>
    3391:       85 c0                   test   %eax,%eax
    3393:       75 4d                   jne    33e2 <hpsa_scsi_do_simple_cmd.constprop.106+0x82>
    3395:       55                      push   %rbp
    3396:       48 89 e5                mov    %rsp,%rbp
    3399:       53                      push   %rbx
    339a:       48 8d 5d d8             lea    -0x28(%rbp),%rbx
    339e:       48 83 ec 20             sub    $0x20,%rsp
    33a2:       c7 45 d8 00 00 00 00    movl   $0x0,-0x28(%rbp)
    33a9:       c7 45 e0 00 00 00 00    movl   $0x0,-0x20(%rbp)
    33b0:       48 8d 43 10             lea    0x10(%rbx),%rax
    33b4:       48 89 9e 54 02 00 00    mov    %rbx,0x254(%rsi)
    33bb:       48 89 45 e8             mov    %rax,-0x18(%rbp)
    33bf:       48 89 45 f0             mov    %rax,-0x10(%rbp)
    33c3:       e8 f8 ce ff ff          callq  2c0 <__enqueue_cmd_and_start_io>
    33c8:       48 89 df                mov    %rbx,%rdi
    33cb:       e8 00 00 00 00          callq  33d0 <hpsa_scsi_do_simple_cmd.constprop.106+0x70>
                        33cc: R_X86_64_PC32     wait_for_completion_io-0x4
    33d0:       48 83 c4 20             add    $0x20,%rsp
    33d4:       31 c0                   xor    %eax,%eax
    33d6:       5b                      pop    %rbx
    33d7:       5d                      pop    %rbp
    33d8:       c3                      retq   
    33d9:       e8 00 00 00 00          callq  33de <hpsa_scsi_do_simple_cmd.constprop.106+0x7e>
                        33da: R_X86_64_PC32     ___preempt_schedule-0x4
    33de:       85 c0                   test   %eax,%eax
    33e0:       74 b3                   je     3395 <hpsa_scsi_do_simple_cmd.constprop.106+0x35>
    33e2:       48 8b 86 38 02 00 00    mov    0x238(%rsi),%rax
    33e9:       ba ff ff ff ff          mov    $0xffffffff,%edx
    33ee:       66 89 50 02             mov    %dx,0x2(%rax)
    33f2:       31 c0                   xor    %eax,%eax
    33f4:       c3                      retq   
    33f5:       90                      nop
    33f6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    33fd:       00 00 00 

It there some compilation flag missing? -f flags when compiling that file are:
-falign-jumps=1
-falign-loops=1
-fconserve-stack
-fno-asynchronous-unwind-tables
-fno-common
-fno-delete-null-pointer-checks
-fno-inline-functions-called-once
-fno-omit-frame-pointer
-fno-optimize-sibling-calls
-fno-strict-aliasing
-fno-strict-overflow
-fno-var-tracking-assignments
-fstack-protector
-funit-at-a-time

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 10:36 ` [PATCH 00/33] Compile-time stack metadata validation Jiri Slaby
@ 2016-02-12 10:41   ` Jiri Slaby
  2016-02-12 14:45   ` Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: Jiri Slaby @ 2016-02-12 10:41 UTC (permalink / raw)
  To: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Arnaldo Carvalho de Melo, David Vrabel,
	Borislav Petkov, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	Jeremy Fitzhardinge, Chris Wright, Alok Kataria, Rusty Russell,
	Herbert Xu, David S. Miller, Pavel Machek, Rafael J. Wysocki,
	Len Brown, Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On 02/12/2016, 11:36 AM, Jiri Slaby wrote:
> It there some compilation flag missing? -f flags when compiling that file are:
> -falign-jumps=1
> -falign-loops=1
> -fconserve-stack
> -fno-asynchronous-unwind-tables
> -fno-common
> -fno-delete-null-pointer-checks
> -fno-inline-functions-called-once
> -fno-omit-frame-pointer
> -fno-optimize-sibling-calls
> -fno-strict-aliasing
> -fno-strict-overflow
> -fno-var-tracking-assignments
> -fstack-protector
> -funit-at-a-time

Happens with:
gcc (SUSE Linux) 5.3.1 20151207 [gcc-5-branch revision 231355]
gcc-6 (SUSE Linux) 6.0.0 20160202 (experimental) [trunk revision 233076]

> thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 10:36 ` [PATCH 00/33] Compile-time stack metadata validation Jiri Slaby
  2016-02-12 10:41   ` Jiri Slaby
@ 2016-02-12 14:45   ` Josh Poimboeuf
  2016-02-12 17:10     ` Peter Zijlstra
  1 sibling, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-12 14:45 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Feb 12, 2016 at 11:36:24AM +0100, Jiri Slaby wrote:
> On 01/21/2016, 11:49 PM, Josh Poimboeuf wrote:
> > This is v16 of the compile-time stack metadata validation patch set,
> > along with proposed fixes for most of the warnings it found.  It's based
> > on the tip/master branch.
> 
> Hi,
> 
> with this config:
> https://github.com/openSUSE/kernel-source/blob/master/config/x86_64/vanilla
> 
> I am seeing a lot of functions in C which do not have frame pointer setup/cleanup:

Hi Jiri,

Thanks for testing.

> stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup

This seems like a real frame pointer bug caused by the following line in
arch/x86/include/asm/preempt.h:

  # define __preempt_schedule() asm ("call ___preempt_schedule")

The asm statement doesn't have the stack pointer as an output operand,
so gcc doesn't skips the frame pointer setup before calling.

However, I suspect the "bug" is intentional for optimization purposes.

> stacktool: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.o: cfs_cdebug_show.part.5.constprop.35()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.o: cfs_cdebug_show.part.5.constprop.35()+0x8: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.o: cfs_cdebug_show.part.5.constprop.35()+0x9: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: ksocknal_connsock_decref()+0x0: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: ksocknal_connsock_decref()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: ksocknal_connsock_decref()+0x1: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lnet/klnds/socklnd/socklnd.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: cfs_cdebug_show.part.1.constprop.16()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: cfs_cdebug_show.part.1.constprop.16()+0x8: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: cfs_cdebug_show.part.1.constprop.16()+0x9: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lnet/lnet/lib-move.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lnet/lnet/lo.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lnet/lnet/nidstrings.o: cfs_print_nidlist()+0x220: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lnet/lnet/peer.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lnet/lnet/router.o: cfs_cdebug_show.part.0.constprop.16()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lnet/lnet/router.o: cfs_cdebug_show.part.0.constprop.16()+0x8: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lnet/lnet/router.o: cfs_cdebug_show.part.0.constprop.16()+0x9: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lnet/lnet/router.o: lnet_find_net_locked()+0x8a: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lnet/lnet/router.o: lnet_find_net_locked()+0x8a: return without frame pointer restore
> stacktool: drivers/staging/lustre/lustre/fid/fid_request.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/fld/lproc_fld.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/libcfs/libcfs_lock.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/libcfs/libcfs_mem.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/dir.o: obd_unpackmd()+0x0: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/llite/dir.o: obd_unpackmd()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/dir.o: obd_unpackmd()+0x4: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/file.o: md_intent_lock.part.28()+0x0: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/llite/file.o: md_intent_lock.part.28()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/file.o: md_intent_lock.part.28()+0x24: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x1a: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x1b: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/../lclient/glimpse.o: cl_io_get()+0x19: return without frame pointer restore
> stacktool: drivers/staging/lustre/lustre/llite/../lclient/lcommon_misc.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/llite_mmap.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/lproc_llite.o: checksum_pages_store()+0x19e: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/namei.o: ll_test_inode()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/namei.o: ll_test_inode()+0x5: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/llite/namei.o: ll_test_inode()+0x9: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/rw.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: md_revalidate_lock.part.26()+0x0: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: md_revalidate_lock.part.26()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: md_revalidate_lock.part.26()+0x24: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: sa_args_fini()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: sa_args_fini()+0x5: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: sa_args_fini()+0x9: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/statahead.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/vvp_page.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/xattr_cache.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/llite/xattr.o: get_xattr_type()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/llite/xattr.o: get_xattr_type()+0x1f: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/llite/xattr.o: get_xattr_type()+0x5: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/lmv/lmv_intent.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/lmv/lmv_obd.o: __lmv_fid_alloc()+0x185: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/lov/lov_io.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/lov/lovsub_dev.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/mdc/mdc_lib.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/mdc/mdc_locks.o: .text.unlikely: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/obdclass/debug.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/obdclass/genops.o: class_name2dev()+0xc7: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/obdclass/lustre_handles.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/obdclass/obd_config.o: lustre_cfg_string()+0x0: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/obdclass/obd_config.o: lustre_cfg_string()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/obdclass/obd_config.o: lustre_cfg_string()+0x4: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: __client_obd_list_lock()+0x0: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: __client_obd_list_lock()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: __client_obd_list_lock()+0x1: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: osc_extent_search()+0x78: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/osc/osc_cache.o: osc_extent_search()+0x78: return without frame pointer restore
> stacktool: drivers/staging/lustre/lustre/osc/osc_dev.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/osc/osc_page.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/ptlrpc/connection.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/ptlrpc/import.o: deuuidify.constprop.8()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/ptlrpc/import.o: deuuidify.constprop.8()+0x5: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/ptlrpc/import.o: deuuidify.constprop.8()+0x6: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/ptlrpc/llog_net.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/ptlrpc/../../lustre/ldlm/ldlm_extent.o: ldlm_extent_shift_kms()+0x93: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.o: ldlm_work_bl_ast_lock()+0x156: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.o: ldlm_work_cp_ast_lock()+0xda: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: nrs_policy_register()+0x0: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: nrs_policy_register()+0x5: duplicate frame pointer save
> stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: nrs_policy_register()+0x6: duplicate frame pointer setup
> stacktool: drivers/staging/lustre/lustre/ptlrpc/nrs.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/ptlrpc/pack_generic.o: lustre_swab_mgs_nidtbl_entry()+0x89: frame pointer state mismatch
> stacktool: drivers/staging/lustre/lustre/ptlrpc/pack_generic.o: lustre_swab_mgs_nidtbl_entry()+0x89: return without frame pointer restore
> stacktool: drivers/staging/lustre/lustre/ptlrpc/sec_bulk.o: .text: unexpected end of section
> stacktool: drivers/staging/lustre/lustre/ptlrpc/sec_config.o: .text: unexpected end of section

These staging driver issues are caused by stacktool getting confused by
gcc optimizations related to noreturn functions.  I have it on the TODO
list to make the noreturn function detection more intelligent.

> stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
> stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
> stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
> stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
> stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
> stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
> stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
> stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
> stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
> stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
> stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
> stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
> stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
> stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup

These are all the same "call ___preempt_schedule" issue from above.
I'll need to look into it to figure out if it's a real bug or if it's a
"feature" we should ignore.

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 14:45   ` Josh Poimboeuf
@ 2016-02-12 17:10     ` Peter Zijlstra
  2016-02-12 18:32       ` Josh Poimboeuf
  0 siblings, 1 reply; 133+ messages in thread
From: Peter Zijlstra @ 2016-02-12 17:10 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Feb 12, 2016 at 08:45:43AM -0600, Josh Poimboeuf wrote:
> On Fri, Feb 12, 2016 at 11:36:24AM +0100, Jiri Slaby wrote:
> 
> This seems like a real frame pointer bug caused by the following line in
> arch/x86/include/asm/preempt.h:
> 
>   # define __preempt_schedule() asm ("call ___preempt_schedule")

The purpose there is that:

	preempt_enable();

turns into:

	decl	__percpu_prefix:__preempt_count
	jnz	1f:
	call	___preempt_schedule
1:

See arch/x86/include/asm/preempt.h:__preempt_count_dec_and_test()

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 17:10     ` Peter Zijlstra
@ 2016-02-12 18:32       ` Josh Poimboeuf
  2016-02-12 18:34         ` Josh Poimboeuf
  2016-02-12 20:10         ` Peter Zijlstra
  0 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-12 18:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Feb 12, 2016 at 06:10:37PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 12, 2016 at 08:45:43AM -0600, Josh Poimboeuf wrote:
> > On Fri, Feb 12, 2016 at 11:36:24AM +0100, Jiri Slaby wrote:
> > 
> > This seems like a real frame pointer bug caused by the following line in
> > arch/x86/include/asm/preempt.h:
> > 
> >   # define __preempt_schedule() asm ("call ___preempt_schedule")
> 
> The purpose there is that:
> 
> 	preempt_enable();
> 
> turns into:
> 
> 	decl	__percpu_prefix:__preempt_count
> 	jnz	1f:
> 	call	___preempt_schedule
> 1:
> 
> See arch/x86/include/asm/preempt.h:__preempt_count_dec_and_test()

Sorry, I'm kind of confused.  Do you mean that's what preempt_enable()
would turn into *without* the above define?

What I actually see in the listing is:

 	decl	__percpu_prefix:__preempt_count
 	je	1f:
	....
 1:
 	call	___preempt_schedule

So it puts the "call ___preempt_schedule" in the slow path.

I also don't see how that would be related to the use of the asm
statement in the __preempt_schedule() macro.  Doesn't the use of
unlikely() in preempt_enable() put the call in the slow path?

  #define preempt_enable() \
  do { \
	  barrier(); \
	  if (unlikely(preempt_count_dec_and_test())) \
		  preempt_schedule(); \
  } while (0)

Also, why is the thunk needed?  Any reason why preempt_enable() can't be
called directly from C?

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 18:32       ` Josh Poimboeuf
@ 2016-02-12 18:34         ` Josh Poimboeuf
  2016-02-12 20:10         ` Peter Zijlstra
  1 sibling, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-12 18:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Feb 12, 2016 at 12:32:06PM -0600, Josh Poimboeuf wrote:
> On Fri, Feb 12, 2016 at 06:10:37PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 12, 2016 at 08:45:43AM -0600, Josh Poimboeuf wrote:
> > > On Fri, Feb 12, 2016 at 11:36:24AM +0100, Jiri Slaby wrote:
> > > 
> > > This seems like a real frame pointer bug caused by the following line in
> > > arch/x86/include/asm/preempt.h:
> > > 
> > >   # define __preempt_schedule() asm ("call ___preempt_schedule")
> > 
> > The purpose there is that:
> > 
> > 	preempt_enable();
> > 
> > turns into:
> > 
> > 	decl	__percpu_prefix:__preempt_count
> > 	jnz	1f:
> > 	call	___preempt_schedule
> > 1:
> > 
> > See arch/x86/include/asm/preempt.h:__preempt_count_dec_and_test()
> 
> Sorry, I'm kind of confused.  Do you mean that's what preempt_enable()
> would turn into *without* the above define?
> 
> What I actually see in the listing is:
> 
>  	decl	__percpu_prefix:__preempt_count
>  	je	1f:
> 	....
>  1:
>  	call	___preempt_schedule
> 
> So it puts the "call ___preempt_schedule" in the slow path.
> 
> I also don't see how that would be related to the use of the asm
> statement in the __preempt_schedule() macro.  Doesn't the use of
> unlikely() in preempt_enable() put the call in the slow path?
> 
>   #define preempt_enable() \
>   do { \
> 	  barrier(); \
> 	  if (unlikely(preempt_count_dec_and_test())) \
> 		  preempt_schedule(); \
>   } while (0)
> 
> Also, why is the thunk needed?  Any reason why preempt_enable() can't be
> called directly from C?

Sorry, s/preempt_enable/preempt_schedule/ on that last sentence.

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 18:32       ` Josh Poimboeuf
  2016-02-12 18:34         ` Josh Poimboeuf
@ 2016-02-12 20:10         ` Peter Zijlstra
  2016-02-15 16:31           ` Josh Poimboeuf
  1 sibling, 1 reply; 133+ messages in thread
From: Peter Zijlstra @ 2016-02-12 20:10 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Feb 12, 2016 at 12:32:06PM -0600, Josh Poimboeuf wrote:
> What I actually see in the listing is:
> 
>  	decl	__percpu_prefix:__preempt_count
>  	je	1f:
> 	....
>  1:
>  	call	___preempt_schedule
> 
> So it puts the "call ___preempt_schedule" in the slow path.

Ah yes indeed. Same difference though.

> I also don't see how that would be related to the use of the asm
> statement in the __preempt_schedule() macro.  Doesn't the use of
> unlikely() in preempt_enable() put the call in the slow path?

Sadly no, unlikely() and asm_goto don't work well together. But the slow
path or not isn't the reason we do the asm call thing.

>   #define preempt_enable() \
>   do { \
> 	  barrier(); \
> 	  if (unlikely(preempt_count_dec_and_test())) \
> 		  preempt_schedule(); \
>   } while (0)
> 
> Also, why is the thunk needed?  Any reason why preempt_enable() can't be
> called directly from C?

That would make the call-site save registers and increase the size of
every preempt_enable(). By using the thunk we can do callee saved
registers and avoid blowing up the call site.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-12 20:10         ` Peter Zijlstra
@ 2016-02-15 16:31           ` Josh Poimboeuf
  2016-02-15 16:49             ` Peter Zijlstra
       [not found]             ` <CA+55aFzoPCd_LcSx1FUuEhSBYk2KrfzXGj-Vcn39W5bz=KuZhA@mail.gmail.com>
  0 siblings, 2 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-15 16:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Fri, Feb 12, 2016 at 09:10:11PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 12, 2016 at 12:32:06PM -0600, Josh Poimboeuf wrote:
> > What I actually see in the listing is:
> > 
> >  	decl	__percpu_prefix:__preempt_count
> >  	je	1f:
> > 	....
> >  1:
> >  	call	___preempt_schedule
> > 
> > So it puts the "call ___preempt_schedule" in the slow path.
> 
> Ah yes indeed. Same difference though.
> 
> > I also don't see how that would be related to the use of the asm
> > statement in the __preempt_schedule() macro.  Doesn't the use of
> > unlikely() in preempt_enable() put the call in the slow path?
> 
> Sadly no, unlikely() and asm_goto don't work well together. But the slow
> path or not isn't the reason we do the asm call thing.
> 
> >   #define preempt_enable() \
> >   do { \
> > 	  barrier(); \
> > 	  if (unlikely(preempt_count_dec_and_test())) \
> > 		  preempt_schedule(); \
> >   } while (0)
> > 
> > Also, why is the thunk needed?  Any reason why preempt_enable() can't be
> > called directly from C?
> 
> That would make the call-site save registers and increase the size of
> every preempt_enable(). By using the thunk we can do callee saved
> registers and avoid blowing up the call site.

So is the goal to optimize for size?  If I replace the calls to
__preempt_schedule[_notrace]() with real C calls and remove the thunks,
it only adds about 2k to vmlinux.

There are two ways to fix the warnings:

1. get rid of the thunks and call the C functions directly; or

2. add the stack pointer to the asm() statement output operand list to
ensure a stack frame gets created in the caller function before the
call.  (Note this still allows the thunks to do callee saved registers.)

I like #1 better, but maybe I'm still missing the point of the thunks.

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-15 16:31           ` Josh Poimboeuf
@ 2016-02-15 16:49             ` Peter Zijlstra
       [not found]             ` <CA+55aFzoPCd_LcSx1FUuEhSBYk2KrfzXGj-Vcn39W5bz=KuZhA@mail.gmail.com>
  1 sibling, 0 replies; 133+ messages in thread
From: Peter Zijlstra @ 2016-02-15 16:49 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Slaby, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Arnaldo Carvalho de Melo, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Mon, Feb 15, 2016 at 10:31:34AM -0600, Josh Poimboeuf wrote:
> On Fri, Feb 12, 2016 at 09:10:11PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 12, 2016 at 12:32:06PM -0600, Josh Poimboeuf wrote:
> > > What I actually see in the listing is:
> > > 
> > >  	decl	__percpu_prefix:__preempt_count
> > >  	je	1f:
> > > 	....
> > >  1:
> > >  	call	___preempt_schedule
> > > 
> > > So it puts the "call ___preempt_schedule" in the slow path.
> > 
> > Ah yes indeed. Same difference though.
> > 
> > > I also don't see how that would be related to the use of the asm
> > > statement in the __preempt_schedule() macro.  Doesn't the use of
> > > unlikely() in preempt_enable() put the call in the slow path?
> > 
> > Sadly no, unlikely() and asm_goto don't work well together. But the slow
> > path or not isn't the reason we do the asm call thing.
> > 
> > >   #define preempt_enable() \
> > >   do { \
> > > 	  barrier(); \
> > > 	  if (unlikely(preempt_count_dec_and_test())) \
> > > 		  preempt_schedule(); \
> > >   } while (0)
> > > 
> > > Also, why is the thunk needed?  Any reason why preempt_enable() can't be
> > > called directly from C?
> > 
> > That would make the call-site save registers and increase the size of
> > every preempt_enable(). By using the thunk we can do callee saved
> > registers and avoid blowing up the call site.
> 
> So is the goal to optimize for size?  

General performance impact of preempt_enable().

> If I replace the calls to
> __preempt_schedule[_notrace]() with real C calls and remove the thunks,
> it only adds about 2k to vmlinux.

That's less than I had expected, but probably still worth it.

And is that added text purely in the slow path? We really want to avoid
putting any more register pressure on the preempt_enable() call sites.
The single memop and Jcc is about as fast we can get and we spend quite
a bit of effort getting there.

> There are two ways to fix the warnings:
> 
> 1. get rid of the thunks and call the C functions directly; or
> 
> 2. add the stack pointer to the asm() statement output operand list to
> ensure a stack frame gets created in the caller function before the
> call.  (Note this still allows the thunks to do callee saved registers.)
> 
> I like #1 better, but maybe I'm still missing the point of the thunks.

Ingo, Linus?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
       [not found]             ` <CA+55aFzoPCd_LcSx1FUuEhSBYk2KrfzXGj-Vcn39W5bz=KuZhA@mail.gmail.com>
@ 2016-02-15 20:01               ` Josh Poimboeuf
  2016-02-18 17:41                 ` [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace] Josh Poimboeuf
  2016-02-15 20:02               ` [PATCH 00/33] Compile-time stack metadata validation Andi Kleen
  1 sibling, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-15 20:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Alok Kataria, Ingo Molnar, Guenter Roeck,
	Anil S Keshavamurthy, Herbert Xu, Andrew Morton, Rusty Russell,
	Bernd Petrovitsch, linux-watchdog, Pedro Alves, Pavel Machek,
	Konrad Rzeszutek Wilk, Michal Marek, Namhyung Kim,
	Jeremy Fitzhardinge, Waiman Long, Rafael J. Wysocki, Jiri Slaby,
	kvm, x86, Arnaldo Carvalho de Melo, Paolo Bonzini,
	Masami Hiramatsu, Borislav Petkov, Chris Wright, Andy Lutomirski,
	Alexei Starovoitov, David S. Miller, Wim Van Sebroeck,
	David Vrabel, live-patching, netdev, Boris Ostrovsky,
	Gleb Natapov, Matt Fleming, Chris J Arges, linux-kernel,
	Borislav Petkov, Andi Kleen, Len Brown,
	Ananth N Mavinakayanahalli, H. Peter Anvin, Peter Zijlstra

On Mon, Feb 15, 2016 at 08:56:21AM -0800, Linus Torvalds wrote:
> On Feb 15, 2016 8:31 AM, "Josh Poimboeuf" <jpoimboe@redhat.com> wrote:
> >
> > So is the goal to optimize for size?  If I replace the calls to
> > __preempt_schedule[_notrace]() with real C calls and remove the thunks,
> > it only adds about 2k to vmlinux.
> 
> It adds nasty register pressure in some of the most critical parts of the
> kernel, and makes the asm code harder to read.
> 
> And yes, I still read the asm. For performance reasons, and when decoding
> oopses.
> 
> I realize that few other people care about generated code. That's sad.
> 
> > There are two ways to fix the warnings:
> >
> > 1. get rid of the thunks and call the C functions directly; or
> 
> No. Not until gcc learns about per-function callibg conventions (so that it
> can be marked as not clobbering registers).
> 
> > 2. add the stack pointer to the asm() statement output operand list to
> > ensure a stack frame gets created in the caller function before the
> > call.
> 
> That probably doesn't make things much worse. It probably makes least
> functions have a stack frame if they do preempt disable, but it might still
> be acceptable. Hard to say before I end up hurting this case again.

Oddly, this change (see patch below) actually seems to make things
faster in a lot of cases.  For many smaller functions it causes the
stack frame creation to get moved out of the common path and into the
unlikely path.

For example, here's the original cyc2ns_read_end():

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	55                   	push   %rbp
  ffffffff8101f8c1:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8c4:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c8:	75 08                	jne    ffffffff8101f8d2 <cyc2ns_read_end+0x12>
  ffffffff8101f8ca:	65 48 89 3d e6 5a ff 	mov    %rdi,%gs:0x7eff5ae6(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8d1:	7e 
  ffffffff8101f8d2:	65 ff 0d 77 c4 fe 7e 	decl   %gs:0x7efec477(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d9:	74 02                	je     ffffffff8101f8dd <cyc2ns_read_end+0x1d>
  ffffffff8101f8db:	5d                   	pop    %rbp
  ffffffff8101f8dc:	c3                   	retq   
  ffffffff8101f8dd:	e8 1e 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e2:	5d                   	pop    %rbp
  ffffffff8101f8e3:	c3                   	retq   
  ffffffff8101f8e4:	66 66 66 2e 0f 1f 84 	data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8eb:	00 00 00 00 00 

And here's the same function with the patch:

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c4:	75 08                	jne    ffffffff8101f8ce <cyc2ns_read_end+0xe>
  ffffffff8101f8c6:	65 48 89 3d ea 5a ff 	mov    %rdi,%gs:0x7eff5aea(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8cd:	7e 
  ffffffff8101f8ce:	65 ff 0d 7b c4 fe 7e 	decl   %gs:0x7efec47b(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d5:	74 01                	je     ffffffff8101f8d8 <cyc2ns_read_end+0x18>
  ffffffff8101f8d7:	c3                   	retq   
  ffffffff8101f8d8:	55                   	push   %rbp
  ffffffff8101f8d9:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8dc:	e8 1f 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e1:	5d                   	pop    %rbp
  ffffffff8101f8e2:	c3                   	retq   
  ffffffff8101f8e3:	66 66 66 66 2e 0f 1f 	data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8ea:	84 00 00 00 00 00 

Notice that it moved the frame pointer setup code to the unlikely
___preempt_schedule() call path.  Going through a sampling of the
differences in the asm, that's the most common change I see.

Otherwise it has no real effect on callers which already have stack
frames (though it does change the ordering of some 'mov's).

And it has the intended effect of fixing the following warnings by
ensuring these call sites have stack frames:

  stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
  stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
  stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
  stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup

So that only adds a stack frame to 15 call sites out of ~5000 calls to
___preempt_schedule[_notrace].  All the others already had stack frames.

Any idea for a good benchmark to run with the patch?

> The alternative is to just teach the tools about the magic issue of a few
> things like this.

I think that would be problematic for our goal of making stack traces of
sleeping tasks reliable.  If preempt_enable()'s caller doesn't first
create a stack frame, the caller of the caller will get skipped in the
stack trace.

---8<---

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index 01bcde8..2989aa6 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -94,10 +94,19 @@ static __always_inline bool should_resched(int preempt_offset)
 
 #ifdef CONFIG_PREEMPT
   extern asmlinkage void ___preempt_schedule(void);
-# define __preempt_schedule() asm ("call ___preempt_schedule")
+# define __preempt_schedule() \
+({ \
+	register void *__sp asm(_ASM_SP); \
+	asm volatile ("call ___preempt_schedule" : "+r"(__sp)); \
+})
+
   extern asmlinkage void preempt_schedule(void);
   extern asmlinkage void ___preempt_schedule_notrace(void);
-# define __preempt_schedule_notrace() asm ("call ___preempt_schedule_notrace")
+# define __preempt_schedule_notrace() \
+({ \
+	register void *__sp asm(_ASM_SP); \
+	asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp)); \
+})
   extern asmlinkage void preempt_schedule_notrace(void);
 #endif
 

-- 
Josh

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
       [not found]             ` <CA+55aFzoPCd_LcSx1FUuEhSBYk2KrfzXGj-Vcn39W5bz=KuZhA@mail.gmail.com>
  2016-02-15 20:01               ` Josh Poimboeuf
@ 2016-02-15 20:02               ` Andi Kleen
  1 sibling, 0 replies; 133+ messages in thread
From: Andi Kleen @ 2016-02-15 20:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Josh Poimboeuf, Thomas Gleixner, Alok Kataria, Ingo Molnar,
	Guenter Roeck, Anil S Keshavamurthy, Herbert Xu, Andrew Morton,
	Rusty Russell, Bernd Petrovitsch, linux-watchdog, Pedro Alves,
	Pavel Machek, Konrad Rzeszutek Wilk, Michal Marek, Namhyung Kim,
	Jeremy Fitzhardinge, Waiman Long, Rafael J. Wysocki, Jiri Slaby,
	kvm, x86, Arnaldo Carvalho de Melo, Paolo Bonzini,
	Masami Hiramatsu, Borislav Petkov, Chris Wright, Andy Lutomirski,
	Alexei Starovoitov, David S. Miller, Wim Van Sebroeck,
	David Vrabel, live-patching, netdev, Boris Ostrovsky,
	Gleb Natapov, Matt Fleming, Chris J Arges, linux-kernel,
	Borislav Petkov, Andi Kleen, Len Brown,
	Ananth N Mavinakayanahalli, H. Peter Anvin, Peter Zijlstra

> > There are two ways to fix the warnings:
> >
> > 1. get rid of the thunks and call the C functions directly; or
> 
> No. Not until gcc learns about per-function callibg conventions (so that it can
> be marked as not clobbering registers).

It does already for static functions in 5.x (with -fipa-ra).
And with LTO it can be used even somewhat globally.

Even older version supported it, for only for x86->SSE on 32bit, which
is useless for the kernel. But the new IPA-RA propagates which registers
are clobbered.

That said it will probably be a long time until we can drop support for
older compilers.  So for now the manual method is still needed.

-Andi

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]
  2016-02-15 20:01               ` Josh Poimboeuf
@ 2016-02-18 17:41                 ` Josh Poimboeuf
  2016-02-19 12:05                   ` Jiri Slaby
                                     ` (2 more replies)
  0 siblings, 3 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-18 17:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: Jiri Slaby, Peter Zijlstra, Linus Torvalds, linux-kernel, live-patching

If __preempt_schedule() or __preempt_schedule_notrace() is referenced at
the beginning of a function, gcc can insert the asm inline "call
___preempt_schedule[_notrace]" instruction before setting up a stack
frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is
enabled and can result in bad stack traces.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the inline asm
statements.

Specifically this fixes the following stacktool warnings:

  stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
  stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
  stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
  stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup

So it only adds a stack frame to 15 call sites out of ~5000 calls to
___preempt_schedule[_notrace].  All the others already had stack frames.

Oddly, this change actually seems to make things faster in a lot of
cases.  For many smaller functions it causes the stack frame creation to
get moved out of the common path and into the unlikely path.

For example, here's the original cyc2ns_read_end():

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	55                   	push   %rbp
  ffffffff8101f8c1:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8c4:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c8:	75 08                	jne    ffffffff8101f8d2 <cyc2ns_read_end+0x12>
  ffffffff8101f8ca:	65 48 89 3d e6 5a ff 	mov    %rdi,%gs:0x7eff5ae6(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8d1:	7e
  ffffffff8101f8d2:	65 ff 0d 77 c4 fe 7e 	decl   %gs:0x7efec477(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d9:	74 02                	je     ffffffff8101f8dd <cyc2ns_read_end+0x1d>
  ffffffff8101f8db:	5d                   	pop    %rbp
  ffffffff8101f8dc:	c3                   	retq
  ffffffff8101f8dd:	e8 1e 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e2:	5d                   	pop    %rbp
  ffffffff8101f8e3:	c3                   	retq
  ffffffff8101f8e4:	66 66 66 2e 0f 1f 84 	data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8eb:	00 00 00 00 00

And here's the same function with the patch:

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c4:	75 08                	jne    ffffffff8101f8ce <cyc2ns_read_end+0xe>
  ffffffff8101f8c6:	65 48 89 3d ea 5a ff 	mov    %rdi,%gs:0x7eff5aea(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8cd:	7e
  ffffffff8101f8ce:	65 ff 0d 7b c4 fe 7e 	decl   %gs:0x7efec47b(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d5:	74 01                	je     ffffffff8101f8d8 <cyc2ns_read_end+0x18>
  ffffffff8101f8d7:	c3                   	retq
  ffffffff8101f8d8:	55                   	push   %rbp
  ffffffff8101f8d9:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8dc:	e8 1f 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e1:	5d                   	pop    %rbp
  ffffffff8101f8e2:	c3                   	retq
  ffffffff8101f8e3:	66 66 66 66 2e 0f 1f 	data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8ea:	84 00 00 00 00 00

Notice that it moved the frame pointer setup code to the unlikely
___preempt_schedule() call path.  Going through a sampling of the
differences in the asm, that's the most common change I see.

Otherwise it has no real effect on callers which already have stack
frames (though it does result in the reordering of some 'mov's).

Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/include/asm/preempt.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index 01bcde8..d397deb 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -94,10 +94,19 @@ static __always_inline bool should_resched(int preempt_offset)
 
 #ifdef CONFIG_PREEMPT
   extern asmlinkage void ___preempt_schedule(void);
-# define __preempt_schedule() asm ("call ___preempt_schedule")
+# define __preempt_schedule()					\
+({								\
+	register void *__sp asm(_ASM_SP);			\
+	asm volatile ("call ___preempt_schedule" : "+r"(__sp));	\
+})
+
   extern asmlinkage void preempt_schedule(void);
   extern asmlinkage void ___preempt_schedule_notrace(void);
-# define __preempt_schedule_notrace() asm ("call ___preempt_schedule_notrace")
+# define __preempt_schedule_notrace()					\
+({									\
+	register void *__sp asm(_ASM_SP);				\
+	asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp));	\
+})
   extern asmlinkage void preempt_schedule_notrace(void);
 #endif
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]
  2016-02-18 17:41                 ` [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace] Josh Poimboeuf
@ 2016-02-19 12:05                   ` Jiri Slaby
  2016-02-23  9:05                   ` [tip:x86/debug] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]() =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:53                   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: Jiri Slaby @ 2016-02-19 12:05 UTC (permalink / raw)
  To: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: Peter Zijlstra, Linus Torvalds, linux-kernel, live-patching

On 02/18/2016, 06:41 PM, Josh Poimboeuf wrote:
> If __preempt_schedule() or __preempt_schedule_notrace() is referenced at
> the beginning of a function, gcc can insert the asm inline "call
> ___preempt_schedule[_notrace]" instruction before setting up a stack
> frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is
> enabled and can result in bad stack traces.
> 
> Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
> listing the stack pointer as an output operand for the inline asm
> statements.
> 
> Specifically this fixes the following stacktool warnings:
> 
>   stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
...
> Reported-by: Jiri Slaby <jslaby@suse.cz>

This patch and adding lbug_with_loc to global_noreturns makes all
stacktool warnings go away here.

> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  arch/x86/include/asm/preempt.h | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
> index 01bcde8..d397deb 100644
> --- a/arch/x86/include/asm/preempt.h
> +++ b/arch/x86/include/asm/preempt.h
> @@ -94,10 +94,19 @@ static __always_inline bool should_resched(int preempt_offset)
>  
>  #ifdef CONFIG_PREEMPT
>    extern asmlinkage void ___preempt_schedule(void);
> -# define __preempt_schedule() asm ("call ___preempt_schedule")
> +# define __preempt_schedule()					\
> +({								\
> +	register void *__sp asm(_ASM_SP);			\
> +	asm volatile ("call ___preempt_schedule" : "+r"(__sp));	\
> +})
> +
>    extern asmlinkage void preempt_schedule(void);
>    extern asmlinkage void ___preempt_schedule_notrace(void);
> -# define __preempt_schedule_notrace() asm ("call ___preempt_schedule_notrace")
> +# define __preempt_schedule_notrace()					\
> +({									\
> +	register void *__sp asm(_ASM_SP);				\
> +	asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp));	\
> +})
>    extern asmlinkage void preempt_schedule_notrace(void);
>  #endif
>  
> 

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
                   ` (34 preceding siblings ...)
  2016-02-12 10:36 ` [PATCH 00/33] Compile-time stack metadata validation Jiri Slaby
@ 2016-02-23  8:14 ` Ingo Molnar
  2016-02-23 14:27   ` Arnaldo Carvalho de Melo
  2016-02-23 15:01   ` Josh Poimboeuf
  35 siblings, 2 replies; 133+ messages in thread
From: Ingo Molnar @ 2016-02-23  8:14 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, David Vrabel,
	Borislav Petkov, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	Jeremy Fitzhardinge, Chris Wright, Alok Kataria, Rusty Russell,
	Herbert Xu, David S. Miller, Pavel Machek, Rafael J. Wysocki,
	Len Brown, Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long


So I tried out this latest stacktool series and it looks mostly good for an 
upstream merge.

To help this effort move forward I've applied the preparatory/fix patches that are 
part of this series to tip:x86/debug - that's 26 out of 31 patches. (I've 
propagated all the acks that the latest submission got into the changelogs.)

I have 5 (easy to address) observations that need to be addressed before we can 
move on with the rest of the merge:

1)

Due to recent changes to x86 exception handling, I get a lot of bogus warnings 
about exception table sizes:

  stacktool: arch/x86/kernel/cpu/mcheck/mce.o: __ex_table size not a multiple of 24
  stacktool: arch/x86/kernel/cpu/mtrr/generic.o: __ex_table size not a multiple of 24
  stacktool: arch/x86/kernel/cpu/mtrr/cleanup.o: __ex_table size not a multiple of 24

this is due to:

  548acf19234d x86/mm: Expand the exception table logic to allow new handling options

2)

The fact that 'stacktool' already checks about assembly details like __ex_table[] 
shows that my review feedback early iterations of this series, that the 
'stacktool' name is too specific, was correct.

We really need to rename it before it gets upstream and the situation gets worse. 
__ex_table[] has obviously nothing to do with the 'stack layout' of binaries.

Another suitable name would be 'asmtool' or 'objtool'. For example the following 
would naturally express what it does:

  objtool check kernel/built-in.o

the name expresses that the tool checks object files, independently of the 
existing toolchain. Its primary purpose right now is the checking of stack layout 
details, but the tool is not limited to that at all.

3)

There's quite a bit of overhead when running the tool on larger object files, most 
prominently in cmd_check():

    62.06%  stacktool        stacktool                              [.] cmd_check
     6.72%  stacktool        stacktool                              [.] find_rela_by_dest_range

I added -g to the CFLAGS, which made it visible to annotated output of perf top:

    0.00 :        40430d:       lea    0x4(%rdx,%rax,1),%r13
         :      find_instruction():
         :                                                  struct section *sec,
         :                                                  unsigned long offset)
         :      {
         :              struct instruction *insn;
         :
         :              list_for_each_entry(insn, &file->insns, list)
    0.03 :        404312:       mov    0x38(%rsp),%rax
    0.00 :        404317:       cmp    %rbp,%rax
    0.00 :        40431a:       jne    404334 <cmd_check+0x5b4>
    0.00 :        40431c:       jmpq   4045ba <cmd_check+0x83a>
    0.00 :        404321:       nopl   0x0(%rax)
    6.14 :        404328:       mov    (%rax),%rax
    0.00 :        40432b:       cmp    %rbp,%rax
    2.02 :        40432e:       je     4045ba <cmd_check+0x83a>
         :                      if (insn->sec == sec && insn->offset == offset)
    0.55 :        404334:       cmp    %r12,0x10(%rax)
   87.91 :        404338:       jne    404328 <cmd_check+0x5a8>
    0.00 :        40433a:       cmp    %r13,0x18(%rax)
    3.36 :        40433e:       jne    404328 <cmd_check+0x5a8>
         :      get_jump_destinations():
         :                               * later in validate_functions().
         :                               */
         :                              continue;
         :                      }
         :
         :                      insn->jump_dest = find_instruction(file, dest_sec, dest_off);
    0.00 :        404340:       mov    %rax,0x48(%rbx)
    0.00 :        404344:       jmpq   4042b0 <cmd_check+0x530>
    0.00 :        404349:       nopl   0x0(%rax)
         :      fprintf():
         :
         :      # ifdef __va_arg_pack
         :      __fortify_function int
         :      fprintf (FILE *__restrict __stream, const char *__restrict __fmt, ...)
         :      {
         :        return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,

that looks like a linear list search? That would suck with thousands of entries.

(If this is non-trivial to improve then we can delay this optimization to a later 
patch.)

4)

I think the various 'STACKTOOL' flags in the kernel source are a bit of a misnomer 
- it's not the tool we want to name but the actual property of the code.

So instead of:

  STACKTOOL_IGNORE_FUNC(__bpf_prog_run);

we should do something like:

  STACK_FRAME_NON_STANDARD(__bpf_prog_run);

see how much more expressive it is? It becomes a function attribute independent of 
the tooling that makes use of it.

Similarly, for the highest level 'don't check these directories' makefile flags, 
I'd suggest, instead of using this rather opaque, tool dependent naming:

  STACKTOOL := n

something more specific, like:

  OBJECT_FILES_NON_STANDARD := y

which makes it clearer what's going on: these are special object files that are 
not the typical kernel object files.

Stacktool (or objtool) would be one consumer of this annotation.

I think Boris made similar observations in past reviews.

5)

Likewise, I think the CONFIG_STACK_VALIDATION=y Kconfig flag does not express that 
we do exception table checks as well - and it does not express all the other 
things we may check in object files in the future.

Something like CONFIG_CHECK_OBJECT_FILES=y would express it, and the help text 
would list all the things the tool is able to checks for at the moment.

-------------------

Please send followup iterations of the series against the tip:x86/debug tree (or 
tip:master), to keep the size of the series down.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/xen: Add stack frame dependency to hypercall inline asm calls
  2016-01-21 22:49 ` [PATCH 05/33] x86/xen: Add stack frame dependency to hypercall inline asm calls Josh Poimboeuf
@ 2016-02-23  8:55   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:55 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: boris.ostrovsky, torvalds, linux-kernel, peterz, mmarek, palves,
	bp, dvlasenk, mingo, tglx, bernd, bp, hpa, konrad.wilk,
	chris.j.arges, acme, jpoimboe, luto, david.vrabel, luto, jslaby,
	brgerst, namhyung, akpm

Commit-ID:  6d2d32c1fdcbf0e054f555fc855b81047734ad3f
Gitweb:     http://git.kernel.org/tip/6d2d32c1fdcbf0e054f555fc855b81047734ad3f
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:09 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:54 +0100

x86/xen: Add stack frame dependency to hypercall inline asm calls

If a hypercall is inlined at the beginning of a function, gcc can insert
the call instruction before setting up a stack frame, which breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the hypercall inline
asm statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/c6face5a46713108bded9c4c103637222abc4528.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/xen/hypercall.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 3bcdcc8..a12a047 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -110,9 +110,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 	register unsigned long __arg2 asm(__HYPERCALL_ARG2REG) = __arg2; \
 	register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
 	register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
-	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5;
+	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \
+	register void *__sp asm(_ASM_SP);
 
-#define __HYPERCALL_0PARAM	"=r" (__res)
+#define __HYPERCALL_0PARAM	"=r" (__res), "+r" (__sp)
 #define __HYPERCALL_1PARAM	__HYPERCALL_0PARAM, "+r" (__arg1)
 #define __HYPERCALL_2PARAM	__HYPERCALL_1PARAM, "+r" (__arg2)
 #define __HYPERCALL_3PARAM	__HYPERCALL_2PARAM, "+r" (__arg3)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()
  2016-01-21 22:49 ` [PATCH 06/33] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame() Josh Poimboeuf
@ 2016-02-23  8:56   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:56 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: bernd, dvlasenk, david.vrabel, konrad.wilk, hpa, tglx, namhyung,
	jslaby, mingo, bp, acme, luto, brgerst, linux-kernel, jpoimboe,
	mmarek, boris.ostrovsky, akpm, luto, peterz, chris.j.arges,
	palves, torvalds

Commit-ID:  755f95c9331ba316ae6533cdd16456e4613dd0dc
Gitweb:     http://git.kernel.org/tip/755f95c9331ba316ae6533cdd16456e4613dd0dc
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:10 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:54 +0100

x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()

xen_adjust_exception_frame() is a callable function, but is missing the
ELF function type, which confuses tools like stacktool.

Properly annotate it to be a callable function.  The generated code is
unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/b1851bd17a0986472692a7e3a05290d891382cdd.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/xen/xen-asm_64.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index cc8acc4..c3df431 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -26,6 +26,7 @@ ENTRY(xen_adjust_exception_frame)
 	mov 8+0(%rsp), %rcx
 	mov 8+8(%rsp), %r11
 	ret $16
+ENDPROC(xen_adjust_exception_frame)
 
 hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32
 /*

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/xen: Create stack frames in xen-asm.S
  2016-01-21 22:49 ` [PATCH 07/33] x86/asm/xen: Create stack frames in xen-asm.S Josh Poimboeuf
@ 2016-02-23  8:56   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:56 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: luto, bernd, palves, brgerst, torvalds, chris.j.arges,
	konrad.wilk, namhyung, mmarek, boris.ostrovsky, akpm, hpa,
	linux-kernel, bp, jslaby, jpoimboe, david.vrabel, mingo, luto,
	dvlasenk, tglx, peterz, acme

Commit-ID:  801367b7a77adeecf0f77f25f6e50e91130bd1fb
Gitweb:     http://git.kernel.org/tip/801367b7a77adeecf0f77f25f6e50e91130bd1fb
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:11 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:55 +0100

x86/asm/xen: Create stack frames in xen-asm.S

xen_irq_enable_direct(), xen_restore_fl_direct(), and check_events() are
callable non-leaf functions which don't honor CONFIG_FRAME_POINTER,
which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/a8340ad3fc72ba9ed34da9b3af9cdd6f1a896e17.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/xen/xen-asm.S | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 3e45aa0..eff224d 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -14,6 +14,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/percpu.h>
 #include <asm/processor-flags.h>
+#include <asm/frame.h>
 
 #include "xen-asm.h"
 
@@ -23,6 +24,7 @@
  * then enter the hypervisor to get them handled.
  */
 ENTRY(xen_irq_enable_direct)
+	FRAME_BEGIN
 	/* Unmask events */
 	movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
 
@@ -39,6 +41,7 @@ ENTRY(xen_irq_enable_direct)
 2:	call check_events
 1:
 ENDPATCH(xen_irq_enable_direct)
+	FRAME_END
 	ret
 	ENDPROC(xen_irq_enable_direct)
 	RELOC(xen_irq_enable_direct, 2b+1)
@@ -82,6 +85,7 @@ ENDPATCH(xen_save_fl_direct)
  * enters the hypervisor to get them delivered if so.
  */
 ENTRY(xen_restore_fl_direct)
+	FRAME_BEGIN
 #ifdef CONFIG_X86_64
 	testw $X86_EFLAGS_IF, %di
 #else
@@ -100,6 +104,7 @@ ENTRY(xen_restore_fl_direct)
 2:	call check_events
 1:
 ENDPATCH(xen_restore_fl_direct)
+	FRAME_END
 	ret
 	ENDPROC(xen_restore_fl_direct)
 	RELOC(xen_restore_fl_direct, 2b+1)
@@ -109,7 +114,8 @@ ENDPATCH(xen_restore_fl_direct)
  * Force an event check by making a hypercall, but preserve regs
  * before making the call.
  */
-check_events:
+ENTRY(check_events)
+	FRAME_BEGIN
 #ifdef CONFIG_X86_32
 	push %eax
 	push %ecx
@@ -139,4 +145,6 @@ check_events:
 	pop %rcx
 	pop %rax
 #endif
+	FRAME_END
 	ret
+ENDPROC(check_events)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/paravirt: Add stack frame dependency to PVOP inline asm calls
  2016-01-21 22:49 ` [PATCH 08/33] x86/paravirt: Add stack frame dependency to PVOP inline asm calls Josh Poimboeuf
@ 2016-02-23  8:56   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:56 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: jpoimboe, mmarek, chrisw, rusty, dvlasenk, bp, brgerst, luto,
	jeremy, linux-kernel, namhyung, luto, hpa, jslaby, bernd,
	torvalds, tglx, palves, peterz, akataria, mingo, chris.j.arges,
	acme, akpm, bp

Commit-ID:  48b86d5c38a817ddf718d3ea5369cd2e885f28f3
Gitweb:     http://git.kernel.org/tip/48b86d5c38a817ddf718d3ea5369cd2e885f28f3
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:12 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:55 +0100

x86/paravirt: Add stack frame dependency to PVOP inline asm calls

If a PVOP call macro is inlined at the beginning of a function, gcc can
insert the call instruction before setting up a stack frame, which
breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and
can result in a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the PVOP inline asm
statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6a13e48c5a8cf2de1aa112ae2d4c0ac194096282.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/paravirt_types.h | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 77db561..e8c2326 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -466,8 +466,9 @@ int paravirt_disable_iospace(void);
  * makes sure the incoming and outgoing types are always correct.
  */
 #ifdef CONFIG_X86_32
-#define PVOP_VCALL_ARGS				\
-	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx
+#define PVOP_VCALL_ARGS							\
+	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;	\
+	register void *__sp asm("esp")
 #define PVOP_CALL_ARGS			PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)		"a" ((unsigned long)(x))
@@ -485,9 +486,10 @@ int paravirt_disable_iospace(void);
 #define VEXTRA_CLOBBERS
 #else  /* CONFIG_X86_64 */
 /* [re]ax isn't an arg, but the return val */
-#define PVOP_VCALL_ARGS					\
-	unsigned long __edi = __edi, __esi = __esi,	\
-		__edx = __edx, __ecx = __ecx, __eax = __eax
+#define PVOP_VCALL_ARGS						\
+	unsigned long __edi = __edi, __esi = __esi,		\
+		__edx = __edx, __ecx = __ecx, __eax = __eax;	\
+	register void *__sp asm("rsp")
 #define PVOP_CALL_ARGS		PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)		"D" ((unsigned long)(x))
@@ -526,7 +528,7 @@ int paravirt_disable_iospace(void);
 			asm volatile(pre				\
 				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
-				     : call_clbr			\
+				     : call_clbr, "+r" (__sp)		\
 				     : paravirt_type(op),		\
 				       paravirt_clobber(clbr),		\
 				       ##__VA_ARGS__			\
@@ -536,7 +538,7 @@ int paravirt_disable_iospace(void);
 			asm volatile(pre				\
 				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
-				     : call_clbr			\
+				     : call_clbr, "+r" (__sp)		\
 				     : paravirt_type(op),		\
 				       paravirt_clobber(clbr),		\
 				       ##__VA_ARGS__			\
@@ -563,7 +565,7 @@ int paravirt_disable_iospace(void);
 		asm volatile(pre					\
 			     paravirt_alt(PARAVIRT_CALL)		\
 			     post					\
-			     : call_clbr				\
+			     : call_clbr, "+r" (__sp)			\
 			     : paravirt_type(op),			\
 			       paravirt_clobber(clbr),			\
 			       ##__VA_ARGS__				\

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK
  2016-01-21 22:49 ` [PATCH 09/33] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK Josh Poimboeuf
@ 2016-02-23  8:57   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:57 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: luto, chris.j.arges, mingo, torvalds, namhyung, akpm, chrisw,
	mmarek, jslaby, brgerst, bp, akataria, jeremy, linux-kernel, hpa,
	luto, peterz, palves, jpoimboe, rusty, acme, tglx, dvlasenk, bp,
	bernd

Commit-ID:  c9cc1d72bb0b657de06b8d4be36d94bea0454ee8
Gitweb:     http://git.kernel.org/tip/c9cc1d72bb0b657de06b8d4be36d94bea0454ee8
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:13 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:55 +0100

x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK

A function created with the PV_CALLEE_SAVE_REGS_THUNK macro doesn't set
up a new stack frame before the call instruction, which breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.  Also, the thunk functions aren't annotated as ELF
callable functions.

Create a stack frame when CONFIG_FRAME_POINTER is enabled and add the
ELF function type.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/a2cad74e87c4aba7fd0f54a1af312e66a824a575.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/paravirt.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index f619250..601f1b8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -13,6 +13,7 @@
 #include <linux/bug.h>
 #include <linux/types.h>
 #include <linux/cpumask.h>
+#include <asm/frame.h>
 
 static inline int paravirt_enabled(void)
 {
@@ -756,15 +757,19 @@ static __always_inline void __ticket_unlock_kick(struct arch_spinlock *lock,
  * call. The return value in rax/eax will not be saved, even for void
  * functions.
  */
+#define PV_THUNK_NAME(func) "__raw_callee_save_" #func
 #define PV_CALLEE_SAVE_REGS_THUNK(func)					\
 	extern typeof(func) __raw_callee_save_##func;			\
 									\
 	asm(".pushsection .text;"					\
-	    ".globl __raw_callee_save_" #func " ; "			\
-	    "__raw_callee_save_" #func ": "				\
+	    ".globl " PV_THUNK_NAME(func) ";"				\
+	    ".type " PV_THUNK_NAME(func) ", @function;"			\
+	    PV_THUNK_NAME(func) ":"					\
+	    FRAME_BEGIN							\
 	    PV_SAVE_ALL_CALLER_REGS					\
 	    "call " #func ";"						\
 	    PV_RESTORE_ALL_CALLER_REGS					\
+	    FRAME_END							\
 	    "ret;"							\
 	    ".popsection")
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/amd: Set ELF function type for vide()
  2016-01-21 22:49 ` [PATCH 10/33] x86/amd: Set ELF function type for vide() Josh Poimboeuf
@ 2016-02-23  8:57   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:57 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: luto, palves, chris.j.arges, torvalds, namhyung, bp, dvlasenk,
	mmarek, acme, brgerst, akpm, hpa, peterz, mingo, tglx, jslaby,
	bernd, linux-kernel, bp, luto, jpoimboe

Commit-ID:  48740ab9f0e953137ab62891b87f986e36e1bc69
Gitweb:     http://git.kernel.org/tip/48740ab9f0e953137ab62891b87f986e36e1bc69
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:14 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:55 +0100

x86/amd: Set ELF function type for vide()

vide() is a callable function, but is missing the ELF function type,
which confuses tools like stacktool.

Properly annotate it to be a callable function.  The generated code is
unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/a324095f5c9390ff39b15b4562ea1bbeda1a8282.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/amd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index a07956a..fe2f089 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -75,7 +75,10 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
  */
 
 extern __visible void vide(void);
-__asm__(".globl vide\n\t.align 4\nvide: ret");
+__asm__(".globl vide\n"
+	".type vide, @function\n"
+	".align 4\n"
+	"vide: ret\n");
 
 static void init_amd_k5(struct cpuinfo_x86 *c)
 {

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Move .Lbswap_mask data to .rodata section
  2016-01-21 22:49 ` [PATCH 11/33] x86/asm/crypto: Move .Lbswap_mask data to .rodata section Josh Poimboeuf
@ 2016-02-23  8:58   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:58 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: herbert, palves, brgerst, linux-kernel, acme, davem, bp, luto,
	mingo, torvalds, bernd, chris.j.arges, tglx, bp, namhyung,
	peterz, jslaby, luto, dvlasenk, mmarek, akpm, hpa, jpoimboe

Commit-ID:  999ba9ec9cc3e9435f42dbe926c68921f1f66ef3
Gitweb:     http://git.kernel.org/tip/999ba9ec9cc3e9435f42dbe926c68921f1f66ef3
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:15 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:56 +0100

x86/asm/crypto: Move .Lbswap_mask data to .rodata section

stacktool reports the following warning:

  stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction

stacktool gets confused when it tries to disassemble the following data
in the .text section:

  .Lbswap_mask:
          .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0

Move it to .rodata which is a more appropriate section for read-only
data.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/b6a2f3f8bda705143e127c025edb2b53c86e6eb4.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/aesni-intel_asm.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 6bd2c6c..c44cfed 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2538,9 +2538,11 @@ ENTRY(aesni_cbc_dec)
 ENDPROC(aesni_cbc_dec)
 
 #ifdef __x86_64__
+.pushsection .rodata
 .align 16
 .Lbswap_mask:
 	.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
+.popsection
 
 /*
  * _aesni_inc_init:	internal ABI

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Move jump_table to .rodata section
  2016-01-21 22:49 ` [PATCH 12/33] x86/asm/crypto: Move jump_table " Josh Poimboeuf
@ 2016-02-23  8:58   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:58 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: jpoimboe, bp, torvalds, palves, herbert, mmarek, acme, namhyung,
	tglx, brgerst, peterz, jslaby, akpm, bp, bernd, luto, dvlasenk,
	davem, linux-kernel, mingo, chris.j.arges, luto, hpa

Commit-ID:  c26ac7081af1f1cd05ccfea858d455100255cfd0
Gitweb:     http://git.kernel.org/tip/c26ac7081af1f1cd05ccfea858d455100255cfd0
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:16 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:56 +0100

x86/asm/crypto: Move jump_table to .rodata section

stacktool reports the following warning:

  stacktool: arch/x86/crypto/crc32c-pcl-intel-asm_64.o: crc_pcl()+0x11dd: can't decode instruction

It gets confused when trying to decode jump_table data.  Move jump_table
to the .rodata section which is a more appropriate home for read-only
data.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/1dbf80c097bb9d89c0cbddc01a815ada690e3b32.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 4fe27e0..dc05f01 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -170,8 +170,8 @@ continue_block:
 	## branch into array
 	lea	jump_table(%rip), bufp
 	movzxw  (bufp, %rax, 2), len
-	offset=crc_array-jump_table
-	lea     offset(bufp, len, 1), bufp
+	lea	crc_array(%rip), bufp
+	lea     (bufp, len, 1), bufp
 	jmp     *bufp
 
 	################################################################
@@ -310,7 +310,9 @@ do_return:
 	popq    %rdi
 	popq    %rbx
         ret
+ENDPROC(crc_pcl)
 
+.section	.rodata, "a", %progbits
         ################################################################
         ## jump table        Table is 129 entries x 2 bytes each
         ################################################################
@@ -324,13 +326,11 @@ JMPTBL_ENTRY %i
 	i=i+1
 .endr
 
-ENDPROC(crc_pcl)
 
 	################################################################
 	## PCLMULQDQ tables
 	## Table is 128 entries x 2 words (8 bytes) each
 	################################################################
-.section	.rodata, "a", %progbits
 .align 8
 K_table:
 	.long 0x493c7d27, 0x00000001

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Simplify stack usage in sha-mb functions
  2016-01-21 22:49 ` [PATCH 13/33] x86/asm/crypto: Simplify stack usage in sha-mb functions Josh Poimboeuf
@ 2016-02-23  8:59   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:59 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: luto, akpm, acme, brgerst, namhyung, jpoimboe, luto, palves,
	mmarek, bp, dvlasenk, peterz, mingo, torvalds, linux-kernel,
	tglx, bernd, jslaby, hpa, chris.j.arges

Commit-ID:  c520f61ae153d8b28553d5dd0ea698e0968939d7
Gitweb:     http://git.kernel.org/tip/c520f61ae153d8b28553d5dd0ea698e0968939d7
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:17 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:56 +0100

x86/asm/crypto: Simplify stack usage in sha-mb functions

sha1_mb_mgr_flush_avx2() and sha1_mb_mgr_submit_avx2() both allocate a
lot of stack space which is never used.  Also, many of the registers
being saved aren't being clobbered so there's no need to save them.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/9402e4d87580d6b2376ed95f67b84bdcce3c830e.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S  | 32 ++----------------------
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S | 29 +++------------------
 2 files changed, 6 insertions(+), 55 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
index 85c4e1c..672eaeb 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
@@ -86,16 +86,6 @@
 #define extra_blocks    %arg2
 #define p               %arg2
 
-
-# STACK_SPACE needs to be an odd multiple of 8
-_XMM_SAVE_SIZE  = 10*16
-_GPR_SAVE_SIZE  = 8*8
-_ALIGN_SIZE     = 8
-
-_XMM_SAVE       = 0
-_GPR_SAVE       = _XMM_SAVE + _XMM_SAVE_SIZE
-STACK_SPACE     = _GPR_SAVE + _GPR_SAVE_SIZE + _ALIGN_SIZE
-
 .macro LABEL prefix n
 \prefix\n\():
 .endm
@@ -113,16 +103,7 @@ offset = \_offset
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
 ENTRY(sha1_mb_mgr_flush_avx2)
-	mov	%rsp, %r10
-	sub     $STACK_SPACE, %rsp
-	and     $~31, %rsp
-	mov     %rbx, _GPR_SAVE(%rsp)
-	mov     %r10, _GPR_SAVE+8*1(%rsp) #save rsp
-	mov	%rbp, _GPR_SAVE+8*3(%rsp)
-	mov	%r12, _GPR_SAVE+8*4(%rsp)
-	mov	%r13, _GPR_SAVE+8*5(%rsp)
-	mov	%r14, _GPR_SAVE+8*6(%rsp)
-	mov	%r15, _GPR_SAVE+8*7(%rsp)
+	push	%rbx
 
 	# If bit (32+3) is set, then all lanes are empty
 	mov     _unused_lanes(state), unused_lanes
@@ -230,16 +211,7 @@ len_is_0:
 	mov     tmp2_w, offset(job_rax)
 
 return:
-
-	mov     _GPR_SAVE(%rsp), %rbx
-	mov     _GPR_SAVE+8*1(%rsp), %r10 #saved rsp
-	mov	_GPR_SAVE+8*3(%rsp), %rbp
-	mov	_GPR_SAVE+8*4(%rsp), %r12
-	mov	_GPR_SAVE+8*5(%rsp), %r13
-	mov	_GPR_SAVE+8*6(%rsp), %r14
-	mov	_GPR_SAVE+8*7(%rsp), %r15
-	mov     %r10, %rsp
-
+	pop	%rbx
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index 2ab9560..a5a14c6 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -94,25 +94,12 @@ DWORD_tmp	= %r9d
 
 lane_data       = %r10
 
-# STACK_SPACE needs to be an odd multiple of 8
-STACK_SPACE     = 8*8 + 16*10 + 8
-
 # JOB* submit_mb_mgr_submit_avx2(MB_MGR *state, job_sha1 *job)
 # arg 1 : rcx : state
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
-
-	mov	%rsp, %r10
-	sub     $STACK_SPACE, %rsp
-	and	$~31, %rsp
-
-	mov     %rbx, (%rsp)
-	mov	%r10, 8*2(%rsp)	#save old rsp
-	mov     %rbp, 8*3(%rsp)
-	mov	%r12, 8*4(%rsp)
-	mov	%r13, 8*5(%rsp)
-	mov	%r14, 8*6(%rsp)
-	mov	%r15, 8*7(%rsp)
+	push	%rbx
+	push	%rbp
 
 	mov     _unused_lanes(state), unused_lanes
 	mov	unused_lanes, lane
@@ -203,16 +190,8 @@ len_is_0:
 	movl    DWORD_tmp, _result_digest+1*16(job_rax)
 
 return:
-
-	mov     (%rsp), %rbx
-	mov	8*2(%rsp), %r10	#save old rsp
-	mov     8*3(%rsp), %rbp
-	mov	8*4(%rsp), %r12
-	mov	8*5(%rsp), %r13
-	mov	8*6(%rsp), %r14
-	mov	8*7(%rsp), %r15
-	mov     %r10, %rsp
-
+	pop	%rbp
+	pop	%rbx
 	ret
 
 return_null:

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Don't use RBP as a scratch register
  2016-01-21 22:49 ` [PATCH 14/33] x86/asm/crypto: Don't use rbp as a scratch register Josh Poimboeuf
@ 2016-02-23  8:59   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:59 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: chris.j.arges, namhyung, luto, mmarek, brgerst, torvalds, palves,
	tglx, jpoimboe, acme, luto, hpa, mingo, dvlasenk, bernd, jslaby,
	linux-kernel, akpm, peterz, bp

Commit-ID:  1bd03cf57a9b2e2294143f76c21f386a94e5e6c8
Gitweb:     http://git.kernel.org/tip/1bd03cf57a9b2e2294143f76c21f386a94e5e6c8
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:18 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:57 +0100

x86/asm/crypto: Don't use RBP as a scratch register

The frame pointer (RBP) is getting clobbered in
sha1_mb_mgr_submit_avx2() before a function call, which can mess up
stack traces.  Use R12 instead.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/15a3eb7ebe68e37755927915f45e4f0bde4d18c5.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index a5a14c6..c3b9447 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -86,8 +86,8 @@ job_rax         = %rax
 len             = %rax
 DWORD_len	= %eax
 
-lane            = %rbp
-tmp3            = %rbp
+lane            = %r12
+tmp3            = %r12
 
 tmp             = %r9
 DWORD_tmp	= %r9d
@@ -99,7 +99,7 @@ lane_data       = %r10
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
 	push	%rbx
-	push	%rbp
+	push	%r12
 
 	mov     _unused_lanes(state), unused_lanes
 	mov	unused_lanes, lane
@@ -190,7 +190,7 @@ len_is_0:
 	movl    DWORD_tmp, _result_digest+1*16(job_rax)
 
 return:
-	pop	%rbp
+	pop	%r12
 	pop	%rbx
 	ret
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Create stack frames in crypto functions
  2016-01-21 22:49 ` [PATCH 15/33] x86/asm/crypto: Create stack frames in crypto functions Josh Poimboeuf
@ 2016-02-23  8:59   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  8:59 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: bp, acme, tglx, linux-kernel, peterz, herbert, chris.j.arges,
	brgerst, dvlasenk, bernd, hpa, luto, mingo, torvalds, akpm,
	mmarek, namhyung, luto, jpoimboe, palves, davem, jslaby

Commit-ID:  c24cf96a589107cb29c1d2cfe9c42d43e3f68654
Gitweb:     http://git.kernel.org/tip/c24cf96a589107cb29c1d2cfe9c42d43e3f68654
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:19 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:57 +0100

x86/asm/crypto: Create stack frames in crypto functions

The crypto code has several callable non-leaf functions which don't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6c20192bcf1102ae18ae5a242cabf30ce9b29895.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/aesni-intel_asm.S                | 73 +++++++++++++++---------
 arch/x86/crypto/camellia-aesni-avx-asm_64.S      | 15 +++++
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S     | 15 +++++
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S        |  9 +++
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S        | 13 +++++
 arch/x86/crypto/ghash-clmulni-intel_asm.S        |  5 ++
 arch/x86/crypto/serpent-avx-x86_64-asm_64.S      | 13 +++++
 arch/x86/crypto/serpent-avx2-asm_64.S            | 13 +++++
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S  |  3 +
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S |  3 +
 arch/x86/crypto/twofish-avx-x86_64-asm_64.S      | 13 +++++
 11 files changed, 148 insertions(+), 27 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index c44cfed..383a6f8 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -31,6 +31,7 @@
 
 #include <linux/linkage.h>
 #include <asm/inst.h>
+#include <asm/frame.h>
 
 /*
  * The following macros are used to move an (un)aligned 16 byte value to/from
@@ -1800,11 +1801,12 @@ ENDPROC(_key_expansion_256b)
  *                   unsigned int key_len)
  */
 ENTRY(aesni_set_key)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
-	movl 8(%esp), KEYP		# ctx
-	movl 12(%esp), UKEYP		# in_key
-	movl 16(%esp), %edx		# key_len
+	movl (FRAME_OFFSET+8)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+12)(%esp), UKEYP	# in_key
+	movl (FRAME_OFFSET+16)(%esp), %edx	# key_len
 #endif
 	movups (UKEYP), %xmm0		# user key (first 16 bytes)
 	movaps %xmm0, (KEYP)
@@ -1905,6 +1907,7 @@ ENTRY(aesni_set_key)
 #ifndef __x86_64__
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_set_key)
 
@@ -1912,12 +1915,13 @@ ENDPROC(aesni_set_key)
  * void aesni_enc(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
 ENTRY(aesni_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
 	pushl KLEN
-	movl 12(%esp), KEYP
-	movl 16(%esp), OUTP
-	movl 20(%esp), INP
+	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+20)(%esp), INP	# src
 #endif
 	movl 480(KEYP), KLEN		# key length
 	movups (INP), STATE		# input
@@ -1927,6 +1931,7 @@ ENTRY(aesni_enc)
 	popl KLEN
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_enc)
 
@@ -2101,12 +2106,13 @@ ENDPROC(_aesni_enc4)
  * void aesni_dec (struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
 ENTRY(aesni_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
 	pushl KLEN
-	movl 12(%esp), KEYP
-	movl 16(%esp), OUTP
-	movl 20(%esp), INP
+	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+20)(%esp), INP	# src
 #endif
 	mov 480(KEYP), KLEN		# key length
 	add $240, KEYP
@@ -2117,6 +2123,7 @@ ENTRY(aesni_dec)
 	popl KLEN
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_dec)
 
@@ -2292,14 +2299,15 @@ ENDPROC(_aesni_dec4)
  *		      size_t len)
  */
 ENTRY(aesni_ecb_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 16(%esp), KEYP
-	movl 20(%esp), OUTP
-	movl 24(%esp), INP
-	movl 28(%esp), LEN
+	movl (FRAME_OFFSET+16)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+20)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+24)(%esp), INP	# src
+	movl (FRAME_OFFSET+28)(%esp), LEN	# len
 #endif
 	test LEN, LEN		# check length
 	jz .Lecb_enc_ret
@@ -2342,6 +2350,7 @@ ENTRY(aesni_ecb_enc)
 	popl KEYP
 	popl LEN
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_ecb_enc)
 
@@ -2350,14 +2359,15 @@ ENDPROC(aesni_ecb_enc)
  *		      size_t len);
  */
 ENTRY(aesni_ecb_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 16(%esp), KEYP
-	movl 20(%esp), OUTP
-	movl 24(%esp), INP
-	movl 28(%esp), LEN
+	movl (FRAME_OFFSET+16)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+20)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+24)(%esp), INP	# src
+	movl (FRAME_OFFSET+28)(%esp), LEN	# len
 #endif
 	test LEN, LEN
 	jz .Lecb_dec_ret
@@ -2401,6 +2411,7 @@ ENTRY(aesni_ecb_dec)
 	popl KEYP
 	popl LEN
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_ecb_dec)
 
@@ -2409,16 +2420,17 @@ ENDPROC(aesni_ecb_dec)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_cbc_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 20(%esp), KEYP
-	movl 24(%esp), OUTP
-	movl 28(%esp), INP
-	movl 32(%esp), LEN
-	movl 36(%esp), IVP
+	movl (FRAME_OFFSET+20)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+24)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+28)(%esp), INP	# src
+	movl (FRAME_OFFSET+32)(%esp), LEN	# len
+	movl (FRAME_OFFSET+36)(%esp), IVP	# iv
 #endif
 	cmp $16, LEN
 	jb .Lcbc_enc_ret
@@ -2443,6 +2455,7 @@ ENTRY(aesni_cbc_enc)
 	popl LEN
 	popl IVP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_cbc_enc)
 
@@ -2451,16 +2464,17 @@ ENDPROC(aesni_cbc_enc)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_cbc_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 20(%esp), KEYP
-	movl 24(%esp), OUTP
-	movl 28(%esp), INP
-	movl 32(%esp), LEN
-	movl 36(%esp), IVP
+	movl (FRAME_OFFSET+20)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+24)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+28)(%esp), INP	# src
+	movl (FRAME_OFFSET+32)(%esp), LEN	# len
+	movl (FRAME_OFFSET+36)(%esp), IVP	# iv
 #endif
 	cmp $16, LEN
 	jb .Lcbc_dec_just_ret
@@ -2534,6 +2548,7 @@ ENTRY(aesni_cbc_dec)
 	popl LEN
 	popl IVP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_cbc_dec)
 
@@ -2600,6 +2615,7 @@ ENDPROC(_aesni_inc)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_ctr_enc)
+	FRAME_BEGIN
 	cmp $16, LEN
 	jb .Lctr_enc_just_ret
 	mov 480(KEYP), KLEN
@@ -2653,6 +2669,7 @@ ENTRY(aesni_ctr_enc)
 .Lctr_enc_ret:
 	movups IV, (IVP)
 .Lctr_enc_just_ret:
+	FRAME_END
 	ret
 ENDPROC(aesni_ctr_enc)
 
@@ -2679,6 +2696,7 @@ ENDPROC(aesni_ctr_enc)
  *			 bool enc, u8 *iv)
  */
 ENTRY(aesni_xts_crypt8)
+	FRAME_BEGIN
 	cmpb $0, %cl
 	movl $0, %ecx
 	movl $240, %r10d
@@ -2779,6 +2797,7 @@ ENTRY(aesni_xts_crypt8)
 	pxor INC, STATE4
 	movdqu STATE4, 0x70(OUTP)
 
+	FRAME_END
 	ret
 ENDPROC(aesni_xts_crypt8)
 
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index ce71f92..aa9e8bd 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -16,6 +16,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -726,6 +727,7 @@ __camellia_enc_blk16:
 	 *	%xmm0..%xmm15: 16 encrypted blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 16(%rax), %rcx;
 
@@ -780,6 +782,7 @@ __camellia_enc_blk16:
 		    %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
 		    %xmm15, (key_table)(CTX, %r8, 8), (%rax), 1 * 16(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -812,6 +815,7 @@ __camellia_dec_blk16:
 	 *	%xmm0..%xmm15: 16 plaintext blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 16(%rax), %rcx;
 
@@ -865,6 +869,7 @@ __camellia_dec_blk16:
 		    %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
 		    %xmm15, (key_table)(CTX), (%rax), 1 * 16(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -890,6 +895,7 @@ ENTRY(camellia_ecb_enc_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	 FRAME_BEGIN
 
 	inpack16_pre(%xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
 		     %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
@@ -904,6 +910,7 @@ ENTRY(camellia_ecb_enc_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_enc_16way)
 
@@ -913,6 +920,7 @@ ENTRY(camellia_ecb_dec_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	 FRAME_BEGIN
 
 	cmpl $16, key_length(CTX);
 	movl $32, %r8d;
@@ -932,6 +940,7 @@ ENTRY(camellia_ecb_dec_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_dec_16way)
 
@@ -941,6 +950,7 @@ ENTRY(camellia_cbc_dec_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	FRAME_BEGIN
 
 	cmpl $16, key_length(CTX);
 	movl $32, %r8d;
@@ -981,6 +991,7 @@ ENTRY(camellia_cbc_dec_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_cbc_dec_16way)
 
@@ -997,6 +1008,7 @@ ENTRY(camellia_ctr_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
@@ -1092,6 +1104,7 @@ ENTRY(camellia_ctr_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ctr_16way)
 
@@ -1112,6 +1125,7 @@ camellia_xts_crypt_16way:
 	 *	%r8: index for input whitening key
 	 *	%r9: pointer to  __camellia_enc_blk16 or __camellia_dec_blk16
 	 */
+	FRAME_BEGIN
 
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
@@ -1234,6 +1248,7 @@ camellia_xts_crypt_16way:
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_xts_crypt_16way)
 
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 0e0b886..16186c1 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -11,6 +11,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -766,6 +767,7 @@ __camellia_enc_blk32:
 	 *	%ymm0..%ymm15: 32 encrypted blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 32(%rax), %rcx;
 
@@ -820,6 +822,7 @@ __camellia_enc_blk32:
 		    %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 		    %ymm15, (key_table)(CTX, %r8, 8), (%rax), 1 * 32(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -852,6 +855,7 @@ __camellia_dec_blk32:
 	 *	%ymm0..%ymm15: 16 plaintext blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 32(%rax), %rcx;
 
@@ -905,6 +909,7 @@ __camellia_dec_blk32:
 		    %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 		    %ymm15, (key_table)(CTX), (%rax), 1 * 32(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -930,6 +935,7 @@ ENTRY(camellia_ecb_enc_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -948,6 +954,7 @@ ENTRY(camellia_ecb_enc_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_enc_32way)
 
@@ -957,6 +964,7 @@ ENTRY(camellia_ecb_dec_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -980,6 +988,7 @@ ENTRY(camellia_ecb_dec_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_dec_32way)
 
@@ -989,6 +998,7 @@ ENTRY(camellia_cbc_dec_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1046,6 +1056,7 @@ ENTRY(camellia_cbc_dec_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_cbc_dec_32way)
 
@@ -1070,6 +1081,7 @@ ENTRY(camellia_ctr_32way)
 	 *	%rdx: src (32 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1184,6 +1196,7 @@ ENTRY(camellia_ctr_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ctr_32way)
 
@@ -1216,6 +1229,7 @@ camellia_xts_crypt_32way:
 	 *	%r8: index for input whitening key
 	 *	%r9: pointer to  __camellia_enc_blk32 or __camellia_dec_blk32
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1349,6 +1363,7 @@ camellia_xts_crypt_32way:
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_xts_crypt_32way)
 
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index c35fd5d..14fa196 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 .file "cast5-avx-x86_64-asm_64.S"
 
@@ -365,6 +366,7 @@ ENTRY(cast5_ecb_enc_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -388,6 +390,7 @@ ENTRY(cast5_ecb_enc_16way)
 	vmovdqu RR4, (6*4*4)(%r11);
 	vmovdqu RL4, (7*4*4)(%r11);
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ecb_enc_16way)
 
@@ -398,6 +401,7 @@ ENTRY(cast5_ecb_dec_16way)
 	 *	%rdx: src
 	 */
 
+	FRAME_BEGIN
 	movq %rsi, %r11;
 
 	vmovdqu (0*4*4)(%rdx), RL1;
@@ -420,6 +424,7 @@ ENTRY(cast5_ecb_dec_16way)
 	vmovdqu RR4, (6*4*4)(%r11);
 	vmovdqu RL4, (7*4*4)(%r11);
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ecb_dec_16way)
 
@@ -429,6 +434,7 @@ ENTRY(cast5_cbc_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -469,6 +475,7 @@ ENTRY(cast5_cbc_dec_16way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_cbc_dec_16way)
 
@@ -479,6 +486,7 @@ ENTRY(cast5_ctr_16way)
 	 *	%rdx: src
 	 *	%rcx: iv (big endian, 64bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -542,5 +550,6 @@ ENTRY(cast5_ctr_16way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ctr_16way)
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index e3531f8..c419389 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "cast6-avx-x86_64-asm_64.S"
@@ -349,6 +350,7 @@ ENTRY(cast6_ecb_enc_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -358,6 +360,7 @@ ENTRY(cast6_ecb_enc_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ecb_enc_8way)
 
@@ -367,6 +370,7 @@ ENTRY(cast6_ecb_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -376,6 +380,7 @@ ENTRY(cast6_ecb_dec_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ecb_dec_8way)
 
@@ -385,6 +390,7 @@ ENTRY(cast6_cbc_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -399,6 +405,7 @@ ENTRY(cast6_cbc_dec_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_cbc_dec_8way)
 
@@ -409,6 +416,7 @@ ENTRY(cast6_ctr_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -424,6 +432,7 @@ ENTRY(cast6_ctr_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ctr_8way)
 
@@ -434,6 +443,7 @@ ENTRY(cast6_xts_enc_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -446,6 +456,7 @@ ENTRY(cast6_xts_enc_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_xts_enc_8way)
 
@@ -456,6 +467,7 @@ ENTRY(cast6_xts_dec_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -468,5 +480,6 @@ ENTRY(cast6_xts_dec_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_xts_dec_8way)
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index 5d1e007..eed55c8 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -18,6 +18,7 @@
 
 #include <linux/linkage.h>
 #include <asm/inst.h>
+#include <asm/frame.h>
 
 .data
 
@@ -94,6 +95,7 @@ ENDPROC(__clmul_gf128mul_ble)
 
 /* void clmul_ghash_mul(char *dst, const u128 *shash) */
 ENTRY(clmul_ghash_mul)
+	FRAME_BEGIN
 	movups (%rdi), DATA
 	movups (%rsi), SHASH
 	movaps .Lbswap_mask, BSWAP
@@ -101,6 +103,7 @@ ENTRY(clmul_ghash_mul)
 	call __clmul_gf128mul_ble
 	PSHUFB_XMM BSWAP DATA
 	movups DATA, (%rdi)
+	FRAME_END
 	ret
 ENDPROC(clmul_ghash_mul)
 
@@ -109,6 +112,7 @@ ENDPROC(clmul_ghash_mul)
  *			   const u128 *shash);
  */
 ENTRY(clmul_ghash_update)
+	FRAME_BEGIN
 	cmp $16, %rdx
 	jb .Lupdate_just_ret	# check length
 	movaps .Lbswap_mask, BSWAP
@@ -128,5 +132,6 @@ ENTRY(clmul_ghash_update)
 	PSHUFB_XMM BSWAP DATA
 	movups DATA, (%rdi)
 .Lupdate_just_ret:
+	FRAME_END
 	ret
 ENDPROC(clmul_ghash_update)
diff --git a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
index 2f202f4..8be5718 100644
--- a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "serpent-avx-x86_64-asm_64.S"
@@ -681,6 +682,7 @@ ENTRY(serpent_ecb_enc_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -688,6 +690,7 @@ ENTRY(serpent_ecb_enc_8way_avx)
 
 	store_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_enc_8way_avx)
 
@@ -697,6 +700,7 @@ ENTRY(serpent_ecb_dec_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -704,6 +708,7 @@ ENTRY(serpent_ecb_dec_8way_avx)
 
 	store_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_dec_8way_avx)
 
@@ -713,6 +718,7 @@ ENTRY(serpent_cbc_dec_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -720,6 +726,7 @@ ENTRY(serpent_cbc_dec_8way_avx)
 
 	store_cbc_8way(%rdx, %rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_cbc_dec_8way_avx)
 
@@ -730,6 +737,7 @@ ENTRY(serpent_ctr_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	load_ctr_8way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
 		      RD2, RK0, RK1, RK2);
@@ -738,6 +746,7 @@ ENTRY(serpent_ctr_8way_avx)
 
 	store_ctr_8way(%rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ctr_8way_avx)
 
@@ -748,6 +757,7 @@ ENTRY(serpent_xts_enc_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	/* regs <= src, dst <= IVs, regs <= regs xor IVs */
 	load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
@@ -758,6 +768,7 @@ ENTRY(serpent_xts_enc_8way_avx)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_enc_8way_avx)
 
@@ -768,6 +779,7 @@ ENTRY(serpent_xts_dec_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	/* regs <= src, dst <= IVs, regs <= regs xor IVs */
 	load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
@@ -778,5 +790,6 @@ ENTRY(serpent_xts_dec_8way_avx)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_dec_8way_avx)
diff --git a/arch/x86/crypto/serpent-avx2-asm_64.S b/arch/x86/crypto/serpent-avx2-asm_64.S
index b222085..97c48ad 100644
--- a/arch/x86/crypto/serpent-avx2-asm_64.S
+++ b/arch/x86/crypto/serpent-avx2-asm_64.S
@@ -15,6 +15,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx2.S"
 
 .file "serpent-avx2-asm_64.S"
@@ -673,6 +674,7 @@ ENTRY(serpent_ecb_enc_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -684,6 +686,7 @@ ENTRY(serpent_ecb_enc_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_enc_16way)
 
@@ -693,6 +696,7 @@ ENTRY(serpent_ecb_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -704,6 +708,7 @@ ENTRY(serpent_ecb_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_dec_16way)
 
@@ -713,6 +718,7 @@ ENTRY(serpent_cbc_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -725,6 +731,7 @@ ENTRY(serpent_cbc_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_cbc_dec_16way)
 
@@ -735,6 +742,7 @@ ENTRY(serpent_ctr_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -748,6 +756,7 @@ ENTRY(serpent_ctr_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ctr_16way)
 
@@ -758,6 +767,7 @@ ENTRY(serpent_xts_enc_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -772,6 +782,7 @@ ENTRY(serpent_xts_enc_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_enc_16way)
 
@@ -782,6 +793,7 @@ ENTRY(serpent_xts_dec_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -796,5 +808,6 @@ ENTRY(serpent_xts_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_dec_16way)
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
index 672eaeb..96df6a3 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
@@ -52,6 +52,7 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "sha1_mb_mgr_datastruct.S"
 
 
@@ -103,6 +104,7 @@ offset = \_offset
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
 ENTRY(sha1_mb_mgr_flush_avx2)
+	FRAME_BEGIN
 	push	%rbx
 
 	# If bit (32+3) is set, then all lanes are empty
@@ -212,6 +214,7 @@ len_is_0:
 
 return:
 	pop	%rbx
+	FRAME_END
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index c3b9447..1435acf 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -53,6 +53,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "sha1_mb_mgr_datastruct.S"
 
 
@@ -98,6 +99,7 @@ lane_data       = %r10
 # arg 1 : rcx : state
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
+	FRAME_BEGIN
 	push	%rbx
 	push	%r12
 
@@ -192,6 +194,7 @@ len_is_0:
 return:
 	pop	%r12
 	pop	%rbx
+	FRAME_END
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
index 0505813..dc66273 100644
--- a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "twofish-avx-x86_64-asm_64.S"
@@ -333,6 +334,7 @@ ENTRY(twofish_ecb_enc_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -342,6 +344,7 @@ ENTRY(twofish_ecb_enc_8way)
 
 	store_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ecb_enc_8way)
 
@@ -351,6 +354,7 @@ ENTRY(twofish_ecb_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -360,6 +364,7 @@ ENTRY(twofish_ecb_dec_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ecb_dec_8way)
 
@@ -369,6 +374,7 @@ ENTRY(twofish_cbc_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -383,6 +389,7 @@ ENTRY(twofish_cbc_dec_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_cbc_dec_8way)
 
@@ -393,6 +400,7 @@ ENTRY(twofish_ctr_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -408,6 +416,7 @@ ENTRY(twofish_ctr_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ctr_8way)
 
@@ -418,6 +427,7 @@ ENTRY(twofish_xts_enc_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -430,6 +440,7 @@ ENTRY(twofish_xts_enc_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_xts_enc_8way)
 
@@ -440,6 +451,7 @@ ENTRY(twofish_xts_dec_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -452,5 +464,6 @@ ENTRY(twofish_xts_dec_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_xts_dec_8way)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/entry: Create stack frames in thunk functions
  2016-01-21 22:49 ` [PATCH 16/33] x86/asm/entry: Create stack frames in thunk functions Josh Poimboeuf
@ 2016-02-23  9:00   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:00 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: jslaby, peterz, acme, tglx, bp, mmarek, palves, mingo, torvalds,
	bernd, luto, brgerst, linux-kernel, chris.j.arges, luto, hpa,
	jpoimboe, bp, namhyung, akpm, dvlasenk

Commit-ID:  b404b2c4a1f98ad34c33f45a254c31d4fca9d91d
Gitweb:     http://git.kernel.org/tip/b404b2c4a1f98ad34c33f45a254c31d4fca9d91d
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:20 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:57 +0100

x86/asm/entry: Create stack frames in thunk functions

Thunk functions are callable non-leaf functions that don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.  Also they
aren't annotated as ELF callable functions which can confuse tooling.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled and
add the ELF function type.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/4373e5bff459b9fd66ce5d45bfcc881a5c202643.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/thunk_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index efb2b93..98df1fa 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -8,11 +8,14 @@
 #include <linux/linkage.h>
 #include "calling.h"
 #include <asm/asm.h>
+#include <asm/frame.h>
 
 	/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
 	.macro THUNK name, func, put_ret_addr_in_rdi=0
 	.globl \name
+	.type \name, @function
 \name:
+	FRAME_BEGIN
 
 	/* this one pushes 9 elems, the next one would be %rIP */
 	pushq %rdi
@@ -62,6 +65,7 @@ restore:
 	popq %rdx
 	popq %rsi
 	popq %rdi
+	FRAME_END
 	ret
 	_ASM_NOKPROBE(restore)
 #endif

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
  2016-01-21 22:49 ` [PATCH 17/33] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() Josh Poimboeuf
@ 2016-02-23  9:00   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-23 11:39     ` Pavel Machek
  2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  1 sibling, 1 reply; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:00 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: peterz, jpoimboe, bernd, pavel, brgerst, acme, mmarek, dvlasenk,
	len.brown, linux-kernel, rafael.j.wysocki, mingo, luto, bp,
	jslaby, bp, akpm, palves, namhyung, chris.j.arges, tglx,
	torvalds, luto, hpa

Commit-ID:  4b0cf693ed225b60aea8835361a1c8f85f9f9041
Gitweb:     http://git.kernel.org/tip/4b0cf693ed225b60aea8835361a1c8f85f9f9041
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:21 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:57 +0100

x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()

do_suspend_lowlevel() is a callable non-leaf function which doesn't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7383d87dd40a460e0d757a0793498b9d06a7ee0d.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/acpi/wakeup_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 8c35df4..169963f 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -5,6 +5,7 @@
 #include <asm/page_types.h>
 #include <asm/msr.h>
 #include <asm/asm-offsets.h>
+#include <asm/frame.h>
 
 # Copyright 2003 Pavel Machek <pavel@suse.cz>, distribute under GPLv2
 
@@ -39,6 +40,7 @@ bogus_64_magic:
 	jmp	bogus_64_magic
 
 ENTRY(do_suspend_lowlevel)
+	FRAME_BEGIN
 	subq	$8, %rsp
 	xorl	%eax, %eax
 	call	save_processor_state
@@ -109,6 +111,7 @@ ENTRY(do_suspend_lowlevel)
 
 	xorl	%eax, %eax
 	addq	$8, %rsp
+	FRAME_END
 	jmp	restore_processor_state
 ENDPROC(do_suspend_lowlevel)
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm: Create stack frames in rwsem functions
  2016-01-21 22:49 ` [PATCH 18/33] x86/asm: Create stack frames in rwsem functions Josh Poimboeuf
@ 2016-02-23  9:01   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:01 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: acme, brgerst, hpa, bp, mingo, linux-kernel, dvlasenk, bp,
	jpoimboe, palves, namhyung, chris.j.arges, luto, tglx, mmarek,
	torvalds, luto, akpm, jslaby, peterz, bernd

Commit-ID:  8576cf762b6ab9d53b444262e2d3ba2a7d381c18
Gitweb:     http://git.kernel.org/tip/8576cf762b6ab9d53b444262e2d3ba2a7d381c18
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:22 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:58 +0100

x86/asm: Create stack frames in rwsem functions

rwsem.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/ad0932bbead975b15f9578e4f2cf2ee5961eb840.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/lib/rwsem.S | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/rwsem.S b/arch/x86/lib/rwsem.S
index 40027db..be110ef 100644
--- a/arch/x86/lib/rwsem.S
+++ b/arch/x86/lib/rwsem.S
@@ -15,6 +15,7 @@
 
 #include <linux/linkage.h>
 #include <asm/alternative-asm.h>
+#include <asm/frame.h>
 
 #define __ASM_HALF_REG(reg)	__ASM_SEL(reg, e##reg)
 #define __ASM_HALF_SIZE(inst)	__ASM_SEL(inst##w, inst##l)
@@ -84,24 +85,29 @@
 
 /* Fix up special calling conventions */
 ENTRY(call_rwsem_down_read_failed)
+	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
 	movq %rax,%rdi
 	call rwsem_down_read_failed
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_down_read_failed)
 
 ENTRY(call_rwsem_down_write_failed)
+	FRAME_BEGIN
 	save_common_regs
 	movq %rax,%rdi
 	call rwsem_down_write_failed
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_down_write_failed)
 
 ENTRY(call_rwsem_wake)
+	FRAME_BEGIN
 	/* do nothing if still outstanding active readers */
 	__ASM_HALF_SIZE(dec) %__ASM_HALF_REG(dx)
 	jnz 1f
@@ -109,15 +115,18 @@ ENTRY(call_rwsem_wake)
 	movq %rax,%rdi
 	call rwsem_wake
 	restore_common_regs
-1:	ret
+1:	FRAME_END
+	ret
 ENDPROC(call_rwsem_wake)
 
 ENTRY(call_rwsem_downgrade_wake)
+	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
 	movq %rax,%rdi
 	call rwsem_downgrade_wake
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_downgrade_wake)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/efi: Create a stack frame in efi_call()
  2016-01-21 22:49 ` [PATCH 19/33] x86/asm/efi: Create a stack frame in efi_call() Josh Poimboeuf
@ 2016-02-23  9:01   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:01 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: torvalds, chris.j.arges, linux-kernel, namhyung, tglx, bp,
	mmarek, dvlasenk, jslaby, luto, palves, hpa, matt, akpm, peterz,
	acme, bp, mingo, luto, bernd, brgerst, jpoimboe

Commit-ID:  54bb55fa2922da76d3856b465c1b22f242d42daa
Gitweb:     http://git.kernel.org/tip/54bb55fa2922da76d3856b465c1b22f242d42daa
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:23 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:58 +0100

x86/asm/efi: Create a stack frame in efi_call()

efi_call() is a callable non-leaf function which doesn't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/2294b6fad60eea4cc862eddc8e98a1324e6eeeca.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/platform/efi/efi_stub_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index 86d0f9e..0df2dcc 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -11,6 +11,7 @@
 #include <asm/msr.h>
 #include <asm/processor-flags.h>
 #include <asm/page_types.h>
+#include <asm/frame.h>
 
 #define SAVE_XMM			\
 	mov %rsp, %rax;			\
@@ -74,6 +75,7 @@
 	.endm
 
 ENTRY(efi_call)
+	FRAME_BEGIN
 	SAVE_XMM
 	mov (%rsp), %rax
 	mov 8(%rax), %rax
@@ -88,6 +90,7 @@ ENTRY(efi_call)
 	RESTORE_PGT
 	addq $48, %rsp
 	RESTORE_XMM
+	FRAME_END
 	ret
 ENDPROC(efi_call)
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/power: Create stack frames in hibernate_asm_64.S
  2016-01-21 22:49 ` [PATCH 20/33] x86/asm/power: Create stack frames in hibernate_asm_64.S Josh Poimboeuf
@ 2016-02-23  9:01   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:01 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: bp, tglx, torvalds, palves, jpoimboe, bp, luto, rafael.j.wysocki,
	hpa, mingo, brgerst, luto, pavel, namhyung, jslaby,
	chris.j.arges, peterz, bernd, dvlasenk, linux-kernel, mmarek,
	acme, akpm

Commit-ID:  6f9e5b9ab3161a77ed403af8d7a9ab0a7bab1c88
Gitweb:     http://git.kernel.org/tip/6f9e5b9ab3161a77ed403af8d7a9ab0a7bab1c88
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:24 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:58 +0100

x86/asm/power: Create stack frames in hibernate_asm_64.S

swsusp_arch_suspend() and restore_registers() are callable non-leaf
functions which don't honor CONFIG_FRAME_POINTER, which can result in
bad stack traces.  Also they aren't annotated as ELF callable functions
which can confuse tooling.

Create a stack frame for them when CONFIG_FRAME_POINTER is enabled and
give them proper ELF function annotations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/bdad00205897dc707aebe9e9e39757085e2bf999.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/power/hibernate_asm_64.S | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index e2386cb..4400a43 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -21,8 +21,10 @@
 #include <asm/page_types.h>
 #include <asm/asm-offsets.h>
 #include <asm/processor-flags.h>
+#include <asm/frame.h>
 
 ENTRY(swsusp_arch_suspend)
+	FRAME_BEGIN
 	movq	$saved_context, %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
@@ -50,7 +52,9 @@ ENTRY(swsusp_arch_suspend)
 	movq	%rax, restore_cr3(%rip)
 
 	call swsusp_save
+	FRAME_END
 	ret
+ENDPROC(swsusp_arch_suspend)
 
 ENTRY(restore_image)
 	/* switch to temporary page tables */
@@ -107,6 +111,7 @@ ENTRY(core_restore_code)
 	 */
 
 ENTRY(restore_registers)
+	FRAME_BEGIN
 	/* go back to the original page tables */
 	movq    %rbx, %cr3
 
@@ -147,4 +152,6 @@ ENTRY(restore_registers)
 	/* tell the hibernation core that we've just restored the memory */
 	movq	%rax, in_suspend(%rip)
 
+	FRAME_END
 	ret
+ENDPROC(restore_registers)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/uaccess: Add stack frame output operand in get_user() inline asm
  2016-01-21 22:49 ` [PATCH 21/33] x86/uaccess: Add stack frame output operand in get_user inline asm Josh Poimboeuf
@ 2016-02-23  9:02   ` =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
  2016-02-25  5:50   ` tip-bot for Chris J Arges
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?= @ 2016-02-23  9:02 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: tglx, palves, brgerst, acme, akpm, peterz, mmarek, bernd, bp,
	jslaby, luto, jpoimboe, luto, chris.j.arges, namhyung, torvalds,
	mingo, linux-kernel, hpa, bp, dvlasenk

Commit-ID:  5e947b38f8ead33bf6aaecc2af20a0a3a988fd02
Gitweb:     http://git.kernel.org/tip/5e947b38f8ead33bf6aaecc2af20a0a3a988fd02
Author:     Chris J Arges <chris.j.arges@canonical.com>
AuthorDate: Thu, 21 Jan 2016 16:49:25 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:58 +0100

x86/uaccess: Add stack frame output operand in get_user() inline asm

Numerous 'call without frame pointer save/setup' warnings are introduced
by stacktool because of functions using the get_user() macro. Bad stack
traces could occur due to lack of or misplacement of stack frame setup
code.

This patch forces a stack frame to be created before the inline asm code
if CONFIG_FRAME_POINTER is enabled by listing the stack pointer as an
output operand for the get_user() inline assembly statement.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/bc85501f221ee512670797c7f110022e64b12c81.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/uaccess.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a4a30e4..9bbb3b2 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -179,10 +179,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
 ({									\
 	int __ret_gu;							\
 	register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);		\
+	register void *__sp asm(_ASM_SP);				\
 	__chk_user_ptr(ptr);						\
 	might_fault();							\
-	asm volatile("call __get_user_%P3"				\
-		     : "=a" (__ret_gu), "=r" (__val_gu)			\
+	asm volatile("call __get_user_%P4"				\
+		     : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp)	\
 		     : "0" (ptr), "i" (sizeof(*(ptr))));		\
 	(x) = (__force __typeof__(*(ptr))) __val_gu;			\
 	__builtin_expect(__ret_gu, 0);					\

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/bpf: Annotate callable functions
  2016-01-21 22:49 ` [PATCH 22/33] x86/asm/bpf: Annotate callable functions Josh Poimboeuf
@ 2016-02-23  9:02   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:02 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: dvlasenk, tglx, peterz, bernd, jpoimboe, akpm, jslaby, acme, hpa,
	namhyung, mmarek, mingo, linux-kernel, bp, palves, luto, brgerst,
	chris.j.arges, luto, ast, torvalds

Commit-ID:  ebcc476acbda75cb1a2020677c38b436085173ba
Gitweb:     http://git.kernel.org/tip/ebcc476acbda75cb1a2020677c38b436085173ba
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:26 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:59 +0100

x86/asm/bpf: Annotate callable functions

bpf_jit.S has several functions which can be called from C code.  Give
them proper ELF annotations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/bbe1de0c299fecd4fc9a1766bae8be2647bedb01.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/net/bpf_jit.S | 39 ++++++++++++++++-----------------------
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index 4093216..eb4a3bd 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -22,15 +22,16 @@
 	32 /* space for rbx,r13,r14,r15 */ + \
 	8 /* space for skb_copy_bits */)
 
-sk_load_word:
-	.globl	sk_load_word
+#define FUNC(name) \
+	.globl name; \
+	.type name, @function; \
+	name:
 
+FUNC(sk_load_word)
 	test	%esi,%esi
 	js	bpf_slow_path_word_neg
 
-sk_load_word_positive_offset:
-	.globl	sk_load_word_positive_offset
-
+FUNC(sk_load_word_positive_offset)
 	mov	%r9d,%eax		# hlen
 	sub	%esi,%eax		# hlen - offset
 	cmp	$3,%eax
@@ -39,15 +40,11 @@ sk_load_word_positive_offset:
 	bswap   %eax  			/* ntohl() */
 	ret
 
-sk_load_half:
-	.globl	sk_load_half
-
+FUNC(sk_load_half)
 	test	%esi,%esi
 	js	bpf_slow_path_half_neg
 
-sk_load_half_positive_offset:
-	.globl	sk_load_half_positive_offset
-
+FUNC(sk_load_half_positive_offset)
 	mov	%r9d,%eax
 	sub	%esi,%eax		#	hlen - offset
 	cmp	$1,%eax
@@ -56,15 +53,11 @@ sk_load_half_positive_offset:
 	rol	$8,%ax			# ntohs()
 	ret
 
-sk_load_byte:
-	.globl	sk_load_byte
-
+FUNC(sk_load_byte)
 	test	%esi,%esi
 	js	bpf_slow_path_byte_neg
 
-sk_load_byte_positive_offset:
-	.globl	sk_load_byte_positive_offset
-
+FUNC(sk_load_byte_positive_offset)
 	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
 	jle	bpf_slow_path_byte
 	movzbl	(SKBDATA,%rsi),%eax
@@ -120,8 +113,8 @@ bpf_slow_path_byte:
 bpf_slow_path_word_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi	/* test range */
 	jl	bpf_error	/* offset lower -> error  */
-sk_load_word_negative_offset:
-	.globl	sk_load_word_negative_offset
+
+FUNC(sk_load_word_negative_offset)
 	sk_negative_common(4)
 	mov	(%rax), %eax
 	bswap	%eax
@@ -130,8 +123,8 @@ sk_load_word_negative_offset:
 bpf_slow_path_half_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
-sk_load_half_negative_offset:
-	.globl	sk_load_half_negative_offset
+
+FUNC(sk_load_half_negative_offset)
 	sk_negative_common(2)
 	mov	(%rax),%ax
 	rol	$8,%ax
@@ -141,8 +134,8 @@ sk_load_half_negative_offset:
 bpf_slow_path_byte_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
-sk_load_byte_negative_offset:
-	.globl	sk_load_byte_negative_offset
+
+FUNC(sk_load_byte_negative_offset)
 	sk_negative_common(1)
 	movzbl	(%rax), %eax
 	ret

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-21 22:49 ` [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S Josh Poimboeuf
  2016-01-22  2:44   ` Alexei Starovoitov
@ 2016-02-23  9:03   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:03 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: luto, dvlasenk, bp, hpa, palves, luto, jslaby, ast, tglx, acme,
	torvalds, mmarek, brgerst, peterz, namhyung, bernd, jpoimboe,
	mingo, akpm, linux-kernel, chris.j.arges

Commit-ID:  45670be075ce96566bc6b6ca0b579f17ed6f94f3
Gitweb:     http://git.kernel.org/tip/45670be075ce96566bc6b6ca0b579f17ed6f94f3
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:27 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:59 +0100

x86/asm/bpf: Create stack frames in bpf_jit.S

bpf_jit.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame before the call instructions when
CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/fa4c41976b438b51954cb8021f06bceb1d1d66cc.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/net/bpf_jit.S | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index eb4a3bd..f2a7faf 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -8,6 +8,7 @@
  * of the License.
  */
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 /*
  * Calling convention :
@@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
 
 /* rsi contains offset and can be scratched */
 #define bpf_slow_path_common(LEN)		\
+	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
+	FRAME_BEGIN;				\
 	mov	%rbx, %rdi; /* arg1 == skb */	\
 	push	%r9;				\
 	push	SKBDATA;			\
 /* rsi already has offset */			\
 	mov	$LEN,%ecx;	/* len */	\
-	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
 	call	skb_copy_bits;			\
 	test    %eax,%eax;			\
 	pop	SKBDATA;			\
-	pop	%r9;
+	pop	%r9;				\
+	FRAME_END
 
 
 bpf_slow_path_word:
@@ -99,6 +102,7 @@ bpf_slow_path_byte:
 	ret
 
 #define sk_negative_common(SIZE)				\
+	FRAME_BEGIN;						\
 	mov	%rbx, %rdi; /* arg1 == skb */			\
 	push	%r9;						\
 	push	SKBDATA;					\
@@ -108,6 +112,7 @@ bpf_slow_path_byte:
 	test	%rax,%rax;					\
 	pop	SKBDATA;					\
 	pop	%r9;						\
+	FRAME_END;						\
 	jz	bpf_error
 
 bpf_slow_path_word_neg:

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kprobes: Get rid of kretprobe_trampoline_holder()
  2016-01-21 22:49 ` [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder() Josh Poimboeuf
  2016-01-21 23:42   ` 平松雅巳 / HIRAMATU,MASAMI
@ 2016-02-23  9:03   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:03 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: bp, linux-kernel, mingo, peterz, tglx, chris.j.arges, mmarek,
	hpa, dvlasenk, brgerst, bernd, ananth, acme, masami.hiramatsu.pt,
	palves, akpm, luto, torvalds, jpoimboe, luto,
	anil.s.keshavamurthy, namhyung, davem, jslaby

Commit-ID:  e3f26d20d60297e4c37999525c020177cc62faac
Gitweb:     http://git.kernel.org/tip/e3f26d20d60297e4c37999525c020177cc62faac
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:28 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:03:59 +0100

x86/kprobes: Get rid of kretprobe_trampoline_holder()

The kretprobe_trampoline_holder() wrapper around kretprobe_trampoline()
isn't used anywhere and adds some unnecessary frame pointer instructions
which never execute.  Instead, just make kretprobe_trampoline() a proper
ELF function.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/92d921b102fb865a7c254cfde9e4a0a72b9a781e.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/kprobes/core.c | 57 +++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 1deffe6..5b187df 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -671,38 +671,37 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
  * When a retprobed function returns, this code saves registers and
  * calls trampoline_handler() runs, which calls the kretprobe's handler.
  */
-static void __used kretprobe_trampoline_holder(void)
-{
-	asm volatile (
-			".global kretprobe_trampoline\n"
-			"kretprobe_trampoline: \n"
+asm(
+	".global kretprobe_trampoline\n"
+	".type kretprobe_trampoline, @function\n"
+	"kretprobe_trampoline:\n"
 #ifdef CONFIG_X86_64
-			/* We don't bother saving the ss register */
-			"	pushq %rsp\n"
-			"	pushfq\n"
-			SAVE_REGS_STRING
-			"	movq %rsp, %rdi\n"
-			"	call trampoline_handler\n"
-			/* Replace saved sp with true return address. */
-			"	movq %rax, 152(%rsp)\n"
-			RESTORE_REGS_STRING
-			"	popfq\n"
+	/* We don't bother saving the ss register */
+	"	pushq %rsp\n"
+	"	pushfq\n"
+	SAVE_REGS_STRING
+	"	movq %rsp, %rdi\n"
+	"	call trampoline_handler\n"
+	/* Replace saved sp with true return address. */
+	"	movq %rax, 152(%rsp)\n"
+	RESTORE_REGS_STRING
+	"	popfq\n"
 #else
-			"	pushf\n"
-			SAVE_REGS_STRING
-			"	movl %esp, %eax\n"
-			"	call trampoline_handler\n"
-			/* Move flags to cs */
-			"	movl 56(%esp), %edx\n"
-			"	movl %edx, 52(%esp)\n"
-			/* Replace saved flags with true return address. */
-			"	movl %eax, 56(%esp)\n"
-			RESTORE_REGS_STRING
-			"	popf\n"
+	"	pushf\n"
+	SAVE_REGS_STRING
+	"	movl %esp, %eax\n"
+	"	call trampoline_handler\n"
+	/* Move flags to cs */
+	"	movl 56(%esp), %edx\n"
+	"	movl %edx, 52(%esp)\n"
+	/* Replace saved flags with true return address. */
+	"	movl %eax, 56(%esp)\n"
+	RESTORE_REGS_STRING
+	"	popf\n"
 #endif
-			"	ret\n");
-}
-NOKPROBE_SYMBOL(kretprobe_trampoline_holder);
+	"	ret\n"
+	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
+);
 NOKPROBE_SYMBOL(kretprobe_trampoline);
 
 /*

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kvm: Set ELF function type for fastop functions
  2016-01-21 22:49 ` [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions Josh Poimboeuf
  2016-01-22 10:05   ` Paolo Bonzini
@ 2016-02-23  9:03   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:03 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: dvlasenk, palves, torvalds, jslaby, jpoimboe, luto, brgerst,
	gleb, peterz, pbonzini, tglx, mingo, acme, mmarek, luto,
	namhyung, bernd, chris.j.arges, linux-kernel, bp, akpm, hpa

Commit-ID:  1095da24b338ace27160c3a74b66b41a3ba6c58e
Gitweb:     http://git.kernel.org/tip/1095da24b338ace27160c3a74b66b41a3ba6c58e
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:29 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:04:00 +0100

x86/kvm: Set ELF function type for fastop functions

The callable functions created with the FOP* and FASTOP* macros are
missing ELF function annotations, which confuses tools like stacktool.
Properly annotate them.

This adds some additional labels to the assembly, but the generated
binary code is unchanged (with the exception of instructions which have
embedded references to __LINE__).

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/e399651c89ace54906c203c0557f66ed6ea3ce8d.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kvm/emulate.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 1505587..aa4d726 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -309,23 +309,29 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
 
 static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 
-#define FOP_ALIGN ".align " __stringify(FASTOP_SIZE) " \n\t"
+#define FOP_FUNC(name) \
+	".align " __stringify(FASTOP_SIZE) " \n\t" \
+	".type " name ", @function \n\t" \
+	name ":\n\t"
+
 #define FOP_RET   "ret \n\t"
 
 #define FOP_START(op) \
 	extern void em_##op(struct fastop *fake); \
 	asm(".pushsection .text, \"ax\" \n\t" \
 	    ".global em_" #op " \n\t" \
-            FOP_ALIGN \
-	    "em_" #op ": \n\t"
+	    FOP_FUNC("em_" #op)
 
 #define FOP_END \
 	    ".popsection")
 
-#define FOPNOP() FOP_ALIGN FOP_RET
+#define FOPNOP() \
+	FOP_FUNC(__stringify(__UNIQUE_ID(nop))) \
+	FOP_RET
 
 #define FOP1E(op,  dst) \
-	FOP_ALIGN "10: " #op " %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst) \
+	"10: " #op " %" #dst " \n\t" FOP_RET
 
 #define FOP1EEX(op,  dst) \
 	FOP1E(op, dst) _ASM_EXTABLE(10b, kvm_fastop_exception)
@@ -357,7 +363,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 #define FOP2E(op,  dst, src)	   \
-	FOP_ALIGN #op " %" #src ", %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst "_" #src) \
+	#op " %" #src ", %" #dst " \n\t" FOP_RET
 
 #define FASTOP2(op) \
 	FOP_START(op) \
@@ -395,7 +402,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 #define FOP3E(op,  dst, src, src2) \
-	FOP_ALIGN #op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst "_" #src "_" #src2) \
+	#op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
 
 /* 3-operand, word-only, src2=cl */
 #define FASTOP3WCL(op) \
@@ -407,7 +415,12 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 /* Special case for SETcc - 1 instruction per cc */
-#define FOP_SETCC(op) ".align 4; " #op " %al; ret \n\t"
+#define FOP_SETCC(op) \
+	".align 4 \n\t" \
+	".type " #op ", @function \n\t" \
+	#op ": \n\t" \
+	#op " %al \n\t" \
+	FOP_RET
 
 asm(".global kvm_fastop_exception \n"
     "kvm_fastop_exception: xor %esi, %esi; ret");

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kvm: Make test_cc() always inline
  2016-01-22 16:16       ` [PATCH v16.1 26/33] x86/kvm: Make test_cc() always inline Josh Poimboeuf
@ 2016-02-23  9:04         ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:52         ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:04 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: tglx, namhyung, pbonzini, linux-kernel, gleb, jpoimboe, brgerst,
	hpa, mmarek, peterz, bernd, jslaby, mingo, luto, torvalds, akpm,
	acme, palves, chris.j.arges, luto, bp, dvlasenk

Commit-ID:  dcf6fdbaa99fc3a3e38797e6d596c5241de18b9b
Gitweb:     http://git.kernel.org/tip/dcf6fdbaa99fc3a3e38797e6d596c5241de18b9b
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Fri, 22 Jan 2016 10:16:12 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:04:00 +0100

x86/kvm: Make test_cc() always inline

With some configs (including allyesconfig), gcc doesn't inline
test_cc().  When that happens, test_cc() doesn't create a stack frame
before inserting the inline asm call instruction.  This breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.

Force it to always be inlined so that its containing function's stack
frame can be used.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20160122161612.GE20502@treble.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kvm/emulate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index aa4d726..80363eb 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -969,7 +969,7 @@ static int em_bsr_c(struct x86_emulate_ctxt *ctxt)
 	return fastop(ctxt, em_bsr);
 }
 
-static u8 test_cc(unsigned int condition, unsigned long flags)
+static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
 {
 	u8 rc;
 	void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] watchdog/hpwdt: Create stack frame in asminline_call()
  2016-01-21 22:49 ` [PATCH 27/33] watchdog/hpwdt: Create stack frame in asminline_call() Josh Poimboeuf
@ 2016-02-23  9:04   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:04 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: hpa, acme, torvalds, jslaby, brgerst, mmarek, luto, jpoimboe,
	linux-kernel, luto, bp, bernd, namhyung, tglx, akpm, mingo, wim,
	chris.j.arges, palves, peterz, dvlasenk, linux

Commit-ID:  a216c875a01520896c7ed9ea43779a3da4103ee9
Gitweb:     http://git.kernel.org/tip/a216c875a01520896c7ed9ea43779a3da4103ee9
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:31 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:04:00 +0100

watchdog/hpwdt: Create stack frame in asminline_call()

asminline_call() is a callable non-leaf function which doesn't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: linux-watchdog@vger.kernel.org
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/60de3cfb6f16d413bfb923036cc87fec132df735.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/watchdog/hpwdt.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 92443c3..90016db 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -353,10 +353,10 @@ static int detect_cru_service(void)
 
 asm(".text                      \n\t"
     ".align 4                   \n\t"
-    ".globl asminline_call	\n"
+    ".globl asminline_call	\n\t"
+    ".type asminline_call, @function \n\t"
     "asminline_call:            \n\t"
-    "pushq      %rbp            \n\t"
-    "movq       %rsp, %rbp      \n\t"
+    FRAME_BEGIN
     "pushq      %rax            \n\t"
     "pushq      %rbx            \n\t"
     "pushq      %rdx            \n\t"
@@ -386,7 +386,7 @@ asm(".text                      \n\t"
     "popq       %rdx            \n\t"
     "popq       %rbx            \n\t"
     "popq       %rax            \n\t"
-    "leave                      \n\t"
+    FRAME_END
     "ret                        \n\t"
     ".previous");
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/locking: Create stack frame in PV unlock
  2016-01-21 22:49 ` [PATCH 28/33] x86/locking: Create stack frame in PV unlock Josh Poimboeuf
@ 2016-02-23  9:05   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:05 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: peterz, dvlasenk, jslaby, jpoimboe, Waiman.Long, torvalds,
	namhyung, palves, luto, mingo, hpa, linux-kernel, acme, bernd,
	bp, luto, tglx, brgerst, akpm, chris.j.arges, mmarek

Commit-ID:  74d03c501b10e38c54554567933509cecdddd1eb
Gitweb:     http://git.kernel.org/tip/74d03c501b10e38c54554567933509cecdddd1eb
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:32 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:04:00 +0100

x86/locking: Create stack frame in PV unlock

The assembly PV_UNLOCK function is a callable non-leaf function which
doesn't honor CONFIG_FRAME_POINTER, which can result in bad stack
traces.

Create a stack frame when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hpe.com>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6685a72ddbbd0ad3694337cca0af4b4ea09f5f40.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/qspinlock_paravirt.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/qspinlock_paravirt.h b/arch/x86/include/asm/qspinlock_paravirt.h
index 9f92c18..9d55f9b 100644
--- a/arch/x86/include/asm/qspinlock_paravirt.h
+++ b/arch/x86/include/asm/qspinlock_paravirt.h
@@ -36,8 +36,10 @@ PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath);
  */
 asm    (".pushsection .text;"
 	".globl " PV_UNLOCK ";"
+	".type " PV_UNLOCK ", @function;"
 	".align 4,0x90;"
 	PV_UNLOCK ": "
+	FRAME_BEGIN
 	"push  %rdx;"
 	"mov   $0x1,%eax;"
 	"xor   %edx,%edx;"
@@ -45,6 +47,7 @@ asm    (".pushsection .text;"
 	"cmp   $0x1,%al;"
 	"jne   .slowpath;"
 	"pop   %rdx;"
+	FRAME_END
 	"ret;"
 	".slowpath: "
 	"push   %rsi;"
@@ -52,6 +55,7 @@ asm    (".pushsection .text;"
 	"call " PV_UNLOCK_SLOWPATH ";"
 	"pop    %rsi;"
 	"pop    %rdx;"
+	FRAME_END
 	"ret;"
 	".size " PV_UNLOCK ", .-" PV_UNLOCK ";"
 	".popsection");

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kvm: Add output operand in vmx_handle_external_intr inline asm
  2016-01-22 21:44           ` [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm Chris J Arges
  2016-01-25 15:05             ` Josh Poimboeuf
@ 2016-02-23  9:05             ` =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
  2016-02-25  5:53             ` tip-bot for Chris J Arges
  2 siblings, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?= @ 2016-02-23  9:05 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: mingo, torvalds, peterz, hpa, linux-kernel, jpoimboe, tglx,
	chris.j.arges

Commit-ID:  3aff5415db7e5a5e50f9b05ce1b0b92a4a55e169
Gitweb:     http://git.kernel.org/tip/3aff5415db7e5a5e50f9b05ce1b0b92a4a55e169
Author:     Chris J Arges <chris.j.arges@canonical.com>
AuthorDate: Fri, 22 Jan 2016 15:44:38 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:04:01 +0100

x86/kvm: Add output operand in vmx_handle_external_intr inline asm

Stacktool generates the following warning:
  stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup

By adding the stackpointer as an output operand, this patch ensures that a
stack frame is created when CONFIG_FRAME_POINTER is enabled for the inline
assmebly statement.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: gleb@kernel.org
Cc: kvm@vger.kernel.org
Cc: live-patching@vger.kernel.org
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1453499078-9330-3-git-send-email-chris.j.arges@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kvm/vmx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e2951b6..e153522 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8356,6 +8356,7 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 {
 	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+	register void *__sp asm(_ASM_SP);
 
 	/*
 	 * If external interrupt exists, IF bit is set in rflags/eflags on the
@@ -8388,8 +8389,9 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 			"call *%[entry]\n\t"
 			:
 #ifdef CONFIG_X86_64
-			[sp]"=&r"(tmp)
+			[sp]"=&r"(tmp),
 #endif
+			"+r"(__sp)
 			:
 			[entry]"r"(entry),
 			[ss]"i"(__KERNEL_DS),

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]()
  2016-02-18 17:41                 ` [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace] Josh Poimboeuf
  2016-02-19 12:05                   ` Jiri Slaby
@ 2016-02-23  9:05                   ` =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
  2016-02-25  5:53                   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?= @ 2016-02-23  9:05 UTC (permalink / raw)
  To: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=
  Cc: mingo, torvalds, hpa, jpoimboe, peterz, linux-kernel, tglx, jslaby

Commit-ID:  b5429dac54a31359e508add8572ebe8d29b8cbdb
Gitweb:     http://git.kernel.org/tip/b5429dac54a31359e508add8572ebe8d29b8cbdb
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 18 Feb 2016 11:41:58 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Feb 2016 09:04:01 +0100

sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]()

If __preempt_schedule() or __preempt_schedule_notrace() is referenced at
the beginning of a function, gcc can insert the asm inline "call
___preempt_schedule[_notrace]" instruction before setting up a stack
frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is
enabled and can result in bad stack traces.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the inline asm
statements.

Specifically this fixes the following stacktool warnings:

  stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
  stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
  stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
  stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup

So it only adds a stack frame to 15 call sites out of ~5000 calls to
___preempt_schedule[_notrace]().  All the others already had stack frames.

Oddly, this change actually seems to make things faster in a lot of
cases.  For many smaller functions it causes the stack frame creation to
get moved out of the common path and into the unlikely path.

For example, here's the original cyc2ns_read_end():

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	55                   	push   %rbp
  ffffffff8101f8c1:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8c4:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c8:	75 08                	jne    ffffffff8101f8d2 <cyc2ns_read_end+0x12>
  ffffffff8101f8ca:	65 48 89 3d e6 5a ff 	mov    %rdi,%gs:0x7eff5ae6(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8d1:	7e
  ffffffff8101f8d2:	65 ff 0d 77 c4 fe 7e 	decl   %gs:0x7efec477(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d9:	74 02                	je     ffffffff8101f8dd <cyc2ns_read_end+0x1d>
  ffffffff8101f8db:	5d                   	pop    %rbp
  ffffffff8101f8dc:	c3                   	retq
  ffffffff8101f8dd:	e8 1e 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e2:	5d                   	pop    %rbp
  ffffffff8101f8e3:	c3                   	retq
  ffffffff8101f8e4:	66 66 66 2e 0f 1f 84 	data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8eb:	00 00 00 00 00

And here's the same function with the patch:

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c4:	75 08                	jne    ffffffff8101f8ce <cyc2ns_read_end+0xe>
  ffffffff8101f8c6:	65 48 89 3d ea 5a ff 	mov    %rdi,%gs:0x7eff5aea(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8cd:	7e
  ffffffff8101f8ce:	65 ff 0d 7b c4 fe 7e 	decl   %gs:0x7efec47b(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d5:	74 01                	je     ffffffff8101f8d8 <cyc2ns_read_end+0x18>
  ffffffff8101f8d7:	c3                   	retq
  ffffffff8101f8d8:	55                   	push   %rbp
  ffffffff8101f8d9:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8dc:	e8 1f 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e1:	5d                   	pop    %rbp
  ffffffff8101f8e2:	c3                   	retq
  ffffffff8101f8e3:	66 66 66 66 2e 0f 1f 	data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8ea:	84 00 00 00 00 00

Notice that it moved the frame pointer setup code to the unlikely
___preempt_schedule() call path.  Going through a sampling of the
differences in the asm, that's the most common change I see.

Otherwise it has no real effect on callers which already have stack
frames (though it does result in the reordering of some 'mov's).

Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20160218174158.GA28230@treble.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/preempt.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index 01bcde8..d397deb 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -94,10 +94,19 @@ static __always_inline bool should_resched(int preempt_offset)
 
 #ifdef CONFIG_PREEMPT
   extern asmlinkage void ___preempt_schedule(void);
-# define __preempt_schedule() asm ("call ___preempt_schedule")
+# define __preempt_schedule()					\
+({								\
+	register void *__sp asm(_ASM_SP);			\
+	asm volatile ("call ___preempt_schedule" : "+r"(__sp));	\
+})
+
   extern asmlinkage void preempt_schedule(void);
   extern asmlinkage void ___preempt_schedule_notrace(void);
-# define __preempt_schedule_notrace() asm ("call ___preempt_schedule_notrace")
+# define __preempt_schedule_notrace()					\
+({									\
+	register void *__sp asm(_ASM_SP);				\
+	asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp));	\
+})
   extern asmlinkage void preempt_schedule_notrace(void);
 #endif
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [tip:x86/debug] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
  2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-23 11:39     ` Pavel Machek
  0 siblings, 0 replies; 133+ messages in thread
From: Pavel Machek @ 2016-02-23 11:39 UTC (permalink / raw)
  To: linux-kernel, rafael.j.wysocki, mingo, jpoimboe, peterz, bernd,
	brgerst, dvlasenk, len.brown, acme, mmarek, tglx, chris.j.arges,
	luto, torvalds, hpa, luto, akpm, namhyung, palves, bp, bp,
	jslaby
  Cc: =?UTF-8?B?bGludXgtdGlwLWNvbW1pdHNAdmdlci5rZXJuZWwub3Jn?=


On Tue 2016-02-23 01:00:37,
=?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=@zytor.com
wrote:

Hi!

I don't who sent those mails (hpa?) but From: is severely misformated,
as is content-type.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-23  8:14 ` Ingo Molnar
@ 2016-02-23 14:27   ` Arnaldo Carvalho de Melo
  2016-02-23 15:07     ` Josh Poimboeuf
  2016-02-23 15:01   ` Josh Poimboeuf
  1 sibling, 1 reply; 133+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-23 14:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

Em Tue, Feb 23, 2016 at 09:14:06AM +0100, Ingo Molnar escreveu:
> The fact that 'stacktool' already checks about assembly details like __ex_table[] 
> shows that my review feedback early iterations of this series, that the 
> 'stacktool' name is too specific, was correct.
> 
> We really need to rename it before it gets upstream and the situation gets worse. 
> __ex_table[] has obviously nothing to do with the 'stack layout' of binaries.
> 
> Another suitable name would be 'asmtool' or 'objtool'. For example the following 
> would naturally express what it does:
> 
>   objtool check kernel/built-in.o
> 
> the name expresses that the tool checks object files, independently of the 
> existing toolchain. Its primary purpose right now is the checking of stack layout 
> details, but the tool is not limited to that at all.

Removing 'tool' from the tool name would be nice too :-) Making it
easily googlable would be good too, lotsa people complain about 'perf'
being too vague, see for a quick laugher:

http://www.brendangregg.com/perf.html

``Searching for just "perf" finds sites on the police, petroleum, weed
control, and a T-shirt. This is not an official perf page, for either
perf_events or the T-shirt.''

The T-shirt: http://www.brendangregg.com/perf_events/omg-so-perf.jpg

Maybe we should ask Linus to come with some other nice, short,
searchable, funny name like 'git'?

'chob' as in 'check object'?

- Arnaldo

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-23  8:14 ` Ingo Molnar
  2016-02-23 14:27   ` Arnaldo Carvalho de Melo
@ 2016-02-23 15:01   ` Josh Poimboeuf
  2016-02-24  7:40     ` Ingo Molnar
  1 sibling, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-23 15:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, David Vrabel,
	Borislav Petkov, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	Jeremy Fitzhardinge, Chris Wright, Alok Kataria, Rusty Russell,
	Herbert Xu, David S. Miller, Pavel Machek, Rafael J. Wysocki,
	Len Brown, Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

Hi Ingo,

On Tue, Feb 23, 2016 at 09:14:06AM +0100, Ingo Molnar wrote:
> So I tried out this latest stacktool series and it looks mostly good for an 
> upstream merge.
> 
> To help this effort move forward I've applied the preparatory/fix patches that are 
> part of this series to tip:x86/debug - that's 26 out of 31 patches. (I've 
> propagated all the acks that the latest submission got into the changelogs.)

Thanks very much for your review and for applying the fixes!

A few issues relating to the merge:

- The tip:x86/debug branch fails to build because it depends on
  ec5186557abb ("x86/asm: Add C versions of frame pointer macros") which
  is in tip:x86/asm.

- As Pavel mentioned, the tip-bot seems to be spitting out garbage
  emails from:
  =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=@zytor.com.

> I have 5 (easy to address) observations that need to be addressed before we can 
> move on with the rest of the merge:
> 
> 1)
> 
> Due to recent changes to x86 exception handling, I get a lot of bogus warnings 
> about exception table sizes:
> 
>   stacktool: arch/x86/kernel/cpu/mcheck/mce.o: __ex_table size not a multiple of 24
>   stacktool: arch/x86/kernel/cpu/mtrr/generic.o: __ex_table size not a multiple of 24
>   stacktool: arch/x86/kernel/cpu/mtrr/cleanup.o: __ex_table size not a multiple of 24
> 
> this is due to:
> 
>   548acf19234d x86/mm: Expand the exception table logic to allow new handling options

Ok, I'll fix it up.

> 2)
> 
> The fact that 'stacktool' already checks about assembly details like __ex_table[] 
> shows that my review feedback early iterations of this series, that the 
> 'stacktool' name is too specific, was correct.
> 
> We really need to rename it before it gets upstream and the situation gets worse. 
> __ex_table[] has obviously nothing to do with the 'stack layout' of binaries.
> 
> Another suitable name would be 'asmtool' or 'objtool'. For example the following 
> would naturally express what it does:
> 
>   objtool check kernel/built-in.o
> 
> the name expresses that the tool checks object files, independently of the 
> existing toolchain. Its primary purpose right now is the checking of stack layout 
> details, but the tool is not limited to that at all.

Fair enough.  I'll rename it to objtool if there are no objections.

> 3)
> 
> There's quite a bit of overhead when running the tool on larger object files, most 
> prominently in cmd_check():
> 
>     62.06%  stacktool        stacktool                              [.] cmd_check
>      6.72%  stacktool        stacktool                              [.] find_rela_by_dest_range
> 
> I added -g to the CFLAGS, which made it visible to annotated output of perf top:
> 
>     0.00 :        40430d:       lea    0x4(%rdx,%rax,1),%r13
>          :      find_instruction():
>          :                                                  struct section *sec,
>          :                                                  unsigned long offset)
>          :      {
>          :              struct instruction *insn;
>          :
>          :              list_for_each_entry(insn, &file->insns, list)
>     0.03 :        404312:       mov    0x38(%rsp),%rax
>     0.00 :        404317:       cmp    %rbp,%rax
>     0.00 :        40431a:       jne    404334 <cmd_check+0x5b4>
>     0.00 :        40431c:       jmpq   4045ba <cmd_check+0x83a>
>     0.00 :        404321:       nopl   0x0(%rax)
>     6.14 :        404328:       mov    (%rax),%rax
>     0.00 :        40432b:       cmp    %rbp,%rax
>     2.02 :        40432e:       je     4045ba <cmd_check+0x83a>
>          :                      if (insn->sec == sec && insn->offset == offset)
>     0.55 :        404334:       cmp    %r12,0x10(%rax)
>    87.91 :        404338:       jne    404328 <cmd_check+0x5a8>
>     0.00 :        40433a:       cmp    %r13,0x18(%rax)
>     3.36 :        40433e:       jne    404328 <cmd_check+0x5a8>
>          :      get_jump_destinations():
>          :                               * later in validate_functions().
>          :                               */
>          :                              continue;
>          :                      }
>          :
>          :                      insn->jump_dest = find_instruction(file, dest_sec, dest_off);
>     0.00 :        404340:       mov    %rax,0x48(%rbx)
>     0.00 :        404344:       jmpq   4042b0 <cmd_check+0x530>
>     0.00 :        404349:       nopl   0x0(%rax)
>          :      fprintf():
>          :
>          :      # ifdef __va_arg_pack
>          :      __fortify_function int
>          :      fprintf (FILE *__restrict __stream, const char *__restrict __fmt, ...)
>          :      {
>          :        return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
> 
> that looks like a linear list search? That would suck with thousands of entries.
> 
> (If this is non-trivial to improve then we can delay this optimization to a later 
> patch.)

Yeah, I agree that the linear list search isn't good.  I still need to
think about the data structures a bit, so if it's ok with you, I'll fix
it with a later patch.

> 4)
> 
> I think the various 'STACKTOOL' flags in the kernel source are a bit of a misnomer 
> - it's not the tool we want to name but the actual property of the code.
> 
> So instead of:
> 
>   STACKTOOL_IGNORE_FUNC(__bpf_prog_run);
> 
> we should do something like:
> 
>   STACK_FRAME_NON_STANDARD(__bpf_prog_run);
> 
> see how much more expressive it is? It becomes a function attribute independent of 
> the tooling that makes use of it.

Ok, STACK_FRAME_NON_STANDARD sounds fine to me.

> Similarly, for the highest level 'don't check these directories' makefile flags, 
> I'd suggest, instead of using this rather opaque, tool dependent naming:
> 
>   STACKTOOL := n
> 
> something more specific, like:
> 
>   OBJECT_FILES_NON_STANDARD := y
> 
> which makes it clearer what's going on: these are special object files that are 
> not the typical kernel object files.
> 
> Stacktool (or objtool) would be one consumer of this annotation.
> 
> I think Boris made similar observations in past reviews.

Sounds reasonable.  With this approach we could probably eventually get
rid of KASAN_SANITIZE.

I'll change it to "OBJECT_FILES_NON_STANDARD := y" if there are no
objections.

Note there are also per-object ignores like:

  STACKTOOL_head_$(BITS).o := n

I can similarly change that to something like:

  OBJECT_FILES_NON_STANDARD_head_$(BITS).o := n

> 5)
> 
> Likewise, I think the CONFIG_STACK_VALIDATION=y Kconfig flag does not express that 
> we do exception table checks as well - and it does not express all the other 
> things we may check in object files in the future.
> 
> Something like CONFIG_CHECK_OBJECT_FILES=y would express it, and the help text 
> would list all the things the tool is able to checks for at the moment.

Hm, I'm not really sure about this.  Yes, the tool could potentially do
other types of checks, but is it necessary to lump them all together
into a single config option?  It does have subcommands after all ;-)

The exception table check reported above is very basic and doesn't serve
any useful purpose other than supporting the goal of validating the
stack.

However, I don't feel strong enough about it to hold up the merge any
longer, so I'll plan to make the change unless I hear back from you.

> Please send followup iterations of the series against the tip:x86/debug tree (or 
> tip:master), to keep the size of the series down.

Will do, thanks!

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-23 14:27   ` Arnaldo Carvalho de Melo
@ 2016-02-23 15:07     ` Josh Poimboeuf
  2016-02-23 15:28       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-23 15:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Tue, Feb 23, 2016 at 11:27:17AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Feb 23, 2016 at 09:14:06AM +0100, Ingo Molnar escreveu:
> > The fact that 'stacktool' already checks about assembly details like __ex_table[] 
> > shows that my review feedback early iterations of this series, that the 
> > 'stacktool' name is too specific, was correct.
> > 
> > We really need to rename it before it gets upstream and the situation gets worse. 
> > __ex_table[] has obviously nothing to do with the 'stack layout' of binaries.
> > 
> > Another suitable name would be 'asmtool' or 'objtool'. For example the following 
> > would naturally express what it does:
> > 
> >   objtool check kernel/built-in.o
> > 
> > the name expresses that the tool checks object files, independently of the 
> > existing toolchain. Its primary purpose right now is the checking of stack layout 
> > details, but the tool is not limited to that at all.
> 
> Removing 'tool' from the tool name would be nice too :-) Making it
> easily googlable would be good too, lotsa people complain about 'perf'
> being too vague, see for a quick laugher:
> 
> http://www.brendangregg.com/perf.html
> 
> ``Searching for just "perf" finds sites on the police, petroleum, weed
> control, and a T-shirt. This is not an official perf page, for either
> perf_events or the T-shirt.''
> 
> The T-shirt: http://www.brendangregg.com/perf_events/omg-so-perf.jpg

Yeah, 'tool' in the name is kind of silly, but the above type of
situation is why I prefer 'objtool' over 'obj'.

Though I have to admit I like the idea of making a t-shirt for it ;-)

> Maybe we should ask Linus to come with some other nice, short,
> searchable, funny name like 'git'?
> 
> 'chob' as in 'check object'?

I think 'objtool' is searchable enough.  And I also like the fact that
its name at least gives you an idea of what it does (and eventually it
will do more than just "checking").

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-23 15:07     ` Josh Poimboeuf
@ 2016-02-23 15:28       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 133+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-23 15:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	linux-kernel, live-patching, Michal Marek, Peter Zijlstra,
	Andy Lutomirski, Borislav Petkov, Linus Torvalds, Andi Kleen,
	Pedro Alves, Namhyung Kim, Bernd Petrovitsch, Chris J Arges,
	Andrew Morton, Jiri Slaby, David Vrabel, Borislav Petkov,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Jeremy Fitzhardinge,
	Chris Wright, Alok Kataria, Rusty Russell, Herbert Xu,
	David S. Miller, Pavel Machek, Rafael J. Wysocki, Len Brown,
	Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

Em Tue, Feb 23, 2016 at 09:07:55AM -0600, Josh Poimboeuf escreveu:
> I think 'objtool' is searchable enough.  And I also like the fact that

Yeah, agreed, there is even documentation available for it already:

http://docs.bvstools.com/home/objtool

> its name at least gives you an idea of what it does (and eventually it
> will do more than just "checking").

:-)

- ARnaldo

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-23 15:01   ` Josh Poimboeuf
@ 2016-02-24  7:40     ` Ingo Molnar
  2016-02-24 16:32       ` Josh Poimboeuf
  0 siblings, 1 reply; 133+ messages in thread
From: Ingo Molnar @ 2016-02-24  7:40 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, David Vrabel,
	Borislav Petkov, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	Jeremy Fitzhardinge, Chris Wright, Alok Kataria, Rusty Russell,
	Herbert Xu, David S. Miller, Pavel Machek, Rafael J. Wysocki,
	Len Brown, Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> Hi Ingo,
> 
> On Tue, Feb 23, 2016 at 09:14:06AM +0100, Ingo Molnar wrote:
> > So I tried out this latest stacktool series and it looks mostly good for an 
> > upstream merge.
> > 
> > To help this effort move forward I've applied the preparatory/fix patches that are 
> > part of this series to tip:x86/debug - that's 26 out of 31 patches. (I've 
> > propagated all the acks that the latest submission got into the changelogs.)
> 
> Thanks very much for your review and for applying the fixes!
> 
> A few issues relating to the merge:
> 
> - The tip:x86/debug branch fails to build because it depends on
>   ec5186557abb ("x86/asm: Add C versions of frame pointer macros") which
>   is in tip:x86/asm.

Indeed...

> - As Pavel mentioned, the tip-bot seems to be spitting out garbage
>   emails from:
>   =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=@zytor.com.

Yeah, hpa fixed that meanwhile.

Due to the above bad base I rebased the tree (to a x86/asm base), so there will be 
a new round of (hopefully readable) tip-bot notifications. I'll push it out after 
a bit of testing.

> > 5)
> > 
> > Likewise, I think the CONFIG_STACK_VALIDATION=y Kconfig flag does not express that 
> > we do exception table checks as well - and it does not express all the other 
> > things we may check in object files in the future.
> > 
> > Something like CONFIG_CHECK_OBJECT_FILES=y would express it, and the help text 
> > would list all the things the tool is able to checks for at the moment.
> 
> Hm, I'm not really sure about this.  Yes, the tool could potentially do
> other types of checks, but is it necessary to lump them all together
> into a single config option?  It does have subcommands after all ;-)

lol ;-)

Ok, I'm fine with CONFIG_STACK_VALIDATION=y as well.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 00/33] Compile-time stack metadata validation
  2016-02-24  7:40     ` Ingo Molnar
@ 2016-02-24 16:32       ` Josh Poimboeuf
  0 siblings, 0 replies; 133+ messages in thread
From: Josh Poimboeuf @ 2016-02-24 16:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel,
	live-patching, Michal Marek, Peter Zijlstra, Andy Lutomirski,
	Borislav Petkov, Linus Torvalds, Andi Kleen, Pedro Alves,
	Namhyung Kim, Bernd Petrovitsch, Chris J Arges, Andrew Morton,
	Jiri Slaby, Arnaldo Carvalho de Melo, David Vrabel,
	Borislav Petkov, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	Jeremy Fitzhardinge, Chris Wright, Alok Kataria, Rusty Russell,
	Herbert Xu, David S. Miller, Pavel Machek, Rafael J. Wysocki,
	Len Brown, Matt Fleming, Alexei Starovoitov, netdev,
	Ananth N Mavinakayanahalli, Anil S Keshavamurthy,
	Masami Hiramatsu, Gleb Natapov, Paolo Bonzini, kvm,
	Wim Van Sebroeck, Guenter Roeck, linux-watchdog, Waiman Long

On Wed, Feb 24, 2016 at 08:40:54AM +0100, Ingo Molnar wrote:
> 
> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > Hi Ingo,
> > 
> > On Tue, Feb 23, 2016 at 09:14:06AM +0100, Ingo Molnar wrote:
> > > So I tried out this latest stacktool series and it looks mostly good for an 
> > > upstream merge.
> > > 
> > > To help this effort move forward I've applied the preparatory/fix patches that are 
> > > part of this series to tip:x86/debug - that's 26 out of 31 patches. (I've 
> > > propagated all the acks that the latest submission got into the changelogs.)
> > 
> > Thanks very much for your review and for applying the fixes!
> > 
> > A few issues relating to the merge:
> > 
> > - The tip:x86/debug branch fails to build because it depends on
> >   ec5186557abb ("x86/asm: Add C versions of frame pointer macros") which
> >   is in tip:x86/asm.
> 
> Indeed...

FYI, v17 will be based on tip:x86/debug.  But note that, when run
against that branch, it'll spit out a lot of these warnings:

  objtool: arch/x86/ia32/sys_ia32.o: __ex_table size not a multiple of 12
  objtool: arch/x86/ia32/ia32_signal.o: __ex_table size not a multiple of 12
  objtool: arch/x86/entry/common.o: __ex_table size not a multiple of 12

Those warnings mean it's expecting the new exception table format which
was added in:

  548acf19234d ("x86/mm: Expand the exception table logic to allow new handling options")

So that commit is needed to avoid the warnings.

Thanks!

-- 
Josh

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/xen: Add stack frame dependency to hypercall inline asm calls
  2016-01-21 22:49 ` [PATCH 05/33] x86/xen: Add stack frame dependency to hypercall inline asm calls Josh Poimboeuf
  2016-02-23  8:55   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jslaby, akpm, dvlasenk, bp, mingo, linux-kernel, bernd, torvalds,
	boris.ostrovsky, luto, palves, konrad.wilk, peterz,
	chris.j.arges, namhyung, brgerst, hpa, tglx, acme, bp, mmarek,
	jpoimboe, luto, david.vrabel

Commit-ID:  0e8e2238b52e5301d1d1d4a298ec5b72ac54c702
Gitweb:     http://git.kernel.org/tip/0e8e2238b52e5301d1d1d4a298ec5b72ac54c702
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:09 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:41 +0100

x86/xen: Add stack frame dependency to hypercall inline asm calls

If a hypercall is inlined at the beginning of a function, gcc can insert
the call instruction before setting up a stack frame, which breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the hypercall inline
asm statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/c6face5a46713108bded9c4c103637222abc4528.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/xen/hypercall.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 3bcdcc8..a12a047 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -110,9 +110,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 	register unsigned long __arg2 asm(__HYPERCALL_ARG2REG) = __arg2; \
 	register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
 	register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
-	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5;
+	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \
+	register void *__sp asm(_ASM_SP);
 
-#define __HYPERCALL_0PARAM	"=r" (__res)
+#define __HYPERCALL_0PARAM	"=r" (__res), "+r" (__sp)
 #define __HYPERCALL_1PARAM	__HYPERCALL_0PARAM, "+r" (__arg1)
 #define __HYPERCALL_2PARAM	__HYPERCALL_1PARAM, "+r" (__arg2)
 #define __HYPERCALL_3PARAM	__HYPERCALL_2PARAM, "+r" (__arg3)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()
  2016-01-21 22:49 ` [PATCH 06/33] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame() Josh Poimboeuf
  2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: chris.j.arges, linux-kernel, palves, acme, torvalds, tglx,
	dvlasenk, jpoimboe, konrad.wilk, akpm, hpa, brgerst, namhyung,
	peterz, luto, jslaby, mingo, boris.ostrovsky, bernd, mmarek,
	luto, bp, david.vrabel

Commit-ID:  9fd216067d75b45eb18d84dc476059de47da07c2
Gitweb:     http://git.kernel.org/tip/9fd216067d75b45eb18d84dc476059de47da07c2
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:10 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:41 +0100

x86/asm/xen: Set ELF function type for xen_adjust_exception_frame()

xen_adjust_exception_frame() is a callable function, but is missing the
ELF function type, which confuses tools like stacktool.

Properly annotate it to be a callable function.  The generated code is
unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/b1851bd17a0986472692a7e3a05290d891382cdd.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/xen/xen-asm_64.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index cc8acc4..c3df431 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -26,6 +26,7 @@ ENTRY(xen_adjust_exception_frame)
 	mov 8+0(%rsp), %rcx
 	mov 8+8(%rsp), %r11
 	ret $16
+ENDPROC(xen_adjust_exception_frame)
 
 hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32
 /*

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/xen: Create stack frames in xen-asm.S
  2016-01-21 22:49 ` [PATCH 07/33] x86/asm/xen: Create stack frames in xen-asm.S Josh Poimboeuf
  2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, jpoimboe, torvalds, david.vrabel, mmarek, boris.ostrovsky,
	linux-kernel, luto, jslaby, peterz, brgerst, bernd, dvlasenk, bp,
	acme, namhyung, palves, chris.j.arges, akpm, konrad.wilk, mingo,
	tglx, luto

Commit-ID:  8be0eb7e0d53bc2dbe6e9ad6e96a2d5b89d148a8
Gitweb:     http://git.kernel.org/tip/8be0eb7e0d53bc2dbe6e9ad6e96a2d5b89d148a8
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:11 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/asm/xen: Create stack frames in xen-asm.S

xen_irq_enable_direct(), xen_restore_fl_direct(), and check_events() are
callable non-leaf functions which don't honor CONFIG_FRAME_POINTER,
which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/a8340ad3fc72ba9ed34da9b3af9cdd6f1a896e17.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/xen/xen-asm.S | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 3e45aa0..eff224d 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -14,6 +14,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/percpu.h>
 #include <asm/processor-flags.h>
+#include <asm/frame.h>
 
 #include "xen-asm.h"
 
@@ -23,6 +24,7 @@
  * then enter the hypervisor to get them handled.
  */
 ENTRY(xen_irq_enable_direct)
+	FRAME_BEGIN
 	/* Unmask events */
 	movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
 
@@ -39,6 +41,7 @@ ENTRY(xen_irq_enable_direct)
 2:	call check_events
 1:
 ENDPATCH(xen_irq_enable_direct)
+	FRAME_END
 	ret
 	ENDPROC(xen_irq_enable_direct)
 	RELOC(xen_irq_enable_direct, 2b+1)
@@ -82,6 +85,7 @@ ENDPATCH(xen_save_fl_direct)
  * enters the hypervisor to get them delivered if so.
  */
 ENTRY(xen_restore_fl_direct)
+	FRAME_BEGIN
 #ifdef CONFIG_X86_64
 	testw $X86_EFLAGS_IF, %di
 #else
@@ -100,6 +104,7 @@ ENTRY(xen_restore_fl_direct)
 2:	call check_events
 1:
 ENDPATCH(xen_restore_fl_direct)
+	FRAME_END
 	ret
 	ENDPROC(xen_restore_fl_direct)
 	RELOC(xen_restore_fl_direct, 2b+1)
@@ -109,7 +114,8 @@ ENDPATCH(xen_restore_fl_direct)
  * Force an event check by making a hypercall, but preserve regs
  * before making the call.
  */
-check_events:
+ENTRY(check_events)
+	FRAME_BEGIN
 #ifdef CONFIG_X86_32
 	push %eax
 	push %ecx
@@ -139,4 +145,6 @@ check_events:
 	pop %rcx
 	pop %rax
 #endif
+	FRAME_END
 	ret
+ENDPROC(check_events)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/paravirt: Add stack frame dependency to PVOP inline asm calls
  2016-01-21 22:49 ` [PATCH 08/33] x86/paravirt: Add stack frame dependency to PVOP inline asm calls Josh Poimboeuf
  2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: akpm, linux-kernel, chris.j.arges, luto, mmarek, mingo, jpoimboe,
	palves, chrisw, luto, bp, acme, rusty, torvalds, bernd, hpa, bp,
	akataria, brgerst, peterz, jslaby, jeremy, dvlasenk, tglx,
	namhyung

Commit-ID:  bb93eb4cd606888f879897e8289d9c1143571feb
Gitweb:     http://git.kernel.org/tip/bb93eb4cd606888f879897e8289d9c1143571feb
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:12 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/paravirt: Add stack frame dependency to PVOP inline asm calls

If a PVOP call macro is inlined at the beginning of a function, gcc can
insert the call instruction before setting up a stack frame, which
breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and
can result in a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the PVOP inline asm
statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6a13e48c5a8cf2de1aa112ae2d4c0ac194096282.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/paravirt_types.h | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 77db561..e8c2326 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -466,8 +466,9 @@ int paravirt_disable_iospace(void);
  * makes sure the incoming and outgoing types are always correct.
  */
 #ifdef CONFIG_X86_32
-#define PVOP_VCALL_ARGS				\
-	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx
+#define PVOP_VCALL_ARGS							\
+	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;	\
+	register void *__sp asm("esp")
 #define PVOP_CALL_ARGS			PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)		"a" ((unsigned long)(x))
@@ -485,9 +486,10 @@ int paravirt_disable_iospace(void);
 #define VEXTRA_CLOBBERS
 #else  /* CONFIG_X86_64 */
 /* [re]ax isn't an arg, but the return val */
-#define PVOP_VCALL_ARGS					\
-	unsigned long __edi = __edi, __esi = __esi,	\
-		__edx = __edx, __ecx = __ecx, __eax = __eax
+#define PVOP_VCALL_ARGS						\
+	unsigned long __edi = __edi, __esi = __esi,		\
+		__edx = __edx, __ecx = __ecx, __eax = __eax;	\
+	register void *__sp asm("rsp")
 #define PVOP_CALL_ARGS		PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)		"D" ((unsigned long)(x))
@@ -526,7 +528,7 @@ int paravirt_disable_iospace(void);
 			asm volatile(pre				\
 				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
-				     : call_clbr			\
+				     : call_clbr, "+r" (__sp)		\
 				     : paravirt_type(op),		\
 				       paravirt_clobber(clbr),		\
 				       ##__VA_ARGS__			\
@@ -536,7 +538,7 @@ int paravirt_disable_iospace(void);
 			asm volatile(pre				\
 				     paravirt_alt(PARAVIRT_CALL)	\
 				     post				\
-				     : call_clbr			\
+				     : call_clbr, "+r" (__sp)		\
 				     : paravirt_type(op),		\
 				       paravirt_clobber(clbr),		\
 				       ##__VA_ARGS__			\
@@ -563,7 +565,7 @@ int paravirt_disable_iospace(void);
 		asm volatile(pre					\
 			     paravirt_alt(PARAVIRT_CALL)		\
 			     post					\
-			     : call_clbr				\
+			     : call_clbr, "+r" (__sp)			\
 			     : paravirt_type(op),			\
 			       paravirt_clobber(clbr),			\
 			       ##__VA_ARGS__				\

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK
  2016-01-21 22:49 ` [PATCH 09/33] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK Josh Poimboeuf
  2016-02-23  8:57   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: rusty, chrisw, bp, akataria, palves, tglx, hpa, jpoimboe, bernd,
	acme, mmarek, dvlasenk, akpm, linux-kernel, namhyung, jslaby,
	torvalds, luto, brgerst, bp, jeremy, chris.j.arges, luto, mingo,
	peterz

Commit-ID:  87b240cbe3e51bf070fe2839fecb6450323aaef4
Gitweb:     http://git.kernel.org/tip/87b240cbe3e51bf070fe2839fecb6450323aaef4
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:13 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK

A function created with the PV_CALLEE_SAVE_REGS_THUNK macro doesn't set
up a new stack frame before the call instruction, which breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.  Also, the thunk functions aren't annotated as ELF
callable functions.

Create a stack frame when CONFIG_FRAME_POINTER is enabled and add the
ELF function type.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/a2cad74e87c4aba7fd0f54a1af312e66a824a575.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/paravirt.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index f619250..601f1b8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -13,6 +13,7 @@
 #include <linux/bug.h>
 #include <linux/types.h>
 #include <linux/cpumask.h>
+#include <asm/frame.h>
 
 static inline int paravirt_enabled(void)
 {
@@ -756,15 +757,19 @@ static __always_inline void __ticket_unlock_kick(struct arch_spinlock *lock,
  * call. The return value in rax/eax will not be saved, even for void
  * functions.
  */
+#define PV_THUNK_NAME(func) "__raw_callee_save_" #func
 #define PV_CALLEE_SAVE_REGS_THUNK(func)					\
 	extern typeof(func) __raw_callee_save_##func;			\
 									\
 	asm(".pushsection .text;"					\
-	    ".globl __raw_callee_save_" #func " ; "			\
-	    "__raw_callee_save_" #func ": "				\
+	    ".globl " PV_THUNK_NAME(func) ";"				\
+	    ".type " PV_THUNK_NAME(func) ", @function;"			\
+	    PV_THUNK_NAME(func) ":"					\
+	    FRAME_BEGIN							\
 	    PV_SAVE_ALL_CALLER_REGS					\
 	    "call " #func ";"						\
 	    PV_RESTORE_ALL_CALLER_REGS					\
+	    FRAME_END							\
 	    "ret;"							\
 	    ".popsection")
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/amd: Set ELF function type for vide()
  2016-01-21 22:49 ` [PATCH 10/33] x86/amd: Set ELF function type for vide() Josh Poimboeuf
  2016-02-23  8:57   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bernd, namhyung, tglx, chris.j.arges, akpm, dvlasenk,
	linux-kernel, luto, acme, hpa, mmarek, torvalds, jpoimboe,
	jslaby, palves, mingo, bp, brgerst, peterz, bp, luto

Commit-ID:  de642faf48670c3c8eae5899177f786c624f4894
Gitweb:     http://git.kernel.org/tip/de642faf48670c3c8eae5899177f786c624f4894
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:14 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/amd: Set ELF function type for vide()

vide() is a callable function, but is missing the ELF function type,
which confuses tools like stacktool.

Properly annotate it to be a callable function.  The generated code is
unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/a324095f5c9390ff39b15b4562ea1bbeda1a8282.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/amd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index a07956a..fe2f089 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -75,7 +75,10 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
  */
 
 extern __visible void vide(void);
-__asm__(".globl vide\n\t.align 4\nvide: ret");
+__asm__(".globl vide\n"
+	".type vide, @function\n"
+	".align 4\n"
+	"vide: ret\n");
 
 static void init_amd_k5(struct cpuinfo_x86 *c)
 {

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Move .Lbswap_mask data to .rodata section
  2016-01-21 22:49 ` [PATCH 11/33] x86/asm/crypto: Move .Lbswap_mask data to .rodata section Josh Poimboeuf
  2016-02-23  8:58   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: davem, linux-kernel, torvalds, herbert, bp, bernd, hpa, peterz,
	luto, mmarek, palves, jpoimboe, tglx, acme, namhyung, mingo,
	jslaby, akpm, dvlasenk, chris.j.arges, bp, luto, brgerst

Commit-ID:  1253cab8a35278c0493a48aa5ab74992f7a849ea
Gitweb:     http://git.kernel.org/tip/1253cab8a35278c0493a48aa5ab74992f7a849ea
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:15 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/asm/crypto: Move .Lbswap_mask data to .rodata section

stacktool reports the following warning:

  stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction

stacktool gets confused when it tries to disassemble the following data
in the .text section:

  .Lbswap_mask:
          .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0

Move it to .rodata which is a more appropriate section for read-only
data.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/b6a2f3f8bda705143e127c025edb2b53c86e6eb4.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/aesni-intel_asm.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 6bd2c6c..c44cfed 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2538,9 +2538,11 @@ ENTRY(aesni_cbc_dec)
 ENDPROC(aesni_cbc_dec)
 
 #ifdef __x86_64__
+.pushsection .rodata
 .align 16
 .Lbswap_mask:
 	.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
+.popsection
 
 /*
  * _aesni_inc_init:	internal ABI

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Move jump_table to .rodata section
  2016-01-21 22:49 ` [PATCH 12/33] x86/asm/crypto: Move jump_table " Josh Poimboeuf
  2016-02-23  8:58   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: palves, bernd, chris.j.arges, dvlasenk, peterz, jslaby, brgerst,
	herbert, mmarek, davem, luto, hpa, torvalds, luto, mingo, akpm,
	bp, acme, linux-kernel, namhyung, jpoimboe, bp, tglx

Commit-ID:  f66f61919eb38c5f20e8a978cae7ecdede4c23b9
Gitweb:     http://git.kernel.org/tip/f66f61919eb38c5f20e8a978cae7ecdede4c23b9
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:16 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/asm/crypto: Move jump_table to .rodata section

stacktool reports the following warning:

  stacktool: arch/x86/crypto/crc32c-pcl-intel-asm_64.o: crc_pcl()+0x11dd: can't decode instruction

It gets confused when trying to decode jump_table data.  Move jump_table
to the .rodata section which is a more appropriate home for read-only
data.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/1dbf80c097bb9d89c0cbddc01a815ada690e3b32.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 4fe27e0..dc05f01 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -170,8 +170,8 @@ continue_block:
 	## branch into array
 	lea	jump_table(%rip), bufp
 	movzxw  (bufp, %rax, 2), len
-	offset=crc_array-jump_table
-	lea     offset(bufp, len, 1), bufp
+	lea	crc_array(%rip), bufp
+	lea     (bufp, len, 1), bufp
 	jmp     *bufp
 
 	################################################################
@@ -310,7 +310,9 @@ do_return:
 	popq    %rdi
 	popq    %rbx
         ret
+ENDPROC(crc_pcl)
 
+.section	.rodata, "a", %progbits
         ################################################################
         ## jump table        Table is 129 entries x 2 bytes each
         ################################################################
@@ -324,13 +326,11 @@ JMPTBL_ENTRY %i
 	i=i+1
 .endr
 
-ENDPROC(crc_pcl)
 
 	################################################################
 	## PCLMULQDQ tables
 	## Table is 128 entries x 2 words (8 bytes) each
 	################################################################
-.section	.rodata, "a", %progbits
 .align 8
 K_table:
 	.long 0x493c7d27, 0x00000001

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Simplify stack usage in sha-mb functions
  2016-01-21 22:49 ` [PATCH 13/33] x86/asm/crypto: Simplify stack usage in sha-mb functions Josh Poimboeuf
  2016-02-23  8:59   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, acme, mingo, hpa, torvalds, akpm, brgerst, peterz,
	bernd, luto, jslaby, palves, tglx, mmarek, bp, jpoimboe,
	linux-kernel, chris.j.arges, luto, namhyung

Commit-ID:  aec4d0e301f17bb143341c82cc44685b8af0b945
Gitweb:     http://git.kernel.org/tip/aec4d0e301f17bb143341c82cc44685b8af0b945
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:17 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/asm/crypto: Simplify stack usage in sha-mb functions

sha1_mb_mgr_flush_avx2() and sha1_mb_mgr_submit_avx2() both allocate a
lot of stack space which is never used.  Also, many of the registers
being saved aren't being clobbered so there's no need to save them.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/9402e4d87580d6b2376ed95f67b84bdcce3c830e.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S  | 32 ++----------------------
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S | 29 +++------------------
 2 files changed, 6 insertions(+), 55 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
index 85c4e1c..672eaeb 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
@@ -86,16 +86,6 @@
 #define extra_blocks    %arg2
 #define p               %arg2
 
-
-# STACK_SPACE needs to be an odd multiple of 8
-_XMM_SAVE_SIZE  = 10*16
-_GPR_SAVE_SIZE  = 8*8
-_ALIGN_SIZE     = 8
-
-_XMM_SAVE       = 0
-_GPR_SAVE       = _XMM_SAVE + _XMM_SAVE_SIZE
-STACK_SPACE     = _GPR_SAVE + _GPR_SAVE_SIZE + _ALIGN_SIZE
-
 .macro LABEL prefix n
 \prefix\n\():
 .endm
@@ -113,16 +103,7 @@ offset = \_offset
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
 ENTRY(sha1_mb_mgr_flush_avx2)
-	mov	%rsp, %r10
-	sub     $STACK_SPACE, %rsp
-	and     $~31, %rsp
-	mov     %rbx, _GPR_SAVE(%rsp)
-	mov     %r10, _GPR_SAVE+8*1(%rsp) #save rsp
-	mov	%rbp, _GPR_SAVE+8*3(%rsp)
-	mov	%r12, _GPR_SAVE+8*4(%rsp)
-	mov	%r13, _GPR_SAVE+8*5(%rsp)
-	mov	%r14, _GPR_SAVE+8*6(%rsp)
-	mov	%r15, _GPR_SAVE+8*7(%rsp)
+	push	%rbx
 
 	# If bit (32+3) is set, then all lanes are empty
 	mov     _unused_lanes(state), unused_lanes
@@ -230,16 +211,7 @@ len_is_0:
 	mov     tmp2_w, offset(job_rax)
 
 return:
-
-	mov     _GPR_SAVE(%rsp), %rbx
-	mov     _GPR_SAVE+8*1(%rsp), %r10 #saved rsp
-	mov	_GPR_SAVE+8*3(%rsp), %rbp
-	mov	_GPR_SAVE+8*4(%rsp), %r12
-	mov	_GPR_SAVE+8*5(%rsp), %r13
-	mov	_GPR_SAVE+8*6(%rsp), %r14
-	mov	_GPR_SAVE+8*7(%rsp), %r15
-	mov     %r10, %rsp
-
+	pop	%rbx
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index 2ab9560..a5a14c6 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -94,25 +94,12 @@ DWORD_tmp	= %r9d
 
 lane_data       = %r10
 
-# STACK_SPACE needs to be an odd multiple of 8
-STACK_SPACE     = 8*8 + 16*10 + 8
-
 # JOB* submit_mb_mgr_submit_avx2(MB_MGR *state, job_sha1 *job)
 # arg 1 : rcx : state
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
-
-	mov	%rsp, %r10
-	sub     $STACK_SPACE, %rsp
-	and	$~31, %rsp
-
-	mov     %rbx, (%rsp)
-	mov	%r10, 8*2(%rsp)	#save old rsp
-	mov     %rbp, 8*3(%rsp)
-	mov	%r12, 8*4(%rsp)
-	mov	%r13, 8*5(%rsp)
-	mov	%r14, 8*6(%rsp)
-	mov	%r15, 8*7(%rsp)
+	push	%rbx
+	push	%rbp
 
 	mov     _unused_lanes(state), unused_lanes
 	mov	unused_lanes, lane
@@ -203,16 +190,8 @@ len_is_0:
 	movl    DWORD_tmp, _result_digest+1*16(job_rax)
 
 return:
-
-	mov     (%rsp), %rbx
-	mov	8*2(%rsp), %r10	#save old rsp
-	mov     8*3(%rsp), %rbp
-	mov	8*4(%rsp), %r12
-	mov	8*5(%rsp), %r13
-	mov	8*6(%rsp), %r14
-	mov	8*7(%rsp), %r15
-	mov     %r10, %rsp
-
+	pop	%rbp
+	pop	%rbx
 	ret
 
 return_null:

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Don't use RBP as a scratch register
  2016-01-21 22:49 ` [PATCH 14/33] x86/asm/crypto: Don't use rbp as a scratch register Josh Poimboeuf
  2016-02-23  8:59   ` [tip:x86/debug] x86/asm/crypto: Don't use RBP " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, hpa, jpoimboe, bp, torvalds, brgerst, palves, luto, tglx,
	namhyung, acme, mingo, peterz, dvlasenk, linux-kernel, bernd,
	chris.j.arges, mmarek, akpm, jslaby

Commit-ID:  68874ac3304ade7ed5ebb12af00d6b9bbbca0a16
Gitweb:     http://git.kernel.org/tip/68874ac3304ade7ed5ebb12af00d6b9bbbca0a16
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:18 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:42 +0100

x86/asm/crypto: Don't use RBP as a scratch register

The frame pointer (RBP) is getting clobbered in
sha1_mb_mgr_submit_avx2() before a function call, which can mess up
stack traces.  Use R12 instead.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/15a3eb7ebe68e37755927915f45e4f0bde4d18c5.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index a5a14c6..c3b9447 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -86,8 +86,8 @@ job_rax         = %rax
 len             = %rax
 DWORD_len	= %eax
 
-lane            = %rbp
-tmp3            = %rbp
+lane            = %r12
+tmp3            = %r12
 
 tmp             = %r9
 DWORD_tmp	= %r9d
@@ -99,7 +99,7 @@ lane_data       = %r10
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
 	push	%rbx
-	push	%rbp
+	push	%r12
 
 	mov     _unused_lanes(state), unused_lanes
 	mov	unused_lanes, lane
@@ -190,7 +190,7 @@ len_is_0:
 	movl    DWORD_tmp, _result_digest+1*16(job_rax)
 
 return:
-	pop	%rbp
+	pop	%r12
 	pop	%rbx
 	ret
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/crypto: Create stack frames in crypto functions
  2016-01-21 22:49 ` [PATCH 15/33] x86/asm/crypto: Create stack frames in crypto functions Josh Poimboeuf
  2016-02-23  8:59   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, namhyung, chris.j.arges, jpoimboe, luto, mingo, acme,
	mmarek, torvalds, bernd, herbert, tglx, davem, brgerst,
	linux-kernel, dvlasenk, luto, bp, palves, hpa, jslaby, akpm

Commit-ID:  8691ccd764f9ecc69a6812dfe76214c86ac9ba06
Gitweb:     http://git.kernel.org/tip/8691ccd764f9ecc69a6812dfe76214c86ac9ba06
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:19 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm/crypto: Create stack frames in crypto functions

The crypto code has several callable non-leaf functions which don't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6c20192bcf1102ae18ae5a242cabf30ce9b29895.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/aesni-intel_asm.S                | 73 +++++++++++++++---------
 arch/x86/crypto/camellia-aesni-avx-asm_64.S      | 15 +++++
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S     | 15 +++++
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S        |  9 +++
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S        | 13 +++++
 arch/x86/crypto/ghash-clmulni-intel_asm.S        |  5 ++
 arch/x86/crypto/serpent-avx-x86_64-asm_64.S      | 13 +++++
 arch/x86/crypto/serpent-avx2-asm_64.S            | 13 +++++
 arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S  |  3 +
 arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S |  3 +
 arch/x86/crypto/twofish-avx-x86_64-asm_64.S      | 13 +++++
 11 files changed, 148 insertions(+), 27 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index c44cfed..383a6f8 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -31,6 +31,7 @@
 
 #include <linux/linkage.h>
 #include <asm/inst.h>
+#include <asm/frame.h>
 
 /*
  * The following macros are used to move an (un)aligned 16 byte value to/from
@@ -1800,11 +1801,12 @@ ENDPROC(_key_expansion_256b)
  *                   unsigned int key_len)
  */
 ENTRY(aesni_set_key)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
-	movl 8(%esp), KEYP		# ctx
-	movl 12(%esp), UKEYP		# in_key
-	movl 16(%esp), %edx		# key_len
+	movl (FRAME_OFFSET+8)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+12)(%esp), UKEYP	# in_key
+	movl (FRAME_OFFSET+16)(%esp), %edx	# key_len
 #endif
 	movups (UKEYP), %xmm0		# user key (first 16 bytes)
 	movaps %xmm0, (KEYP)
@@ -1905,6 +1907,7 @@ ENTRY(aesni_set_key)
 #ifndef __x86_64__
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_set_key)
 
@@ -1912,12 +1915,13 @@ ENDPROC(aesni_set_key)
  * void aesni_enc(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
 ENTRY(aesni_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
 	pushl KLEN
-	movl 12(%esp), KEYP
-	movl 16(%esp), OUTP
-	movl 20(%esp), INP
+	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+20)(%esp), INP	# src
 #endif
 	movl 480(KEYP), KLEN		# key length
 	movups (INP), STATE		# input
@@ -1927,6 +1931,7 @@ ENTRY(aesni_enc)
 	popl KLEN
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_enc)
 
@@ -2101,12 +2106,13 @@ ENDPROC(_aesni_enc4)
  * void aesni_dec (struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
 ENTRY(aesni_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
 	pushl KLEN
-	movl 12(%esp), KEYP
-	movl 16(%esp), OUTP
-	movl 20(%esp), INP
+	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+20)(%esp), INP	# src
 #endif
 	mov 480(KEYP), KLEN		# key length
 	add $240, KEYP
@@ -2117,6 +2123,7 @@ ENTRY(aesni_dec)
 	popl KLEN
 	popl KEYP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_dec)
 
@@ -2292,14 +2299,15 @@ ENDPROC(_aesni_dec4)
  *		      size_t len)
  */
 ENTRY(aesni_ecb_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 16(%esp), KEYP
-	movl 20(%esp), OUTP
-	movl 24(%esp), INP
-	movl 28(%esp), LEN
+	movl (FRAME_OFFSET+16)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+20)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+24)(%esp), INP	# src
+	movl (FRAME_OFFSET+28)(%esp), LEN	# len
 #endif
 	test LEN, LEN		# check length
 	jz .Lecb_enc_ret
@@ -2342,6 +2350,7 @@ ENTRY(aesni_ecb_enc)
 	popl KEYP
 	popl LEN
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_ecb_enc)
 
@@ -2350,14 +2359,15 @@ ENDPROC(aesni_ecb_enc)
  *		      size_t len);
  */
 ENTRY(aesni_ecb_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 16(%esp), KEYP
-	movl 20(%esp), OUTP
-	movl 24(%esp), INP
-	movl 28(%esp), LEN
+	movl (FRAME_OFFSET+16)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+20)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+24)(%esp), INP	# src
+	movl (FRAME_OFFSET+28)(%esp), LEN	# len
 #endif
 	test LEN, LEN
 	jz .Lecb_dec_ret
@@ -2401,6 +2411,7 @@ ENTRY(aesni_ecb_dec)
 	popl KEYP
 	popl LEN
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_ecb_dec)
 
@@ -2409,16 +2420,17 @@ ENDPROC(aesni_ecb_dec)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_cbc_enc)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 20(%esp), KEYP
-	movl 24(%esp), OUTP
-	movl 28(%esp), INP
-	movl 32(%esp), LEN
-	movl 36(%esp), IVP
+	movl (FRAME_OFFSET+20)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+24)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+28)(%esp), INP	# src
+	movl (FRAME_OFFSET+32)(%esp), LEN	# len
+	movl (FRAME_OFFSET+36)(%esp), IVP	# iv
 #endif
 	cmp $16, LEN
 	jb .Lcbc_enc_ret
@@ -2443,6 +2455,7 @@ ENTRY(aesni_cbc_enc)
 	popl LEN
 	popl IVP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_cbc_enc)
 
@@ -2451,16 +2464,17 @@ ENDPROC(aesni_cbc_enc)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_cbc_dec)
+	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
 	pushl LEN
 	pushl KEYP
 	pushl KLEN
-	movl 20(%esp), KEYP
-	movl 24(%esp), OUTP
-	movl 28(%esp), INP
-	movl 32(%esp), LEN
-	movl 36(%esp), IVP
+	movl (FRAME_OFFSET+20)(%esp), KEYP	# ctx
+	movl (FRAME_OFFSET+24)(%esp), OUTP	# dst
+	movl (FRAME_OFFSET+28)(%esp), INP	# src
+	movl (FRAME_OFFSET+32)(%esp), LEN	# len
+	movl (FRAME_OFFSET+36)(%esp), IVP	# iv
 #endif
 	cmp $16, LEN
 	jb .Lcbc_dec_just_ret
@@ -2534,6 +2548,7 @@ ENTRY(aesni_cbc_dec)
 	popl LEN
 	popl IVP
 #endif
+	FRAME_END
 	ret
 ENDPROC(aesni_cbc_dec)
 
@@ -2600,6 +2615,7 @@ ENDPROC(_aesni_inc)
  *		      size_t len, u8 *iv)
  */
 ENTRY(aesni_ctr_enc)
+	FRAME_BEGIN
 	cmp $16, LEN
 	jb .Lctr_enc_just_ret
 	mov 480(KEYP), KLEN
@@ -2653,6 +2669,7 @@ ENTRY(aesni_ctr_enc)
 .Lctr_enc_ret:
 	movups IV, (IVP)
 .Lctr_enc_just_ret:
+	FRAME_END
 	ret
 ENDPROC(aesni_ctr_enc)
 
@@ -2679,6 +2696,7 @@ ENDPROC(aesni_ctr_enc)
  *			 bool enc, u8 *iv)
  */
 ENTRY(aesni_xts_crypt8)
+	FRAME_BEGIN
 	cmpb $0, %cl
 	movl $0, %ecx
 	movl $240, %r10d
@@ -2779,6 +2797,7 @@ ENTRY(aesni_xts_crypt8)
 	pxor INC, STATE4
 	movdqu STATE4, 0x70(OUTP)
 
+	FRAME_END
 	ret
 ENDPROC(aesni_xts_crypt8)
 
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index ce71f92..aa9e8bd 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -16,6 +16,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -726,6 +727,7 @@ __camellia_enc_blk16:
 	 *	%xmm0..%xmm15: 16 encrypted blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 16(%rax), %rcx;
 
@@ -780,6 +782,7 @@ __camellia_enc_blk16:
 		    %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
 		    %xmm15, (key_table)(CTX, %r8, 8), (%rax), 1 * 16(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -812,6 +815,7 @@ __camellia_dec_blk16:
 	 *	%xmm0..%xmm15: 16 plaintext blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 16(%rax), %rcx;
 
@@ -865,6 +869,7 @@ __camellia_dec_blk16:
 		    %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
 		    %xmm15, (key_table)(CTX), (%rax), 1 * 16(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -890,6 +895,7 @@ ENTRY(camellia_ecb_enc_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	 FRAME_BEGIN
 
 	inpack16_pre(%xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
 		     %xmm8, %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14,
@@ -904,6 +910,7 @@ ENTRY(camellia_ecb_enc_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_enc_16way)
 
@@ -913,6 +920,7 @@ ENTRY(camellia_ecb_dec_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	 FRAME_BEGIN
 
 	cmpl $16, key_length(CTX);
 	movl $32, %r8d;
@@ -932,6 +940,7 @@ ENTRY(camellia_ecb_dec_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_dec_16way)
 
@@ -941,6 +950,7 @@ ENTRY(camellia_cbc_dec_16way)
 	 *	%rsi: dst (16 blocks)
 	 *	%rdx: src (16 blocks)
 	 */
+	FRAME_BEGIN
 
 	cmpl $16, key_length(CTX);
 	movl $32, %r8d;
@@ -981,6 +991,7 @@ ENTRY(camellia_cbc_dec_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_cbc_dec_16way)
 
@@ -997,6 +1008,7 @@ ENTRY(camellia_ctr_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
@@ -1092,6 +1104,7 @@ ENTRY(camellia_ctr_16way)
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ctr_16way)
 
@@ -1112,6 +1125,7 @@ camellia_xts_crypt_16way:
 	 *	%r8: index for input whitening key
 	 *	%r9: pointer to  __camellia_enc_blk16 or __camellia_dec_blk16
 	 */
+	FRAME_BEGIN
 
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
@@ -1234,6 +1248,7 @@ camellia_xts_crypt_16way:
 		     %xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
 		     %xmm8, %rsi);
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_xts_crypt_16way)
 
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 0e0b886..16186c1 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -11,6 +11,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -766,6 +767,7 @@ __camellia_enc_blk32:
 	 *	%ymm0..%ymm15: 32 encrypted blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 32(%rax), %rcx;
 
@@ -820,6 +822,7 @@ __camellia_enc_blk32:
 		    %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 		    %ymm15, (key_table)(CTX, %r8, 8), (%rax), 1 * 32(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -852,6 +855,7 @@ __camellia_dec_blk32:
 	 *	%ymm0..%ymm15: 16 plaintext blocks, order swapped:
 	 *       7, 8, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
 	 */
+	FRAME_BEGIN
 
 	leaq 8 * 32(%rax), %rcx;
 
@@ -905,6 +909,7 @@ __camellia_dec_blk32:
 		    %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 		    %ymm15, (key_table)(CTX), (%rax), 1 * 32(%rax));
 
+	FRAME_END
 	ret;
 
 .align 8
@@ -930,6 +935,7 @@ ENTRY(camellia_ecb_enc_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -948,6 +954,7 @@ ENTRY(camellia_ecb_enc_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_enc_32way)
 
@@ -957,6 +964,7 @@ ENTRY(camellia_ecb_dec_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -980,6 +988,7 @@ ENTRY(camellia_ecb_dec_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ecb_dec_32way)
 
@@ -989,6 +998,7 @@ ENTRY(camellia_cbc_dec_32way)
 	 *	%rsi: dst (32 blocks)
 	 *	%rdx: src (32 blocks)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1046,6 +1056,7 @@ ENTRY(camellia_cbc_dec_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_cbc_dec_32way)
 
@@ -1070,6 +1081,7 @@ ENTRY(camellia_ctr_32way)
 	 *	%rdx: src (32 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1184,6 +1196,7 @@ ENTRY(camellia_ctr_32way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_ctr_32way)
 
@@ -1216,6 +1229,7 @@ camellia_xts_crypt_32way:
 	 *	%r8: index for input whitening key
 	 *	%r9: pointer to  __camellia_enc_blk32 or __camellia_dec_blk32
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -1349,6 +1363,7 @@ camellia_xts_crypt_32way:
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(camellia_xts_crypt_32way)
 
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index c35fd5d..14fa196 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 .file "cast5-avx-x86_64-asm_64.S"
 
@@ -365,6 +366,7 @@ ENTRY(cast5_ecb_enc_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -388,6 +390,7 @@ ENTRY(cast5_ecb_enc_16way)
 	vmovdqu RR4, (6*4*4)(%r11);
 	vmovdqu RL4, (7*4*4)(%r11);
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ecb_enc_16way)
 
@@ -398,6 +401,7 @@ ENTRY(cast5_ecb_dec_16way)
 	 *	%rdx: src
 	 */
 
+	FRAME_BEGIN
 	movq %rsi, %r11;
 
 	vmovdqu (0*4*4)(%rdx), RL1;
@@ -420,6 +424,7 @@ ENTRY(cast5_ecb_dec_16way)
 	vmovdqu RR4, (6*4*4)(%r11);
 	vmovdqu RL4, (7*4*4)(%r11);
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ecb_dec_16way)
 
@@ -429,6 +434,7 @@ ENTRY(cast5_cbc_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -469,6 +475,7 @@ ENTRY(cast5_cbc_dec_16way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_cbc_dec_16way)
 
@@ -479,6 +486,7 @@ ENTRY(cast5_ctr_16way)
 	 *	%rdx: src
 	 *	%rcx: iv (big endian, 64bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -542,5 +550,6 @@ ENTRY(cast5_ctr_16way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast5_ctr_16way)
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index e3531f8..c419389 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "cast6-avx-x86_64-asm_64.S"
@@ -349,6 +350,7 @@ ENTRY(cast6_ecb_enc_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -358,6 +360,7 @@ ENTRY(cast6_ecb_enc_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ecb_enc_8way)
 
@@ -367,6 +370,7 @@ ENTRY(cast6_ecb_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -376,6 +380,7 @@ ENTRY(cast6_ecb_dec_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ecb_dec_8way)
 
@@ -385,6 +390,7 @@ ENTRY(cast6_cbc_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -399,6 +405,7 @@ ENTRY(cast6_cbc_dec_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_cbc_dec_8way)
 
@@ -409,6 +416,7 @@ ENTRY(cast6_ctr_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -424,6 +432,7 @@ ENTRY(cast6_ctr_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_ctr_8way)
 
@@ -434,6 +443,7 @@ ENTRY(cast6_xts_enc_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -446,6 +456,7 @@ ENTRY(cast6_xts_enc_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_xts_enc_8way)
 
@@ -456,6 +467,7 @@ ENTRY(cast6_xts_dec_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -468,5 +480,6 @@ ENTRY(cast6_xts_dec_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(cast6_xts_dec_8way)
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index 5d1e007..eed55c8 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -18,6 +18,7 @@
 
 #include <linux/linkage.h>
 #include <asm/inst.h>
+#include <asm/frame.h>
 
 .data
 
@@ -94,6 +95,7 @@ ENDPROC(__clmul_gf128mul_ble)
 
 /* void clmul_ghash_mul(char *dst, const u128 *shash) */
 ENTRY(clmul_ghash_mul)
+	FRAME_BEGIN
 	movups (%rdi), DATA
 	movups (%rsi), SHASH
 	movaps .Lbswap_mask, BSWAP
@@ -101,6 +103,7 @@ ENTRY(clmul_ghash_mul)
 	call __clmul_gf128mul_ble
 	PSHUFB_XMM BSWAP DATA
 	movups DATA, (%rdi)
+	FRAME_END
 	ret
 ENDPROC(clmul_ghash_mul)
 
@@ -109,6 +112,7 @@ ENDPROC(clmul_ghash_mul)
  *			   const u128 *shash);
  */
 ENTRY(clmul_ghash_update)
+	FRAME_BEGIN
 	cmp $16, %rdx
 	jb .Lupdate_just_ret	# check length
 	movaps .Lbswap_mask, BSWAP
@@ -128,5 +132,6 @@ ENTRY(clmul_ghash_update)
 	PSHUFB_XMM BSWAP DATA
 	movups DATA, (%rdi)
 .Lupdate_just_ret:
+	FRAME_END
 	ret
 ENDPROC(clmul_ghash_update)
diff --git a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
index 2f202f4..8be5718 100644
--- a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "serpent-avx-x86_64-asm_64.S"
@@ -681,6 +682,7 @@ ENTRY(serpent_ecb_enc_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -688,6 +690,7 @@ ENTRY(serpent_ecb_enc_8way_avx)
 
 	store_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_enc_8way_avx)
 
@@ -697,6 +700,7 @@ ENTRY(serpent_ecb_dec_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -704,6 +708,7 @@ ENTRY(serpent_ecb_dec_8way_avx)
 
 	store_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_dec_8way_avx)
 
@@ -713,6 +718,7 @@ ENTRY(serpent_cbc_dec_8way_avx)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	load_8way(%rdx, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
@@ -720,6 +726,7 @@ ENTRY(serpent_cbc_dec_8way_avx)
 
 	store_cbc_8way(%rdx, %rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_cbc_dec_8way_avx)
 
@@ -730,6 +737,7 @@ ENTRY(serpent_ctr_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	load_ctr_8way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
 		      RD2, RK0, RK1, RK2);
@@ -738,6 +746,7 @@ ENTRY(serpent_ctr_8way_avx)
 
 	store_ctr_8way(%rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ctr_8way_avx)
 
@@ -748,6 +757,7 @@ ENTRY(serpent_xts_enc_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	/* regs <= src, dst <= IVs, regs <= regs xor IVs */
 	load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
@@ -758,6 +768,7 @@ ENTRY(serpent_xts_enc_8way_avx)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_enc_8way_avx)
 
@@ -768,6 +779,7 @@ ENTRY(serpent_xts_dec_8way_avx)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	/* regs <= src, dst <= IVs, regs <= regs xor IVs */
 	load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
@@ -778,5 +790,6 @@ ENTRY(serpent_xts_dec_8way_avx)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_dec_8way_avx)
diff --git a/arch/x86/crypto/serpent-avx2-asm_64.S b/arch/x86/crypto/serpent-avx2-asm_64.S
index b222085..97c48ad 100644
--- a/arch/x86/crypto/serpent-avx2-asm_64.S
+++ b/arch/x86/crypto/serpent-avx2-asm_64.S
@@ -15,6 +15,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx2.S"
 
 .file "serpent-avx2-asm_64.S"
@@ -673,6 +674,7 @@ ENTRY(serpent_ecb_enc_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -684,6 +686,7 @@ ENTRY(serpent_ecb_enc_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_enc_16way)
 
@@ -693,6 +696,7 @@ ENTRY(serpent_ecb_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -704,6 +708,7 @@ ENTRY(serpent_ecb_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ecb_dec_16way)
 
@@ -713,6 +718,7 @@ ENTRY(serpent_cbc_dec_16way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -725,6 +731,7 @@ ENTRY(serpent_cbc_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_cbc_dec_16way)
 
@@ -735,6 +742,7 @@ ENTRY(serpent_ctr_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -748,6 +756,7 @@ ENTRY(serpent_ctr_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_ctr_16way)
 
@@ -758,6 +767,7 @@ ENTRY(serpent_xts_enc_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -772,6 +782,7 @@ ENTRY(serpent_xts_enc_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_enc_16way)
 
@@ -782,6 +793,7 @@ ENTRY(serpent_xts_dec_16way)
 	 *	%rdx: src (16 blocks)
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	vzeroupper;
 
@@ -796,5 +808,6 @@ ENTRY(serpent_xts_dec_16way)
 
 	vzeroupper;
 
+	FRAME_END
 	ret;
 ENDPROC(serpent_xts_dec_16way)
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
index 672eaeb..96df6a3 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_flush_avx2.S
@@ -52,6 +52,7 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "sha1_mb_mgr_datastruct.S"
 
 
@@ -103,6 +104,7 @@ offset = \_offset
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
 ENTRY(sha1_mb_mgr_flush_avx2)
+	FRAME_BEGIN
 	push	%rbx
 
 	# If bit (32+3) is set, then all lanes are empty
@@ -212,6 +214,7 @@ len_is_0:
 
 return:
 	pop	%rbx
+	FRAME_END
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
index c3b9447..1435acf 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha-mb/sha1_mb_mgr_submit_avx2.S
@@ -53,6 +53,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "sha1_mb_mgr_datastruct.S"
 
 
@@ -98,6 +99,7 @@ lane_data       = %r10
 # arg 1 : rcx : state
 # arg 2 : rdx : job
 ENTRY(sha1_mb_mgr_submit_avx2)
+	FRAME_BEGIN
 	push	%rbx
 	push	%r12
 
@@ -192,6 +194,7 @@ len_is_0:
 return:
 	pop	%r12
 	pop	%rbx
+	FRAME_END
 	ret
 
 return_null:
diff --git a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
index 0505813..dc66273 100644
--- a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
@@ -24,6 +24,7 @@
  */
 
 #include <linux/linkage.h>
+#include <asm/frame.h>
 #include "glue_helper-asm-avx.S"
 
 .file "twofish-avx-x86_64-asm_64.S"
@@ -333,6 +334,7 @@ ENTRY(twofish_ecb_enc_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -342,6 +344,7 @@ ENTRY(twofish_ecb_enc_8way)
 
 	store_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ecb_enc_8way)
 
@@ -351,6 +354,7 @@ ENTRY(twofish_ecb_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -360,6 +364,7 @@ ENTRY(twofish_ecb_dec_8way)
 
 	store_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ecb_dec_8way)
 
@@ -369,6 +374,7 @@ ENTRY(twofish_cbc_dec_8way)
 	 *	%rsi: dst
 	 *	%rdx: src
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -383,6 +389,7 @@ ENTRY(twofish_cbc_dec_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_cbc_dec_8way)
 
@@ -393,6 +400,7 @@ ENTRY(twofish_ctr_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (little endian, 128bit)
 	 */
+	FRAME_BEGIN
 
 	pushq %r12;
 
@@ -408,6 +416,7 @@ ENTRY(twofish_ctr_8way)
 
 	popq %r12;
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_ctr_8way)
 
@@ -418,6 +427,7 @@ ENTRY(twofish_xts_enc_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -430,6 +440,7 @@ ENTRY(twofish_xts_enc_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_xts_enc_8way)
 
@@ -440,6 +451,7 @@ ENTRY(twofish_xts_dec_8way)
 	 *	%rdx: src
 	 *	%rcx: iv (t ⊕ αⁿ ∈ GF(2¹²⁸))
 	 */
+	FRAME_BEGIN
 
 	movq %rsi, %r11;
 
@@ -452,5 +464,6 @@ ENTRY(twofish_xts_dec_8way)
 	/* dst <= regs xor IVs(in dst) */
 	store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
 
+	FRAME_END
 	ret;
 ENDPROC(twofish_xts_dec_8way)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/entry: Create stack frames in thunk functions
  2016-01-21 22:49 ` [PATCH 16/33] x86/asm/entry: Create stack frames in thunk functions Josh Poimboeuf
  2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, brgerst, peterz, torvalds, mingo, akpm, bernd,
	linux-kernel, bp, luto, dvlasenk, chris.j.arges, jslaby,
	jpoimboe, bp, palves, tglx, mmarek, hpa, namhyung, luto

Commit-ID:  058fb73274f9e7eb72acc9836cbb2c4a9f9659a0
Gitweb:     http://git.kernel.org/tip/058fb73274f9e7eb72acc9836cbb2c4a9f9659a0
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:20 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm/entry: Create stack frames in thunk functions

Thunk functions are callable non-leaf functions that don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.  Also they
aren't annotated as ELF callable functions which can confuse tooling.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled and
add the ELF function type.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/4373e5bff459b9fd66ce5d45bfcc881a5c202643.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/thunk_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index efb2b93..98df1fa 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -8,11 +8,14 @@
 #include <linux/linkage.h>
 #include "calling.h"
 #include <asm/asm.h>
+#include <asm/frame.h>
 
 	/* rdi:	arg1 ... normal C conventions. rax is saved/restored. */
 	.macro THUNK name, func, put_ret_addr_in_rdi=0
 	.globl \name
+	.type \name, @function
 \name:
+	FRAME_BEGIN
 
 	/* this one pushes 9 elems, the next one would be %rIP */
 	pushq %rdi
@@ -62,6 +65,7 @@ restore:
 	popq %rdx
 	popq %rsi
 	popq %rdi
+	FRAME_END
 	ret
 	_ASM_NOKPROBE(restore)
 #endif

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
  2016-01-21 22:49 ` [PATCH 17/33] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() Josh Poimboeuf
  2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, len.brown, pavel, bernd, akpm, jslaby, bp, bp, tglx,
	dvlasenk, linux-kernel, namhyung, brgerst, hpa, palves, peterz,
	mmarek, chris.j.arges, jpoimboe, acme, rafael.j.wysocki, luto,
	mingo, luto

Commit-ID:  13523309495cdbd57a0d344c0d5d574987af007f
Gitweb:     http://git.kernel.org/tip/13523309495cdbd57a0d344c0d5d574987af007f
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:21 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()

do_suspend_lowlevel() is a callable non-leaf function which doesn't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7383d87dd40a460e0d757a0793498b9d06a7ee0d.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/acpi/wakeup_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 8c35df4..169963f 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -5,6 +5,7 @@
 #include <asm/page_types.h>
 #include <asm/msr.h>
 #include <asm/asm-offsets.h>
+#include <asm/frame.h>
 
 # Copyright 2003 Pavel Machek <pavel@suse.cz>, distribute under GPLv2
 
@@ -39,6 +40,7 @@ bogus_64_magic:
 	jmp	bogus_64_magic
 
 ENTRY(do_suspend_lowlevel)
+	FRAME_BEGIN
 	subq	$8, %rsp
 	xorl	%eax, %eax
 	call	save_processor_state
@@ -109,6 +111,7 @@ ENTRY(do_suspend_lowlevel)
 
 	xorl	%eax, %eax
 	addq	$8, %rsp
+	FRAME_END
 	jmp	restore_processor_state
 ENDPROC(do_suspend_lowlevel)
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm: Create stack frames in rwsem functions
  2016-01-21 22:49 ` [PATCH 18/33] x86/asm: Create stack frames in rwsem functions Josh Poimboeuf
  2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mmarek, bp, peterz, luto, luto, acme, tglx, jpoimboe, bernd,
	brgerst, palves, jslaby, akpm, namhyung, dvlasenk, mingo,
	torvalds, linux-kernel, hpa, bp, chris.j.arges

Commit-ID:  3387a535ce629906d849864ef6a3c3437a645cb5
Gitweb:     http://git.kernel.org/tip/3387a535ce629906d849864ef6a3c3437a645cb5
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:22 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm: Create stack frames in rwsem functions

rwsem.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create stack frames for them when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/ad0932bbead975b15f9578e4f2cf2ee5961eb840.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/lib/rwsem.S | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/rwsem.S b/arch/x86/lib/rwsem.S
index 40027db..be110ef 100644
--- a/arch/x86/lib/rwsem.S
+++ b/arch/x86/lib/rwsem.S
@@ -15,6 +15,7 @@
 
 #include <linux/linkage.h>
 #include <asm/alternative-asm.h>
+#include <asm/frame.h>
 
 #define __ASM_HALF_REG(reg)	__ASM_SEL(reg, e##reg)
 #define __ASM_HALF_SIZE(inst)	__ASM_SEL(inst##w, inst##l)
@@ -84,24 +85,29 @@
 
 /* Fix up special calling conventions */
 ENTRY(call_rwsem_down_read_failed)
+	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
 	movq %rax,%rdi
 	call rwsem_down_read_failed
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_down_read_failed)
 
 ENTRY(call_rwsem_down_write_failed)
+	FRAME_BEGIN
 	save_common_regs
 	movq %rax,%rdi
 	call rwsem_down_write_failed
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_down_write_failed)
 
 ENTRY(call_rwsem_wake)
+	FRAME_BEGIN
 	/* do nothing if still outstanding active readers */
 	__ASM_HALF_SIZE(dec) %__ASM_HALF_REG(dx)
 	jnz 1f
@@ -109,15 +115,18 @@ ENTRY(call_rwsem_wake)
 	movq %rax,%rdi
 	call rwsem_wake
 	restore_common_regs
-1:	ret
+1:	FRAME_END
+	ret
 ENDPROC(call_rwsem_wake)
 
 ENTRY(call_rwsem_downgrade_wake)
+	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
 	movq %rax,%rdi
 	call rwsem_downgrade_wake
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	restore_common_regs
+	FRAME_END
 	ret
 ENDPROC(call_rwsem_downgrade_wake)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/efi: Create a stack frame in efi_call()
  2016-01-21 22:49 ` [PATCH 19/33] x86/asm/efi: Create a stack frame in efi_call() Josh Poimboeuf
  2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, torvalds, brgerst, mingo, palves, matt, namhyung, akpm,
	luto, luto, peterz, chris.j.arges, mmarek, hpa, dvlasenk, bp,
	bernd, jpoimboe, linux-kernel, jslaby, tglx, bp

Commit-ID:  779c433b8ea5c9fdfb892265b2ca6213d1f12ff8
Gitweb:     http://git.kernel.org/tip/779c433b8ea5c9fdfb892265b2ca6213d1f12ff8
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:23 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm/efi: Create a stack frame in efi_call()

efi_call() is a callable non-leaf function which doesn't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/2294b6fad60eea4cc862eddc8e98a1324e6eeeca.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/platform/efi/efi_stub_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index 86d0f9e..0df2dcc 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -11,6 +11,7 @@
 #include <asm/msr.h>
 #include <asm/processor-flags.h>
 #include <asm/page_types.h>
+#include <asm/frame.h>
 
 #define SAVE_XMM			\
 	mov %rsp, %rax;			\
@@ -74,6 +75,7 @@
 	.endm
 
 ENTRY(efi_call)
+	FRAME_BEGIN
 	SAVE_XMM
 	mov (%rsp), %rax
 	mov 8(%rax), %rax
@@ -88,6 +90,7 @@ ENTRY(efi_call)
 	RESTORE_PGT
 	addq $48, %rsp
 	RESTORE_XMM
+	FRAME_END
 	ret
 ENDPROC(efi_call)
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/power: Create stack frames in hibernate_asm_64.S
  2016-01-21 22:49 ` [PATCH 20/33] x86/asm/power: Create stack frames in hibernate_asm_64.S Josh Poimboeuf
  2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, brgerst, dvlasenk, bp, jpoimboe, akpm, palves, luto,
	namhyung, torvalds, jslaby, pavel, acme, hpa, tglx, bp, bernd,
	linux-kernel, rafael.j.wysocki, peterz, mmarek, chris.j.arges,
	mingo

Commit-ID:  ef0f3ed5a4acfb24740480bf2e50b178724f094d
Gitweb:     http://git.kernel.org/tip/ef0f3ed5a4acfb24740480bf2e50b178724f094d
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:24 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm/power: Create stack frames in hibernate_asm_64.S

swsusp_arch_suspend() and restore_registers() are callable non-leaf
functions which don't honor CONFIG_FRAME_POINTER, which can result in
bad stack traces.  Also they aren't annotated as ELF callable functions
which can confuse tooling.

Create a stack frame for them when CONFIG_FRAME_POINTER is enabled and
give them proper ELF function annotations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/bdad00205897dc707aebe9e9e39757085e2bf999.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/power/hibernate_asm_64.S | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index e2386cb..4400a43 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -21,8 +21,10 @@
 #include <asm/page_types.h>
 #include <asm/asm-offsets.h>
 #include <asm/processor-flags.h>
+#include <asm/frame.h>
 
 ENTRY(swsusp_arch_suspend)
+	FRAME_BEGIN
 	movq	$saved_context, %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
@@ -50,7 +52,9 @@ ENTRY(swsusp_arch_suspend)
 	movq	%rax, restore_cr3(%rip)
 
 	call swsusp_save
+	FRAME_END
 	ret
+ENDPROC(swsusp_arch_suspend)
 
 ENTRY(restore_image)
 	/* switch to temporary page tables */
@@ -107,6 +111,7 @@ ENTRY(core_restore_code)
 	 */
 
 ENTRY(restore_registers)
+	FRAME_BEGIN
 	/* go back to the original page tables */
 	movq    %rbx, %cr3
 
@@ -147,4 +152,6 @@ ENTRY(restore_registers)
 	/* tell the hibernation core that we've just restored the memory */
 	movq	%rax, in_suspend(%rip)
 
+	FRAME_END
 	ret
+ENDPROC(restore_registers)

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/uaccess: Add stack frame output operand in get_user() inline asm
  2016-01-21 22:49 ` [PATCH 21/33] x86/uaccess: Add stack frame output operand in get_user inline asm Josh Poimboeuf
  2016-02-23  9:02   ` [tip:x86/debug] x86/uaccess: Add stack frame output operand in get_user() " =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
@ 2016-02-25  5:50   ` tip-bot for Chris J Arges
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Chris J Arges @ 2016-02-25  5:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, brgerst, acme, dvlasenk, peterz, bernd, jslaby, luto,
	palves, linux-kernel, hpa, mmarek, akpm, namhyung, bp, bp,
	jpoimboe, torvalds, tglx, chris.j.arges, mingo

Commit-ID:  f05058c4d652b619adfda6c78d8f5b341169c264
Gitweb:     http://git.kernel.org/tip/f05058c4d652b619adfda6c78d8f5b341169c264
Author:     Chris J Arges <chris.j.arges@canonical.com>
AuthorDate: Thu, 21 Jan 2016 16:49:25 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/uaccess: Add stack frame output operand in get_user() inline asm

Numerous 'call without frame pointer save/setup' warnings are introduced
by stacktool because of functions using the get_user() macro. Bad stack
traces could occur due to lack of or misplacement of stack frame setup
code.

This patch forces a stack frame to be created before the inline asm code
if CONFIG_FRAME_POINTER is enabled by listing the stack pointer as an
output operand for the get_user() inline assembly statement.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/bc85501f221ee512670797c7f110022e64b12c81.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/uaccess.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a4a30e4..9bbb3b2 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -179,10 +179,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
 ({									\
 	int __ret_gu;							\
 	register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);		\
+	register void *__sp asm(_ASM_SP);				\
 	__chk_user_ptr(ptr);						\
 	might_fault();							\
-	asm volatile("call __get_user_%P3"				\
-		     : "=a" (__ret_gu), "=r" (__val_gu)			\
+	asm volatile("call __get_user_%P4"				\
+		     : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp)	\
 		     : "0" (ptr), "i" (sizeof(*(ptr))));		\
 	(x) = (__force __typeof__(*(ptr))) __val_gu;			\
 	__builtin_expect(__ret_gu, 0);					\

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/bpf: Annotate callable functions
  2016-01-21 22:49 ` [PATCH 22/33] x86/asm/bpf: Annotate callable functions Josh Poimboeuf
  2016-02-23  9:02   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mmarek, chris.j.arges, namhyung, peterz, torvalds, bp, ast, luto,
	linux-kernel, akpm, brgerst, bernd, palves, jpoimboe, tglx,
	dvlasenk, luto, hpa, mingo, jslaby, acme

Commit-ID:  2d8fe90a1b96d52c2a3f719c385b846b02f0bcd8
Gitweb:     http://git.kernel.org/tip/2d8fe90a1b96d52c2a3f719c385b846b02f0bcd8
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:26 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:43 +0100

x86/asm/bpf: Annotate callable functions

bpf_jit.S has several functions which can be called from C code.  Give
them proper ELF annotations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/bbe1de0c299fecd4fc9a1766bae8be2647bedb01.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/net/bpf_jit.S | 39 ++++++++++++++++-----------------------
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index 4093216..eb4a3bd 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -22,15 +22,16 @@
 	32 /* space for rbx,r13,r14,r15 */ + \
 	8 /* space for skb_copy_bits */)
 
-sk_load_word:
-	.globl	sk_load_word
+#define FUNC(name) \
+	.globl name; \
+	.type name, @function; \
+	name:
 
+FUNC(sk_load_word)
 	test	%esi,%esi
 	js	bpf_slow_path_word_neg
 
-sk_load_word_positive_offset:
-	.globl	sk_load_word_positive_offset
-
+FUNC(sk_load_word_positive_offset)
 	mov	%r9d,%eax		# hlen
 	sub	%esi,%eax		# hlen - offset
 	cmp	$3,%eax
@@ -39,15 +40,11 @@ sk_load_word_positive_offset:
 	bswap   %eax  			/* ntohl() */
 	ret
 
-sk_load_half:
-	.globl	sk_load_half
-
+FUNC(sk_load_half)
 	test	%esi,%esi
 	js	bpf_slow_path_half_neg
 
-sk_load_half_positive_offset:
-	.globl	sk_load_half_positive_offset
-
+FUNC(sk_load_half_positive_offset)
 	mov	%r9d,%eax
 	sub	%esi,%eax		#	hlen - offset
 	cmp	$1,%eax
@@ -56,15 +53,11 @@ sk_load_half_positive_offset:
 	rol	$8,%ax			# ntohs()
 	ret
 
-sk_load_byte:
-	.globl	sk_load_byte
-
+FUNC(sk_load_byte)
 	test	%esi,%esi
 	js	bpf_slow_path_byte_neg
 
-sk_load_byte_positive_offset:
-	.globl	sk_load_byte_positive_offset
-
+FUNC(sk_load_byte_positive_offset)
 	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
 	jle	bpf_slow_path_byte
 	movzbl	(SKBDATA,%rsi),%eax
@@ -120,8 +113,8 @@ bpf_slow_path_byte:
 bpf_slow_path_word_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi	/* test range */
 	jl	bpf_error	/* offset lower -> error  */
-sk_load_word_negative_offset:
-	.globl	sk_load_word_negative_offset
+
+FUNC(sk_load_word_negative_offset)
 	sk_negative_common(4)
 	mov	(%rax), %eax
 	bswap	%eax
@@ -130,8 +123,8 @@ sk_load_word_negative_offset:
 bpf_slow_path_half_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
-sk_load_half_negative_offset:
-	.globl	sk_load_half_negative_offset
+
+FUNC(sk_load_half_negative_offset)
 	sk_negative_common(2)
 	mov	(%rax),%ax
 	rol	$8,%ax
@@ -141,8 +134,8 @@ sk_load_half_negative_offset:
 bpf_slow_path_byte_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
-sk_load_byte_negative_offset:
-	.globl	sk_load_byte_negative_offset
+
+FUNC(sk_load_byte_negative_offset)
 	sk_negative_common(1)
 	movzbl	(%rax), %eax
 	ret

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/asm/bpf: Create stack frames in bpf_jit.S
  2016-01-21 22:49 ` [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S Josh Poimboeuf
  2016-01-22  2:44   ` Alexei Starovoitov
  2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, bp, akpm, acme, peterz, bernd, mmarek, chris.j.arges,
	torvalds, mingo, tglx, linux-kernel, namhyung, hpa, luto, luto,
	jslaby, brgerst, jpoimboe, palves, ast

Commit-ID:  d21001cc15ba9f63b0334d60942278587471a451
Gitweb:     http://git.kernel.org/tip/d21001cc15ba9f63b0334d60942278587471a451
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:27 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

x86/asm/bpf: Create stack frames in bpf_jit.S

bpf_jit.S has several callable non-leaf functions which don't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame before the call instructions when
CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/fa4c41976b438b51954cb8021f06bceb1d1d66cc.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/net/bpf_jit.S | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index eb4a3bd..f2a7faf 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -8,6 +8,7 @@
  * of the License.
  */
 #include <linux/linkage.h>
+#include <asm/frame.h>
 
 /*
  * Calling convention :
@@ -65,16 +66,18 @@ FUNC(sk_load_byte_positive_offset)
 
 /* rsi contains offset and can be scratched */
 #define bpf_slow_path_common(LEN)		\
+	lea	-MAX_BPF_STACK + 32(%rbp), %rdx;\
+	FRAME_BEGIN;				\
 	mov	%rbx, %rdi; /* arg1 == skb */	\
 	push	%r9;				\
 	push	SKBDATA;			\
 /* rsi already has offset */			\
 	mov	$LEN,%ecx;	/* len */	\
-	lea	- MAX_BPF_STACK + 32(%rbp),%rdx;			\
 	call	skb_copy_bits;			\
 	test    %eax,%eax;			\
 	pop	SKBDATA;			\
-	pop	%r9;
+	pop	%r9;				\
+	FRAME_END
 
 
 bpf_slow_path_word:
@@ -99,6 +102,7 @@ bpf_slow_path_byte:
 	ret
 
 #define sk_negative_common(SIZE)				\
+	FRAME_BEGIN;						\
 	mov	%rbx, %rdi; /* arg1 == skb */			\
 	push	%r9;						\
 	push	SKBDATA;					\
@@ -108,6 +112,7 @@ bpf_slow_path_byte:
 	test	%rax,%rax;					\
 	pop	SKBDATA;					\
 	pop	%r9;						\
+	FRAME_END;						\
 	jz	bpf_error
 
 bpf_slow_path_word_neg:

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kprobes: Get rid of kretprobe_trampoline_holder()
  2016-01-21 22:49 ` [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder() Josh Poimboeuf
  2016-01-21 23:42   ` 平松雅巳 / HIRAMATU,MASAMI
  2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, torvalds, luto, masami.hiramatsu.pt, bp,
	anil.s.keshavamurthy, mingo, palves, linux-kernel, mmarek,
	dvlasenk, bernd, jslaby, chris.j.arges, davem, akpm, acme,
	brgerst, luto, hpa, ananth, jpoimboe, namhyung, tglx

Commit-ID:  c1c355ce14c037666fbcb9453d9067c86bbdda5c
Gitweb:     http://git.kernel.org/tip/c1c355ce14c037666fbcb9453d9067c86bbdda5c
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:28 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

x86/kprobes: Get rid of kretprobe_trampoline_holder()

The kretprobe_trampoline_holder() wrapper around kretprobe_trampoline()
isn't used anywhere and adds some unnecessary frame pointer instructions
which never execute.  Instead, just make kretprobe_trampoline() a proper
ELF function.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/92d921b102fb865a7c254cfde9e4a0a72b9a781e.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/kprobes/core.c | 57 +++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 1deffe6..5b187df 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -671,38 +671,37 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
  * When a retprobed function returns, this code saves registers and
  * calls trampoline_handler() runs, which calls the kretprobe's handler.
  */
-static void __used kretprobe_trampoline_holder(void)
-{
-	asm volatile (
-			".global kretprobe_trampoline\n"
-			"kretprobe_trampoline: \n"
+asm(
+	".global kretprobe_trampoline\n"
+	".type kretprobe_trampoline, @function\n"
+	"kretprobe_trampoline:\n"
 #ifdef CONFIG_X86_64
-			/* We don't bother saving the ss register */
-			"	pushq %rsp\n"
-			"	pushfq\n"
-			SAVE_REGS_STRING
-			"	movq %rsp, %rdi\n"
-			"	call trampoline_handler\n"
-			/* Replace saved sp with true return address. */
-			"	movq %rax, 152(%rsp)\n"
-			RESTORE_REGS_STRING
-			"	popfq\n"
+	/* We don't bother saving the ss register */
+	"	pushq %rsp\n"
+	"	pushfq\n"
+	SAVE_REGS_STRING
+	"	movq %rsp, %rdi\n"
+	"	call trampoline_handler\n"
+	/* Replace saved sp with true return address. */
+	"	movq %rax, 152(%rsp)\n"
+	RESTORE_REGS_STRING
+	"	popfq\n"
 #else
-			"	pushf\n"
-			SAVE_REGS_STRING
-			"	movl %esp, %eax\n"
-			"	call trampoline_handler\n"
-			/* Move flags to cs */
-			"	movl 56(%esp), %edx\n"
-			"	movl %edx, 52(%esp)\n"
-			/* Replace saved flags with true return address. */
-			"	movl %eax, 56(%esp)\n"
-			RESTORE_REGS_STRING
-			"	popf\n"
+	"	pushf\n"
+	SAVE_REGS_STRING
+	"	movl %esp, %eax\n"
+	"	call trampoline_handler\n"
+	/* Move flags to cs */
+	"	movl 56(%esp), %edx\n"
+	"	movl %edx, 52(%esp)\n"
+	/* Replace saved flags with true return address. */
+	"	movl %eax, 56(%esp)\n"
+	RESTORE_REGS_STRING
+	"	popf\n"
 #endif
-			"	ret\n");
-}
-NOKPROBE_SYMBOL(kretprobe_trampoline_holder);
+	"	ret\n"
+	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
+);
 NOKPROBE_SYMBOL(kretprobe_trampoline);
 
 /*

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kvm: Set ELF function type for fastop functions
  2016-01-21 22:49 ` [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions Josh Poimboeuf
  2016-01-22 10:05   ` Paolo Bonzini
  2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: chris.j.arges, linux-kernel, hpa, luto, brgerst, bernd, peterz,
	dvlasenk, jpoimboe, mmarek, gleb, mingo, akpm, palves, acme, bp,
	torvalds, namhyung, tglx, luto, pbonzini, jslaby

Commit-ID:  1482a0825bdf82dab4074bd3c824f4c87cbdf848
Gitweb:     http://git.kernel.org/tip/1482a0825bdf82dab4074bd3c824f4c87cbdf848
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:29 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

x86/kvm: Set ELF function type for fastop functions

The callable functions created with the FOP* and FASTOP* macros are
missing ELF function annotations, which confuses tools like stacktool.
Properly annotate them.

This adds some additional labels to the assembly, but the generated
binary code is unchanged (with the exception of instructions which have
embedded references to __LINE__).

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/e399651c89ace54906c203c0557f66ed6ea3ce8d.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kvm/emulate.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 1505587..aa4d726 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -309,23 +309,29 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
 
 static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 
-#define FOP_ALIGN ".align " __stringify(FASTOP_SIZE) " \n\t"
+#define FOP_FUNC(name) \
+	".align " __stringify(FASTOP_SIZE) " \n\t" \
+	".type " name ", @function \n\t" \
+	name ":\n\t"
+
 #define FOP_RET   "ret \n\t"
 
 #define FOP_START(op) \
 	extern void em_##op(struct fastop *fake); \
 	asm(".pushsection .text, \"ax\" \n\t" \
 	    ".global em_" #op " \n\t" \
-            FOP_ALIGN \
-	    "em_" #op ": \n\t"
+	    FOP_FUNC("em_" #op)
 
 #define FOP_END \
 	    ".popsection")
 
-#define FOPNOP() FOP_ALIGN FOP_RET
+#define FOPNOP() \
+	FOP_FUNC(__stringify(__UNIQUE_ID(nop))) \
+	FOP_RET
 
 #define FOP1E(op,  dst) \
-	FOP_ALIGN "10: " #op " %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst) \
+	"10: " #op " %" #dst " \n\t" FOP_RET
 
 #define FOP1EEX(op,  dst) \
 	FOP1E(op, dst) _ASM_EXTABLE(10b, kvm_fastop_exception)
@@ -357,7 +363,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 #define FOP2E(op,  dst, src)	   \
-	FOP_ALIGN #op " %" #src ", %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst "_" #src) \
+	#op " %" #src ", %" #dst " \n\t" FOP_RET
 
 #define FASTOP2(op) \
 	FOP_START(op) \
@@ -395,7 +402,8 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 #define FOP3E(op,  dst, src, src2) \
-	FOP_ALIGN #op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
+	FOP_FUNC(#op "_" #dst "_" #src "_" #src2) \
+	#op " %" #src2 ", %" #src ", %" #dst " \n\t" FOP_RET
 
 /* 3-operand, word-only, src2=cl */
 #define FASTOP3WCL(op) \
@@ -407,7 +415,12 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *));
 	FOP_END
 
 /* Special case for SETcc - 1 instruction per cc */
-#define FOP_SETCC(op) ".align 4; " #op " %al; ret \n\t"
+#define FOP_SETCC(op) \
+	".align 4 \n\t" \
+	".type " #op ", @function \n\t" \
+	#op ": \n\t" \
+	#op " %al \n\t" \
+	FOP_RET
 
 asm(".global kvm_fastop_exception \n"
     "kvm_fastop_exception: xor %esi, %esi; ret");

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kvm: Make test_cc() always inline
  2016-01-22 16:16       ` [PATCH v16.1 26/33] x86/kvm: Make test_cc() always inline Josh Poimboeuf
  2016-02-23  9:04         ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:52         ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: akpm, peterz, namhyung, pbonzini, mmarek, linux-kernel, mingo,
	jslaby, dvlasenk, chris.j.arges, gleb, luto, luto, bernd, acme,
	jpoimboe, bp, hpa, palves, torvalds, tglx, brgerst

Commit-ID:  cb7390fed4c04e609a420ac0b1c07a7a781b43bf
Gitweb:     http://git.kernel.org/tip/cb7390fed4c04e609a420ac0b1c07a7a781b43bf
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Fri, 22 Jan 2016 10:16:12 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

x86/kvm: Make test_cc() always inline

With some configs (including allyesconfig), gcc doesn't inline
test_cc().  When that happens, test_cc() doesn't create a stack frame
before inserting the inline asm call instruction.  This breaks frame
pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
a bad stack trace.

Force it to always be inlined so that its containing function's stack
frame can be used.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20160122161612.GE20502@treble.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kvm/emulate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index aa4d726..80363eb 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -969,7 +969,7 @@ static int em_bsr_c(struct x86_emulate_ctxt *ctxt)
 	return fastop(ctxt, em_bsr);
 }
 
-static u8 test_cc(unsigned int condition, unsigned long flags)
+static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
 {
 	u8 rc;
 	void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] watchdog/hpwdt: Create stack frame in asminline_call()
  2016-01-21 22:49 ` [PATCH 27/33] watchdog/hpwdt: Create stack frame in asminline_call() Josh Poimboeuf
  2016-02-23  9:04   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, palves, wim, acme, hpa, tglx, bp, mmarek, luto, jslaby,
	namhyung, torvalds, brgerst, luto, akpm, peterz, chris.j.arges,
	linux, linux-kernel, bernd, dvlasenk, mingo

Commit-ID:  5c1d5f283a855a5fe6b4f122054d85072b97ae4a
Gitweb:     http://git.kernel.org/tip/5c1d5f283a855a5fe6b4f122054d85072b97ae4a
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:31 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

watchdog/hpwdt: Create stack frame in asminline_call()

asminline_call() is a callable non-leaf function which doesn't honor
CONFIG_FRAME_POINTER, which can result in bad stack traces.

Create a stack frame when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: linux-watchdog@vger.kernel.org
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/60de3cfb6f16d413bfb923036cc87fec132df735.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/watchdog/hpwdt.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 92443c3..90016db 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -353,10 +353,10 @@ static int detect_cru_service(void)
 
 asm(".text                      \n\t"
     ".align 4                   \n\t"
-    ".globl asminline_call	\n"
+    ".globl asminline_call	\n\t"
+    ".type asminline_call, @function \n\t"
     "asminline_call:            \n\t"
-    "pushq      %rbp            \n\t"
-    "movq       %rsp, %rbp      \n\t"
+    FRAME_BEGIN
     "pushq      %rax            \n\t"
     "pushq      %rbx            \n\t"
     "pushq      %rdx            \n\t"
@@ -386,7 +386,7 @@ asm(".text                      \n\t"
     "popq       %rdx            \n\t"
     "popq       %rbx            \n\t"
     "popq       %rax            \n\t"
-    "leave                      \n\t"
+    FRAME_END
     "ret                        \n\t"
     ".previous");
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/locking: Create stack frame in PV unlock
  2016-01-21 22:49 ` [PATCH 28/33] x86/locking: Create stack frame in PV unlock Josh Poimboeuf
  2016-02-23  9:05   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, jslaby, acme, chris.j.arges, torvalds, dvlasenk,
	Waiman.Long, brgerst, namhyung, palves, peterz, bernd, tglx,
	mingo, akpm, mmarek, linux-kernel, luto, luto, bp, hpa

Commit-ID:  16df4ff8604881db0130f93f4b6ade759fa48e87
Gitweb:     http://git.kernel.org/tip/16df4ff8604881db0130f93f4b6ade759fa48e87
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 21 Jan 2016 16:49:32 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

x86/locking: Create stack frame in PV unlock

The assembly PV_UNLOCK function is a callable non-leaf function which
doesn't honor CONFIG_FRAME_POINTER, which can result in bad stack
traces.

Create a stack frame when CONFIG_FRAME_POINTER is enabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hpe.com>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6685a72ddbbd0ad3694337cca0af4b4ea09f5f40.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/qspinlock_paravirt.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/qspinlock_paravirt.h b/arch/x86/include/asm/qspinlock_paravirt.h
index 9f92c18..9d55f9b 100644
--- a/arch/x86/include/asm/qspinlock_paravirt.h
+++ b/arch/x86/include/asm/qspinlock_paravirt.h
@@ -36,8 +36,10 @@ PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath);
  */
 asm    (".pushsection .text;"
 	".globl " PV_UNLOCK ";"
+	".type " PV_UNLOCK ", @function;"
 	".align 4,0x90;"
 	PV_UNLOCK ": "
+	FRAME_BEGIN
 	"push  %rdx;"
 	"mov   $0x1,%eax;"
 	"xor   %edx,%edx;"
@@ -45,6 +47,7 @@ asm    (".pushsection .text;"
 	"cmp   $0x1,%al;"
 	"jne   .slowpath;"
 	"pop   %rdx;"
+	FRAME_END
 	"ret;"
 	".slowpath: "
 	"push   %rsi;"
@@ -52,6 +55,7 @@ asm    (".pushsection .text;"
 	"call " PV_UNLOCK_SLOWPATH ";"
 	"pop    %rsi;"
 	"pop    %rdx;"
+	FRAME_END
 	"ret;"
 	".size " PV_UNLOCK ", .-" PV_UNLOCK ";"
 	".popsection");

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] x86/kvm: Add output operand in vmx_handle_external_intr inline asm
  2016-01-22 21:44           ` [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm Chris J Arges
  2016-01-25 15:05             ` Josh Poimboeuf
  2016-02-23  9:05             ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
@ 2016-02-25  5:53             ` tip-bot for Chris J Arges
  2 siblings, 0 replies; 133+ messages in thread
From: tip-bot for Chris J Arges @ 2016-02-25  5:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, linux-kernel, peterz, chris.j.arges, jpoimboe, mingo,
	torvalds, hpa

Commit-ID:  3f62de5f6f369b67b7ac709e3c942c9130d2c51a
Gitweb:     http://git.kernel.org/tip/3f62de5f6f369b67b7ac709e3c942c9130d2c51a
Author:     Chris J Arges <chris.j.arges@canonical.com>
AuthorDate: Fri, 22 Jan 2016 15:44:38 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:44 +0100

x86/kvm: Add output operand in vmx_handle_external_intr inline asm

Stacktool generates the following warning:
  stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup

By adding the stackpointer as an output operand, this patch ensures that a
stack frame is created when CONFIG_FRAME_POINTER is enabled for the inline
assmebly statement.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: gleb@kernel.org
Cc: kvm@vger.kernel.org
Cc: live-patching@vger.kernel.org
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1453499078-9330-3-git-send-email-chris.j.arges@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kvm/vmx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e2951b6..e153522 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8356,6 +8356,7 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 {
 	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+	register void *__sp asm(_ASM_SP);
 
 	/*
 	 * If external interrupt exists, IF bit is set in rflags/eflags on the
@@ -8388,8 +8389,9 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 			"call *%[entry]\n\t"
 			:
 #ifdef CONFIG_X86_64
-			[sp]"=&r"(tmp)
+			[sp]"=&r"(tmp),
 #endif
+			"+r"(__sp)
 			:
 			[entry]"r"(entry),
 			[ss]"i"(__KERNEL_DS),

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [tip:x86/debug] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]()
  2016-02-18 17:41                 ` [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace] Josh Poimboeuf
  2016-02-19 12:05                   ` Jiri Slaby
  2016-02-23  9:05                   ` [tip:x86/debug] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]() =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
@ 2016-02-25  5:53                   ` tip-bot for Josh Poimboeuf
  2 siblings, 0 replies; 133+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2016-02-25  5:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, jpoimboe, torvalds, mingo, peterz, jslaby, linux-kernel, hpa

Commit-ID:  821eae7d14f0bbf69df1cc4656c54900b2672928
Gitweb:     http://git.kernel.org/tip/821eae7d14f0bbf69df1cc4656c54900b2672928
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 18 Feb 2016 11:41:58 -0600
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Feb 2016 08:35:45 +0100

sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]()

If __preempt_schedule() or __preempt_schedule_notrace() is referenced at
the beginning of a function, gcc can insert the asm inline "call
___preempt_schedule[_notrace]" instruction before setting up a stack
frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is
enabled and can result in bad stack traces.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the inline asm
statements.

Specifically this fixes the following stacktool warnings:

  stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
  stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
  stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
  stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
  stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
  stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
  stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup

So it only adds a stack frame to 15 call sites out of ~5000 calls to
___preempt_schedule[_notrace]().  All the others already had stack frames.

Oddly, this change actually seems to make things faster in a lot of
cases.  For many smaller functions it causes the stack frame creation to
get moved out of the common path and into the unlikely path.

For example, here's the original cyc2ns_read_end():

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	55                   	push   %rbp
  ffffffff8101f8c1:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8c4:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c8:	75 08                	jne    ffffffff8101f8d2 <cyc2ns_read_end+0x12>
  ffffffff8101f8ca:	65 48 89 3d e6 5a ff 	mov    %rdi,%gs:0x7eff5ae6(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8d1:	7e
  ffffffff8101f8d2:	65 ff 0d 77 c4 fe 7e 	decl   %gs:0x7efec477(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d9:	74 02                	je     ffffffff8101f8dd <cyc2ns_read_end+0x1d>
  ffffffff8101f8db:	5d                   	pop    %rbp
  ffffffff8101f8dc:	c3                   	retq
  ffffffff8101f8dd:	e8 1e 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e2:	5d                   	pop    %rbp
  ffffffff8101f8e3:	c3                   	retq
  ffffffff8101f8e4:	66 66 66 2e 0f 1f 84 	data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8eb:	00 00 00 00 00

And here's the same function with the patch:

  ffffffff8101f8c0 <cyc2ns_read_end>:
  ffffffff8101f8c0:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
  ffffffff8101f8c4:	75 08                	jne    ffffffff8101f8ce <cyc2ns_read_end+0xe>
  ffffffff8101f8c6:	65 48 89 3d ea 5a ff 	mov    %rdi,%gs:0x7eff5aea(%rip)        # 153b8 <cyc2ns+0x38>
  ffffffff8101f8cd:	7e
  ffffffff8101f8ce:	65 ff 0d 7b c4 fe 7e 	decl   %gs:0x7efec47b(%rip)        # bd50 <__preempt_count>
  ffffffff8101f8d5:	74 01                	je     ffffffff8101f8d8 <cyc2ns_read_end+0x18>
  ffffffff8101f8d7:	c3                   	retq
  ffffffff8101f8d8:	55                   	push   %rbp
  ffffffff8101f8d9:	48 89 e5             	mov    %rsp,%rbp
  ffffffff8101f8dc:	e8 1f 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
  ffffffff8101f8e1:	5d                   	pop    %rbp
  ffffffff8101f8e2:	c3                   	retq
  ffffffff8101f8e3:	66 66 66 66 2e 0f 1f 	data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
  ffffffff8101f8ea:	84 00 00 00 00 00

Notice that it moved the frame pointer setup code to the unlikely
___preempt_schedule() call path.  Going through a sampling of the
differences in the asm, that's the most common change I see.

Otherwise it has no real effect on callers which already have stack
frames (though it does result in the reordering of some 'mov's).

Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20160218174158.GA28230@treble.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/preempt.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index 01bcde8..d397deb 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -94,10 +94,19 @@ static __always_inline bool should_resched(int preempt_offset)
 
 #ifdef CONFIG_PREEMPT
   extern asmlinkage void ___preempt_schedule(void);
-# define __preempt_schedule() asm ("call ___preempt_schedule")
+# define __preempt_schedule()					\
+({								\
+	register void *__sp asm(_ASM_SP);			\
+	asm volatile ("call ___preempt_schedule" : "+r"(__sp));	\
+})
+
   extern asmlinkage void preempt_schedule(void);
   extern asmlinkage void ___preempt_schedule_notrace(void);
-# define __preempt_schedule_notrace() asm ("call ___preempt_schedule_notrace")
+# define __preempt_schedule_notrace()					\
+({									\
+	register void *__sp asm(_ASM_SP);				\
+	asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp));	\
+})
   extern asmlinkage void preempt_schedule_notrace(void);
 #endif
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

end of thread, other threads:[~2016-02-25  5:56 UTC | newest]

Thread overview: 133+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-21 22:49 [PATCH 00/33] Compile-time stack metadata validation Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 01/33] x86/stacktool: " Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 02/33] kbuild/stacktool: Add CONFIG_STACK_VALIDATION option Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 03/33] x86/stacktool: Enable stacktool on x86_64 Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 04/33] x86/stacktool: Add STACKTOOL_IGNORE_FUNC macro Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 05/33] x86/xen: Add stack frame dependency to hypercall inline asm calls Josh Poimboeuf
2016-02-23  8:55   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 06/33] x86/asm/xen: Set ELF function type for xen_adjust_exception_frame() Josh Poimboeuf
2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 07/33] x86/asm/xen: Create stack frames in xen-asm.S Josh Poimboeuf
2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:45   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 08/33] x86/paravirt: Add stack frame dependency to PVOP inline asm calls Josh Poimboeuf
2016-02-23  8:56   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 09/33] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK Josh Poimboeuf
2016-02-23  8:57   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 10/33] x86/amd: Set ELF function type for vide() Josh Poimboeuf
2016-02-23  8:57   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:46   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 11/33] x86/asm/crypto: Move .Lbswap_mask data to .rodata section Josh Poimboeuf
2016-02-23  8:58   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 12/33] x86/asm/crypto: Move jump_table " Josh Poimboeuf
2016-02-23  8:58   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 13/33] x86/asm/crypto: Simplify stack usage in sha-mb functions Josh Poimboeuf
2016-02-23  8:59   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:47   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 14/33] x86/asm/crypto: Don't use rbp as a scratch register Josh Poimboeuf
2016-02-23  8:59   ` [tip:x86/debug] x86/asm/crypto: Don't use RBP " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 15/33] x86/asm/crypto: Create stack frames in crypto functions Josh Poimboeuf
2016-02-23  8:59   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 16/33] x86/asm/entry: Create stack frames in thunk functions Josh Poimboeuf
2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:48   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 17/33] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() Josh Poimboeuf
2016-02-23  9:00   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-23 11:39     ` Pavel Machek
2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 18/33] x86/asm: Create stack frames in rwsem functions Josh Poimboeuf
2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 19/33] x86/asm/efi: Create a stack frame in efi_call() Josh Poimboeuf
2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:49   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 20/33] x86/asm/power: Create stack frames in hibernate_asm_64.S Josh Poimboeuf
2016-02-23  9:01   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 21/33] x86/uaccess: Add stack frame output operand in get_user inline asm Josh Poimboeuf
2016-02-23  9:02   ` [tip:x86/debug] x86/uaccess: Add stack frame output operand in get_user() " =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
2016-02-25  5:50   ` tip-bot for Chris J Arges
2016-01-21 22:49 ` [PATCH 22/33] x86/asm/bpf: Annotate callable functions Josh Poimboeuf
2016-02-23  9:02   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:50   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 23/33] x86/asm/bpf: Create stack frames in bpf_jit.S Josh Poimboeuf
2016-01-22  2:44   ` Alexei Starovoitov
2016-01-22  3:55     ` Josh Poimboeuf
2016-01-22  4:18       ` Alexei Starovoitov
2016-01-22  7:36         ` Ingo Molnar
2016-01-22 15:58         ` Josh Poimboeuf
2016-01-22 17:18           ` Alexei Starovoitov
2016-01-22 17:36             ` Josh Poimboeuf
2016-01-22 17:40               ` Alexei Starovoitov
2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 24/33] x86/kprobes: Get rid of kretprobe_trampoline_holder() Josh Poimboeuf
2016-01-21 23:42   ` 平松雅巳 / HIRAMATU,MASAMI
2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 25/33] x86/kvm: Set ELF function type for fastop functions Josh Poimboeuf
2016-01-22 10:05   ` Paolo Bonzini
2016-02-23  9:03   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:51   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 26/33] x86/kvm: Add stack frame dependency to test_cc() inline asm Josh Poimboeuf
2016-01-22 10:05   ` Paolo Bonzini
2016-01-22 16:02     ` Josh Poimboeuf
2016-01-22 16:16       ` [PATCH v16.1 26/33] x86/kvm: Make test_cc() always inline Josh Poimboeuf
2016-02-23  9:04         ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:52         ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 27/33] watchdog/hpwdt: Create stack frame in asminline_call() Josh Poimboeuf
2016-02-23  9:04   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 28/33] x86/locking: Create stack frame in PV unlock Josh Poimboeuf
2016-02-23  9:05   ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:52   ` tip-bot for Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 29/33] x86/stacktool: Add directory and file whitelists Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 30/33] x86/xen: Add xen_cpuid() to stacktool whitelist Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 31/33] bpf: Add __bpf_prog_run() " Josh Poimboeuf
2016-01-21 22:57   ` Daniel Borkmann
2016-01-22  2:55   ` Alexei Starovoitov
2016-01-22  4:13     ` Josh Poimboeuf
2016-01-22 17:19       ` Alexei Starovoitov
2016-01-21 22:49 ` [PATCH 32/33] sched: Add __schedule() " Josh Poimboeuf
2016-01-21 22:49 ` [PATCH 33/33] x86/kprobes: Add kretprobe_trampoline() " Josh Poimboeuf
2016-01-22 17:43 ` [PATCH 00/33] Compile-time stack metadata validation Chris J Arges
2016-01-22 19:14   ` Josh Poimboeuf
2016-01-22 20:40     ` Chris J Arges
2016-01-22 20:47       ` Josh Poimboeuf
2016-01-22 21:44         ` [PATCH 0/2] A few stacktool warning fixes Chris J Arges
2016-01-22 21:44           ` [PATCH 1/2] tools/stacktool: Add __reiserfs_panic to global_noreturns list Chris J Arges
2016-01-25 15:04             ` Josh Poimboeuf
2016-01-22 21:44           ` [PATCH 2/2] x86/kvm: Add output operand in vmx_handle_external_intr inline asm Chris J Arges
2016-01-25 15:05             ` Josh Poimboeuf
2016-02-23  9:05             ` [tip:x86/debug] " =?UTF-8?B?dGlwLWJvdCBmb3IgQ2hyaXMgSiBBcmdlcyA8dGlwYm90QHp5dG9yLmNvbT4=?=
2016-02-25  5:53             ` tip-bot for Chris J Arges
2016-02-12 10:36 ` [PATCH 00/33] Compile-time stack metadata validation Jiri Slaby
2016-02-12 10:41   ` Jiri Slaby
2016-02-12 14:45   ` Josh Poimboeuf
2016-02-12 17:10     ` Peter Zijlstra
2016-02-12 18:32       ` Josh Poimboeuf
2016-02-12 18:34         ` Josh Poimboeuf
2016-02-12 20:10         ` Peter Zijlstra
2016-02-15 16:31           ` Josh Poimboeuf
2016-02-15 16:49             ` Peter Zijlstra
     [not found]             ` <CA+55aFzoPCd_LcSx1FUuEhSBYk2KrfzXGj-Vcn39W5bz=KuZhA@mail.gmail.com>
2016-02-15 20:01               ` Josh Poimboeuf
2016-02-18 17:41                 ` [PATCH] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace] Josh Poimboeuf
2016-02-19 12:05                   ` Jiri Slaby
2016-02-23  9:05                   ` [tip:x86/debug] sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]() =?UTF-8?B?dGlwLWJvdCBmb3IgSm9zaCBQb2ltYm9ldWYgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-25  5:53                   ` tip-bot for Josh Poimboeuf
2016-02-15 20:02               ` [PATCH 00/33] Compile-time stack metadata validation Andi Kleen
2016-02-23  8:14 ` Ingo Molnar
2016-02-23 14:27   ` Arnaldo Carvalho de Melo
2016-02-23 15:07     ` Josh Poimboeuf
2016-02-23 15:28       ` Arnaldo Carvalho de Melo
2016-02-23 15:01   ` Josh Poimboeuf
2016-02-24  7:40     ` Ingo Molnar
2016-02-24 16:32       ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).