linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/39] x86: Kernel IBT
@ 2022-03-03 11:23 Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 01/39] static_call: Avoid building empty .static_call_sites Peter Zijlstra
                   ` (39 more replies)
  0 siblings, 40 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Hi, another week, another series.

Since last time:

 - fixed and tested kexec (redgecomb)
 - s/4*HAS_KERNEL_IBT/ENDBR_INSN_SIZE/ (jpoimboe)
 - re-arranged Xen patches to avoid churn (andyhpp)
 - folded IBT_SEAL Kconfig and objtool options (jpoimboe)
 - dropped direct call/jmp rewrite from objtool (jpoimboe)
 - dropped UD1 poison (jpoimboe)
 - fixed kprobe selftests (masami,naveen)
 - fixed ftrace selftests (rostedt)
 - simplified CET/INT3 selftests (jpoimboe)
 - boot time msg on IBT (kees)
 - objtool WARN_FUNC sym+off fallback (jpoimboe)
 - picked up tags for unchanged patches
 - probably more

Supposedly clang-14-rc2 will work on this series, I'll validate the moment the
Debian package gets updated.

Patches go on top of tip/master + arm64/for-next/linkage. Also available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/wip.ibt

Enjoy!

---
 arch/powerpc/include/asm/livepatch.h        |  10 -
 arch/powerpc/kernel/kprobes.c               |  34 +--
 arch/um/kernel/um_arch.c                    |   4 +
 arch/x86/Kconfig                            |  27 +++
 arch/x86/Makefile                           |   7 +-
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S   |   3 +
 arch/x86/entry/entry_64.S                   |  27 ++-
 arch/x86/entry/entry_64_compat.S            |   5 +
 arch/x86/include/asm/alternative.h          |   1 +
 arch/x86/include/asm/cpu.h                  |   4 +
 arch/x86/include/asm/cpufeatures.h          |   1 +
 arch/x86/include/asm/efi.h                  |   9 +-
 arch/x86/include/asm/ibt.h                  |  92 ++++++++
 arch/x86/include/asm/idtentry.h             |  25 +-
 arch/x86/include/asm/irqflags.h             |   5 -
 arch/x86/include/asm/linkage.h              |  39 ++++
 arch/x86/include/asm/msr-index.h            |  20 +-
 arch/x86/include/asm/paravirt.h             |   1 +
 arch/x86/include/asm/paravirt_types.h       |   1 -
 arch/x86/include/asm/qspinlock_paravirt.h   |   3 +
 arch/x86/include/asm/segment.h              |   5 +-
 arch/x86/include/asm/setup.h                |   3 +-
 arch/x86/include/asm/text-patching.h        |  30 ++-
 arch/x86/include/asm/traps.h                |   2 +
 arch/x86/include/uapi/asm/processor-flags.h |   2 +
 arch/x86/kernel/alternative.c               |  57 ++++-
 arch/x86/kernel/apm_32.c                    |   7 +
 arch/x86/kernel/cpu/bugs.c                  |  13 ++
 arch/x86/kernel/cpu/common.c                |  59 ++++-
 arch/x86/kernel/ftrace.c                    |   9 +-
 arch/x86/kernel/ftrace_64.S                 |  23 +-
 arch/x86/kernel/head_64.S                   |  14 +-
 arch/x86/kernel/idt.c                       |   9 +-
 arch/x86/kernel/kprobes/core.c              |  29 ++-
 arch/x86/kernel/kvm.c                       |   3 +-
 arch/x86/kernel/machine_kexec_64.c          |   2 +
 arch/x86/kernel/module.c                    |  21 +-
 arch/x86/kernel/paravirt.c                  |  29 +--
 arch/x86/kernel/relocate_kernel_64.S        |  10 +
 arch/x86/kernel/traps.c                     |  61 +++++
 arch/x86/kernel/vmlinux.lds.S               |   9 +
 arch/x86/kvm/emulate.c                      |   6 +-
 arch/x86/lib/error-inject.c                 |   2 +
 arch/x86/lib/retpoline.S                    |   1 +
 arch/x86/net/bpf_jit_comp.c                 |  16 +-
 arch/x86/xen/enlighten_pv.c                 |  10 +-
 arch/x86/xen/xen-asm.S                      |  10 +
 arch/x86/xen/xen-head.S                     |   8 +-
 include/asm-generic/vmlinux.lds.h           |   4 +
 include/linux/cfi.h                         |  11 +-
 include/linux/kprobes.h                     |   3 +-
 include/linux/objtool.h                     |  16 ++
 kernel/bpf/trampoline.c                     |  20 +-
 kernel/kprobes.c                            |  66 ++++--
 kernel/livepatch/patch.c                    |  19 +-
 kernel/trace/ftrace.c                       |  34 ++-
 samples/ftrace/ftrace-direct-modify.c       |   5 +
 samples/ftrace/ftrace-direct-multi-modify.c |  10 +-
 samples/ftrace/ftrace-direct-multi.c        |   5 +-
 samples/ftrace/ftrace-direct-too.c          |   3 +
 samples/ftrace/ftrace-direct.c              |   3 +
 scripts/Makefile.build                      |  44 +---
 scripts/Makefile.lib                        |  56 +++++
 scripts/Makefile.modfinal                   |   1 +
 scripts/link-vmlinux.sh                     |  12 +-
 tools/objtool/arch/x86/decode.c             |  34 ++-
 tools/objtool/builtin-check.c               |   8 +-
 tools/objtool/check.c                       | 346 +++++++++++++++++++++++++++-
 tools/objtool/elf.c                         |   3 +
 tools/objtool/include/objtool/arch.h        |   1 +
 tools/objtool/include/objtool/builtin.h     |   3 +-
 tools/objtool/include/objtool/check.h       |  14 +-
 tools/objtool/include/objtool/objtool.h     |   4 +
 tools/objtool/include/objtool/warn.h        |   2 +
 tools/objtool/objtool.c                     |   1 +
 75 files changed, 1254 insertions(+), 242 deletions(-)


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 01/39] static_call: Avoid building empty .static_call_sites
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 02/39] x86/module: Fix the paravirt vs alternative order Peter Zijlstra
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Without CONFIG_HAVE_STATIC_CALL_INLINE there's no point in creating
the .static_call_sites section and it's related symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/vmlinux.lds.h |    4 ++++
 1 file changed, 4 insertions(+)

--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -398,6 +398,7 @@
 	KEEP(*(__jump_table))						\
 	__stop___jump_table = .;
 
+#ifdef CONFIG_HAVE_STATIC_CALL_INLINE
 #define STATIC_CALL_DATA						\
 	. = ALIGN(8);							\
 	__start_static_call_sites = .;					\
@@ -406,6 +407,9 @@
 	__start_static_call_tramp_key = .;				\
 	KEEP(*(.static_call_tramp_key))					\
 	__stop_static_call_tramp_key = .;
+#else
+#define STATIC_CALL_DATA
+#endif
 
 /*
  * Allow architectures to handle ro_after_init data on their



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 02/39] x86/module: Fix the paravirt vs alternative order
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 01/39] static_call: Avoid building empty .static_call_sites Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-08 13:58   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 03/39] objtool: Add --dry-run Peter Zijlstra
                   ` (37 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Ever since commit 4e6292114c741 ("x86/paravirt: Add new features for
paravirt patching") there is an ordering dependency between patching
paravirt ops and patching alternatives, the module loader still
violates this.

Fixes: 4e6292114c741 ("x86/paravirt: Add new features for paravirt patching")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
---
 arch/x86/kernel/module.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -273,6 +273,14 @@ int module_finalize(const Elf_Ehdr *hdr,
 			retpolines = s;
 	}
 
+	/*
+	 * See alternative_instructions() for the ordering rules between the
+	 * various patching types.
+	 */
+	if (para) {
+		void *pseg = (void *)para->sh_addr;
+		apply_paravirt(pseg, pseg + para->sh_size);
+	}
 	if (retpolines) {
 		void *rseg = (void *)retpolines->sh_addr;
 		apply_retpolines(rseg, rseg + retpolines->sh_size);
@@ -290,11 +298,6 @@ int module_finalize(const Elf_Ehdr *hdr,
 					    tseg, tseg + text->sh_size);
 	}
 
-	if (para) {
-		void *pseg = (void *)para->sh_addr;
-		apply_paravirt(pseg, pseg + para->sh_size);
-	}
-
 	/* make jump label nops */
 	jump_label_apply_nops(me);
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 03/39] objtool: Add --dry-run
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 01/39] static_call: Avoid building empty .static_call_sites Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 02/39] x86/module: Fix the paravirt vs alternative order Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 04/39] x86/ibt: Base IBT bits Peter Zijlstra
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
---
 tools/objtool/builtin-check.c           |    3 ++-
 tools/objtool/elf.c                     |    3 +++
 tools/objtool/include/objtool/builtin.h |    2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -20,7 +20,7 @@
 #include <objtool/objtool.h>
 
 bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
-     validate_dup, vmlinux, mcount, noinstr, backup, sls;
+     validate_dup, vmlinux, mcount, noinstr, backup, sls, dryrun;
 
 static const char * const check_usage[] = {
 	"objtool check [<options>] file.o",
@@ -46,6 +46,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"),
 	OPT_BOOLEAN('B', "backup", &backup, "create .orig files before modification"),
 	OPT_BOOLEAN('S', "sls", &sls, "validate straight-line-speculation"),
+	OPT_BOOLEAN(0, "dry-run", &dryrun, "don't write the modifications"),
 	OPT_END(),
 };
 
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -1019,6 +1019,9 @@ int elf_write(struct elf *elf)
 	struct section *sec;
 	Elf_Scn *s;
 
+	if (dryrun)
+		return 0;
+
 	/* Update changed relocation sections and section headers: */
 	list_for_each_entry(sec, &elf->sections, list) {
 		if (sec->changed) {
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -9,7 +9,7 @@
 
 extern const struct option check_options[];
 extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
-            validate_dup, vmlinux, mcount, noinstr, backup, sls;
+            validate_dup, vmlinux, mcount, noinstr, backup, sls, dryrun;
 
 extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]);
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 04/39] x86/ibt: Base IBT bits
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (2 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 03/39] objtool: Add --dry-run Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-07 11:17   ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR Peter Zijlstra
                   ` (35 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Add Kconfig, Makefile and basic instruction support for x86 IBT.

(Ab)use __DISABLE_EXPORTS to disable IBT since it's already employed
to mark compressed and purgatory. Additionally mark realmode with it
as well to avoid inserting ENDBR instructions there. While ENDBR is
technically a NOP, inserting them was causing some grief due to code
growth. There's also a problem with using __noendbr in code compiled
without -fcf-protection=branch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/Kconfig           |   20 ++++++++++++
 arch/x86/Makefile          |    7 +++-
 arch/x86/include/asm/ibt.h |   74 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+), 2 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1861,6 +1861,26 @@ config X86_UMIP
 	  specific cases in protected and virtual-8086 modes. Emulated
 	  results are dummy.
 
+config CC_HAS_IBT
+	# GCC >= 9 and binutils >= 2.29
+	# Retpoline check to work around https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93654
+	# Clang/LLVM >= 14
+	# fentry check to work around https://reviews.llvm.org/D111108
+	def_bool ((CC_IS_GCC && $(cc-option, -fcf-protection=branch -mindirect-branch-register)) || \
+		  (CC_IS_CLANG && $(cc-option, -fcf-protection=branch -mfentry))) && \
+		  $(as-instr,endbr64)
+
+config X86_KERNEL_IBT
+	prompt "Indirect Branch Tracking"
+	bool
+	depends on X86_64 && CC_HAS_IBT
+	help
+	  Build the kernel with support for Indirect Branch Tracking, a
+	  hardware support course-grain forward-edge Control Flow Integrity
+	  protection. It enforces that all indirect calls must land on
+	  an ENDBR instruction, as such, the compiler will instrument the
+	  code with them to make this happen.
+
 config X86_INTEL_MEMORY_PROTECTION_KEYS
 	prompt "Memory Protection Keys"
 	def_bool y
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -36,7 +36,7 @@ endif
 
 # How to compile the 16-bit code.  Note we always compile for -march=i386;
 # that way we can complain to the user if the CPU is insufficient.
-REALMODE_CFLAGS	:= -m16 -g -Os -DDISABLE_BRANCH_PROFILING \
+REALMODE_CFLAGS	:= -m16 -g -Os -DDISABLE_BRANCH_PROFILING -D__DISABLE_EXPORTS \
 		   -Wall -Wstrict-prototypes -march=i386 -mregparm=3 \
 		   -fno-strict-aliasing -fomit-frame-pointer -fno-pic \
 		   -mno-mmx -mno-sse $(call cc-option,-fcf-protection=none)
@@ -62,8 +62,11 @@ export BITS
 #
 KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
 
-# Intel CET isn't enabled in the kernel
+ifeq ($(CONFIG_X86_KERNEL_IBT),y)
+KBUILD_CFLAGS += $(call cc-option,-fcf-protection=branch)
+else
 KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
+endif
 
 ifeq ($(CONFIG_X86_32),y)
         BITS := 32
--- /dev/null
+++ b/arch/x86/include/asm/ibt.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_IBT_H
+#define _ASM_X86_IBT_H
+
+#include <linux/types.h>
+
+#if defined(CONFIG_X86_KERNEL_IBT) && !defined(__DISABLE_EXPORTS)
+
+#define HAS_KERNEL_IBT	1
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_X86_64
+#define ASM_ENDBR	"endbr64\n\t"
+#else
+#define ASM_ENDBR	"endbr32\n\t"
+#endif
+
+#define __noendbr	__attribute__((nocf_check))
+
+static inline __attribute_const__ u32 gen_endbr(void)
+{
+	u32 endbr;
+
+	/*
+	 * Generate ENDBR64 in a way that is sure to not result in
+	 * an ENDBR64 instruction as immediate.
+	 */
+	asm ( "mov $~0xfa1e0ff3, %[endbr]\n\t"
+	      "not %[endbr]\n\t"
+	       : [endbr] "=&r" (endbr) );
+
+	return endbr;
+}
+
+static inline bool is_endbr(u32 val)
+{
+	val &= ~0x01000000U; /* ENDBR32 -> ENDBR64 */
+	return val == gen_endbr();
+}
+
+#else /* __ASSEMBLY__ */
+
+#ifdef CONFIG_X86_64
+#define ENDBR	endbr64
+#else
+#define ENDBR	endbr32
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#else /* !IBT */
+
+#define HAS_KERNEL_IBT	0
+
+#ifndef __ASSEMBLY__
+
+#define ASM_ENDBR
+
+#define __noendbr
+
+static inline bool is_endbr(u32 val) { return false; }
+
+#else /* __ASSEMBLY__ */
+
+#define ENDBR
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* CONFIG_X86_KERNEL_IBT */
+
+#define ENDBR_INSN_SIZE		(4*HAS_KERNEL_IBT)
+
+#endif /* _ASM_X86_IBT_H */



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (3 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 04/39] x86/ibt: Base IBT bits Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-04 18:59   ` Josh Poimboeuf
  2022-03-03 11:23 ` [PATCH v3 06/39] x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR Peter Zijlstra
                   ` (34 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

In order to have objtool warn about code references to !ENDBR
instruction, we need an annotation to allow this for non-control-flow
instances -- consider text range checks, text patching, or return
trampolines etc.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 include/linux/objtool.h |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -77,6 +77,12 @@ struct unwind_hint {
 #define STACK_FRAME_NON_STANDARD_FP(func)
 #endif
 
+#define ANNOTATE_NOENDBR					\
+	"986: \n\t"						\
+	".pushsection .discard.noendbr\n\t"			\
+	_ASM_PTR " 986b\n\t"					\
+	".popsection\n\t"
+
 #else /* __ASSEMBLY__ */
 
 /*
@@ -129,6 +135,13 @@ struct unwind_hint {
 	.popsection
 .endm
 
+.macro ANNOTATE_NOENDBR
+.Lhere_\@:
+	.pushsection .discard.noendbr
+	.quad	.Lhere_\@
+	.popsection
+.endm
+
 #endif /* __ASSEMBLY__ */
 
 #else /* !CONFIG_STACK_VALIDATION */
@@ -139,12 +152,15 @@ struct unwind_hint {
 	"\n\t"
 #define STACK_FRAME_NON_STANDARD(func)
 #define STACK_FRAME_NON_STANDARD_FP(func)
+#define ANNOTATE_NOENDBR
 #else
 #define ANNOTATE_INTRA_FUNCTION_CALL
 .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
 .endm
 .macro STACK_FRAME_NON_STANDARD func:req
 .endm
+.macro ANNOTATE_NOENDBR
+.endm
 #endif
 
 #endif /* CONFIG_STACK_VALIDATION */



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 06/39] x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (4 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 07/39] x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch() Peter Zijlstra
                   ` (33 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/text-patching.h |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -101,13 +101,21 @@ void *text_gen_insn(u8 opcode, const voi
 	static union text_poke_insn insn; /* per instance */
 	int size = text_opcode_size(opcode);
 
+	/*
+	 * Hide the addresses to avoid the compiler folding in constants when
+	 * referencing code, these can mess up annotations like
+	 * ANNOTATE_NOENDBR.
+	 */
+	OPTIMIZER_HIDE_VAR(addr);
+	OPTIMIZER_HIDE_VAR(dest);
+
 	insn.opcode = opcode;
 
 	if (size > 1) {
 		insn.disp = (long)dest - (long)(addr + size);
 		if (size == 2) {
 			/*
-			 * Ensure that for JMP9 the displacement
+			 * Ensure that for JMP8 the displacement
 			 * actually fits the signed byte.
 			 */
 			BUG_ON((insn.disp >> 31) != (insn.disp >> 7));



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 07/39] x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (5 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 06/39] x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 08/39] x86/entry: Cleanup PARAVIRT Peter Zijlstra
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Less duplication is more better.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/text-patching.h |   20 ++++++++++++++------
 arch/x86/kernel/paravirt.c           |   23 +++--------------------
 2 files changed, 17 insertions(+), 26 deletions(-)

--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -96,32 +96,40 @@ union text_poke_insn {
 };
 
 static __always_inline
-void *text_gen_insn(u8 opcode, const void *addr, const void *dest)
+void __text_gen_insn(void *buf, u8 opcode, const void *addr, const void *dest, int size)
 {
-	static union text_poke_insn insn; /* per instance */
-	int size = text_opcode_size(opcode);
+	union text_poke_insn *insn = buf;
+
+	BUG_ON(size < text_opcode_size(opcode));
 
 	/*
 	 * Hide the addresses to avoid the compiler folding in constants when
 	 * referencing code, these can mess up annotations like
 	 * ANNOTATE_NOENDBR.
 	 */
+	OPTIMIZER_HIDE_VAR(insn);
 	OPTIMIZER_HIDE_VAR(addr);
 	OPTIMIZER_HIDE_VAR(dest);
 
-	insn.opcode = opcode;
+	insn->opcode = opcode;
 
 	if (size > 1) {
-		insn.disp = (long)dest - (long)(addr + size);
+		insn->disp = (long)dest - (long)(addr + size);
 		if (size == 2) {
 			/*
 			 * Ensure that for JMP8 the displacement
 			 * actually fits the signed byte.
 			 */
-			BUG_ON((insn.disp >> 31) != (insn.disp >> 7));
+			BUG_ON((insn->disp >> 31) != (insn->disp >> 7));
 		}
 	}
+}
 
+static __always_inline
+void *text_gen_insn(u8 opcode, const void *addr, const void *dest)
+{
+	static union text_poke_insn insn; /* per instance */
+	__text_gen_insn(&insn, opcode, addr, dest, text_opcode_size(opcode));
 	return &insn.text;
 }
 
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -69,29 +69,12 @@ noinstr void paravirt_BUG(void)
 	BUG();
 }
 
-struct branch {
-	unsigned char opcode;
-	u32 delta;
-} __attribute__((packed));
-
 static unsigned paravirt_patch_call(void *insn_buff, const void *target,
 				    unsigned long addr, unsigned len)
 {
-	const int call_len = 5;
-	struct branch *b = insn_buff;
-	unsigned long delta = (unsigned long)target - (addr+call_len);
-
-	if (len < call_len) {
-		pr_warn("paravirt: Failed to patch indirect CALL at %ps\n", (void *)addr);
-		/* Kernel might not be viable if patching fails, bail out: */
-		BUG_ON(1);
-	}
-
-	b->opcode = 0xe8; /* call */
-	b->delta = delta;
-	BUILD_BUG_ON(sizeof(*b) != call_len);
-
-	return call_len;
+	__text_gen_insn(insn_buff, CALL_INSN_OPCODE,
+			(void *)addr, target, CALL_INSN_SIZE);
+	return CALL_INSN_SIZE;
 }
 
 #ifdef CONFIG_PARAVIRT_XXL



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 08/39] x86/entry: Cleanup PARAVIRT
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (6 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 07/39] x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch() Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 09/39] x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel() Peter Zijlstra
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov, Andrew Cooper

Since commit 5c8f6a2e316e ("x86/xen: Add
xenpv_restore_regs_and_return_to_usermode()") Xen will no longer reach
this code and we can do away with the paravirt
SWAPGS/INTERRUPT_RETURN.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -619,8 +619,8 @@ SYM_INNER_LABEL(swapgs_restore_regs_and_
 
 	/* Restore RDI. */
 	popq	%rdi
-	SWAPGS
-	INTERRUPT_RETURN
+	swapgs
+	jmp	native_iret
 
 
 SYM_INNER_LABEL(restore_regs_and_return_to_kernel, SYM_L_GLOBAL)



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 09/39] x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (7 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 08/39] x86/entry: Cleanup PARAVIRT Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 10/39] x86/ibt,xen: Sprinkle the ENDBR Peter Zijlstra
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov, Andrew Cooper

By doing an early rewrite of 'jmp native_iret` in
restore_regs_and_return_to_kernel() we can get rid of the last
INTERRUPT_RETURN user and paravirt_iret.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S             |   11 ++++++++---
 arch/x86/include/asm/irqflags.h       |    5 -----
 arch/x86/include/asm/paravirt_types.h |    1 -
 arch/x86/kernel/head_64.S             |    3 ++-
 arch/x86/kernel/paravirt.c            |    4 ----
 arch/x86/xen/enlighten_pv.c           |    7 ++++++-
 arch/x86/xen/xen-asm.S                |    1 +
 7 files changed, 17 insertions(+), 15 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -609,7 +609,7 @@ SYM_INNER_LABEL(swapgs_restore_regs_and_
 	/* Restore RDI. */
 	popq	%rdi
 	swapgs
-	jmp	native_iret
+	jmp	.Lnative_iret
 
 
 SYM_INNER_LABEL(restore_regs_and_return_to_kernel, SYM_L_GLOBAL)
@@ -626,9 +626,14 @@ SYM_INNER_LABEL(restore_regs_and_return_
 	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization
 	 * when returning from IPI handler.
 	 */
-	INTERRUPT_RETURN
+#ifdef CONFIG_XEN_PV
+SYM_INNER_LABEL(early_xen_iret_patch, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
+	.byte 0xe9
+	.long .Lnative_iret - (. + 4)
+#endif
 
-SYM_INNER_LABEL_ALIGN(native_iret, SYM_L_GLOBAL)
+.Lnative_iret:
 	UNWIND_HINT_IRET_REGS
 	/*
 	 * Are we returning to a stack segment from the LDT?  Note: in
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -141,13 +141,8 @@ static __always_inline void arch_local_i
 #ifdef CONFIG_X86_64
 #ifdef CONFIG_XEN_PV
 #define SWAPGS	ALTERNATIVE "swapgs", "", X86_FEATURE_XENPV
-#define INTERRUPT_RETURN						\
-	ANNOTATE_RETPOLINE_SAFE;					\
-	ALTERNATIVE_TERNARY("jmp *paravirt_iret(%rip);",		\
-		X86_FEATURE_XENPV, "jmp xen_iret;", "jmp native_iret;")
 #else
 #define SWAPGS	swapgs
-#define INTERRUPT_RETURN	jmp native_iret
 #endif
 #endif
 #endif /* !__ASSEMBLY__ */
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -272,7 +272,6 @@ struct paravirt_patch_template {
 
 extern struct pv_info pv_info;
 extern struct paravirt_patch_template pv_ops;
-extern void (*paravirt_iret)(void);
 
 #define PARAVIRT_PATCH(x)					\
 	(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -345,7 +345,6 @@ SYM_CODE_START_NOALIGN(vc_boot_ghcb)
 	/* Remove Error Code */
 	addq    $8, %rsp
 
-	/* Pure iret required here - don't use INTERRUPT_RETURN */
 	iretq
 SYM_CODE_END(vc_boot_ghcb)
 #endif
@@ -426,6 +425,8 @@ SYM_CODE_END(early_idt_handler_common)
  * early_idt_handler_array can't be used because it returns via the
  * paravirtualized INTERRUPT_RETURN and pv-ops don't work that early.
  *
+ * XXX it does, fix this.
+ *
  * This handler will end up in the .init.text section and not be
  * available to boot secondary CPUs.
  */
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -132,8 +132,6 @@ void paravirt_set_sched_clock(u64 (*func
 }
 
 /* These are in entry.S */
-extern void native_iret(void);
-
 static struct resource reserve_ioports = {
 	.start = 0,
 	.end = IO_SPACE_LIMIT,
@@ -397,8 +395,6 @@ struct paravirt_patch_template pv_ops =
 
 #ifdef CONFIG_PARAVIRT_XXL
 NOKPROBE_SYMBOL(native_load_idt);
-
-void (*paravirt_iret)(void) = native_iret;
 #endif
 
 EXPORT_SYMBOL(pv_ops);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1177,6 +1177,8 @@ static void __init xen_domu_set_legacy_f
 	x86_platform.legacy.rtc = 0;
 }
 
+extern void early_xen_iret_patch(void);
+
 /* First C function to be called on Xen boot */
 asmlinkage __visible void __init xen_start_kernel(void)
 {
@@ -1187,6 +1189,10 @@ asmlinkage __visible void __init xen_sta
 	if (!xen_start_info)
 		return;
 
+	__text_gen_insn(&early_xen_iret_patch,
+			JMP32_INSN_OPCODE, &early_xen_iret_patch, &xen_iret,
+			JMP32_INSN_SIZE);
+
 	xen_domain_type = XEN_PV_DOMAIN;
 	xen_start_flags = xen_start_info->flags;
 
@@ -1195,7 +1201,6 @@ asmlinkage __visible void __init xen_sta
 	/* Install Xen paravirt ops */
 	pv_info = xen_info;
 	pv_ops.cpu = xen_cpu_ops.cpu;
-	paravirt_iret = xen_iret;
 	xen_init_irq_ops();
 
 	/*
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -189,6 +189,7 @@ hypercall_iret = hypercall_page + __HYPE
  */
 SYM_CODE_START(xen_iret)
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR
 	pushq $0
 	jmp hypercall_iret
 SYM_CODE_END(xen_iret)



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 10/39] x86/ibt,xen: Sprinkle the ENDBR
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (8 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 09/39] x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel() Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 11/39] x86/ibt,entry: Sprinkle ENDBR dust Peter Zijlstra
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Even though Xen currently doesn't advertise IBT, prepare for when it
will eventually do so and sprinkle the ENDBR dust accordingly.

Even though most of the entry points are IRET like, the CPL0
Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S      |    1 +
 arch/x86/include/asm/segment.h |    2 +-
 arch/x86/kernel/head_64.S      |    1 +
 arch/x86/xen/enlighten_pv.c    |    3 +++
 arch/x86/xen/xen-asm.S         |    9 +++++++++
 arch/x86/xen/xen-head.S        |    8 ++++++--
 6 files changed, 21 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -809,6 +809,7 @@ SYM_CODE_END(exc_xen_hypervisor_callback
  */
 SYM_CODE_START(xen_failsafe_callback)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	movl	%ds, %ecx
 	cmpw	%cx, 0x10(%rsp)
 	jne	1f
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -283,7 +283,7 @@ static inline void vdso_read_cpunode(uns
  * pop %rcx; pop %r11; jmp early_idt_handler_array[i]; summing up to
  * max 8 bytes.
  */
-#define XEN_EARLY_IDT_HANDLER_SIZE 8
+#define XEN_EARLY_IDT_HANDLER_SIZE (8 + ENDBR_INSN_SIZE)
 
 #ifndef __ASSEMBLY__
 
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -383,6 +383,7 @@ SYM_CODE_START(early_idt_handler_array)
 	.endr
 	UNWIND_HINT_IRET_REGS offset=16
 SYM_CODE_END(early_idt_handler_array)
+	ANNOTATE_NOENDBR // early_idt_handler_array[NUM_EXCEPTION_VECTORS]
 
 SYM_CODE_START_LOCAL(early_idt_handler_common)
 	/*
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -624,6 +624,9 @@ static struct trap_array_entry trap_arra
 	TRAP_ENTRY(exc_coprocessor_error,		false ),
 	TRAP_ENTRY(exc_alignment_check,			false ),
 	TRAP_ENTRY(exc_simd_coprocessor_error,		false ),
+#ifdef CONFIG_X86_KERNEL_IBT
+	TRAP_ENTRY(exc_control_protection,		false ),
+#endif
 };
 
 static bool __ref get_trap_addr(void **addr, unsigned int ist)
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -122,6 +122,7 @@ SYM_FUNC_END(xen_read_cr2_direct);
 .macro xen_pv_trap name
 SYM_CODE_START(xen_\name)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	pop %rcx
 	pop %r11
 	jmp  \name
@@ -147,6 +148,9 @@ xen_pv_trap asm_exc_page_fault
 xen_pv_trap asm_exc_spurious_interrupt_bug
 xen_pv_trap asm_exc_coprocessor_error
 xen_pv_trap asm_exc_alignment_check
+#ifdef CONFIG_X86_KERNEL_IBT
+xen_pv_trap asm_exc_control_protection
+#endif
 #ifdef CONFIG_X86_MCE
 xen_pv_trap asm_xenpv_exc_machine_check
 #endif /* CONFIG_X86_MCE */
@@ -162,6 +166,7 @@ SYM_CODE_START(xen_early_idt_handler_arr
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
 	UNWIND_HINT_EMPTY
+	ENDBR
 	pop %rcx
 	pop %r11
 	jmp early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE
@@ -231,6 +236,7 @@ SYM_CODE_END(xenpv_restore_regs_and_retu
 /* Normal 64-bit system call target */
 SYM_CODE_START(xen_syscall_target)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	popq %rcx
 	popq %r11
 
@@ -250,6 +256,7 @@ SYM_CODE_END(xen_syscall_target)
 /* 32-bit compat syscall target */
 SYM_CODE_START(xen_syscall32_target)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	popq %rcx
 	popq %r11
 
@@ -267,6 +274,7 @@ SYM_CODE_END(xen_syscall32_target)
 /* 32-bit compat sysenter target */
 SYM_CODE_START(xen_sysenter_target)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	/*
 	 * NB: Xen is polite and clears TF from EFLAGS for us.  This means
 	 * that we don't need to guard against single step exceptions here.
@@ -290,6 +298,7 @@ SYM_CODE_END(xen_sysenter_target)
 SYM_CODE_START(xen_syscall32_target)
 SYM_CODE_START(xen_sysenter_target)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	lea 16(%rsp), %rsp	/* strip %rcx, %r11 */
 	mov $-ENOSYS, %rax
 	pushq $0
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -25,8 +25,11 @@
 SYM_CODE_START(hypercall_page)
 	.rept (PAGE_SIZE / 32)
 		UNWIND_HINT_FUNC
-		.skip 31, 0x90
-		RET
+		ANNOTATE_NOENDBR
+		/*
+		 * Xen will write the hypercall page, and sort out ENDBR.
+		 */
+		.skip 32, 0xcc
 	.endr
 
 #define HYPERCALL(n) \
@@ -74,6 +77,7 @@ SYM_CODE_END(startup_xen)
 .pushsection .text
 SYM_CODE_START(asm_cpu_bringup_and_idle)
 	UNWIND_HINT_EMPTY
+	ENDBR
 
 	call cpu_bringup_and_idle
 SYM_CODE_END(asm_cpu_bringup_and_idle)



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 11/39] x86/ibt,entry: Sprinkle ENDBR dust
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (9 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 10/39] x86/ibt,xen: Sprinkle the ENDBR Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 12/39] x86/linkage: Add ENDBR to SYM_FUNC_START*() Peter Zijlstra
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Kernel entry points should be having ENDBR on for IBT configs.

The SYSCALL entry points are found through taking their respective
address in order to program them in the MSRs, while the exception
entry points are found through UNWIND_HINT_IRET_REGS.

The rule is that any UNWIND_HINT_IRET_REGS at sym+0 should have an
ENDBR, see the later objtool ibt validation patch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S        |    6 ++++++
 arch/x86/entry/entry_64_compat.S |    3 +++
 arch/x86/include/asm/idtentry.h  |   20 +++++++++++---------
 arch/x86/include/asm/segment.h   |    3 ++-
 arch/x86/kernel/head_64.S        |    4 +++-
 arch/x86/kernel/idt.c            |    5 +++--
 6 files changed, 28 insertions(+), 13 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -86,6 +86,7 @@
 
 SYM_CODE_START(entry_SYSCALL_64)
 	UNWIND_HINT_EMPTY
+	ENDBR
 
 	swapgs
 	/* tss.sp2 is scratch space. */
@@ -350,6 +351,7 @@ SYM_CODE_END(ret_from_fork)
 .macro idtentry vector asmsym cfunc has_error_code:req
 SYM_CODE_START(\asmsym)
 	UNWIND_HINT_IRET_REGS offset=\has_error_code*8
+	ENDBR
 	ASM_CLAC
 
 	.if \has_error_code == 0
@@ -417,6 +419,7 @@ SYM_CODE_END(\asmsym)
 .macro idtentry_mce_db vector asmsym cfunc
 SYM_CODE_START(\asmsym)
 	UNWIND_HINT_IRET_REGS
+	ENDBR
 	ASM_CLAC
 
 	pushq	$-1			/* ORIG_RAX: no syscall to restart */
@@ -472,6 +475,7 @@ SYM_CODE_END(\asmsym)
 .macro idtentry_vc vector asmsym cfunc
 SYM_CODE_START(\asmsym)
 	UNWIND_HINT_IRET_REGS
+	ENDBR
 	ASM_CLAC
 
 	/*
@@ -533,6 +537,7 @@ SYM_CODE_END(\asmsym)
 .macro idtentry_df vector asmsym cfunc
 SYM_CODE_START(\asmsym)
 	UNWIND_HINT_IRET_REGS offset=8
+	ENDBR
 	ASM_CLAC
 
 	/* paranoid_entry returns GS information for paranoid_exit in EBX. */
@@ -1069,6 +1074,7 @@ SYM_CODE_END(error_return)
  */
 SYM_CODE_START(asm_exc_nmi)
 	UNWIND_HINT_IRET_REGS
+	ENDBR
 
 	/*
 	 * We allow breakpoints in NMIs. If a breakpoint occurs, then
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -48,6 +48,7 @@
  */
 SYM_CODE_START(entry_SYSENTER_compat)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	/* Interrupts are off on entry. */
 	SWAPGS
 
@@ -198,6 +199,7 @@ SYM_CODE_END(entry_SYSENTER_compat)
  */
 SYM_CODE_START(entry_SYSCALL_compat)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	/* Interrupts are off on entry. */
 	swapgs
 
@@ -340,6 +342,7 @@ SYM_CODE_END(entry_SYSCALL_compat)
  */
 SYM_CODE_START(entry_INT80_compat)
 	UNWIND_HINT_EMPTY
+	ENDBR
 	/*
 	 * Interrupts are off on entry.
 	 */
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -5,6 +5,8 @@
 /* Interrupts/Exceptions */
 #include <asm/trapnr.h>
 
+#define IDT_ALIGN	(8 * (1 + HAS_KERNEL_IBT))
+
 #ifndef __ASSEMBLY__
 #include <linux/entry-common.h>
 #include <linux/hardirq.h>
@@ -480,7 +482,7 @@ __visible noinstr void func(struct pt_re
 
 /*
  * ASM code to emit the common vector entry stubs where each stub is
- * packed into 8 bytes.
+ * packed into IDT_ALIGN bytes.
  *
  * Note, that the 'pushq imm8' is emitted via '.byte 0x6a, vector' because
  * GCC treats the local vector variable as unsigned int and would expand
@@ -492,33 +494,33 @@ __visible noinstr void func(struct pt_re
  * point is to mask off the bits above bit 7 because the push is sign
  * extending.
  */
-	.align 8
+	.align IDT_ALIGN
 SYM_CODE_START(irq_entries_start)
     vector=FIRST_EXTERNAL_VECTOR
     .rept NR_EXTERNAL_VECTORS
 	UNWIND_HINT_IRET_REGS
 0 :
+	ENDBR
 	.byte	0x6a, vector
 	jmp	asm_common_interrupt
-	nop
-	/* Ensure that the above is 8 bytes max */
-	. = 0b + 8
+	/* Ensure that the above is IDT_ALIGN bytes max */
+	.fill 0b + IDT_ALIGN - ., 1, 0xcc
 	vector = vector+1
     .endr
 SYM_CODE_END(irq_entries_start)
 
 #ifdef CONFIG_X86_LOCAL_APIC
-	.align 8
+	.align IDT_ALIGN
 SYM_CODE_START(spurious_entries_start)
     vector=FIRST_SYSTEM_VECTOR
     .rept NR_SYSTEM_VECTORS
 	UNWIND_HINT_IRET_REGS
 0 :
+	ENDBR
 	.byte	0x6a, vector
 	jmp	asm_spurious_interrupt
-	nop
-	/* Ensure that the above is 8 bytes max */
-	. = 0b + 8
+	/* Ensure that the above is IDT_ALIGN bytes max */
+	.fill 0b + IDT_ALIGN - ., 1, 0xcc
 	vector = vector+1
     .endr
 SYM_CODE_END(spurious_entries_start)
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -4,6 +4,7 @@
 
 #include <linux/const.h>
 #include <asm/alternative.h>
+#include <asm/ibt.h>
 
 /*
  * Constructor for a conventional segment GDT (or LDT) entry.
@@ -275,7 +276,7 @@ static inline void vdso_read_cpunode(uns
  * vector has no error code (two bytes), a 'push $vector_number' (two
  * bytes), and a jump to the common entry code (up to five bytes).
  */
-#define EARLY_IDT_HANDLER_SIZE 9
+#define EARLY_IDT_HANDLER_SIZE (9 + ENDBR_INSN_SIZE)
 
 /*
  * xen_early_idt_handler_array is for Xen pv guests: for each entry in
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -371,9 +371,11 @@ SYM_CODE_START(early_idt_handler_array)
 	.rept NUM_EXCEPTION_VECTORS
 	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 		UNWIND_HINT_IRET_REGS
+		ENDBR
 		pushq $0	# Dummy error code, to make stack frame uniform
 	.else
 		UNWIND_HINT_IRET_REGS offset=8
+		ENDBR
 	.endif
 	pushq $i		# 72(%rsp) Vector number
 	jmp early_idt_handler_common
@@ -381,11 +383,11 @@ SYM_CODE_START(early_idt_handler_array)
 	i = i + 1
 	.fill early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE - ., 1, 0xcc
 	.endr
-	UNWIND_HINT_IRET_REGS offset=16
 SYM_CODE_END(early_idt_handler_array)
 	ANNOTATE_NOENDBR // early_idt_handler_array[NUM_EXCEPTION_VECTORS]
 
 SYM_CODE_START_LOCAL(early_idt_handler_common)
+	UNWIND_HINT_IRET_REGS offset=16
 	/*
 	 * The stack is the hardware frame, an error code or zero, and the
 	 * vector number.
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -10,6 +10,7 @@
 #include <asm/proto.h>
 #include <asm/desc.h>
 #include <asm/hw_irq.h>
+#include <asm/idtentry.h>
 
 #define DPL0		0x0
 #define DPL3		0x3
@@ -272,7 +273,7 @@ void __init idt_setup_apic_and_irq_gates
 	idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts), true);
 
 	for_each_clear_bit_from(i, system_vectors, FIRST_SYSTEM_VECTOR) {
-		entry = irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR);
+		entry = irq_entries_start + IDT_ALIGN * (i - FIRST_EXTERNAL_VECTOR);
 		set_intr_gate(i, entry);
 	}
 
@@ -283,7 +284,7 @@ void __init idt_setup_apic_and_irq_gates
 		 * system_vectors bitmap. Otherwise they show up in
 		 * /proc/interrupts.
 		 */
-		entry = spurious_entries_start + 8 * (i - FIRST_SYSTEM_VECTOR);
+		entry = spurious_entries_start + IDT_ALIGN * (i - FIRST_SYSTEM_VECTOR);
 		set_intr_gate(i, entry);
 	}
 #endif



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 12/39] x86/linkage: Add ENDBR to SYM_FUNC_START*()
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (10 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 11/39] x86/ibt,entry: Sprinkle ENDBR dust Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 13/39] x86/ibt,paravirt: Sprinkle ENDBR Peter Zijlstra
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Ensure the ASM functions have ENDBR on for IBT builds, this follows
the ARM64 example. Unlike ARM64, we'll likely end up overwriting them
with poison.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/linkage.h |   39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

--- a/arch/x86/include/asm/linkage.h
+++ b/arch/x86/include/asm/linkage.h
@@ -3,6 +3,7 @@
 #define _ASM_X86_LINKAGE_H
 
 #include <linux/stringify.h>
+#include <asm/ibt.h>
 
 #undef notrace
 #define notrace __attribute__((no_instrument_function))
@@ -34,5 +35,43 @@
 
 #endif /* __ASSEMBLY__ */
 
+/*
+ * compressed and purgatory define this to disable EXPORT,
+ * hijack this same to also not emit ENDBR.
+ */
+#ifndef __DISABLE_EXPORTS
+
+/* SYM_FUNC_START -- use for global functions */
+#define SYM_FUNC_START(name)				\
+	SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN)	\
+	ENDBR
+
+/* SYM_FUNC_START_NOALIGN -- use for global functions, w/o alignment */
+#define SYM_FUNC_START_NOALIGN(name)			\
+	SYM_START(name, SYM_L_GLOBAL, SYM_A_NONE)	\
+	ENDBR
+
+/* SYM_FUNC_START_LOCAL -- use for local functions */
+#define SYM_FUNC_START_LOCAL(name)			\
+	SYM_START(name, SYM_L_LOCAL, SYM_A_ALIGN)	\
+	ENDBR
+
+/* SYM_FUNC_START_LOCAL_NOALIGN -- use for local functions, w/o alignment */
+#define SYM_FUNC_START_LOCAL_NOALIGN(name)		\
+	SYM_START(name, SYM_L_LOCAL, SYM_A_NONE)	\
+	ENDBR
+
+/* SYM_FUNC_START_WEAK -- use for weak functions */
+#define SYM_FUNC_START_WEAK(name)			\
+	SYM_START(name, SYM_L_WEAK, SYM_A_ALIGN)	\
+	ENDBR
+
+/* SYM_FUNC_START_WEAK_NOALIGN -- use for weak functions, w/o alignment */
+#define SYM_FUNC_START_WEAK_NOALIGN(name)		\
+	SYM_START(name, SYM_L_WEAK, SYM_A_NONE)		\
+	ENDBR
+
+#endif /* __DISABLE_EXPORTS */
+
 #endif /* _ASM_X86_LINKAGE_H */
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 13/39] x86/ibt,paravirt: Sprinkle ENDBR
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (11 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 12/39] x86/linkage: Add ENDBR to SYM_FUNC_START*() Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 14/39] x86/ibt,crypto: Add ENDBR for the jump-table entries Peter Zijlstra
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/paravirt.h           |    1 +
 arch/x86/include/asm/qspinlock_paravirt.h |    3 +++
 arch/x86/kernel/kvm.c                     |    3 ++-
 arch/x86/kernel/paravirt.c                |    2 ++
 4 files changed, 8 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -666,6 +666,7 @@ bool __raw_callee_save___native_vcpu_is_
 	    ".globl " PV_THUNK_NAME(func) ";"				\
 	    ".type " PV_THUNK_NAME(func) ", @function;"			\
 	    PV_THUNK_NAME(func) ":"					\
+	    ASM_ENDBR							\
 	    FRAME_BEGIN							\
 	    PV_SAVE_ALL_CALLER_REGS					\
 	    "call " #func ";"						\
--- a/arch/x86/include/asm/qspinlock_paravirt.h
+++ b/arch/x86/include/asm/qspinlock_paravirt.h
@@ -2,6 +2,8 @@
 #ifndef __ASM_QSPINLOCK_PARAVIRT_H
 #define __ASM_QSPINLOCK_PARAVIRT_H
 
+#include <asm/ibt.h>
+
 /*
  * For x86-64, PV_CALLEE_SAVE_REGS_THUNK() saves and restores 8 64-bit
  * registers. For i386, however, only 1 32-bit register needs to be saved
@@ -39,6 +41,7 @@ asm    (".pushsection .text;"
 	".type " PV_UNLOCK ", @function;"
 	".align 4,0x90;"
 	PV_UNLOCK ": "
+	ASM_ENDBR
 	FRAME_BEGIN
 	"push  %rdx;"
 	"mov   $0x1,%eax;"
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1024,10 +1024,11 @@ asm(
 ".global __raw_callee_save___kvm_vcpu_is_preempted;"
 ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
 "__raw_callee_save___kvm_vcpu_is_preempted:"
+ASM_ENDBR
 "movq	__per_cpu_offset(,%rdi,8), %rax;"
 "cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
 "setne	%al;"
-"ret;"
+ASM_RET
 ".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;"
 ".popsection");
 
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -41,6 +41,7 @@ extern void _paravirt_nop(void);
 asm (".pushsection .entry.text, \"ax\"\n"
      ".global _paravirt_nop\n"
      "_paravirt_nop:\n\t"
+     ASM_ENDBR
      ASM_RET
      ".size _paravirt_nop, . - _paravirt_nop\n\t"
      ".type _paravirt_nop, @function\n\t"
@@ -50,6 +51,7 @@ asm (".pushsection .entry.text, \"ax\"\n
 asm (".pushsection .entry.text, \"ax\"\n"
      ".global paravirt_ret0\n"
      "paravirt_ret0:\n\t"
+     ASM_ENDBR
      "xor %" _ASM_AX ", %" _ASM_AX ";\n\t"
      ASM_RET
      ".size paravirt_ret0, . - paravirt_ret0\n\t"



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 14/39] x86/ibt,crypto: Add ENDBR for the jump-table entries
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (12 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 13/39] x86/ibt,paravirt: Sprinkle ENDBR Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 15/39] x86/ibt,kvm: Add ENDBR to fastops Peter Zijlstra
                   ` (25 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

The code does:

	## branch into array
	mov     jump_table(,%rax,8), %bufp
	JMP_NOSPEC bufp

resulting in needing to mark the jump-table entries with ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S |    3 +++
 1 file changed, 3 insertions(+)

--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -195,6 +195,7 @@ SYM_FUNC_START(crc_pcl)
 .altmacro
 LABEL crc_ %i
 .noaltmacro
+	ENDBR
 	crc32q   -i*8(block_0), crc_init
 	crc32q   -i*8(block_1), crc1
 	crc32q   -i*8(block_2), crc2
@@ -204,6 +205,7 @@ LABEL crc_ %i
 .altmacro
 LABEL crc_ %i
 .noaltmacro
+	ENDBR
 	crc32q   -i*8(block_0), crc_init
 	crc32q   -i*8(block_1), crc1
 # SKIP  crc32  -i*8(block_2), crc2 ; Don't do this one yet
@@ -237,6 +239,7 @@ LABEL crc_ %i
 	################################################################
 
 LABEL crc_ 0
+	ENDBR
 	mov     tmp, len
 	cmp     $128*24, tmp
 	jae     full_block



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 15/39] x86/ibt,kvm: Add ENDBR to fastops
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (13 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 14/39] x86/ibt,crypto: Add ENDBR for the jump-table entries Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 16/39] x86/ibt,ftrace: Search for __fentry__ location Peter Zijlstra
                   ` (24 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -189,7 +189,7 @@
 #define X16(x...) X8(x), X8(x)
 
 #define NR_FASTOP (ilog2(sizeof(ulong)) + 1)
-#define FASTOP_SIZE 8
+#define FASTOP_SIZE (8 * (1 + HAS_KERNEL_IBT))
 
 struct opcode {
 	u64 flags;
@@ -311,7 +311,8 @@ static int fastop(struct x86_emulate_ctx
 #define __FOP_FUNC(name) \
 	".align " __stringify(FASTOP_SIZE) " \n\t" \
 	".type " name ", @function \n\t" \
-	name ":\n\t"
+	name ":\n\t" \
+	ASM_ENDBR
 
 #define FOP_FUNC(name) \
 	__FOP_FUNC(#name)
@@ -433,6 +434,7 @@ static int fastop(struct x86_emulate_ctx
 	".align 4 \n\t" \
 	".type " #op ", @function \n\t" \
 	#op ": \n\t" \
+	ASM_ENDBR \
 	#op " %al \n\t" \
 	__FOP_RET(#op)
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 16/39] x86/ibt,ftrace: Search for __fentry__ location
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (14 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 15/39] x86/ibt,kvm: Add ENDBR to fastops Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 17/39] x86/livepatch: Validate " Peter Zijlstra
                   ` (23 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Have ftrace_location() search the symbol for the __fentry__ location
when it isn't at func+0 and use this for {,un}register_ftrace_direct().

This avoids a whole bunch of assumptions about __fentry__ being at
func+0.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/trace/ftrace.c |   34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1578,7 +1578,25 @@ unsigned long ftrace_location_range(unsi
  */
 unsigned long ftrace_location(unsigned long ip)
 {
-	return ftrace_location_range(ip, ip);
+	struct dyn_ftrace *rec;
+	unsigned long offset;
+	unsigned long size;
+
+	rec = lookup_rec(ip, ip);
+	if (!rec) {
+		if (!kallsyms_lookup_size_offset(ip, &size, &offset))
+			goto out;
+
+		/* map sym+0 to __fentry__ */
+		if (!offset)
+			rec = lookup_rec(ip, ip + size - 1);
+	}
+
+	if (rec)
+		return rec->ip;
+
+out:
+	return 0;
 }
 
 /**
@@ -4962,7 +4980,8 @@ ftrace_match_addr(struct ftrace_hash *ha
 {
 	struct ftrace_func_entry *entry;
 
-	if (!ftrace_location(ip))
+	ip = ftrace_location(ip);
+	if (!ip)
 		return -EINVAL;
 
 	if (remove) {
@@ -5110,11 +5129,16 @@ int register_ftrace_direct(unsigned long
 	struct ftrace_func_entry *entry;
 	struct ftrace_hash *free_hash = NULL;
 	struct dyn_ftrace *rec;
-	int ret = -EBUSY;
+	int ret = -ENODEV;
 
 	mutex_lock(&direct_mutex);
 
+	ip = ftrace_location(ip);
+	if (!ip)
+		goto out_unlock;
+
 	/* See if there's a direct function at @ip already */
+	ret = -EBUSY;
 	if (ftrace_find_rec_direct(ip))
 		goto out_unlock;
 
@@ -5222,6 +5246,10 @@ int unregister_ftrace_direct(unsigned lo
 
 	mutex_lock(&direct_mutex);
 
+	ip = ftrace_location(ip);
+	if (!ip)
+		goto out_unlock;
+
 	entry = find_direct_entry(&ip, NULL);
 	if (!entry)
 		goto out_unlock;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 17/39] x86/livepatch: Validate __fentry__ location
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (15 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 16/39] x86/ibt,ftrace: Search for __fentry__ location Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice Peter Zijlstra
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Currently livepatch assumes __fentry__ lives at func+0, which is most
likely untrue with IBT on. Instead make it use ftrace_location() by
default which both validates and finds the actual ip if there is any
in the same symbol.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/powerpc/include/asm/livepatch.h |   10 ----------
 kernel/livepatch/patch.c             |   19 ++-----------------
 2 files changed, 2 insertions(+), 27 deletions(-)

--- a/arch/powerpc/include/asm/livepatch.h
+++ b/arch/powerpc/include/asm/livepatch.h
@@ -19,16 +19,6 @@ static inline void klp_arch_set_pc(struc
 	regs_set_return_ip(regs, ip);
 }
 
-#define klp_get_ftrace_location klp_get_ftrace_location
-static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
-{
-	/*
-	 * Live patch works only with -mprofile-kernel on PPC. In this case,
-	 * the ftrace location is always within the first 16 bytes.
-	 */
-	return ftrace_location_range(faddr, faddr + 16);
-}
-
 static inline void klp_init_thread_info(struct task_struct *p)
 {
 	/* + 1 to account for STACK_END_MAGIC */
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -124,19 +124,6 @@ static void notrace klp_ftrace_handler(u
 	ftrace_test_recursion_unlock(bit);
 }
 
-/*
- * Convert a function address into the appropriate ftrace location.
- *
- * Usually this is just the address of the function, but on some architectures
- * it's more complicated so allow them to provide a custom behaviour.
- */
-#ifndef klp_get_ftrace_location
-static unsigned long klp_get_ftrace_location(unsigned long faddr)
-{
-	return faddr;
-}
-#endif
-
 static void klp_unpatch_func(struct klp_func *func)
 {
 	struct klp_ops *ops;
@@ -153,8 +140,7 @@ static void klp_unpatch_func(struct klp_
 	if (list_is_singular(&ops->func_stack)) {
 		unsigned long ftrace_loc;
 
-		ftrace_loc =
-			klp_get_ftrace_location((unsigned long)func->old_func);
+		ftrace_loc = ftrace_location((unsigned long)func->old_func);
 		if (WARN_ON(!ftrace_loc))
 			return;
 
@@ -186,8 +172,7 @@ static int klp_patch_func(struct klp_fun
 	if (!ops) {
 		unsigned long ftrace_loc;
 
-		ftrace_loc =
-			klp_get_ftrace_location((unsigned long)func->old_func);
+		ftrace_loc = ftrace_location((unsigned long)func->old_func);
 		if (!ftrace_loc) {
 			pr_err("failed to find location for function '%s'\n",
 				func->old_name);



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (16 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 17/39] x86/livepatch: Validate " Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-04 17:51   ` Josh Poimboeuf
  2022-03-03 11:23 ` [PATCH v3 19/39] x86/ibt,kprobes: Cure sym+0 equals fentry woes Peter Zijlstra
                   ` (21 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.

And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/ftrace.c    |    9 ++-------
 arch/x86/kernel/ftrace_64.S |   16 ++++++++++++----
 2 files changed, 14 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -316,12 +316,12 @@ create_trampoline(struct ftrace_ops *ops
 	unsigned long offset;
 	unsigned long npages;
 	unsigned long size;
-	unsigned long retq;
 	unsigned long *ptr;
 	void *trampoline;
 	void *ip;
 	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
 	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
+	unsigned const char retq[] = { RET_INSN_OPCODE, INT3_INSN_OPCODE };
 	union ftrace_op_code_union op_ptr;
 	int ret;
 
@@ -359,12 +359,7 @@ create_trampoline(struct ftrace_ops *ops
 		goto fail;
 
 	ip = trampoline + size;
-
-	/* The trampoline ends with ret(q) */
-	retq = (unsigned long)ftrace_stub;
-	ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE);
-	if (WARN_ON(ret < 0))
-		goto fail;
+	memcpy(ip, retq, RET_SIZE);
 
 	/* No need to test direct calls on created trampolines */
 	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -176,10 +176,10 @@ SYM_FUNC_END(ftrace_caller);
 SYM_FUNC_START(ftrace_epilogue)
 /*
  * This is weak to keep gas from relaxing the jumps.
- * It is also used to copy the RET for trampolines.
  */
 SYM_INNER_LABEL_ALIGN(ftrace_stub, SYM_L_WEAK)
 	UNWIND_HINT_FUNC
+	ENDBR
 	RET
 SYM_FUNC_END(ftrace_epilogue)
 
@@ -284,6 +284,7 @@ SYM_FUNC_START(__fentry__)
 	jnz trace
 
 SYM_INNER_LABEL(ftrace_stub, SYM_L_GLOBAL)
+	ENDBR
 	RET
 
 trace:
@@ -307,7 +308,7 @@ EXPORT_SYMBOL(__fentry__)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_FUNC_START(return_to_handler)
-	subq  $24, %rsp
+	subq  $16, %rsp
 
 	/* Save the return values */
 	movq %rax, (%rsp)
@@ -319,7 +320,14 @@ SYM_FUNC_START(return_to_handler)
 	movq %rax, %rdi
 	movq 8(%rsp), %rdx
 	movq (%rsp), %rax
-	addq $24, %rsp
-	JMP_NOSPEC rdi
+
+	addq $16, %rsp
+	ANNOTATE_INTRA_FUNCTION_CALL
+	call .Ldo_rop
+	int3
+.Ldo_rop:
+	mov %rdi, (%rsp)
+	UNWIND_HINT_FUNC
+	RET
 SYM_FUNC_END(return_to_handler)
 #endif



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 19/39] x86/ibt,kprobes: Cure sym+0 equals fentry woes
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (17 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-07  9:44   ` Masami Hiramatsu
  2022-03-03 11:23 ` [PATCH v3 20/39] x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline Peter Zijlstra
                   ` (20 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Rework _kprobe_addr() to only ever do 2 kallsyms calls and require
only a single architecture/weak function -- although currently powerpc
still uses 2.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/powerpc/kernel/kprobes.c  |   34 +++++++++++++--------
 arch/x86/kernel/kprobes/core.c |   28 ++++++++++++++---
 include/linux/kprobes.h        |    3 +
 kernel/kprobes.c               |   66 ++++++++++++++++++++++++++++++++---------
 4 files changed, 98 insertions(+), 33 deletions(-)

--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -105,6 +105,27 @@ kprobe_opcode_t *kprobe_lookup_name(cons
 	return addr;
 }
 
+static bool arch_kprobe_on_func_entry(unsigned long offset)
+{
+#ifdef PPC64_ELF_ABI_v2
+#ifdef CONFIG_KPROBES_ON_FTRACE
+	return offset <= 16;
+#else
+	return offset <= 8;
+#endif
+#else
+	return !offset;
+#endif
+}
+
+/* XXX try and fold the magic of kprobe_lookup_name() in this */
+kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset,
+					 bool *on_func_entry)
+{
+	*on_func_entry = arch_kprobe_on_func_entry(offset);
+	return (kprobe_opcode_t *)(addr + offset);
+}
+
 void *alloc_insn_page(void)
 {
 	void *page;
@@ -218,19 +239,6 @@ static nokprobe_inline void set_current_
 	kcb->kprobe_saved_msr = regs->msr;
 }
 
-bool arch_kprobe_on_func_entry(unsigned long offset)
-{
-#ifdef PPC64_ELF_ABI_v2
-#ifdef CONFIG_KPROBES_ON_FTRACE
-	return offset <= 16;
-#else
-	return offset <= 8;
-#endif
-#else
-	return !offset;
-#endif
-}
-
 void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
 {
 	ri->ret_addr = (kprobe_opcode_t *)regs->link;
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -52,6 +52,7 @@
 #include <asm/insn.h>
 #include <asm/debugreg.h>
 #include <asm/set_memory.h>
+#include <asm/ibt.h>
 
 #include "common.h"
 
@@ -197,13 +198,14 @@ __recover_probed_insn(kprobe_opcode_t *b
 
 	kp = get_kprobe((void *)addr);
 	faddr = ftrace_location(addr);
+
 	/*
-	 * Addresses inside the ftrace location are refused by
-	 * arch_check_ftrace_location(). Something went terribly wrong
-	 * if such an address is checked here.
+	 * In case addr maps to sym+0 ftrace_location() might return something
+	 * other than addr. In that case consider it the same as !faddr.
 	 */
-	if (WARN_ON(faddr && faddr != addr))
-		return 0UL;
+	if (faddr && faddr != addr)
+		faddr = 0;
+
 	/*
 	 * Use the current code if it is not modified by Kprobe
 	 * and it cannot be modified by ftrace.
@@ -301,6 +303,22 @@ static int can_probe(unsigned long paddr
 	return (addr == paddr);
 }
 
+/* If the x86 support IBT (ENDBR) it must be skipped. */
+kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset,
+					 bool *on_func_entry)
+{
+	if (is_endbr(*(u32 *)addr)) {
+		*on_func_entry = !offset || offset == 4;
+		if (*on_func_entry)
+			offset = 4;
+
+	} else {
+		*on_func_entry = !offset;
+	}
+
+	return (kprobe_opcode_t *)(addr + offset);
+}
+
 /*
  * Copy an instruction with recovering modified instruction by kprobes
  * and adjust the displacement if the instruction uses the %rip-relative
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -265,7 +265,6 @@ extern int arch_init_kprobes(void);
 extern void kprobes_inc_nmissed_count(struct kprobe *p);
 extern bool arch_within_kprobe_blacklist(unsigned long addr);
 extern int arch_populate_kprobe_blacklist(void);
-extern bool arch_kprobe_on_func_entry(unsigned long offset);
 extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
 
 extern bool within_kprobe_blacklist(unsigned long addr);
@@ -384,6 +383,8 @@ static inline struct kprobe_ctlblk *get_
 }
 
 kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset);
+kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset, bool *on_func_entry);
+
 int register_kprobe(struct kprobe *p);
 void unregister_kprobe(struct kprobe *p);
 int register_kprobes(struct kprobe **kps, int num);
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1489,24 +1489,68 @@ bool within_kprobe_blacklist(unsigned lo
 }
 
 /*
+ * arch_adjust_kprobe_addr - adjust the address
+ * @addr: symbol base address
+ * @offset: offset within the symbol
+ * @on_func_entry: was this @addr+@offset on the function entry
+ *
+ * Typically returns @addr + @offset, except for special cases where the
+ * function might be prefixed by a CFI landing pad, in that case any offset
+ * inside the landing pad is mapped to the first 'real' instruction of the
+ * symbol.
+ *
+ * Specifically, for things like IBT/BTI, skip the resp. ENDBR/BTI.C
+ * instruction at +0.
+ */
+kprobe_opcode_t *__weak arch_adjust_kprobe_addr(unsigned long addr,
+						unsigned long offset,
+						bool *on_func_entry)
+{
+	*on_func_entry = !offset;
+	return (kprobe_opcode_t *)(addr + offset);
+}
+
+/*
  * If 'symbol_name' is specified, look it up and add the 'offset'
  * to it. This way, we can specify a relative address to a symbol.
  * This returns encoded errors if it fails to look up symbol or invalid
  * combination of parameters.
  */
-static kprobe_opcode_t *_kprobe_addr(kprobe_opcode_t *addr,
-			const char *symbol_name, unsigned int offset)
+static kprobe_opcode_t *
+_kprobe_addr(kprobe_opcode_t *addr, const char *symbol_name,
+	     unsigned long offset, bool *on_func_entry)
 {
 	if ((symbol_name && addr) || (!symbol_name && !addr))
 		goto invalid;
 
 	if (symbol_name) {
+		/*
+		 * Input: @sym + @offset
+		 * Output: @addr + @offset
+		 *
+		 * NOTE: kprobe_lookup_name() does *NOT* fold the offset
+		 *       argument into it's output!
+		 */
 		addr = kprobe_lookup_name(symbol_name, offset);
 		if (!addr)
 			return ERR_PTR(-ENOENT);
 	}
 
-	addr = (kprobe_opcode_t *)(((char *)addr) + offset);
+	/*
+	 * So here we have @addr + @offset, displace it into a new
+	 * @addr' + @offset' where @addr' is the symbol start address.
+	 */
+	addr = (void *)addr + offset;
+	if (!kallsyms_lookup_size_offset((unsigned long)addr, NULL, &offset))
+		return ERR_PTR(-ENOENT);
+	addr = (void *)addr - offset;
+
+	/*
+	 * Then ask the architecture to re-combine them, taking care of
+	 * magical function entry details while telling us if this was indeed
+	 * at the start of the function.
+	 */
+	addr = arch_adjust_kprobe_addr((unsigned long)addr, offset, on_func_entry);
 	if (addr)
 		return addr;
 
@@ -1516,7 +1560,8 @@ static kprobe_opcode_t *_kprobe_addr(kpr
 
 static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
 {
-	return _kprobe_addr(p->addr, p->symbol_name, p->offset);
+	bool on_func_entry;
+	return _kprobe_addr(p->addr, p->symbol_name, p->offset, &on_func_entry);
 }
 
 /*
@@ -2047,11 +2092,6 @@ static int pre_handler_kretprobe(struct
 }
 NOKPROBE_SYMBOL(pre_handler_kretprobe);
 
-bool __weak arch_kprobe_on_func_entry(unsigned long offset)
-{
-	return !offset;
-}
-
 /**
  * kprobe_on_func_entry() -- check whether given address is function entry
  * @addr: Target address
@@ -2067,15 +2107,13 @@ bool __weak arch_kprobe_on_func_entry(un
  */
 int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
 {
-	kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
+	bool on_func_entry;
+	kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset, &on_func_entry);
 
 	if (IS_ERR(kp_addr))
 		return PTR_ERR(kp_addr);
 
-	if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
-		return -ENOENT;
-
-	if (!arch_kprobe_on_func_entry(offset))
+	if (!on_func_entry)
 		return -EINVAL;
 
 	return 0;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 20/39] x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (18 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 19/39] x86/ibt,kprobes: Cure sym+0 equals fentry woes Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 21/39] x86/ibt,ftrace: Add ENDBR to samples/ftrace Peter Zijlstra
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

With IBT enabled builds we need ENDBR instructions at indirect jump
target sites, since we start execution of the JIT'ed code through an
indirect jump, the very first instruction needs to be ENDBR.

Similarly, since eBPF tail-calls use indirect branches, their landing
site needs to be an ENDBR too.

The trampolines need similar adjustment.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/net/bpf_jit_comp.c |   16 ++++++++++++++--
 kernel/bpf/trampoline.c     |   20 ++++----------------
 2 files changed, 18 insertions(+), 18 deletions(-)

--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -46,6 +46,12 @@ static u8 *emit_code(u8 *ptr, u32 bytes,
 #define EMIT4_off32(b1, b2, b3, b4, off) \
 	do { EMIT4(b1, b2, b3, b4); EMIT(off, 4); } while (0)
 
+#ifdef CONFIG_X86_KERNEL_IBT
+#define EMIT_ENDBR()	EMIT(gen_endbr(), 4)
+#else
+#define EMIT_ENDBR()
+#endif
+
 static bool is_imm8(int value)
 {
 	return value <= 127 && value >= -128;
@@ -241,7 +247,7 @@ struct jit_context {
 /* Number of bytes emit_patch() needs to generate instructions */
 #define X86_PATCH_SIZE		5
 /* Number of bytes that will be skipped on tailcall */
-#define X86_TAIL_CALL_OFFSET	11
+#define X86_TAIL_CALL_OFFSET	(11 + ENDBR_INSN_SIZE)
 
 static void push_callee_regs(u8 **pprog, bool *callee_regs_used)
 {
@@ -286,6 +292,7 @@ static void emit_prologue(u8 **pprog, u3
 	/* BPF trampoline can be made to work without these nops,
 	 * but let's waste 5 bytes for now and optimize later
 	 */
+	EMIT_ENDBR();
 	memcpy(prog, x86_nops[5], X86_PATCH_SIZE);
 	prog += X86_PATCH_SIZE;
 	if (!ebpf_from_cbpf) {
@@ -296,6 +303,10 @@ static void emit_prologue(u8 **pprog, u3
 	}
 	EMIT1(0x55);             /* push rbp */
 	EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
+
+	/* X86_TAIL_CALL_OFFSET is here */
+	EMIT_ENDBR();
+
 	/* sub rsp, rounded_stack_depth */
 	if (stack_depth)
 		EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8));
@@ -2028,10 +2039,11 @@ int arch_prepare_bpf_trampoline(struct b
 		/* skip patched call instruction and point orig_call to actual
 		 * body of the kernel function.
 		 */
-		orig_call += X86_PATCH_SIZE;
+		orig_call += X86_PATCH_SIZE + ENDBR_INSN_SIZE;
 
 	prog = image;
 
+	EMIT_ENDBR();
 	EMIT1(0x55);		 /* push rbp */
 	EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
 	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -117,18 +117,6 @@ static void bpf_trampoline_module_put(st
 	tr->mod = NULL;
 }
 
-static int is_ftrace_location(void *ip)
-{
-	long addr;
-
-	addr = ftrace_location((long)ip);
-	if (!addr)
-		return 0;
-	if (WARN_ON_ONCE(addr != (long)ip))
-		return -EFAULT;
-	return 1;
-}
-
 static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
 {
 	void *ip = tr->func.addr;
@@ -160,12 +148,12 @@ static int modify_fentry(struct bpf_tram
 static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
 {
 	void *ip = tr->func.addr;
+	unsigned long faddr;
 	int ret;
 
-	ret = is_ftrace_location(ip);
-	if (ret < 0)
-		return ret;
-	tr->func.ftrace_managed = ret;
+	faddr = ftrace_location((unsigned long)ip);
+	if (faddr)
+		tr->func.ftrace_managed = true;
 
 	if (bpf_trampoline_module_get(tr))
 		return -ENOENT;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 21/39] x86/ibt,ftrace: Add ENDBR to samples/ftrace
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (19 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 20/39] x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling Peter Zijlstra
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 samples/ftrace/ftrace-direct-modify.c       |    5 +++++
 samples/ftrace/ftrace-direct-multi-modify.c |   10 +++++++---
 samples/ftrace/ftrace-direct-multi.c        |    5 ++++-
 samples/ftrace/ftrace-direct-too.c          |    3 +++
 samples/ftrace/ftrace-direct.c              |    3 +++
 5 files changed, 22 insertions(+), 4 deletions(-)

--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -24,20 +24,25 @@ static unsigned long my_ip = (unsigned l
 
 #ifdef CONFIG_X86_64
 
+#include <asm/ibt.h>
+
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp1, @function\n"
 "	.globl		my_tramp1\n"
 "   my_tramp1:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	call my_direct_func1\n"
 "	leave\n"
 "	.size		my_tramp1, .-my_tramp1\n"
 	ASM_RET
+
 "	.type		my_tramp2, @function\n"
 "	.globl		my_tramp2\n"
 "   my_tramp2:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	call my_direct_func2\n"
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -22,11 +22,14 @@ extern void my_tramp2(void *);
 
 #ifdef CONFIG_X86_64
 
+#include <asm/ibt.h>
+
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp1, @function\n"
 "	.globl		my_tramp1\n"
 "   my_tramp1:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	pushq %rdi\n"
@@ -34,12 +37,13 @@ asm (
 "	call my_direct_func1\n"
 "	popq %rdi\n"
 "	leave\n"
-"	ret\n"
+	ASM_RET
 "	.size		my_tramp1, .-my_tramp1\n"
+
 "	.type		my_tramp2, @function\n"
-"\n"
 "	.globl		my_tramp2\n"
 "   my_tramp2:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	pushq %rdi\n"
@@ -47,7 +51,7 @@ asm (
 "	call my_direct_func2\n"
 "	popq %rdi\n"
 "	leave\n"
-"	ret\n"
+	ASM_RET
 "	.size		my_tramp2, .-my_tramp2\n"
 "	.popsection\n"
 );
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -17,11 +17,14 @@ extern void my_tramp(void *);
 
 #ifdef CONFIG_X86_64
 
+#include <asm/ibt.h>
+
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp, @function\n"
 "	.globl		my_tramp\n"
 "   my_tramp:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	pushq %rdi\n"
@@ -29,7 +32,7 @@ asm (
 "	call my_direct_func\n"
 "	popq %rdi\n"
 "	leave\n"
-"	ret\n"
+	ASM_RET
 "	.size		my_tramp, .-my_tramp\n"
 "	.popsection\n"
 );
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -19,11 +19,14 @@ extern void my_tramp(void *);
 
 #ifdef CONFIG_X86_64
 
+#include <asm/ibt.h>
+
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp, @function\n"
 "	.globl		my_tramp\n"
 "   my_tramp:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	pushq %rdi\n"
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -16,11 +16,14 @@ extern void my_tramp(void *);
 
 #ifdef CONFIG_X86_64
 
+#include <asm/ibt.h>
+
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp, @function\n"
 "	.globl		my_tramp\n"
 "   my_tramp:"
+	ASM_ENDBR
 "	pushq %rbp\n"
 "	movq %rsp, %rbp\n"
 "	pushq %rdi\n"



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (20 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 21/39] x86/ibt,ftrace: Add ENDBR to samples/ftrace Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-04 17:57   ` Josh Poimboeuf
  2022-03-04 18:07   ` Josh Poimboeuf
  2022-03-03 11:23 ` [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip Peter Zijlstra
                   ` (17 subsequent siblings)
  39 siblings, 2 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

The bits required to make the hardware go.. Of note is that, provided
the syscall entry points are covered with ENDBR, #CP doesn't need to
be an IST because we'll never hit the syscall gap.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/cpu.h                  |    4 +
 arch/x86/include/asm/cpufeatures.h          |    1 
 arch/x86/include/asm/idtentry.h             |    5 ++
 arch/x86/include/asm/msr-index.h            |   20 ++++++++-
 arch/x86/include/asm/traps.h                |    2 
 arch/x86/include/uapi/asm/processor-flags.h |    2 
 arch/x86/kernel/cpu/common.c                |   31 +++++++++++++-
 arch/x86/kernel/idt.c                       |    4 +
 arch/x86/kernel/machine_kexec_64.c          |    2 
 arch/x86/kernel/relocate_kernel_64.S        |    8 +++
 arch/x86/kernel/traps.c                     |   61 ++++++++++++++++++++++++++++
 11 files changed, 138 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -7,6 +7,7 @@
 #include <linux/topology.h>
 #include <linux/nodemask.h>
 #include <linux/percpu.h>
+#include <asm/ibt.h>
 
 #ifdef CONFIG_SMP
 
@@ -72,4 +73,7 @@ void init_ia32_feat_ctl(struct cpuinfo_x
 #else
 static inline void init_ia32_feat_ctl(struct cpuinfo_x86 *c) {}
 #endif
+
+extern __noendbr void cet_disable(void);
+
 #endif /* _ASM_X86_CPU_H */
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -387,6 +387,7 @@
 #define X86_FEATURE_TSXLDTRK		(18*32+16) /* TSX Suspend Load Address Tracking */
 #define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
 #define X86_FEATURE_ARCH_LBR		(18*32+19) /* Intel ARCH LBR */
+#define X86_FEATURE_IBT			(18*32+20) /* Indirect Branch Tracking */
 #define X86_FEATURE_AMX_BF16		(18*32+22) /* AMX bf16 Support */
 #define X86_FEATURE_AVX512_FP16		(18*32+23) /* AVX512 FP16 */
 #define X86_FEATURE_AMX_TILE		(18*32+24) /* AMX tile Support */
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -617,6 +617,11 @@ DECLARE_IDTENTRY_DF(X86_TRAP_DF,	exc_dou
 DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_DF,	xenpv_exc_double_fault);
 #endif
 
+/* #CP */
+#ifdef CONFIG_X86_KERNEL_IBT
+DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP,	exc_control_protection);
+#endif
+
 /* #VC */
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 DECLARE_IDTENTRY_VC(X86_TRAP_VC,	exc_vmm_communication);
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -362,11 +362,29 @@
 #define MSR_ATOM_CORE_TURBO_RATIOS	0x0000066c
 #define MSR_ATOM_CORE_TURBO_VIDS	0x0000066d
 
-
 #define MSR_CORE_PERF_LIMIT_REASONS	0x00000690
 #define MSR_GFX_PERF_LIMIT_REASONS	0x000006B0
 #define MSR_RING_PERF_LIMIT_REASONS	0x000006B1
 
+/* Control-flow Enforcement Technology MSRs */
+#define MSR_IA32_U_CET			0x000006a0 /* user mode cet */
+#define MSR_IA32_S_CET			0x000006a2 /* kernel mode cet */
+#define CET_SHSTK_EN			BIT_ULL(0)
+#define CET_WRSS_EN			BIT_ULL(1)
+#define CET_ENDBR_EN			BIT_ULL(2)
+#define CET_LEG_IW_EN			BIT_ULL(3)
+#define CET_NO_TRACK_EN			BIT_ULL(4)
+#define CET_SUPPRESS_DISABLE		BIT_ULL(5)
+#define CET_RESERVED			(BIT_ULL(6) | BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9))
+#define CET_SUPPRESS			BIT_ULL(10)
+#define CET_WAIT_ENDBR			BIT_ULL(11)
+
+#define MSR_IA32_PL0_SSP		0x000006a4 /* ring-0 shadow stack pointer */
+#define MSR_IA32_PL1_SSP		0x000006a5 /* ring-1 shadow stack pointer */
+#define MSR_IA32_PL2_SSP		0x000006a6 /* ring-2 shadow stack pointer */
+#define MSR_IA32_PL3_SSP		0x000006a7 /* ring-3 shadow stack pointer */
+#define MSR_IA32_INT_SSP_TAB		0x000006a8 /* exception shadow stack table */
+
 /* Hardware P state interface */
 #define MSR_PPERF			0x0000064e
 #define MSR_PERF_LIMIT_REASONS		0x0000064f
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -18,6 +18,8 @@ void __init trap_init(void);
 asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
 #endif
 
+extern bool ibt_selftest(void);
+
 #ifdef CONFIG_X86_F00F_BUG
 /* For handling the FOOF bug */
 void handle_invalid_op(struct pt_regs *regs);
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -130,6 +130,8 @@
 #define X86_CR4_SMAP		_BITUL(X86_CR4_SMAP_BIT)
 #define X86_CR4_PKE_BIT		22 /* enable Protection Keys support */
 #define X86_CR4_PKE		_BITUL(X86_CR4_PKE_BIT)
+#define X86_CR4_CET_BIT		23 /* enable Control-flow Enforcement Technology */
+#define X86_CR4_CET		_BITUL(X86_CR4_CET_BIT)
 
 /*
  * x86-64 Task Priority Register, CR8
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -59,6 +59,7 @@
 #include <asm/cpu_device_id.h>
 #include <asm/uv/uv.h>
 #include <asm/sigframe.h>
+#include <asm/traps.h>
 
 #include "cpu.h"
 
@@ -438,7 +439,8 @@ static __always_inline void setup_umip(s
 
 /* These bits should not change their value after CPU init is finished. */
 static const unsigned long cr4_pinned_mask =
-	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | X86_CR4_FSGSBASE;
+	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
+	X86_CR4_FSGSBASE | X86_CR4_CET;
 static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
 static unsigned long cr4_pinned_bits __ro_after_init;
 
@@ -592,6 +594,30 @@ static __init int setup_disable_pku(char
 __setup("nopku", setup_disable_pku);
 #endif /* CONFIG_X86_64 */
 
+static __always_inline void setup_cet(struct cpuinfo_x86 *c)
+{
+	u64 msr = CET_ENDBR_EN;
+
+	if (!HAS_KERNEL_IBT ||
+	    !cpu_feature_enabled(X86_FEATURE_IBT))
+		return;
+
+	wrmsrl(MSR_IA32_S_CET, msr);
+	cr4_set_bits(X86_CR4_CET);
+
+	if (!ibt_selftest()) {
+		pr_err("IBT selftest: Failed!\n");
+		setup_clear_cpu_cap(X86_FEATURE_IBT);
+		return;
+	}
+}
+
+__noendbr void cet_disable(void)
+{
+	if (cpu_feature_enabled(X86_FEATURE_IBT))
+		wrmsrl(MSR_IA32_S_CET, 0);
+}
+
 /*
  * Some CPU features depend on higher CPUID levels, which may not always
  * be available due to CPUID level capping or broken virtualization
@@ -1709,6 +1735,7 @@ static void identify_cpu(struct cpuinfo_
 
 	x86_init_rdrand(c);
 	setup_pku(c);
+	setup_cet(c);
 
 	/*
 	 * Clear/Set all flags overridden by options, need do it
@@ -1777,6 +1804,8 @@ void enable_sep_cpu(void)
 void __init identify_boot_cpu(void)
 {
 	identify_cpu(&boot_cpu_data);
+	if (HAS_KERNEL_IBT && cpu_feature_enabled(X86_FEATURE_IBT))
+		pr_info("CET detected: Indirect Branch Tracking enabled\n");
 #ifdef CONFIG_X86_32
 	sysenter_setup();
 	enable_sep_cpu();
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -104,6 +104,10 @@ static const __initconst struct idt_data
 	ISTG(X86_TRAP_MC,		asm_exc_machine_check, IST_INDEX_MCE),
 #endif
 
+#ifdef CONFIG_X86_KERNEL_IBT
+	INTG(X86_TRAP_CP,		asm_exc_control_protection),
+#endif
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 	ISTG(X86_TRAP_VC,		asm_exc_vmm_communication, IST_INDEX_VC),
 #endif
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -27,6 +27,7 @@
 #include <asm/kexec-bzimage64.h>
 #include <asm/setup.h>
 #include <asm/set_memory.h>
+#include <asm/cpu.h>
 
 #ifdef CONFIG_ACPI
 /*
@@ -310,6 +311,7 @@ void machine_kexec(struct kimage *image)
 	/* Interrupts aren't acceptable while we reboot */
 	local_irq_disable();
 	hw_breakpoint_disable();
+	cet_disable();
 
 	if (image->preserve_context) {
 #ifdef CONFIG_X86_IO_APIC
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -115,6 +115,14 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_ma
 	pushq   %rdx
 
 	/*
+	 * Clear X86_CR4_CET (if it was set) such that we can clear CR0_WP
+	 * below.
+	 */
+	movq	%cr4, %rax
+	andq	$~(X86_CR4_CET), %rax
+	movq	%rax, %cr4
+
+	/*
 	 * Set cr0 to a known state:
 	 *  - Paging enabled
 	 *  - Alignment check disabled
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -210,6 +210,67 @@ DEFINE_IDTENTRY(exc_overflow)
 	do_error_trap(regs, 0, "overflow", X86_TRAP_OF, SIGSEGV, 0, NULL);
 }
 
+#ifdef CONFIG_X86_KERNEL_IBT
+
+static __ro_after_init bool ibt_fatal = true;
+
+void ibt_selftest_ip(void); /* code label defined in asm below */
+
+DEFINE_IDTENTRY_ERRORCODE(exc_control_protection)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_IBT)) {
+		pr_err("Unexpected #CP\n");
+		BUG();
+	}
+
+	if (WARN_ON_ONCE(user_mode(regs) || error_code != 3))
+		return;
+
+	if (unlikely(regs->ip == (unsigned long)ibt_selftest_ip)) {
+		regs->ax = 0;
+		return;
+	}
+
+	pr_err("Missing ENDBR: %pS\n", (void *)instruction_pointer(regs));
+	if (!ibt_fatal) {
+		printk(KERN_DEFAULT CUT_HERE);
+		__warn(__FILE__, __LINE__, (void *)regs->ip, TAINT_WARN, regs, NULL);
+		return;
+	}
+	BUG();
+}
+
+bool ibt_selftest(void)
+{
+	unsigned long ret;
+
+	asm ("	lea ibt_selftest_ip(%%rip), %%rax\n\t"
+	     ANNOTATE_RETPOLINE_SAFE
+	     "	jmp *%%rax\n\t"
+	     ASM_REACHABLE
+	     ANNOTATE_NOENDBR
+	     "ibt_selftest_ip: nop\n\t"
+
+	     : "=a" (ret) : : "memory");
+
+	return !ret;
+}
+
+static int __init ibt_setup(char *str)
+{
+	if (!strcmp(str, "off"))
+		setup_clear_cpu_cap(X86_FEATURE_IBT);
+
+	if (!strcmp(str, "warn"))
+		ibt_fatal = false;
+
+	return 1;
+}
+
+__setup("ibt=", ibt_setup);
+
+#endif /* CONFIG_X86_KERNEL_IBT */
+
 #ifdef CONFIG_X86_F00F_BUG
 void handle_invalid_op(struct pt_regs *regs)
 #else



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (21 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-04 18:02   ` Josh Poimboeuf
  2022-03-03 11:23 ` [PATCH v3 24/39] x86/ibt: Disable IBT around firmware Peter Zijlstra
                   ` (16 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Similar to ibt_selftest_ip, apply the same pattern.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/alternative.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -719,7 +719,7 @@ asm (
 "	.popsection\n"
 );
 
-extern __initdata unsigned long int3_selftest_ip; /* defined in asm below */
+extern void int3_selftest_ip(void); /* defined in asm below */
 
 static int __init
 int3_exception_notify(struct notifier_block *self, unsigned long val, void *data)
@@ -733,7 +733,7 @@ int3_exception_notify(struct notifier_bl
 	if (val != DIE_INT3)
 		return NOTIFY_DONE;
 
-	if (regs->ip - INT3_INSN_SIZE != int3_selftest_ip)
+	if (regs->ip - INT3_INSN_SIZE != (unsigned long)&int3_selftest_ip)
 		return NOTIFY_DONE;
 
 	int3_emulate_call(regs, (unsigned long)&int3_magic);
@@ -757,14 +757,7 @@ static void __init int3_selftest(void)
 	 * then trigger the INT3, padded with NOPs to match a CALL instruction
 	 * length.
 	 */
-	asm volatile ("1: int3; nop; nop; nop; nop\n\t"
-		      ".pushsection .init.data,\"aw\"\n\t"
-		      ".align " __ASM_SEL(4, 8) "\n\t"
-		      ".type int3_selftest_ip, @object\n\t"
-		      ".size int3_selftest_ip, " __ASM_SEL(4, 8) "\n\t"
-		      "int3_selftest_ip:\n\t"
-		      __ASM_SEL(.long, .quad) " 1b\n\t"
-		      ".popsection\n\t"
+	asm volatile ("int3_selftest_ip: int3; nop; nop; nop; nop\n\t"
 		      : ASM_CALL_CONSTRAINT
 		      : __ASM_SEL_RAW(a, D) (&val)
 		      : "memory");



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 24/39] x86/ibt: Disable IBT around firmware
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (22 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 25/39] x86/bugs: Disable Retpoline when IBT Peter Zijlstra
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Assume firmware isn't IBT clean and disable it across calls.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/efi.h   |    9 +++++++--
 arch/x86/include/asm/ibt.h   |    6 ++++++
 arch/x86/kernel/apm_32.c     |    7 +++++++
 arch/x86/kernel/cpu/common.c |   28 ++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -7,6 +7,7 @@
 #include <asm/tlb.h>
 #include <asm/nospec-branch.h>
 #include <asm/mmu_context.h>
+#include <asm/ibt.h>
 #include <linux/build_bug.h>
 #include <linux/kernel.h>
 #include <linux/pgtable.h>
@@ -120,8 +121,12 @@ extern asmlinkage u64 __efi_call(void *f
 	efi_enter_mm();							\
 })
 
-#define arch_efi_call_virt(p, f, args...)				\
-	efi_call((void *)p->f, args)					\
+#define arch_efi_call_virt(p, f, args...) ({				\
+	u64 ret, ibt = ibt_save();					\
+	ret = efi_call((void *)p->f, args);				\
+	ibt_restore(ibt);						\
+	ret;								\
+})
 
 #define arch_efi_call_virt_teardown()					\
 ({									\
--- a/arch/x86/include/asm/ibt.h
+++ b/arch/x86/include/asm/ibt.h
@@ -39,6 +39,9 @@ static inline bool is_endbr(u32 val)
 	return val == gen_endbr();
 }
 
+extern __noendbr u64 ibt_save(void);
+extern __noendbr void ibt_restore(u64 save);
+
 #else /* __ASSEMBLY__ */
 
 #ifdef CONFIG_X86_64
@@ -61,6 +64,9 @@ static inline bool is_endbr(u32 val)
 
 static inline bool is_endbr(u32 val) { return false; }
 
+static inline u64 ibt_save(void) { return 0; }
+static inline void ibt_restore(u64 save) { }
+
 #else /* __ASSEMBLY__ */
 
 #define ENDBR
--- a/arch/x86/kernel/apm_32.c
+++ b/arch/x86/kernel/apm_32.c
@@ -232,6 +232,7 @@
 #include <asm/paravirt.h>
 #include <asm/reboot.h>
 #include <asm/nospec-branch.h>
+#include <asm/ibt.h>
 
 #if defined(CONFIG_APM_DISPLAY_BLANK) && defined(CONFIG_VT)
 extern int (*console_blank_hook)(int);
@@ -598,6 +599,7 @@ static long __apm_bios_call(void *_call)
 	struct desc_struct	save_desc_40;
 	struct desc_struct	*gdt;
 	struct apm_bios_call	*call = _call;
+	u64			ibt;
 
 	cpu = get_cpu();
 	BUG_ON(cpu != 0);
@@ -607,11 +609,13 @@ static long __apm_bios_call(void *_call)
 
 	apm_irq_save(flags);
 	firmware_restrict_branch_speculation_start();
+	ibt = ibt_save();
 	APM_DO_SAVE_SEGS;
 	apm_bios_call_asm(call->func, call->ebx, call->ecx,
 			  &call->eax, &call->ebx, &call->ecx, &call->edx,
 			  &call->esi);
 	APM_DO_RESTORE_SEGS;
+	ibt_restore(ibt);
 	firmware_restrict_branch_speculation_end();
 	apm_irq_restore(flags);
 	gdt[0x40 / 8] = save_desc_40;
@@ -676,6 +680,7 @@ static long __apm_bios_call_simple(void
 	struct desc_struct	save_desc_40;
 	struct desc_struct	*gdt;
 	struct apm_bios_call	*call = _call;
+	u64			ibt;
 
 	cpu = get_cpu();
 	BUG_ON(cpu != 0);
@@ -685,10 +690,12 @@ static long __apm_bios_call_simple(void
 
 	apm_irq_save(flags);
 	firmware_restrict_branch_speculation_start();
+	ibt = ibt_save();
 	APM_DO_SAVE_SEGS;
 	error = apm_bios_call_simple_asm(call->func, call->ebx, call->ecx,
 					 &call->eax);
 	APM_DO_RESTORE_SEGS;
+	ibt_restore(ibt);
 	firmware_restrict_branch_speculation_end();
 	apm_irq_restore(flags);
 	gdt[0x40 / 8] = save_desc_40;
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -594,6 +594,34 @@ static __init int setup_disable_pku(char
 __setup("nopku", setup_disable_pku);
 #endif /* CONFIG_X86_64 */
 
+#ifdef CONFIG_X86_KERNEL_IBT
+
+__noendbr u64 ibt_save(void)
+{
+	u64 msr = 0;
+
+	if (cpu_feature_enabled(X86_FEATURE_IBT)) {
+		rdmsrl(MSR_IA32_S_CET, msr);
+		wrmsrl(MSR_IA32_S_CET, msr & ~CET_ENDBR_EN);
+	}
+
+	return msr;
+}
+
+__noendbr void ibt_restore(u64 save)
+{
+	u64 msr;
+
+	if (cpu_feature_enabled(X86_FEATURE_IBT)) {
+		rdmsrl(MSR_IA32_S_CET, msr);
+		msr &= ~CET_ENDBR_EN;
+		msr |= (save & CET_ENDBR_EN);
+		wrmsrl(MSR_IA32_S_CET, msr);
+	}
+}
+
+#endif
+
 static __always_inline void setup_cet(struct cpuinfo_x86 *c)
 {
 	u64 msr = CET_ENDBR_EN;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 25/39] x86/bugs: Disable Retpoline when IBT
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (23 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 24/39] x86/ibt: Disable IBT around firmware Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-04 17:39   ` Josh Poimboeuf
  2022-03-03 11:23 ` [PATCH v3 26/39] x86/ibt: Annotate text references Peter Zijlstra
                   ` (14 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Retpoline and IBT are mutually exclusive. IBT relies on indirect
branches (JMP/CALL *%reg) while retpoline avoids them by design.

Demote to LFENCE on IBT enabled hardware.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/bugs.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -891,6 +891,7 @@ static void __init spectre_v2_select_mit
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
 	enum spectre_v2_mitigation mode = SPECTRE_V2_NONE;
+	bool silent_demote = false;
 
 	/*
 	 * If the CPU is not affected and the command line mode is NONE or AUTO
@@ -906,6 +907,7 @@ static void __init spectre_v2_select_mit
 
 	case SPECTRE_V2_CMD_FORCE:
 	case SPECTRE_V2_CMD_AUTO:
+		silent_demote = true;
 		if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) {
 			mode = SPECTRE_V2_IBRS_ENHANCED;
 			/* Force it so VMEXIT will restore correctly */
@@ -938,6 +940,7 @@ static void __init spectre_v2_select_mit
 	retpoline_amd:
 		if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) {
 			pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n");
+			silent_demote = false;
 			goto retpoline_generic;
 		}
 		mode = SPECTRE_V2_RETPOLINE_AMD;
@@ -947,6 +950,16 @@ static void __init spectre_v2_select_mit
 	retpoline_generic:
 		mode = SPECTRE_V2_RETPOLINE_GENERIC;
 		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
+
+		/*
+		 * ROP defeats IBT, make sure not to use Retpolines and IBT together.
+		 */
+		if (HAS_KERNEL_IBT && cpu_feature_enabled(X86_FEATURE_IBT)) {
+			if (!silent_demote)
+				pr_warn("Spectre mitigation: Switching to LFENCE due to IBT\n");
+			mode = SPECTRE_V2_RETPOLINE_AMD;
+			setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD);
+		}
 	}
 
 specv2_set_mode:



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 26/39] x86/ibt: Annotate text references
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (24 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 25/39] x86/bugs: Disable Retpoline when IBT Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 27/39] x86/ibt,ftrace: Annotate ftrace code patching Peter Zijlstra
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Annotate away some of the generic code references. This is things
where we take the address of a symbol for exception handling or return
addresses (eg. context switch).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S            |    6 ++++++
 arch/x86/entry/entry_64_compat.S     |    1 +
 arch/x86/kernel/alternative.c        |    9 +++++++--
 arch/x86/kernel/head_64.S            |    4 ++++
 arch/x86/kernel/kprobes/core.c       |    1 +
 arch/x86/kernel/relocate_kernel_64.S |    2 ++
 arch/x86/lib/error-inject.c          |    2 ++
 arch/x86/lib/retpoline.S             |    1 +
 8 files changed, 24 insertions(+), 2 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -277,6 +277,7 @@ SYM_FUNC_END(__switch_to_asm)
 .pushsection .text, "ax"
 SYM_CODE_START(ret_from_fork)
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR // copy_thread
 	movq	%rax, %rdi
 	call	schedule_tail			/* rdi: 'prev' task parameter */
 
@@ -569,6 +570,7 @@ SYM_CODE_END(\asmsym)
 	.align 16
 	.globl __irqentry_text_end
 __irqentry_text_end:
+	ANNOTATE_NOENDBR
 
 SYM_CODE_START_LOCAL(common_interrupt_return)
 SYM_INNER_LABEL(swapgs_restore_regs_and_return_to_usermode, SYM_L_GLOBAL)
@@ -650,6 +652,7 @@ SYM_INNER_LABEL(early_xen_iret_patch, SY
 #endif
 
 SYM_INNER_LABEL(native_irq_return_iret, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR // exc_double_fault
 	/*
 	 * This may fault.  Non-paranoid faults on return to userspace are
 	 * handled by fixup_bad_iret.  These include #SS, #GP, and #NP.
@@ -744,6 +747,7 @@ SYM_FUNC_START(asm_load_gs_index)
 	FRAME_BEGIN
 	swapgs
 .Lgs_change:
+	ANNOTATE_NOENDBR // error_entry
 	movl	%edi, %gs
 2:	ALTERNATIVE "", "mfence", X86_BUG_SWAPGS_FENCE
 	swapgs
@@ -1322,6 +1326,7 @@ SYM_CODE_START(asm_exc_nmi)
 #endif
 
 repeat_nmi:
+	ANNOTATE_NOENDBR // this code
 	/*
 	 * If there was a nested NMI, the first NMI's iret will return
 	 * here. But NMIs are still enabled and we can take another
@@ -1350,6 +1355,7 @@ SYM_CODE_START(asm_exc_nmi)
 	.endr
 	subq	$(5*8), %rsp
 end_repeat_nmi:
+	ANNOTATE_NOENDBR // this code
 
 	/*
 	 * Everything below this point can be preempted by a nested NMI.
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -148,6 +148,7 @@ SYM_INNER_LABEL(entry_SYSENTER_compat_af
 	popfq
 	jmp	.Lsysenter_flags_fixed
 SYM_INNER_LABEL(__end_entry_SYSENTER_compat, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR // is_sysenter_singlestep
 SYM_CODE_END(entry_SYSENTER_compat)
 
 /*
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -713,6 +713,7 @@ asm (
 "	.pushsection	.init.text, \"ax\", @progbits\n"
 "	.type		int3_magic, @function\n"
 "int3_magic:\n"
+	ANNOTATE_NOENDBR
 "	movl	$1, (%" _ASM_ARG1 ")\n"
 	ASM_RET
 "	.size		int3_magic, .-int3_magic\n"
@@ -724,16 +725,19 @@ extern void int3_selftest_ip(void); /* d
 static int __init
 int3_exception_notify(struct notifier_block *self, unsigned long val, void *data)
 {
+	unsigned long selftest = (unsigned long)&int3_selftest_ip;
 	struct die_args *args = data;
 	struct pt_regs *regs = args->regs;
 
+	OPTIMIZER_HIDE_VAR(selftest);
+
 	if (!regs || user_mode(regs))
 		return NOTIFY_DONE;
 
 	if (val != DIE_INT3)
 		return NOTIFY_DONE;
 
-	if (regs->ip - INT3_INSN_SIZE != (unsigned long)&int3_selftest_ip)
+	if (regs->ip - INT3_INSN_SIZE != selftest)
 		return NOTIFY_DONE;
 
 	int3_emulate_call(regs, (unsigned long)&int3_magic);
@@ -757,7 +761,8 @@ static void __init int3_selftest(void)
 	 * then trigger the INT3, padded with NOPs to match a CALL instruction
 	 * length.
 	 */
-	asm volatile ("int3_selftest_ip: int3; nop; nop; nop; nop\n\t"
+	asm volatile (ANNOTATE_NOENDBR
+		      "int3_selftest_ip: int3; nop; nop; nop; nop\n\t"
 		      : ASM_CALL_CONSTRAINT
 		      : __ASM_SEL_RAW(a, D) (&val)
 		      : "memory");
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -99,6 +99,7 @@ SYM_CODE_END(startup_64)
 
 SYM_CODE_START(secondary_startup_64)
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
 	 * and someone has loaded a mapped page table.
@@ -127,6 +128,7 @@ SYM_CODE_START(secondary_startup_64)
 	 */
 SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR
 
 	/*
 	 * Retrieve the modifier (SME encryption mask if SME is active) to be
@@ -192,6 +194,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_
 	jmp	*%rax
 1:
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR // above
 
 	/*
 	 * We must switch to a new descriptor in kernel space for the GDT
@@ -299,6 +302,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_
 	pushq	%rax		# target address in negative space
 	lretq
 .Lafter_lret:
+	ANNOTATE_NOENDBR
 SYM_CODE_END(secondary_startup_64)
 
 #include "verify_cpu.S"
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1040,6 +1040,7 @@ asm(
 	".type __kretprobe_trampoline, @function\n"
 	"__kretprobe_trampoline:\n"
 #ifdef CONFIG_X86_64
+	ANNOTATE_NOENDBR
 	/* Push a fake return address to tell the unwinder it's a kretprobe. */
 	"	pushq $__kretprobe_trampoline\n"
 	UNWIND_HINT_FUNC
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -42,6 +42,7 @@
 	.code64
 SYM_CODE_START_NOALIGN(relocate_kernel)
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR
 	/*
 	 * %rdi indirection_page
 	 * %rsi page_list
@@ -223,6 +224,7 @@ SYM_CODE_END(identity_mapped)
 
 SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
 	UNWIND_HINT_EMPTY
+	ANNOTATE_NOENDBR // RET target, above
 	movq	RSP(%r8), %rsp
 	movq	CR4(%r8), %rax
 	movq	%rax, %cr4
--- a/arch/x86/lib/error-inject.c
+++ b/arch/x86/lib/error-inject.c
@@ -3,6 +3,7 @@
 #include <linux/linkage.h>
 #include <linux/error-injection.h>
 #include <linux/kprobes.h>
+#include <linux/objtool.h>
 
 asmlinkage void just_return_func(void);
 
@@ -11,6 +12,7 @@ asm(
 	".type just_return_func, @function\n"
 	".globl just_return_func\n"
 	"just_return_func:\n"
+		ANNOTATE_NOENDBR
 		ASM_RET
 	".size just_return_func, .-just_return_func\n"
 );
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -55,6 +55,7 @@ SYM_INNER_LABEL(__x86_indirect_thunk_\re
 
 	.align RETPOLINE_THUNK_SIZE
 SYM_CODE_START(__x86_indirect_thunk_array)
+	ANNOTATE_NOENDBR // apply_retpolines
 
 #define GEN(reg) THUNK reg
 #include <asm/GEN-for-each-reg.h>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 27/39] x86/ibt,ftrace: Annotate ftrace code patching
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (25 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 26/39] x86/ibt: Annotate text references Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 28/39] x86/ibt,sev: Annotations Peter Zijlstra
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

These are code patching sites, not indirect targets.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/ftrace_64.S |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -145,6 +145,7 @@ SYM_FUNC_START(ftrace_caller)
 	movq %rcx, RSP(%rsp)
 
 SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 	/* Load the ftrace_ops into the 3rd parameter */
 	movq function_trace_op(%rip), %rdx
 
@@ -155,6 +156,7 @@ SYM_INNER_LABEL(ftrace_caller_op_ptr, SY
 	movq $0, CS(%rsp)
 
 SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 	call ftrace_stub
 
 	/* Handlers can change the RIP */
@@ -169,6 +171,7 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBA
 	 * layout here.
 	 */
 SYM_INNER_LABEL(ftrace_caller_end, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 
 	jmp ftrace_epilogue
 SYM_FUNC_END(ftrace_caller);
@@ -192,6 +195,7 @@ SYM_FUNC_START(ftrace_regs_caller)
 	/* save_mcount_regs fills in first two parameters */
 
 SYM_INNER_LABEL(ftrace_regs_caller_op_ptr, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 	/* Load the ftrace_ops into the 3rd parameter */
 	movq function_trace_op(%rip), %rdx
 
@@ -221,6 +225,7 @@ SYM_INNER_LABEL(ftrace_regs_caller_op_pt
 	leaq (%rsp), %rcx
 
 SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 	call ftrace_stub
 
 	/* Copy flags back to SS, to restore them */
@@ -248,6 +253,7 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_
 	 */
 	testq	%rax, %rax
 SYM_INNER_LABEL(ftrace_regs_caller_jmp, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 	jnz	1f
 
 	restore_mcount_regs
@@ -261,6 +267,7 @@ SYM_INNER_LABEL(ftrace_regs_caller_jmp,
 	 * to the return.
 	 */
 SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 	jmp ftrace_epilogue
 
 	/* Swap the flags with orig_rax */



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 28/39] x86/ibt,sev: Annotations
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (26 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 27/39] x86/ibt,ftrace: Annotate ftrace code patching Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 29/39] x86/ibt: Dont generate ENDBR in .discard.text Peter Zijlstra
                   ` (11 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

No IBT on AMD so far.. probably correct, who knows.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S        |    1 +
 arch/x86/entry/entry_64_compat.S |    1 +
 arch/x86/kernel/head_64.S        |    2 ++
 3 files changed, 4 insertions(+)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -95,6 +95,7 @@ SYM_CODE_START(entry_SYSCALL_64)
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
 SYM_INNER_LABEL(entry_SYSCALL_64_safe_stack, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 
 	/* Construct struct pt_regs on stack */
 	pushq	$__USER_DS				/* pt_regs->ss */
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -214,6 +214,7 @@ SYM_CODE_START(entry_SYSCALL_compat)
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
 SYM_INNER_LABEL(entry_SYSCALL_compat_safe_stack, SYM_L_GLOBAL)
+	ANNOTATE_NOENDBR
 
 	/* Construct struct pt_regs on stack */
 	pushq	$__USER32_DS		/* pt_regs->ss */
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -332,6 +332,7 @@ SYM_CODE_END(start_cpu0)
  */
 SYM_CODE_START_NOALIGN(vc_boot_ghcb)
 	UNWIND_HINT_IRET_REGS offset=8
+	ENDBR
 
 	/* Build pt_regs */
 	PUSH_AND_CLEAR_REGS
@@ -439,6 +440,7 @@ SYM_CODE_END(early_idt_handler_common)
  */
 SYM_CODE_START_NOALIGN(vc_no_ghcb)
 	UNWIND_HINT_IRET_REGS offset=8
+	ENDBR
 
 	/* Build pt_regs */
 	PUSH_AND_CLEAR_REGS



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 29/39] x86/ibt: Dont generate ENDBR in .discard.text
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (27 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 28/39] x86/ibt,sev: Annotations Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 30/39] x86/ibt: Ensure module init/exit points have references Peter Zijlstra
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Having ENDBR in discarded sections can easily lead to relocations into
discarded sections which the linkers aren't really fond of. Objtool
also shouldn't generate them, but why tempt fate.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/setup.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -8,6 +8,7 @@
 
 #include <linux/linkage.h>
 #include <asm/page_types.h>
+#include <asm/ibt.h>
 
 #ifdef __i386__
 
@@ -119,7 +120,7 @@ void *extend_brk(size_t size, size_t ali
  * executable.)
  */
 #define RESERVE_BRK(name,sz)						\
-	static void __section(".discard.text") __used notrace		\
+	static void __section(".discard.text") __noendbr __used notrace	\
 	__brk_reservation_fn_##name##__(void) {				\
 		asm volatile (						\
 			".pushsection .brk_reservation,\"aw\",@nobits;" \



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 30/39] x86/ibt: Ensure module init/exit points have references
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (28 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 29/39] x86/ibt: Dont generate ENDBR in .discard.text Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 31/39] objtool: Rename --duplicate to --lto Peter Zijlstra
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Since the references to the module init/exit points only have external
references, a module LTO run will consider them 'unused' and seal
them, leading to an immediate fail on module load.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/cfi.h |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/include/linux/cfi.h
+++ b/include/linux/cfi.h
@@ -34,8 +34,17 @@ static inline void cfi_module_remove(str
 
 #else /* !CONFIG_CFI_CLANG */
 
-#define __CFI_ADDRESSABLE(fn, __attr)
+#ifdef CONFIG_X86_KERNEL_IBT
+
+#define __CFI_ADDRESSABLE(fn, __attr) \
+	const void *__cfi_jt_ ## fn __visible __attr = (void *)&fn
+
+#endif /* CONFIG_X86_KERNEL_IBT */
 
 #endif /* CONFIG_CFI_CLANG */
 
+#ifndef __CFI_ADDRESSABLE
+#define __CFI_ADDRESSABLE(fn, __attr)
+#endif
+
 #endif /* _LINUX_CFI_H */



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 31/39] objtool: Rename --duplicate to --lto
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (29 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 30/39] x86/ibt: Ensure module init/exit points have references Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 32/39] Kbuild: Allow whole module objtool runs Peter Zijlstra
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

In order to prepare for LTO like objtool runs for modules, rename the
duplicate argument to lto.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 scripts/link-vmlinux.sh                 |    2 +-
 tools/objtool/builtin-check.c           |    4 ++--
 tools/objtool/check.c                   |    7 ++++++-
 tools/objtool/include/objtool/builtin.h |    2 +-
 4 files changed, 10 insertions(+), 5 deletions(-)

--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -115,7 +115,7 @@ objtool_link()
 			objtoolcmd="orc generate"
 		fi
 
-		objtoolopt="${objtoolopt} --duplicate"
+		objtoolopt="${objtoolopt} --lto"
 
 		if is_enabled CONFIG_FTRACE_MCOUNT_USE_OBJTOOL; then
 			objtoolopt="${objtoolopt} --mcount"
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -20,7 +20,7 @@
 #include <objtool/objtool.h>
 
 bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
-     validate_dup, vmlinux, mcount, noinstr, backup, sls, dryrun;
+     lto, vmlinux, mcount, noinstr, backup, sls, dryrun;
 
 static const char * const check_usage[] = {
 	"objtool check [<options>] file.o",
@@ -40,7 +40,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('b', "backtrace", &backtrace, "unwind on error"),
 	OPT_BOOLEAN('a', "uaccess", &uaccess, "enable uaccess checking"),
 	OPT_BOOLEAN('s', "stats", &stats, "print statistics"),
-	OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"),
+	OPT_BOOLEAN(0, "lto", &lto, "whole-archive like runs"),
 	OPT_BOOLEAN('n', "noinstr", &noinstr, "noinstr validation for vmlinux.o"),
 	OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"),
 	OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"),
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3501,6 +3501,11 @@ int check(struct objtool_file *file)
 {
 	int ret, warnings = 0;
 
+	if (lto && !(vmlinux || module)) {
+		fprintf(stderr, "--lto requires: --vmlinux or --module\n");
+		return 1;
+	}
+
 	arch_initial_func_cfi_state(&initial_func_cfi);
 	init_cfi_state(&init_cfi);
 	init_cfi_state(&func_cfi);
@@ -3521,7 +3526,7 @@ int check(struct objtool_file *file)
 	if (list_empty(&file->insn_list))
 		goto out;
 
-	if (vmlinux && !validate_dup) {
+	if (vmlinux && !lto) {
 		ret = validate_vmlinux_functions(file);
 		if (ret < 0)
 			goto out;
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -9,7 +9,7 @@
 
 extern const struct option check_options[];
 extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
-            validate_dup, vmlinux, mcount, noinstr, backup, sls, dryrun;
+	    lto, vmlinux, mcount, noinstr, backup, sls, dryrun;
 
 extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]);
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 32/39] Kbuild: Allow whole module objtool runs
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (30 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 31/39] objtool: Rename --duplicate to --lto Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 33/39] objtool: Read the NOENDBR annotation Peter Zijlstra
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov, Nathan Chancellor

Just like we have vmlinux.o objtool runs, add the ability to do whole
module objtool runs.

Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 scripts/Makefile.build    |   44 ++------------------------------------------
 scripts/Makefile.lib      |   45 +++++++++++++++++++++++++++++++++++++++++++++
 scripts/Makefile.modfinal |    1 +
 3 files changed, 48 insertions(+), 42 deletions(-)

--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -223,41 +223,6 @@ cmd_record_mcount = $(if $(findstring $(
 	$(sub_cmd_record_mcount))
 endif # CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
 
-ifdef CONFIG_STACK_VALIDATION
-
-objtool := $(objtree)/tools/objtool/objtool
-
-objtool_args =								\
-	$(if $(CONFIG_UNWINDER_ORC),orc generate,check)			\
-	$(if $(part-of-module), --module)				\
-	$(if $(CONFIG_FRAME_POINTER),, --no-fp)				\
-	$(if $(CONFIG_GCOV_KERNEL)$(CONFIG_LTO_CLANG), --no-unreachable)\
-	$(if $(CONFIG_RETPOLINE), --retpoline)				\
-	$(if $(CONFIG_X86_SMAP), --uaccess)				\
-	$(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount)		\
-	$(if $(CONFIG_SLS), --sls)
-
-cmd_objtool = $(if $(objtool-enabled), ; $(objtool) $(objtool_args) $@)
-cmd_gen_objtooldep = $(if $(objtool-enabled), { echo ; echo '$@: $$(wildcard $(objtool))' ; } >> $(dot-target).cmd)
-
-endif # CONFIG_STACK_VALIDATION
-
-ifdef CONFIG_LTO_CLANG
-
-# Skip objtool for LLVM bitcode
-$(obj)/%.o: objtool-enabled :=
-
-else
-
-# 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
-# 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file
-# 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file
-
-$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \
-	$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y)
-
-endif
-
 ifdef CONFIG_TRIM_UNUSED_KSYMS
 cmd_gen_ksymdeps = \
 	$(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
@@ -292,21 +257,16 @@ ifdef CONFIG_LTO_CLANG
 # Module .o files may contain LLVM bitcode, compile them into native code
 # before ELF processing
 quiet_cmd_cc_lto_link_modules = LTO [M] $@
-cmd_cc_lto_link_modules =						\
+      cmd_cc_lto_link_modules =						\
 	$(LD) $(ld_flags) -r -o $@					\
 		$(shell [ -s $(@:.lto.o=.o.symversions) ] &&		\
 			echo -T $(@:.lto.o=.o.symversions))		\
 		--whole-archive $(filter-out FORCE,$^)			\
 		$(cmd_objtool)
-
-# objtool was skipped for LLVM bitcode, run it now that we have compiled
-# modules into native code
-$(obj)/%.lto.o: objtool-enabled = y
-$(obj)/%.lto.o: part-of-module := y
+endif
 
 $(obj)/%.lto.o: $(obj)/%.o FORCE
 	$(call if_changed,cc_lto_link_modules)
-endif
 
 cmd_mod = { \
 	echo $(if $($*-objs)$($*-y)$($*-m), $(addprefix $(obj)/, $($*-objs) $($*-y) $($*-m)), $(@:.mod=.o)); \
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -533,3 +533,48 @@ define filechk_offsets
 	 echo ""; \
 	 echo "#endif"
 endef
+
+# objtool
+# ---------------------------------------------------------------------------
+
+ifdef CONFIG_STACK_VALIDATION
+
+objtool := $(objtree)/tools/objtool/objtool
+
+objtool_args =								\
+	$(if $(CONFIG_UNWINDER_ORC),orc generate,check)			\
+	$(if $(part-of-module), --module)				\
+	$(if $(CONFIG_FRAME_POINTER),, --no-fp)				\
+	$(if $(CONFIG_GCOV_KERNEL)$(CONFIG_LTO_CLANG), --no-unreachable)\
+	$(if $(CONFIG_RETPOLINE), --retpoline)				\
+	$(if $(CONFIG_X86_SMAP), --uaccess)				\
+	$(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount)		\
+	$(if $(CONFIG_SLS), --sls)
+
+cmd_objtool = $(if $(objtool-enabled), ; $(objtool) $(objtool_args) $@)
+cmd_objtool_mod = $(if $(objtool-enabled), $(objtool) $(objtool_args) $(@:.ko=.o) ; )
+cmd_gen_objtooldep = $(if $(objtool-enabled), { echo ; echo '$@: $$(wildcard $(objtool))' ; } >> $(dot-target).cmd)
+
+endif # CONFIG_STACK_VALIDATION
+
+ifdef CONFIG_LTO_CLANG
+
+# Skip objtool for LLVM bitcode
+$(obj)/%.o: objtool-enabled :=
+
+# objtool was skipped for LLVM bitcode, run it now that we have compiled
+# modules into native code
+$(obj)/%.lto.o: objtool-enabled = y
+$(obj)/%.lto.o: part-of-module := y
+
+else
+
+# 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
+# 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file
+# 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file
+
+$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \
+	$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y)
+
+endif
+
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -32,6 +32,7 @@ ARCH_POSTLINK := $(wildcard $(srctree)/a
 
 quiet_cmd_ld_ko_o = LD [M]  $@
       cmd_ld_ko_o +=							\
+	$(cmd_objtool_mod)						\
 	$(LD) -r $(KBUILD_LDFLAGS)					\
 		$(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE)		\
 		-T scripts/module.lds -o $@ $(filter %.o, $^);		\



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 33/39] objtool: Read the NOENDBR annotation
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (31 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 32/39] Kbuild: Allow whole module objtool runs Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 34/39] objtool: Have WARN_FUNC fall back to sym+off Peter Zijlstra
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Read the new NOENDBR annotation. While there, attempt to not bloat
struct instruction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c                 |   27 +++++++++++++++++++++++++++
 tools/objtool/include/objtool/check.h |   13 ++++++++++---
 2 files changed, 37 insertions(+), 3 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1860,6 +1860,29 @@ static int read_unwind_hints(struct objt
 	return 0;
 }
 
+static int read_noendbr_hints(struct objtool_file *file)
+{
+	struct section *sec;
+	struct instruction *insn;
+	struct reloc *reloc;
+
+	sec = find_section_by_name(file->elf, ".rela.discard.noendbr");
+	if (!sec)
+		return 0;
+
+	list_for_each_entry(reloc, &sec->reloc_list, list) {
+		insn = find_insn(file, reloc->sym->sec, reloc->sym->offset + reloc->addend);
+		if (!insn) {
+			WARN("bad .discard.noendbr entry");
+			return -1;
+		}
+
+		insn->noendbr = 1;
+	}
+
+	return 0;
+}
+
 static int read_retpoline_hints(struct objtool_file *file)
 {
 	struct section *sec;
@@ -2097,6 +2120,10 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	ret = read_noendbr_hints(file);
+	if (ret)
+		return ret;
+
 	/*
 	 * Must be before add_{jump_call}_destination.
 	 */
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -45,11 +45,18 @@ struct instruction {
 	unsigned int len;
 	enum insn_type type;
 	unsigned long immediate;
-	bool dead_end, ignore, ignore_alts;
-	bool hint;
-	bool retpoline_safe;
+
+	u8 dead_end	: 1,
+	   ignore	: 1,
+	   ignore_alts	: 1,
+	   hint		: 1,
+	   retpoline_safe : 1,
+	   noendbr	: 1;
+		/* 2 bit hole */
 	s8 instr;
 	u8 visited;
+	/* u8 hole */
+
 	struct alt_group *alt_group;
 	struct symbol *call_dest;
 	struct instruction *jump_dest;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 34/39] objtool: Have WARN_FUNC fall back to sym+off
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (32 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 33/39] objtool: Read the NOENDBR annotation Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 35/39] objtool: Add IBT/ENDBR decoding Peter Zijlstra
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Currently WARN_FUNC() either prints func+off and failing that prints
sec+off, add an intermediate sym+off. This is useful when playing
around with entry code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/include/objtool/warn.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/tools/objtool/include/objtool/warn.h
+++ b/tools/objtool/include/objtool/warn.h
@@ -22,6 +22,8 @@ static inline char *offstr(struct sectio
 	unsigned long name_off;
 
 	func = find_func_containing(sec, offset);
+	if (!func)
+		func = find_symbol_containing(sec, offset);
 	if (func) {
 		name = func->name;
 		name_off = offset - func->offset;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 35/39] objtool: Add IBT/ENDBR decoding
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (33 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 34/39] objtool: Have WARN_FUNC fall back to sym+off Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 36/39] objtool: Validate IBT assumptions Peter Zijlstra
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Decode ENDBR instructions and WARN about NOTRACK prefixes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c      |   34 +++++++++++++++++++++++++++++-----
 tools/objtool/include/objtool/arch.h |    1 +
 2 files changed, 30 insertions(+), 5 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -103,6 +103,18 @@ unsigned long arch_jump_destination(stru
 #define rm_is_mem(reg)	(mod_is_mem() && !is_RIP() && rm_is(reg))
 #define rm_is_reg(reg)	(mod_is_reg() && modrm_rm == (reg))
 
+static bool has_notrack_prefix(struct insn *insn)
+{
+	int i;
+
+	for (i = 0; i < insn->prefixes.nbytes; i++) {
+		if (insn->prefixes.bytes[i] == 0x3e)
+			return true;
+	}
+
+	return false;
+}
+
 int arch_decode_instruction(struct objtool_file *file, const struct section *sec,
 			    unsigned long offset, unsigned int maxlen,
 			    unsigned int *len, enum insn_type *type,
@@ -112,7 +124,7 @@ int arch_decode_instruction(struct objto
 	const struct elf *elf = file->elf;
 	struct insn insn;
 	int x86_64, ret;
-	unsigned char op1, op2, op3,
+	unsigned char op1, op2, op3, prefix,
 		      rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
 		      modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
 		      sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0;
@@ -137,6 +149,8 @@ int arch_decode_instruction(struct objto
 	if (insn.vex_prefix.nbytes)
 		return 0;
 
+	prefix = insn.prefixes.bytes[0];
+
 	op1 = insn.opcode.bytes[0];
 	op2 = insn.opcode.bytes[1];
 	op3 = insn.opcode.bytes[2];
@@ -492,6 +506,12 @@ int arch_decode_instruction(struct objto
 			/* nopl/nopw */
 			*type = INSN_NOP;
 
+		} else if (op2 == 0x1e) {
+
+			if (prefix == 0xf3 && (modrm == 0xfa || modrm == 0xfb))
+				*type = INSN_ENDBR;
+
+
 		} else if (op2 == 0x38 && op3 == 0xf8) {
 			if (insn.prefixes.nbytes == 1 &&
 			    insn.prefixes.bytes[0] == 0xf2) {
@@ -636,20 +656,24 @@ int arch_decode_instruction(struct objto
 		break;
 
 	case 0xff:
-		if (modrm_reg == 2 || modrm_reg == 3)
+		if (modrm_reg == 2 || modrm_reg == 3) {
 
 			*type = INSN_CALL_DYNAMIC;
+			if (has_notrack_prefix(&insn))
+				WARN("notrack prefix found at %s:0x%lx", sec->name, offset);
 
-		else if (modrm_reg == 4)
+		} else if (modrm_reg == 4) {
 
 			*type = INSN_JUMP_DYNAMIC;
+			if (has_notrack_prefix(&insn))
+				WARN("notrack prefix found at %s:0x%lx", sec->name, offset);
 
-		else if (modrm_reg == 5)
+		} else if (modrm_reg == 5) {
 
 			/* jmpf */
 			*type = INSN_CONTEXT_SWITCH;
 
-		else if (modrm_reg == 6) {
+		} else if (modrm_reg == 6) {
 
 			/* push from mem */
 			ADD_OP(op) {
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -27,6 +27,7 @@ enum insn_type {
 	INSN_STD,
 	INSN_CLD,
 	INSN_TRAP,
+	INSN_ENDBR,
 	INSN_OTHER,
 };
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 36/39] objtool: Validate IBT assumptions
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (34 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 35/39] objtool: Add IBT/ENDBR decoding Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:23 ` [PATCH v3 37/39] objtool: Optionally WARN about unused ANNOTATE_NOENDBR Peter Zijlstra
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Intel IBT requires that every indirect JMP/CALL targets an ENDBR
instructions, failing this #CP happens and we die. Similarly, all
exception entries should be ENDBR.

Find all code relocations and ensure they're either an ENDBR
instruction or ANNOTATE_NOENDBR. For the exceptions look for
UNWIND_HINT_IRET_REGS at sym+0 not being ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/builtin-check.c           |    4 
 tools/objtool/check.c                   |  210 +++++++++++++++++++++++++++++++-
 tools/objtool/include/objtool/builtin.h |    3 
 tools/objtool/include/objtool/objtool.h |    3 
 4 files changed, 215 insertions(+), 5 deletions(-)

--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -20,7 +20,8 @@
 #include <objtool/objtool.h>
 
 bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
-     lto, vmlinux, mcount, noinstr, backup, sls, dryrun;
+     lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
+     ibt;
 
 static const char * const check_usage[] = {
 	"objtool check [<options>] file.o",
@@ -47,6 +48,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('B', "backup", &backup, "create .orig files before modification"),
 	OPT_BOOLEAN('S', "sls", &sls, "validate straight-line-speculation"),
 	OPT_BOOLEAN(0, "dry-run", &dryrun, "don't write the modifications"),
+	OPT_BOOLEAN(0, "ibt", &ibt, "validate ENDBR placement"),
 	OPT_END(),
 };
 
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -407,8 +407,16 @@ static int decode_instructions(struct ob
 				return -1;
 			}
 
-			sym_for_each_insn(file, func, insn)
+			sym_for_each_insn(file, func, insn) {
 				insn->func = func;
+				if (insn->type == INSN_ENDBR) {
+					if (insn->offset == insn->func->offset) {
+						file->nr_endbr++;
+					} else {
+						file->nr_endbr_int++;
+					}
+				}
+			}
 		}
 	}
 
@@ -1165,6 +1173,19 @@ static void add_retpoline_call(struct ob
 
 	annotate_call_site(file, insn, false);
 }
+
+static bool same_function(struct instruction *insn1, struct instruction *insn2)
+{
+	return insn1->func->pfunc == insn2->func->pfunc;
+}
+
+static bool is_first_func_insn(struct instruction *insn)
+{
+	return insn->offset == insn->func->offset ||
+	       (insn->type == INSN_ENDBR &&
+		insn->offset == insn->func->offset + insn->len);
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -1245,8 +1266,8 @@ static int add_jump_destinations(struct
 				insn->func->cfunc = insn->jump_dest->func;
 				insn->jump_dest->func->pfunc = insn->func;
 
-			} else if (insn->jump_dest->func->pfunc != insn->func->pfunc &&
-				   insn->jump_dest->offset == insn->jump_dest->func->offset) {
+			} else if (!same_function(insn, insn->jump_dest) &&
+				   is_first_func_insn(insn->jump_dest)) {
 				/* internal sibling call (without reloc) */
 				add_call_dest(file, insn, insn->jump_dest->func, true);
 			}
@@ -1836,6 +1857,16 @@ static int read_unwind_hints(struct objt
 
 		insn->hint = true;
 
+		if (ibt && hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
+			struct symbol *sym = find_symbol_by_offset(insn->sec, insn->offset);
+
+			if (sym && sym->bind == STB_GLOBAL &&
+			    insn->type != INSN_ENDBR && !insn->noendbr) {
+				WARN_FUNC("UNWIND_HINT_IRET_REGS without ENDBR",
+					  insn->sec, insn->offset);
+			}
+		}
+
 		if (hint->type == UNWIND_HINT_TYPE_FUNC) {
 			insn->cfi = &func_cfi;
 			continue;
@@ -1877,6 +1908,9 @@ static int read_noendbr_hints(struct obj
 			return -1;
 		}
 
+		if (insn->type == INSN_ENDBR)
+			WARN_FUNC("ANNOTATE_NOENDBR on ENDBR", insn->sec, insn->offset);
+
 		insn->noendbr = 1;
 	}
 
@@ -2120,6 +2154,9 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before read_unwind_hints() since that needs insn->noendbr.
+	 */
 	ret = read_noendbr_hints(file);
 	if (ret)
 		return ret;
@@ -3053,6 +3090,111 @@ static struct instruction *next_insn_to_
 	return next_insn_same_sec(file, insn);
 }
 
+static struct instruction *
+validate_ibt_reloc(struct objtool_file *file, struct reloc *reloc)
+{
+	struct instruction *dest;
+	struct section *sec;
+	unsigned long off;
+
+	sec = reloc->sym->sec;
+	off = reloc->sym->offset;
+
+	if ((reloc->sec->base->sh.sh_flags & SHF_EXECINSTR) &&
+	    (reloc->type == R_X86_64_PC32 || reloc->type == R_X86_64_PLT32))
+		off += arch_dest_reloc_offset(reloc->addend);
+	else
+		off += reloc->addend;
+
+	dest = find_insn(file, sec, off);
+	if (!dest)
+		return NULL;
+
+	if (dest->type == INSN_ENDBR)
+		return NULL;
+
+	if (reloc->sym->static_call_tramp)
+		return NULL;
+
+	return dest;
+}
+
+static void warn_noendbr(const char *msg, struct section *sec, unsigned long offset,
+			 struct instruction *dest)
+{
+	WARN_FUNC("%srelocation to !ENDBR: %s+0x%lx", sec, offset, msg,
+		  dest->func ? dest->func->name : dest->sec->name,
+		  dest->func ? dest->offset - dest->func->offset : dest->offset);
+}
+
+static void validate_ibt_dest(struct objtool_file *file, struct instruction *insn,
+			      struct instruction *dest)
+{
+	if (dest->func && dest->func == insn->func) {
+		/*
+		 * Anything from->to self is either _THIS_IP_ or IRET-to-self.
+		 *
+		 * There is no sane way to annotate _THIS_IP_ since the compiler treats the
+		 * relocation as a constant and is happy to fold in offsets, skewing any
+		 * annotation we do, leading to vast amounts of false-positives.
+		 *
+		 * There's also compiler generated _THIS_IP_ through KCOV and
+		 * such which we have no hope of annotating.
+		 *
+		 * As such, blanket accept self-references without issue.
+		 */
+		return;
+	}
+
+	if (dest->noendbr)
+		return;
+
+	warn_noendbr("", insn->sec, insn->offset, dest);
+}
+
+static void validate_ibt_insn(struct objtool_file *file, struct instruction *insn)
+{
+	struct instruction *dest;
+	struct reloc *reloc;
+
+	switch (insn->type) {
+	case INSN_CALL:
+	case INSN_CALL_DYNAMIC:
+	case INSN_JUMP_CONDITIONAL:
+	case INSN_JUMP_UNCONDITIONAL:
+	case INSN_JUMP_DYNAMIC:
+	case INSN_JUMP_DYNAMIC_CONDITIONAL:
+	case INSN_RETURN:
+		/*
+		 * We're looking for code references setting up indirect code
+		 * flow. As such, ignore direct code flow and the actual
+		 * dynamic branches.
+		 */
+		return;
+
+	case INSN_NOP:
+		/*
+		 * handle_group_alt() will create INSN_NOP instruction that
+		 * don't belong to any section, ignore all NOP since they won't
+		 * carry a (useful) relocation anyway.
+		 */
+		return;
+
+	default:
+		break;
+	}
+
+	for (reloc = insn_reloc(file, insn);
+	     reloc;
+	     reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+					      reloc->offset + 1,
+					      (insn->offset + insn->len) - (reloc->offset + 1))) {
+		dest = validate_ibt_reloc(file, reloc);
+		if (dest)
+			validate_ibt_dest(file, insn, dest);
+	}
+}
+
 /*
  * Follow the branch starting at the given instruction, and recursively follow
  * any other branches (jumps).  Meanwhile, track the frame pointer state at
@@ -3264,6 +3406,9 @@ static int validate_branch(struct objtoo
 			break;
 		}
 
+		if (ibt)
+			validate_ibt_insn(file, insn);
+
 		if (insn->dead_end)
 			return 0;
 
@@ -3506,6 +3651,53 @@ static int validate_functions(struct obj
 	return warnings;
 }
 
+static int validate_ibt(struct objtool_file *file)
+{
+	struct section *sec;
+	struct reloc *reloc;
+
+	for_each_sec(file, sec) {
+		bool is_data;
+
+		/* already done in validate_branch() */
+		if (sec->sh.sh_flags & SHF_EXECINSTR)
+			continue;
+
+		if (!sec->reloc)
+			continue;
+
+		if (!strncmp(sec->name, ".orc", 4))
+			continue;
+
+		if (!strncmp(sec->name, ".discard", 8))
+			continue;
+
+		if (!strncmp(sec->name, ".debug", 6))
+			continue;
+
+		if (!strcmp(sec->name, "_error_injection_whitelist"))
+			continue;
+
+		if (!strcmp(sec->name, "_kprobe_blacklist"))
+			continue;
+
+		is_data = strstr(sec->name, ".data") || strstr(sec->name, ".rodata");
+
+		list_for_each_entry(reloc, &sec->reloc->reloc_list, list) {
+			struct instruction *dest;
+
+			dest = validate_ibt_reloc(file, reloc);
+			if (is_data && dest && !dest->noendbr) {
+				warn_noendbr("data ", reloc->sym->sec,
+					     reloc->sym->offset + reloc->addend,
+					     dest);
+			}
+		}
+	}
+
+	return 0;
+}
+
 static int validate_reachable_instructions(struct objtool_file *file)
 {
 	struct instruction *insn;
@@ -3533,6 +3725,11 @@ int check(struct objtool_file *file)
 		return 1;
 	}
 
+	if (ibt && !lto) {
+		fprintf(stderr, "--ibt requires: --lto\n");
+		return 1;
+	}
+
 	arch_initial_func_cfi_state(&initial_func_cfi);
 	init_cfi_state(&init_cfi);
 	init_cfi_state(&func_cfi);
@@ -3579,6 +3776,13 @@ int check(struct objtool_file *file)
 		goto out;
 	warnings += ret;
 
+	if (ibt) {
+		ret = validate_ibt(file);
+		if (ret < 0)
+			goto out;
+		warnings += ret;
+	}
+
 	if (!warnings) {
 		ret = validate_reachable_instructions(file);
 		if (ret < 0)
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -9,7 +9,8 @@
 
 extern const struct option check_options[];
 extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
-	    lto, vmlinux, mcount, noinstr, backup, sls, dryrun;
+	    lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
+	    ibt;
 
 extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]);
 
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -28,6 +28,9 @@ struct objtool_file {
 	struct list_head mcount_loc_list;
 	bool ignore_unreachables, c_file, hints, rodata;
 
+	unsigned int nr_endbr;
+	unsigned int nr_endbr_int;
+
 	unsigned long jl_short, jl_long;
 	unsigned long jl_nop_short, jl_nop_long;
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 37/39] objtool: Optionally WARN about unused ANNOTATE_NOENDBR
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (35 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 36/39] objtool: Validate IBT assumptions Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-04 18:27   ` Josh Poimboeuf
  2022-03-03 11:23 ` [PATCH v3 38/39] objtool: Find unused ENDBR instructions Peter Zijlstra
                   ` (2 subsequent siblings)
  39 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Separate option because this option is somewhat prone to false
positives since a bunch of symbols/annotations are not correctly
#ifdef'ed, presumably to avoid clutter.

Can be manually ran on allmodconfig builds using OBJTOOL_ARGS.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/builtin-check.c           |    3 +-
 tools/objtool/check.c                   |   45 +++++++++++++++++++++++++++++---
 tools/objtool/include/objtool/builtin.h |    2 -
 tools/objtool/include/objtool/check.h   |    3 +-
 4 files changed, 46 insertions(+), 7 deletions(-)

--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -21,7 +21,7 @@
 
 bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
      lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
-     ibt;
+     ibt, ibt_warn;
 
 static const char * const check_usage[] = {
 	"objtool check [<options>] file.o",
@@ -49,6 +49,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('S', "sls", &sls, "validate straight-line-speculation"),
 	OPT_BOOLEAN(0, "dry-run", &dryrun, "don't write the modifications"),
 	OPT_BOOLEAN(0, "ibt", &ibt, "validate ENDBR placement"),
+	OPT_BOOLEAN(0, "ibt-warn", &ibt_warn, "warn about unused ANNOTATE_ENDBR"),
 	OPT_END(),
 };
 
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1814,6 +1814,19 @@ static void set_func_state(struct cfi_st
 	state->stack_size = initial_func_cfi.cfa.offset;
 }
 
+static bool insn_is_endbr(struct instruction *insn)
+{
+	if (insn->type == INSN_ENDBR)
+		return true;
+
+	if (insn->noendbr) {
+		insn->noendbr_hit = 1;
+		return true;
+	}
+
+	return false;
+}
+
 static int read_unwind_hints(struct objtool_file *file)
 {
 	struct cfi_state cfi = init_cfi;
@@ -1860,8 +1873,7 @@ static int read_unwind_hints(struct objt
 		if (ibt && hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
 			struct symbol *sym = find_symbol_by_offset(insn->sec, insn->offset);
 
-			if (sym && sym->bind == STB_GLOBAL &&
-			    insn->type != INSN_ENDBR && !insn->noendbr) {
+			if (sym && sym->bind == STB_GLOBAL && !insn_is_endbr(insn)) {
 				WARN_FUNC("UNWIND_HINT_IRET_REGS without ENDBR",
 					  insn->sec, insn->offset);
 			}
@@ -3146,7 +3158,7 @@ static void validate_ibt_dest(struct obj
 		return;
 	}
 
-	if (dest->noendbr)
+	if (insn_is_endbr(dest))
 		return;
 
 	warn_noendbr("", insn->sec, insn->offset, dest);
@@ -3687,7 +3699,7 @@ static int validate_ibt(struct objtool_f
 			struct instruction *dest;
 
 			dest = validate_ibt_reloc(file, reloc);
-			if (is_data && dest && !dest->noendbr) {
+			if (is_data && dest && !insn_is_endbr(dest)) {
 				warn_noendbr("data ", reloc->sym->sec,
 					     reloc->sym->offset + reloc->addend,
 					     dest);
@@ -3695,6 +3707,31 @@ static int validate_ibt(struct objtool_f
 		}
 	}
 
+	if (ibt_warn) {
+		struct symbol *hypercall_page = find_symbol_by_name(file->elf, "hypercall_page");
+		struct instruction *insn;
+
+		for_each_insn(file, insn) {
+			if (!insn->noendbr || insn->noendbr_hit)
+				continue;
+
+			if (hypercall_page) {
+				/*
+				 * The Xen hypercall page contains many
+				 * hypercalls (and unused slots) that are never
+				 * indirectly called. Still every slot has an
+				 * annotation. Suppress complaints.
+				 */
+				if (insn->sec == hypercall_page->sec &&
+				    insn->offset >= hypercall_page->offset &&
+				    insn->offset <  hypercall_page->offset + hypercall_page->len)
+					continue;
+			}
+
+			WARN_FUNC("unused ANNOTATE_NOENDBR", insn->sec, insn->offset);
+		}
+	}
+
 	return 0;
 }
 
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -10,7 +10,7 @@
 extern const struct option check_options[];
 extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
 	    lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
-	    ibt;
+	    ibt, ibt_warn;
 
 extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]);
 
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -51,7 +51,8 @@ struct instruction {
 	   ignore_alts	: 1,
 	   hint		: 1,
 	   retpoline_safe : 1,
-	   noendbr	: 1;
+	   noendbr	: 1,
+	   noendbr_hit  : 1;
 		/* 2 bit hole */
 	s8 instr;
 	u8 visited;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 38/39] objtool: Find unused ENDBR instructions
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (36 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 37/39] objtool: Optionally WARN about unused ANNOTATE_NOENDBR Peter Zijlstra
@ 2022-03-03 11:23 ` Peter Zijlstra
  2022-03-03 11:24 ` [PATCH v3 39/39] x86/alternative: Use .ibt_endbr_seal to seal indirect calls Peter Zijlstra
  2022-03-04 19:09 ` [PATCH v3 00/39] x86: Kernel IBT Josh Poimboeuf
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:23 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Find all ENDBR instructions which are never referenced and stick them
in a section such that the kernel can poison them, sealing the
functions from ever being an indirect call target.

This removes about 1-in-4 ENDBR instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/vmlinux.lds.S           |    9 ++++
 tools/objtool/check.c                   |   69 +++++++++++++++++++++++++++++++-
 tools/objtool/include/objtool/objtool.h |    1 
 tools/objtool/objtool.c                 |    1 
 4 files changed, 78 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -285,6 +285,15 @@ SECTIONS
 	}
 #endif
 
+#ifdef CONFIG_X86_KERNEL_IBT
+	. = ALIGN(8);
+	.ibt_endbr_seal : AT(ADDR(.ibt_endbr_seal) - LOAD_OFFSET) {
+		__ibt_endbr_seal = .;
+		*(.ibt_endbr_seal)
+		__ibt_endbr_seal_end = .;
+	}
+#endif
+
 	/*
 	 * struct alt_inst entries. From the header (alternative.h):
 	 * "Alternative instructions for different CPU types or capabilities"
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -380,6 +380,7 @@ static int decode_instructions(struct ob
 			memset(insn, 0, sizeof(*insn));
 			INIT_LIST_HEAD(&insn->alts);
 			INIT_LIST_HEAD(&insn->stack_ops);
+			INIT_LIST_HEAD(&insn->call_node);
 
 			insn->sec = sec;
 			insn->offset = offset;
@@ -409,8 +410,9 @@ static int decode_instructions(struct ob
 
 			sym_for_each_insn(file, func, insn) {
 				insn->func = func;
-				if (insn->type == INSN_ENDBR) {
+				if (insn->type == INSN_ENDBR && list_empty(&insn->call_node)) {
 					if (insn->offset == insn->func->offset) {
+						list_add_tail(&insn->call_node, &file->endbr_list);
 						file->nr_endbr++;
 					} else {
 						file->nr_endbr_int++;
@@ -739,6 +741,58 @@ static int create_retpoline_sites_sectio
 	return 0;
 }
 
+static int create_ibt_endbr_seal_sections(struct objtool_file *file)
+{
+	struct instruction *insn;
+	struct section *sec;
+	int idx;
+
+	sec = find_section_by_name(file->elf, ".ibt_endbr_seal");
+	if (sec) {
+		WARN("file already has .ibt_endbr_seal, skipping");
+		return 0;
+	}
+
+	idx = 0;
+	list_for_each_entry(insn, &file->endbr_list, call_node)
+		idx++;
+
+	if (stats) {
+		printf("ibt: ENDBR at function start: %d\n", file->nr_endbr);
+		printf("ibt: ENDBR inside functions:  %d\n", file->nr_endbr_int);
+		printf("ibt: superfluous ENDBR:       %d\n", idx);
+	}
+
+	if (!idx)
+		return 0;
+
+	sec = elf_create_section(file->elf, ".ibt_endbr_seal", 0,
+				 sizeof(int), idx);
+	if (!sec) {
+		WARN("elf_create_section: .ibt_endbr_seal");
+		return -1;
+	}
+
+	idx = 0;
+	list_for_each_entry(insn, &file->endbr_list, call_node) {
+
+		int *site = (int *)sec->data->d_buf + idx;
+		*site = 0;
+
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(int),
+					  R_X86_64_PC32,
+					  insn->sec, insn->offset)) {
+			WARN("elf_add_reloc_to_insn: .ibt_endbr_seal");
+			return -1;
+		}
+
+		idx++;
+	}
+
+	return 0;
+}
+
 static int create_mcount_loc_sections(struct objtool_file *file)
 {
 	struct section *sec;
@@ -3097,8 +3151,12 @@ validate_ibt_reloc(struct objtool_file *
 	if (!dest)
 		return NULL;
 
-	if (dest->type == INSN_ENDBR)
+	if (dest->type == INSN_ENDBR) {
+		if (!list_empty(&dest->call_node))
+			list_del_init(&dest->call_node);
+
 		return NULL;
+	}
 
 	if (reloc->sym->static_call_tramp)
 		return NULL;
@@ -3805,6 +3863,13 @@ int check(struct objtool_file *file)
 		if (ret < 0)
 			goto out;
 		warnings += ret;
+	}
+
+	if (ibt) {
+		ret = create_ibt_endbr_seal_sections(file);
+		if (ret < 0)
+			goto out;
+		warnings += ret;
 	}
 
 	if (stats) {
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -26,6 +26,7 @@ struct objtool_file {
 	struct list_head retpoline_call_list;
 	struct list_head static_call_list;
 	struct list_head mcount_loc_list;
+	struct list_head endbr_list;
 	bool ignore_unreachables, c_file, hints, rodata;
 
 	unsigned int nr_endbr;
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -128,6 +128,7 @@ struct objtool_file *objtool_open_read(c
 	INIT_LIST_HEAD(&file.retpoline_call_list);
 	INIT_LIST_HEAD(&file.static_call_list);
 	INIT_LIST_HEAD(&file.mcount_loc_list);
+	INIT_LIST_HEAD(&file.endbr_list);
 	file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment");
 	file.ignore_unreachables = no_unreachable;
 	file.hints = false;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v3 39/39] x86/alternative: Use .ibt_endbr_seal to seal indirect calls
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (37 preceding siblings ...)
  2022-03-03 11:23 ` [PATCH v3 38/39] objtool: Find unused ENDBR instructions Peter Zijlstra
@ 2022-03-03 11:24 ` Peter Zijlstra
  2022-03-04 19:09 ` [PATCH v3 00/39] x86: Kernel IBT Josh Poimboeuf
  39 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-03 11:24 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, peterz, ndesaulniers, keescook, samitolvanen,
	mark.rutland, alyssa.milburn, mbenes, rostedt, mhiramat,
	alexei.starovoitov

Objtool's --ibt option generates .ibt_endbr_seal which lists
superfluous ENDBR instructions. That is those instructions for which
the function is never indirectly called.

Overwrite these ENDBR instructions with a NOP4 such that these
function can never be indirect called, reducing the number of viable
ENDBR targets in the kernel.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/um/kernel/um_arch.c           |    4 +++
 arch/x86/Kconfig                   |    9 +++++++-
 arch/x86/include/asm/alternative.h |    1 
 arch/x86/include/asm/ibt.h         |   12 +++++++++++
 arch/x86/kernel/alternative.c      |   39 +++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/module.c           |    8 ++++++-
 scripts/Makefile.lib               |   11 ++++++++++
 scripts/link-vmlinux.sh            |   10 +++++++--
 8 files changed, 90 insertions(+), 4 deletions(-)

--- a/arch/um/kernel/um_arch.c
+++ b/arch/um/kernel/um_arch.c
@@ -424,6 +424,10 @@ void __init check_bugs(void)
 	os_check_bugs();
 }
 
+void apply_ibt_endbr(s32 *start, s32 *end)
+{
+}
+
 void apply_retpolines(s32 *start, s32 *end)
 {
 }
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1873,7 +1873,7 @@ config CC_HAS_IBT
 config X86_KERNEL_IBT
 	prompt "Indirect Branch Tracking"
 	bool
-	depends on X86_64 && CC_HAS_IBT
+	depends on X86_64 && CC_HAS_IBT && STACK_VALIDATION
 	help
 	  Build the kernel with support for Indirect Branch Tracking, a
 	  hardware support course-grain forward-edge Control Flow Integrity
@@ -1881,6 +1881,13 @@ config X86_KERNEL_IBT
 	  an ENDBR instruction, as such, the compiler will instrument the
 	  code with them to make this happen.
 
+	  In addition to building the kernel with IBT, seal all functions that
+	  are not indirect call targets, avoiding them ever becomming one.
+
+	  This requires LTO like objtool runs and will slow down the build. It
+	  does significantly reduce the number of ENDBR instructions in the
+	  kernel image.
+
 config X86_INTEL_MEMORY_PROTECTION_KEYS
 	prompt "Memory Protection Keys"
 	def_bool y
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -76,6 +76,7 @@ extern int alternatives_patched;
 extern void alternative_instructions(void);
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void apply_retpolines(s32 *start, s32 *end);
+extern void apply_ibt_endbr(s32 *start, s32 *end);
 
 struct module;
 
--- a/arch/x86/include/asm/ibt.h
+++ b/arch/x86/include/asm/ibt.h
@@ -33,8 +33,20 @@ static inline __attribute_const__ u32 ge
 	return endbr;
 }
 
+static inline __attribute_const__ u32 gen_endbr_poison(void)
+{
+	/*
+	 * 4 byte NOP that isn't NOP4 (in fact it is OSP NOP3), such that it
+	 * will be unique to (former) ENDBR sites.
+	 */
+	return 0x001f0f66; /* osp nopl (%rax) */
+}
+
 static inline bool is_endbr(u32 val)
 {
+	if (val == gen_endbr_poison())
+		return true;
+
 	val &= ~0x01000000U; /* ENDBR32 -> ENDBR64 */
 	return val == gen_endbr();
 }
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -115,6 +115,7 @@ static void __init_or_module add_nops(vo
 }
 
 extern s32 __retpoline_sites[], __retpoline_sites_end[];
+extern s32 __ibt_endbr_seal[], __ibt_endbr_seal_end[];
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 extern s32 __smp_locks[], __smp_locks_end[];
 void text_poke_early(void *addr, const void *opcode, size_t len);
@@ -512,6 +513,42 @@ void __init_or_module noinline apply_ret
 
 #endif /* CONFIG_RETPOLINE && CONFIG_STACK_VALIDATION */
 
+#ifdef CONFIG_X86_KERNEL_IBT
+
+/*
+ * Generated by: objtool --ibt
+ */
+void __init_or_module noinline apply_ibt_endbr(s32 *start, s32 *end)
+{
+	s32 *s;
+
+	for (s = start; s < end; s++) {
+		u32 endbr, poison = gen_endbr_poison();
+		void *addr = (void *)s + *s;
+
+		if (WARN_ON_ONCE(get_kernel_nofault(endbr, addr)))
+			continue;
+
+		if (WARN_ON_ONCE(!is_endbr(endbr)))
+			continue;
+
+		DPRINTK("ENDBR at: %pS (%px)", addr, addr);
+
+		/*
+		 * When we have IBT, the lack of ENDBR will trigger #CP
+		 */
+		DUMP_BYTES(((u8*)addr), 4, "%px: orig: ", addr);
+		DUMP_BYTES(((u8*)&poison), 4, "%px: repl: ", addr);
+		text_poke_early(addr, &poison, 4);
+	}
+}
+
+#else
+
+void __init_or_module noinline apply_ibt_endbr(s32 *start, s32 *end) { }
+
+#endif /* CONFIG_X86_KERNEL_IBT */
+
 #ifdef CONFIG_SMP
 static void alternatives_smp_lock(const s32 *start, const s32 *end,
 				  u8 *text, u8 *text_end)
@@ -833,6 +870,8 @@ void __init alternative_instructions(voi
 	 */
 	apply_alternatives(__alt_instructions, __alt_instructions_end);
 
+	apply_ibt_endbr(__ibt_endbr_seal, __ibt_endbr_seal_end);
+
 #ifdef CONFIG_SMP
 	/* Patch to UP if other cpus not imminent. */
 	if (!noreplace_smp && (num_present_cpus() == 1 || setup_max_cpus <= 1)) {
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -253,7 +253,7 @@ int module_finalize(const Elf_Ehdr *hdr,
 {
 	const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL,
 		*para = NULL, *orc = NULL, *orc_ip = NULL,
-		*retpolines = NULL;
+		*retpolines = NULL, *ibt_endbr = NULL;
 	char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
 
 	for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) {
@@ -271,6 +271,8 @@ int module_finalize(const Elf_Ehdr *hdr,
 			orc_ip = s;
 		if (!strcmp(".retpoline_sites", secstrings + s->sh_name))
 			retpolines = s;
+		if (!strcmp(".ibt_endbr_seal", secstrings + s->sh_name))
+			ibt_endbr = s;
 	}
 
 	/*
@@ -290,6 +292,10 @@ int module_finalize(const Elf_Ehdr *hdr,
 		void *aseg = (void *)alt->sh_addr;
 		apply_alternatives(aseg, aseg + alt->sh_size);
 	}
+	if (ibt_endbr) {
+		void *iseg = (void *)ibt_endbr->sh_addr;
+		apply_ibt_endbr(iseg, iseg + ibt_endbr->sh_size);
+	}
 	if (locks && text) {
 		void *lseg = (void *)locks->sh_addr;
 		void *tseg = (void *)text->sh_addr;
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -544,6 +544,7 @@ objtool := $(objtree)/tools/objtool/objt
 objtool_args =								\
 	$(if $(CONFIG_UNWINDER_ORC),orc generate,check)			\
 	$(if $(part-of-module), --module)				\
+	$(if $(CONFIG_X86_KERNEL_IBT), --lto --ibt)	                \
 	$(if $(CONFIG_FRAME_POINTER),, --no-fp)				\
 	$(if $(CONFIG_GCOV_KERNEL)$(CONFIG_LTO_CLANG), --no-unreachable)\
 	$(if $(CONFIG_RETPOLINE), --retpoline)				\
@@ -567,6 +568,16 @@ $(obj)/%.o: objtool-enabled :=
 $(obj)/%.lto.o: objtool-enabled = y
 $(obj)/%.lto.o: part-of-module := y
 
+else ifdef CONFIG_X86_KERNEL_IBT
+
+# Skip objtool on individual files
+$(obj)/%.o: objtool-enabled :=
+
+# instead run objtool on the module as a whole, right before
+# the final link pass with the linker script.
+%.ko: objtool-enabled = y
+%.ko: part-of-module := y
+
 else
 
 # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -108,7 +108,9 @@ objtool_link()
 	local objtoolcmd;
 	local objtoolopt;
 
-	if is_enabled CONFIG_LTO_CLANG && is_enabled CONFIG_STACK_VALIDATION; then
+	if is_enabled CONFIG_STACK_VALIDATION && \
+	   ( is_enabled CONFIG_LTO_CLANG || is_enabled CONFIG_X86_KERNEL_IBT ); then
+
 		# Don't perform vmlinux validation unless explicitly requested,
 		# but run objtool on vmlinux.o now that we have an object file.
 		if is_enabled CONFIG_UNWINDER_ORC; then
@@ -117,6 +119,10 @@ objtool_link()
 
 		objtoolopt="${objtoolopt} --lto"
 
+		if is_enabled CONFIG_X86_KERNEL_IBT; then
+			objtoolopt="${objtoolopt} --ibt"
+		fi
+
 		if is_enabled CONFIG_FTRACE_MCOUNT_USE_OBJTOOL; then
 			objtoolopt="${objtoolopt} --mcount"
 		fi
@@ -168,7 +174,7 @@ vmlinux_link()
 	# skip output file argument
 	shift
 
-	if is_enabled CONFIG_LTO_CLANG; then
+	if is_enabled CONFIG_LTO_CLANG || is_enabled CONFIG_X86_KERNEL_IBT; then
 		# Use vmlinux.o instead of performing the slow LTO link again.
 		objs=vmlinux.o
 		libs=



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 25/39] x86/bugs: Disable Retpoline when IBT
  2022-03-03 11:23 ` [PATCH v3 25/39] x86/bugs: Disable Retpoline when IBT Peter Zijlstra
@ 2022-03-04 17:39   ` Josh Poimboeuf
  0 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 17:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:46PM +0100, Peter Zijlstra wrote:
> Retpoline and IBT are mutually exclusive. IBT relies on indirect
> branches (JMP/CALL *%reg) while retpoline avoids them by design.
> 
> Demote to LFENCE on IBT enabled hardware.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/bugs.c |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -891,6 +891,7 @@ static void __init spectre_v2_select_mit
>  {
>  	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
>  	enum spectre_v2_mitigation mode = SPECTRE_V2_NONE;
> +	bool silent_demote = false;
>  
>  	/*
>  	 * If the CPU is not affected and the command line mode is NONE or AUTO
> @@ -906,6 +907,7 @@ static void __init spectre_v2_select_mit
>  
>  	case SPECTRE_V2_CMD_FORCE:
>  	case SPECTRE_V2_CMD_AUTO:
> +		silent_demote = true;
>  		if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) {
>  			mode = SPECTRE_V2_IBRS_ENHANCED;
>  			/* Force it so VMEXIT will restore correctly */
> @@ -938,6 +940,7 @@ static void __init spectre_v2_select_mit
>  	retpoline_amd:
>  		if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) {
>  			pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n");
> +			silent_demote = false;
>  			goto retpoline_generic;
>  		}
>  		mode = SPECTRE_V2_RETPOLINE_AMD;
> @@ -947,6 +950,16 @@ static void __init spectre_v2_select_mit
>  	retpoline_generic:
>  		mode = SPECTRE_V2_RETPOLINE_GENERIC;
>  		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
> +
> +		/*
> +		 * ROP defeats IBT, make sure not to use Retpolines and IBT together.
> +		 */
> +		if (HAS_KERNEL_IBT && cpu_feature_enabled(X86_FEATURE_IBT)) {
> +			if (!silent_demote)
> +				pr_warn("Spectre mitigation: Switching to LFENCE due to IBT\n");
> +			mode = SPECTRE_V2_RETPOLINE_AMD;
> +			setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD);
> +		}

This is better.  But AFAIK, the 'silent_demote' case should only happen
on a hypothetically weird/broken virt setup, right?  Why silence the
warning?  It still seems legit.  If you have IBT on non-eIBRS Intel, and
you get silently demoted from retpoline to lfence, it presumably opens
up some Spectre v2 attack vectors, despite IBT's implicit promise of
providing *more* protection overall.

And actually, after thinking about it, my most preferred approach would
be to do the converse of this patch: only enable IBT if
!X86_FEATURE_RETPOLINE.  And do a real warning, like "your setup is
broken, IBT is disabled".  Maybe even make it a real WARN() so we could
find out if it's a real possibility.  Since this feature is much newer
than retpoline, that would probably be the simplest and least surprising
option.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-03 11:23 ` [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice Peter Zijlstra
@ 2022-03-04 17:51   ` Josh Poimboeuf
  2022-03-04 19:48     ` Peter Zijlstra
  0 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 17:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> +
> +	addq $16, %rsp
> +	ANNOTATE_INTRA_FUNCTION_CALL
> +	call .Ldo_rop
> +	int3
> +.Ldo_rop:
> +	mov %rdi, (%rsp)
> +	UNWIND_HINT_FUNC
> +	RET

Why the int3?

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling
  2022-03-03 11:23 ` [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling Peter Zijlstra
@ 2022-03-04 17:57   ` Josh Poimboeuf
  2022-03-04 20:38     ` Peter Zijlstra
  2022-03-04 18:07   ` Josh Poimboeuf
  1 sibling, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 17:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:43PM +0100, Peter Zijlstra wrote:
> +bool ibt_selftest(void)
> +{
> +	unsigned long ret;
> +
> +	asm ("	lea ibt_selftest_ip(%%rip), %%rax\n\t"
> +	     ANNOTATE_RETPOLINE_SAFE
> +	     "	jmp *%%rax\n\t"
> +	     ASM_REACHABLE
> +	     ANNOTATE_NOENDBR
> +	     "ibt_selftest_ip: nop\n\t"

Maybe pedantic, but I find the annotations to be less surprising if they
come after the label:

	"ibt_selftest_ip:\n\t"
	ASM_REACHABLE
	ANNOTATE_NOENDBR
	"nop\n\t"

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip
  2022-03-03 11:23 ` [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip Peter Zijlstra
@ 2022-03-04 18:02   ` Josh Poimboeuf
  2022-03-04 18:09     ` Josh Poimboeuf
  0 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 18:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:44PM +0100, Peter Zijlstra wrote:
> @@ -733,7 +733,7 @@ int3_exception_notify(struct notifier_bl
>  	if (val != DIE_INT3)
>  		return NOTIFY_DONE;
>  
> -	if (regs->ip - INT3_INSN_SIZE != int3_selftest_ip)
> +	if (regs->ip - INT3_INSN_SIZE != (unsigned long)&int3_selftest_ip)
>  		return NOTIFY_DONE;
>  
>  	int3_emulate_call(regs, (unsigned long)&int3_magic);
> @@ -757,14 +757,7 @@ static void __init int3_selftest(void)
>  	 * then trigger the INT3, padded with NOPs to match a CALL instruction
>  	 * length.
>  	 */
> -	asm volatile ("1: int3; nop; nop; nop; nop\n\t"
> -		      ".pushsection .init.data,\"aw\"\n\t"
> -		      ".align " __ASM_SEL(4, 8) "\n\t"
> -		      ".type int3_selftest_ip, @object\n\t"
> -		      ".size int3_selftest_ip, " __ASM_SEL(4, 8) "\n\t"
> -		      "int3_selftest_ip:\n\t"
> -		      __ASM_SEL(.long, .quad) " 1b\n\t"
> -		      ".popsection\n\t"
> +	asm volatile ("int3_selftest_ip: int3; nop; nop; nop; nop\n\t"
>  		      : ASM_CALL_CONSTRAINT
>  		      : __ASM_SEL_RAW(a, D) (&val)
>  		      : "memory");

Hm, why doesn't this need ANNOTATE_NOENDBR?

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling
  2022-03-03 11:23 ` [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling Peter Zijlstra
  2022-03-04 17:57   ` Josh Poimboeuf
@ 2022-03-04 18:07   ` Josh Poimboeuf
  2022-03-04 20:39     ` Peter Zijlstra
  1 sibling, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 18:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:43PM +0100, Peter Zijlstra wrote:
> @@ -310,6 +311,7 @@ void machine_kexec(struct kimage *image)
>  	/* Interrupts aren't acceptable while we reboot */
>  	local_irq_disable();
>  	hw_breakpoint_disable();
> +	cet_disable();
>  
>  	if (image->preserve_context) {
>  #ifdef CONFIG_X86_IO_APIC
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -115,6 +115,14 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_ma
>  	pushq   %rdx
>  
>  	/*
> +	 * Clear X86_CR4_CET (if it was set) such that we can clear CR0_WP
> +	 * below.
> +	 */
> +	movq	%cr4, %rax
> +	andq	$~(X86_CR4_CET), %rax
> +	movq	%rax, %cr4
> +
> +	/*
>  	 * Set cr0 to a known state:
>  	 *  - Paging enabled
>  	 *  - Alignment check disabled

This probably belongs in a separate patch...

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip
  2022-03-04 18:02   ` Josh Poimboeuf
@ 2022-03-04 18:09     ` Josh Poimboeuf
  0 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 18:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 10:02:23AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:44PM +0100, Peter Zijlstra wrote:
> > @@ -733,7 +733,7 @@ int3_exception_notify(struct notifier_bl
> >  	if (val != DIE_INT3)
> >  		return NOTIFY_DONE;
> >  
> > -	if (regs->ip - INT3_INSN_SIZE != int3_selftest_ip)
> > +	if (regs->ip - INT3_INSN_SIZE != (unsigned long)&int3_selftest_ip)
> >  		return NOTIFY_DONE;
> >  
> >  	int3_emulate_call(regs, (unsigned long)&int3_magic);
> > @@ -757,14 +757,7 @@ static void __init int3_selftest(void)
> >  	 * then trigger the INT3, padded with NOPs to match a CALL instruction
> >  	 * length.
> >  	 */
> > -	asm volatile ("1: int3; nop; nop; nop; nop\n\t"
> > -		      ".pushsection .init.data,\"aw\"\n\t"
> > -		      ".align " __ASM_SEL(4, 8) "\n\t"
> > -		      ".type int3_selftest_ip, @object\n\t"
> > -		      ".size int3_selftest_ip, " __ASM_SEL(4, 8) "\n\t"
> > -		      "int3_selftest_ip:\n\t"
> > -		      __ASM_SEL(.long, .quad) " 1b\n\t"
> > -		      ".popsection\n\t"
> > +	asm volatile ("int3_selftest_ip: int3; nop; nop; nop; nop\n\t"
> >  		      : ASM_CALL_CONSTRAINT
> >  		      : __ASM_SEL_RAW(a, D) (&val)
> >  		      : "memory");
> 
> Hm, why doesn't this need ANNOTATE_NOENDBR?

Never mind, I see it's added later... though I still have the same
comment about the ANNOTATE_ENDBR coming after the label.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 37/39] objtool: Optionally WARN about unused ANNOTATE_NOENDBR
  2022-03-03 11:23 ` [PATCH v3 37/39] objtool: Optionally WARN about unused ANNOTATE_NOENDBR Peter Zijlstra
@ 2022-03-04 18:27   ` Josh Poimboeuf
  0 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 18:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:58PM +0100, Peter Zijlstra wrote:
>  bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
>       lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
> -     ibt;
> +     ibt, ibt_warn;
>  
>  static const char * const check_usage[] = {
>  	"objtool check [<options>] file.o",
> @@ -49,6 +49,7 @@ const struct option check_options[] = {
>  	OPT_BOOLEAN('S', "sls", &sls, "validate straight-line-speculation"),
>  	OPT_BOOLEAN(0, "dry-run", &dryrun, "don't write the modifications"),
>  	OPT_BOOLEAN(0, "ibt", &ibt, "validate ENDBR placement"),
> +	OPT_BOOLEAN(0, "ibt-warn", &ibt_warn, "warn about unused ANNOTATE_ENDBR"),

"ibt-warn" doesn't really describe it.  Most options warn.

"ibt-unused-annotation"?

>  	OPT_END(),
>  };
>  
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1814,6 +1814,19 @@ static void set_func_state(struct cfi_st
>  	state->stack_size = initial_func_cfi.cfa.offset;
>  }
>  
> +static bool insn_is_endbr(struct instruction *insn)
> +{
> +	if (insn->type == INSN_ENDBR)
> +		return true;
> +
> +	if (insn->noendbr) {
> +		insn->noendbr_hit = 1;
> +		return true;
> +	}
> +
> +	return false;
> +}

Ew...

> @@ -3695,6 +3707,31 @@ static int validate_ibt(struct objtool_f
>  		}
>  	}
>  
> +	if (ibt_warn) {
> +		struct symbol *hypercall_page = find_symbol_by_name(file->elf, "hypercall_page");
> +		struct instruction *insn;
> +
> +		for_each_insn(file, insn) {
> +			if (!insn->noendbr || insn->noendbr_hit)
> +				continue;
> +
> +			if (hypercall_page) {
> +				/*
> +				 * The Xen hypercall page contains many
> +				 * hypercalls (and unused slots) that are never
> +				 * indirectly called. Still every slot has an
> +				 * annotation. Suppress complaints.
> +				 */
> +				if (insn->sec == hypercall_page->sec &&
> +				    insn->offset >= hypercall_page->offset &&
> +				    insn->offset <  hypercall_page->offset + hypercall_page->len)
> +					continue;
> +			}
> +
> +			WARN_FUNC("unused ANNOTATE_NOENDBR", insn->sec, insn->offset);
> +		}
> +	}

Ewww... :-/

I can see how this option was useful for development.  But I'm never
going to use it.  Are you?  I'd much rather just drop it.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR
  2022-03-03 11:23 ` [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR Peter Zijlstra
@ 2022-03-04 18:59   ` Josh Poimboeuf
  2022-03-04 20:22     ` Peter Zijlstra
  0 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 18:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:26PM +0100, Peter Zijlstra wrote:
> In order to have objtool warn about code references to !ENDBR
> instruction, we need an annotation to allow this for non-control-flow
> instances -- consider text range checks, text patching, or return
> trampolines etc.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> ---
>  include/linux/objtool.h |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)

Also needs copied over to tools/include/linux/objtool.h to avoid the
sync warning.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 00/39] x86: Kernel IBT
  2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
                   ` (38 preceding siblings ...)
  2022-03-03 11:24 ` [PATCH v3 39/39] x86/alternative: Use .ibt_endbr_seal to seal indirect calls Peter Zijlstra
@ 2022-03-04 19:09 ` Josh Poimboeuf
  2022-03-04 20:22   ` Peter Zijlstra
  2022-03-04 21:39   ` Peter Zijlstra
  39 siblings, 2 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 19:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:21PM +0100, Peter Zijlstra wrote:
> Hi, another week, another series.
> 
> Since last time:
> 
>  - fixed and tested kexec (redgecomb)
>  - s/4*HAS_KERNEL_IBT/ENDBR_INSN_SIZE/ (jpoimboe)
>  - re-arranged Xen patches to avoid churn (andyhpp)
>  - folded IBT_SEAL Kconfig and objtool options (jpoimboe)
>  - dropped direct call/jmp rewrite from objtool (jpoimboe)
>  - dropped UD1 poison (jpoimboe)
>  - fixed kprobe selftests (masami,naveen)
>  - fixed ftrace selftests (rostedt)
>  - simplified CET/INT3 selftests (jpoimboe)
>  - boot time msg on IBT (kees)
>  - objtool WARN_FUNC sym+off fallback (jpoimboe)
>  - picked up tags for unchanged patches
>  - probably more
> 
> Supposedly clang-14-rc2 will work on this series, I'll validate the moment the
> Debian package gets updated.
> 
> Patches go on top of tip/master + arm64/for-next/linkage. Also available here:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/wip.ibt

I'm getting some warnings with CONFIG_X86_KERNEL_IBT=n:

  arch/x86/entry/entry_64.o: warning: objtool: irq_entries_start()+0x7: unreachable instruction
  arch/x86/kernel/ftrace_64.o: warning: objtool: return_to_handler()+0x2a: unreachable instruction

And a warning with CONFIG_X86_KERNEL_IBT=y:

  vmlinux.o: warning: objtool: .text+0xaf0: unreachable instruction

And if I remove the per-file limiting on "unreachable instruction"
warnings, I get a boat-load more warnings for vmlinux.o.

The last two patches (IBT sealing) aren't going to be viable until all
the "unreachable instruction" warnings get cleaned up, because that
means we have missing coverage.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-04 17:51   ` Josh Poimboeuf
@ 2022-03-04 19:48     ` Peter Zijlstra
  2022-03-04 21:03       ` Josh Poimboeuf
  2022-03-04 23:08       ` David Laight
  0 siblings, 2 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 19:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > +
> > +	addq $16, %rsp
> > +	ANNOTATE_INTRA_FUNCTION_CALL
> > +	call .Ldo_rop
> > +	int3
> > +.Ldo_rop:
> > +	mov %rdi, (%rsp)
> > +	UNWIND_HINT_FUNC
> > +	RET
> 
> Why the int3?

Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
without it it's just too close to a retpoline and it doesn't feel right.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 00/39] x86: Kernel IBT
  2022-03-04 19:09 ` [PATCH v3 00/39] x86: Kernel IBT Josh Poimboeuf
@ 2022-03-04 20:22   ` Peter Zijlstra
  2022-03-04 21:39   ` Peter Zijlstra
  1 sibling, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 20:22 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 11:09:58AM -0800, Josh Poimboeuf wrote:

> I'm getting some warnings with CONFIG_X86_KERNEL_IBT=n:
> 
>   arch/x86/entry/entry_64.o: warning: objtool: irq_entries_start()+0x7: unreachable instruction
>   arch/x86/kernel/ftrace_64.o: warning: objtool: return_to_handler()+0x2a: unreachable instruction

Urgh, lemme go chase that.

> And a warning with CONFIG_X86_KERNEL_IBT=y:
> 
>   vmlinux.o: warning: objtool: .text+0xaf0: unreachable instruction

This is that weak symbol issue :/ We talked about it on IRC, but i've
not yet come around to fixing it.

This is mostly a pre-existing issue, only uncovered because we run on
vmlinux more..

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR
  2022-03-04 18:59   ` Josh Poimboeuf
@ 2022-03-04 20:22     ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 20:22 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 10:59:53AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:26PM +0100, Peter Zijlstra wrote:
> > In order to have objtool warn about code references to !ENDBR
> > instruction, we need an annotation to allow this for non-control-flow
> > instances -- consider text range checks, text patching, or return
> > trampolines etc.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Reviewed-by: Kees Cook <keescook@chromium.org>
> > ---
> >  include/linux/objtool.h |   16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> 
> Also needs copied over to tools/include/linux/objtool.h to avoid the
> sync warning.

Urgh yeah, I keep meaning to add that bit and then immediately forget
about it :/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling
  2022-03-04 17:57   ` Josh Poimboeuf
@ 2022-03-04 20:38     ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 20:38 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 09:57:33AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:43PM +0100, Peter Zijlstra wrote:
> > +bool ibt_selftest(void)
> > +{
> > +	unsigned long ret;
> > +
> > +	asm ("	lea ibt_selftest_ip(%%rip), %%rax\n\t"
> > +	     ANNOTATE_RETPOLINE_SAFE
> > +	     "	jmp *%%rax\n\t"
> > +	     ASM_REACHABLE
> > +	     ANNOTATE_NOENDBR
> > +	     "ibt_selftest_ip: nop\n\t"
> 
> Maybe pedantic, but I find the annotations to be less surprising if they
> come after the label:
> 
> 	"ibt_selftest_ip:\n\t"
> 	ASM_REACHABLE
> 	ANNOTATE_NOENDBR
> 	"nop\n\t"

Done.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling
  2022-03-04 18:07   ` Josh Poimboeuf
@ 2022-03-04 20:39     ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 20:39 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 10:07:20AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:43PM +0100, Peter Zijlstra wrote:
> > @@ -310,6 +311,7 @@ void machine_kexec(struct kimage *image)
> >  	/* Interrupts aren't acceptable while we reboot */
> >  	local_irq_disable();
> >  	hw_breakpoint_disable();
> > +	cet_disable();
> >  
> >  	if (image->preserve_context) {
> >  #ifdef CONFIG_X86_IO_APIC
> > --- a/arch/x86/kernel/relocate_kernel_64.S
> > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > @@ -115,6 +115,14 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_ma
> >  	pushq   %rdx
> >  
> >  	/*
> > +	 * Clear X86_CR4_CET (if it was set) such that we can clear CR0_WP
> > +	 * below.
> > +	 */
> > +	movq	%cr4, %rax
> > +	andq	$~(X86_CR4_CET), %rax
> > +	movq	%rax, %cr4
> > +
> > +	/*
> >  	 * Set cr0 to a known state:
> >  	 *  - Paging enabled
> >  	 *  - Alignment check disabled
> 
> This probably belongs in a separate patch...

A x86/ibt,kexec patch has just been born...

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-04 19:48     ` Peter Zijlstra
@ 2022-03-04 21:03       ` Josh Poimboeuf
  2022-03-04 21:44         ` Peter Zijlstra
  2022-03-04 23:08       ` David Laight
  1 sibling, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 21:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 08:48:53PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > +
> > > +	addq $16, %rsp
> > > +	ANNOTATE_INTRA_FUNCTION_CALL
> > > +	call .Ldo_rop
> > > +	int3
> > > +.Ldo_rop:
> > > +	mov %rdi, (%rsp)
> > > +	UNWIND_HINT_FUNC
> > > +	RET
> > 
> > Why the int3?
> 
> Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> without it it's just too close to a retpoline and it doesn't feel right.

Um, it *is* a retpoline :-)

Can you just use the RETPOLINE macro?  Along with a comment stating why
it can't just do a JMP_NOSPEC?

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 00/39] x86: Kernel IBT
  2022-03-04 19:09 ` [PATCH v3 00/39] x86: Kernel IBT Josh Poimboeuf
  2022-03-04 20:22   ` Peter Zijlstra
@ 2022-03-04 21:39   ` Peter Zijlstra
  1 sibling, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 21:39 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 11:09:58AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:21PM +0100, Peter Zijlstra wrote:
> > Hi, another week, another series.
> > 
> > Since last time:
> > 
> >  - fixed and tested kexec (redgecomb)
> >  - s/4*HAS_KERNEL_IBT/ENDBR_INSN_SIZE/ (jpoimboe)
> >  - re-arranged Xen patches to avoid churn (andyhpp)
> >  - folded IBT_SEAL Kconfig and objtool options (jpoimboe)
> >  - dropped direct call/jmp rewrite from objtool (jpoimboe)
> >  - dropped UD1 poison (jpoimboe)
> >  - fixed kprobe selftests (masami,naveen)
> >  - fixed ftrace selftests (rostedt)
> >  - simplified CET/INT3 selftests (jpoimboe)
> >  - boot time msg on IBT (kees)
> >  - objtool WARN_FUNC sym+off fallback (jpoimboe)
> >  - picked up tags for unchanged patches
> >  - probably more
> > 
> > Supposedly clang-14-rc2 will work on this series, I'll validate the moment the
> > Debian package gets updated.
> > 
> > Patches go on top of tip/master + arm64/for-next/linkage. Also available here:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/wip.ibt
> 
> I'm getting some warnings with CONFIG_X86_KERNEL_IBT=n:
> 
>   arch/x86/entry/entry_64.o: warning: objtool: irq_entries_start()+0x7: unreachable instruction
>   arch/x86/kernel/ftrace_64.o: warning: objtool: return_to_handler()+0x2a: unreachable instruction

Blergh, those are INT3 instructions, the first is the LDT stub padding
while the second is that INT3 you asked about earlier.

I can mark then all using SLS style rules, but that then triggers:

arch/x86/kernel/reboot.o: warning: objtool: native_machine_emergency_restart()+0x8f: BUG: why am I validating an ignored function?

which does horrible things on purpose to tickle a tripple fault in order
to reboot the machine.

Perhaps we should ignore INT3 by default, just like NOP ?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-04 21:03       ` Josh Poimboeuf
@ 2022-03-04 21:44         ` Peter Zijlstra
  2022-03-04 22:05           ` Josh Poimboeuf
  0 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-04 21:44 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 01:03:34PM -0800, Josh Poimboeuf wrote:
> On Fri, Mar 04, 2022 at 08:48:53PM +0100, Peter Zijlstra wrote:
> > On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > > +
> > > > +	addq $16, %rsp
> > > > +	ANNOTATE_INTRA_FUNCTION_CALL
> > > > +	call .Ldo_rop
> > > > +	int3
> > > > +.Ldo_rop:
> > > > +	mov %rdi, (%rsp)
> > > > +	UNWIND_HINT_FUNC
> > > > +	RET
> > > 
> > > Why the int3?
> > 
> > Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> > without it it's just too close to a retpoline and it doesn't feel right.
> 
> Um, it *is* a retpoline :-)
> 
> Can you just use the RETPOLINE macro?  Along with a comment stating why
> it can't just do a JMP_NOSPEC?

There is no RETPOLINE macro; or rather it is completely contained in
lib/retpoline.S and I'd sorta like to keep it that way.

That said, I can stick a comment on.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-04 21:44         ` Peter Zijlstra
@ 2022-03-04 22:05           ` Josh Poimboeuf
  0 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2022-03-04 22:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

On Fri, Mar 04, 2022 at 10:44:54PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 04, 2022 at 01:03:34PM -0800, Josh Poimboeuf wrote:
> > On Fri, Mar 04, 2022 at 08:48:53PM +0100, Peter Zijlstra wrote:
> > > On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > > > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > > > +
> > > > > +	addq $16, %rsp
> > > > > +	ANNOTATE_INTRA_FUNCTION_CALL
> > > > > +	call .Ldo_rop
> > > > > +	int3
> > > > > +.Ldo_rop:
> > > > > +	mov %rdi, (%rsp)
> > > > > +	UNWIND_HINT_FUNC
> > > > > +	RET
> > > > 
> > > > Why the int3?
> > > 
> > > Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> > > without it it's just too close to a retpoline and it doesn't feel right.
> > 
> > Um, it *is* a retpoline :-)
> > 
> > Can you just use the RETPOLINE macro?  Along with a comment stating why
> > it can't just do a JMP_NOSPEC?
> 
> There is no RETPOLINE macro; or rather it is completely contained in
> lib/retpoline.S and I'd sorta like to keep it that way.
> 
> That said, I can stick a comment on.

The only reason it's in retpoline.S is because nobody else needed it.

It just seems weird to reinvent the wheel, especially with a slightly
different method of stopping speculation.

And I could envision other cases where we might want an unconditional
retpoline.

Your call though...

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice
  2022-03-04 19:48     ` Peter Zijlstra
  2022-03-04 21:03       ` Josh Poimboeuf
@ 2022-03-04 23:08       ` David Laight
  1 sibling, 0 replies; 64+ messages in thread
From: David Laight @ 2022-03-04 23:08 UTC (permalink / raw)
  To: 'Peter Zijlstra', Josh Poimboeuf
  Cc: x86, joao, hjl.tools, andrew.cooper3, linux-kernel, ndesaulniers,
	keescook, samitolvanen, mark.rutland, alyssa.milburn, mbenes,
	rostedt, mhiramat, alexei.starovoitov

From: Peter Zijlstra
> Sent: 04 March 2022 19:49
> 
> On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > +
> > > +	addq $16, %rsp
> > > +	ANNOTATE_INTRA_FUNCTION_CALL
> > > +	call .Ldo_rop
> > > +	int3
> > > +.Ldo_rop:
> > > +	mov %rdi, (%rsp)
> > > +	UNWIND_HINT_FUNC
> > > +	RET
> >
> > Why the int3?
> 
> Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> without it it's just too close to a retpoline and it doesn't feel right.

Isn't 'jmps .' good enough for a speculation trap?
I'm sure there is a potential issue using 'int 3' because
it is a slow instruction and might take some time to abort.

I actually remember something from a very old Intel doc that
told you not to mix code and data because you didn't want to
'accidentally' execute something like 'atan'.
I can't remember the full context - but it may have been
speculatively executing code after unconditional jumps!
There were certainly bigger problems because the cpu at that
time wouldn't abort the atan - so you had to wait for it to
finish.

I suspect you do need something between the call and label.
The sequence:
	call 1f
1:	pop %rax
is used to get the %pc (especially on 32bit) and is detected
so that it doesn't mess up the return stack.
So you probably want to avoid a call to the next instruction.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 19/39] x86/ibt,kprobes: Cure sym+0 equals fentry woes
  2022-03-03 11:23 ` [PATCH v3 19/39] x86/ibt,kprobes: Cure sym+0 equals fentry woes Peter Zijlstra
@ 2022-03-07  9:44   ` Masami Hiramatsu
  0 siblings, 0 replies; 64+ messages in thread
From: Masami Hiramatsu @ 2022-03-07  9:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, joao, hjl.tools, jpoimboe, andrew.cooper3, linux-kernel,
	ndesaulniers, keescook, samitolvanen, mark.rutland,
	alyssa.milburn, mbenes, rostedt, mhiramat, alexei.starovoitov

On Thu, 03 Mar 2022 12:23:40 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> Rework _kprobe_addr() to only ever do 2 kallsyms calls and require
> only a single architecture/weak function -- although currently powerpc
> still uses 2.

OK, and we need to note here that if the kernel enables this
X86_KERNEL_IBT, the kprobe address is possible to be sym+4 because
kprobes will skip the ENDBR.

Others looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you,

> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/powerpc/kernel/kprobes.c  |   34 +++++++++++++--------
>  arch/x86/kernel/kprobes/core.c |   28 ++++++++++++++---
>  include/linux/kprobes.h        |    3 +
>  kernel/kprobes.c               |   66 ++++++++++++++++++++++++++++++++---------
>  4 files changed, 98 insertions(+), 33 deletions(-)
> 
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -105,6 +105,27 @@ kprobe_opcode_t *kprobe_lookup_name(cons
>  	return addr;
>  }
>  
> +static bool arch_kprobe_on_func_entry(unsigned long offset)
> +{
> +#ifdef PPC64_ELF_ABI_v2
> +#ifdef CONFIG_KPROBES_ON_FTRACE
> +	return offset <= 16;
> +#else
> +	return offset <= 8;
> +#endif
> +#else
> +	return !offset;
> +#endif
> +}
> +
> +/* XXX try and fold the magic of kprobe_lookup_name() in this */
> +kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset,
> +					 bool *on_func_entry)
> +{
> +	*on_func_entry = arch_kprobe_on_func_entry(offset);
> +	return (kprobe_opcode_t *)(addr + offset);

Just for make sure, 

> +}
> +
>  void *alloc_insn_page(void)
>  {
>  	void *page;
> @@ -218,19 +239,6 @@ static nokprobe_inline void set_current_
>  	kcb->kprobe_saved_msr = regs->msr;
>  }
>  
> -bool arch_kprobe_on_func_entry(unsigned long offset)
> -{
> -#ifdef PPC64_ELF_ABI_v2
> -#ifdef CONFIG_KPROBES_ON_FTRACE
> -	return offset <= 16;
> -#else
> -	return offset <= 8;
> -#endif
> -#else
> -	return !offset;
> -#endif
> -}
> -
>  void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
>  {
>  	ri->ret_addr = (kprobe_opcode_t *)regs->link;
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -52,6 +52,7 @@
>  #include <asm/insn.h>
>  #include <asm/debugreg.h>
>  #include <asm/set_memory.h>
> +#include <asm/ibt.h>
>  
>  #include "common.h"
>  
> @@ -197,13 +198,14 @@ __recover_probed_insn(kprobe_opcode_t *b
>  
>  	kp = get_kprobe((void *)addr);
>  	faddr = ftrace_location(addr);
> +
>  	/*
> -	 * Addresses inside the ftrace location are refused by
> -	 * arch_check_ftrace_location(). Something went terribly wrong
> -	 * if such an address is checked here.
> +	 * In case addr maps to sym+0 ftrace_location() might return something
> +	 * other than addr. In that case consider it the same as !faddr.
>  	 */
> -	if (WARN_ON(faddr && faddr != addr))
> -		return 0UL;
> +	if (faddr && faddr != addr)
> +		faddr = 0;
> +
>  	/*
>  	 * Use the current code if it is not modified by Kprobe
>  	 * and it cannot be modified by ftrace.
> @@ -301,6 +303,22 @@ static int can_probe(unsigned long paddr
>  	return (addr == paddr);
>  }
>  
> +/* If the x86 support IBT (ENDBR) it must be skipped. */
> +kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset,
> +					 bool *on_func_entry)
> +{
> +	if (is_endbr(*(u32 *)addr)) {
> +		*on_func_entry = !offset || offset == 4;
> +		if (*on_func_entry)
> +			offset = 4;
> +
> +	} else {
> +		*on_func_entry = !offset;
> +	}
> +
> +	return (kprobe_opcode_t *)(addr + offset);
> +}
> +
>  /*
>   * Copy an instruction with recovering modified instruction by kprobes
>   * and adjust the displacement if the instruction uses the %rip-relative
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -265,7 +265,6 @@ extern int arch_init_kprobes(void);
>  extern void kprobes_inc_nmissed_count(struct kprobe *p);
>  extern bool arch_within_kprobe_blacklist(unsigned long addr);
>  extern int arch_populate_kprobe_blacklist(void);
> -extern bool arch_kprobe_on_func_entry(unsigned long offset);
>  extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
>  
>  extern bool within_kprobe_blacklist(unsigned long addr);
> @@ -384,6 +383,8 @@ static inline struct kprobe_ctlblk *get_
>  }
>  
>  kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset);
> +kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset, bool *on_func_entry);
> +
>  int register_kprobe(struct kprobe *p);
>  void unregister_kprobe(struct kprobe *p);
>  int register_kprobes(struct kprobe **kps, int num);
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1489,24 +1489,68 @@ bool within_kprobe_blacklist(unsigned lo
>  }
>  
>  /*
> + * arch_adjust_kprobe_addr - adjust the address
> + * @addr: symbol base address
> + * @offset: offset within the symbol
> + * @on_func_entry: was this @addr+@offset on the function entry
> + *
> + * Typically returns @addr + @offset, except for special cases where the
> + * function might be prefixed by a CFI landing pad, in that case any offset
> + * inside the landing pad is mapped to the first 'real' instruction of the
> + * symbol.
> + *
> + * Specifically, for things like IBT/BTI, skip the resp. ENDBR/BTI.C
> + * instruction at +0.
> + */
> +kprobe_opcode_t *__weak arch_adjust_kprobe_addr(unsigned long addr,
> +						unsigned long offset,
> +						bool *on_func_entry)
> +{
> +	*on_func_entry = !offset;
> +	return (kprobe_opcode_t *)(addr + offset);
> +}
> +
> +/*
>   * If 'symbol_name' is specified, look it up and add the 'offset'
>   * to it. This way, we can specify a relative address to a symbol.
>   * This returns encoded errors if it fails to look up symbol or invalid
>   * combination of parameters.
>   */
> -static kprobe_opcode_t *_kprobe_addr(kprobe_opcode_t *addr,
> -			const char *symbol_name, unsigned int offset)
> +static kprobe_opcode_t *
> +_kprobe_addr(kprobe_opcode_t *addr, const char *symbol_name,
> +	     unsigned long offset, bool *on_func_entry)
>  {
>  	if ((symbol_name && addr) || (!symbol_name && !addr))
>  		goto invalid;
>  
>  	if (symbol_name) {
> +		/*
> +		 * Input: @sym + @offset
> +		 * Output: @addr + @offset
> +		 *
> +		 * NOTE: kprobe_lookup_name() does *NOT* fold the offset
> +		 *       argument into it's output!
> +		 */
>  		addr = kprobe_lookup_name(symbol_name, offset);
>  		if (!addr)
>  			return ERR_PTR(-ENOENT);
>  	}
>  
> -	addr = (kprobe_opcode_t *)(((char *)addr) + offset);
> +	/*
> +	 * So here we have @addr + @offset, displace it into a new
> +	 * @addr' + @offset' where @addr' is the symbol start address.
> +	 */
> +	addr = (void *)addr + offset;
> +	if (!kallsyms_lookup_size_offset((unsigned long)addr, NULL, &offset))
> +		return ERR_PTR(-ENOENT);
> +	addr = (void *)addr - offset;
> +
> +	/*
> +	 * Then ask the architecture to re-combine them, taking care of
> +	 * magical function entry details while telling us if this was indeed
> +	 * at the start of the function.
> +	 */
> +	addr = arch_adjust_kprobe_addr((unsigned long)addr, offset, on_func_entry);
>  	if (addr)
>  		return addr;
>  
> @@ -1516,7 +1560,8 @@ static kprobe_opcode_t *_kprobe_addr(kpr
>  
>  static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
>  {
> -	return _kprobe_addr(p->addr, p->symbol_name, p->offset);
> +	bool on_func_entry;
> +	return _kprobe_addr(p->addr, p->symbol_name, p->offset, &on_func_entry);
>  }
>  
>  /*
> @@ -2047,11 +2092,6 @@ static int pre_handler_kretprobe(struct
>  }
>  NOKPROBE_SYMBOL(pre_handler_kretprobe);
>  
> -bool __weak arch_kprobe_on_func_entry(unsigned long offset)
> -{
> -	return !offset;
> -}
> -
>  /**
>   * kprobe_on_func_entry() -- check whether given address is function entry
>   * @addr: Target address
> @@ -2067,15 +2107,13 @@ bool __weak arch_kprobe_on_func_entry(un
>   */
>  int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
>  {
> -	kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
> +	bool on_func_entry;
> +	kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset, &on_func_entry);
>  
>  	if (IS_ERR(kp_addr))
>  		return PTR_ERR(kp_addr);
>  
> -	if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
> -		return -ENOENT;
> -
> -	if (!arch_kprobe_on_func_entry(offset))
> +	if (!on_func_entry)
>  		return -EINVAL;
>  
>  	return 0;
> 
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 04/39] x86/ibt: Base IBT bits
  2022-03-03 11:23 ` [PATCH v3 04/39] x86/ibt: Base IBT bits Peter Zijlstra
@ 2022-03-07 11:17   ` Peter Zijlstra
  2022-03-07 12:25     ` Joao Moreira
  0 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2022-03-07 11:17 UTC (permalink / raw)
  To: x86, joao, hjl.tools, jpoimboe, andrew.cooper3
  Cc: linux-kernel, ndesaulniers, keescook, samitolvanen, mark.rutland,
	alyssa.milburn, mbenes, rostedt, mhiramat, alexei.starovoitov

On Thu, Mar 03, 2022 at 12:23:25PM +0100, Peter Zijlstra wrote:

> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -62,8 +62,11 @@ export BITS
>  #
>  KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
>  
> -# Intel CET isn't enabled in the kernel
> +ifeq ($(CONFIG_X86_KERNEL_IBT),y)
> +KBUILD_CFLAGS += $(call cc-option,-fcf-protection=branch)
> +else
>  KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
> +endif
>  
>  ifeq ($(CONFIG_X86_32),y)
>          BITS := 32

Joao reported that RETPOLINE=n builds explode; turns out the compilers
default to using NOTRACK prefixes for jump-tables and we explicitly do
not enable that security compromise for the kernel.

Since the compilers don't have explicit control over NOTRACK generation,
blanket disable jump-tables when using IBT without RETPOLINE.

Joao will be submitting GCC and Clang bugreports on this shortly.

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index d38c18f4bd53..f80a425e7d29 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -63,7 +63,9 @@ export BITS
 KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
 
 ifeq ($(CONFIG_X86_KERNEL_IBT),y)
-KBUILD_CFLAGS += $(call cc-option,-fcf-protection=branch)
+# Explicitly disable jump-tables, also implied by RETPOLINE=y, for kernel IBT
+# to avoid NOTRACK prefixes.
+KBUILD_CFLAGS += $(call cc-option,-fcf-protection=branch -fno-jump-tables)
 else
 KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
 endif

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 04/39] x86/ibt: Base IBT bits
  2022-03-07 11:17   ` Peter Zijlstra
@ 2022-03-07 12:25     ` Joao Moreira
  2022-03-07 18:19       ` Nick Desaulniers
  0 siblings, 1 reply; 64+ messages in thread
From: Joao Moreira @ 2022-03-07 12:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, hjl.tools, jpoimboe, andrew.cooper3, linux-kernel,
	ndesaulniers, keescook, samitolvanen, mark.rutland,
	alyssa.milburn, mbenes, rostedt, mhiramat, alexei.starovoitov

> 
> Joao will be submitting GCC and Clang bugreports on this shortly.
> 
Reported both on GCC and Clang:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816
https://github.com/llvm/llvm-project/issues/54247

I'm not sure if something should be reported to ClangBuiltLinux, since 
the kernel will be built with -fno-jump-tables anyways.

Tks,
Joao

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v3 04/39] x86/ibt: Base IBT bits
  2022-03-07 12:25     ` Joao Moreira
@ 2022-03-07 18:19       ` Nick Desaulniers
  0 siblings, 0 replies; 64+ messages in thread
From: Nick Desaulniers @ 2022-03-07 18:19 UTC (permalink / raw)
  To: Joao Moreira
  Cc: Peter Zijlstra, x86, hjl.tools, jpoimboe, andrew.cooper3,
	linux-kernel, keescook, samitolvanen, mark.rutland,
	alyssa.milburn, mbenes, rostedt, mhiramat, alexei.starovoitov

On Mon, Mar 7, 2022 at 4:25 AM Joao Moreira <joao@overdrivepizza.com> wrote:
>
> >
> > Joao will be submitting GCC and Clang bugreports on this shortly.
> >
> Reported both on GCC and Clang:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816
> https://github.com/llvm/llvm-project/issues/54247
>
> I'm not sure if something should be reported to ClangBuiltLinux, since
> the kernel will be built with -fno-jump-tables anyways.

As long as a bug gets filed; that's what's more important to me.
Doesn't matter if we track it upstream in LLVM's issue tracker or
downstream.  Having the record of it in either is what's important to
me. Thanks for filing!
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [tip: x86/urgent] x86/module: Fix the paravirt vs alternative order
  2022-03-03 11:23 ` [PATCH v3 02/39] x86/module: Fix the paravirt vs alternative order Peter Zijlstra
@ 2022-03-08 13:58   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2022-03-08 13:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     5adf349439d29f92467e864f728dfc23180f3ef9
Gitweb:        https://git.kernel.org/tip/5adf349439d29f92467e864f728dfc23180f3ef9
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Thu, 03 Mar 2022 12:23:23 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Tue, 08 Mar 2022 14:15:25 +01:00

x86/module: Fix the paravirt vs alternative order

Ever since commit

  4e6292114c74 ("x86/paravirt: Add new features for paravirt patching")

there is an ordering dependency between patching paravirt ops and
patching alternatives, the module loader still violates this.

Fixes: 4e6292114c74 ("x86/paravirt: Add new features for paravirt patching")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220303112825.068773913@infradead.org
---
 arch/x86/kernel/module.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 95fa745..96d7c27 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -273,6 +273,14 @@ int module_finalize(const Elf_Ehdr *hdr,
 			retpolines = s;
 	}
 
+	/*
+	 * See alternative_instructions() for the ordering rules between the
+	 * various patching types.
+	 */
+	if (para) {
+		void *pseg = (void *)para->sh_addr;
+		apply_paravirt(pseg, pseg + para->sh_size);
+	}
 	if (retpolines) {
 		void *rseg = (void *)retpolines->sh_addr;
 		apply_retpolines(rseg, rseg + retpolines->sh_size);
@@ -290,11 +298,6 @@ int module_finalize(const Elf_Ehdr *hdr,
 					    tseg, tseg + text->sh_size);
 	}
 
-	if (para) {
-		void *pseg = (void *)para->sh_addr;
-		apply_paravirt(pseg, pseg + para->sh_size);
-	}
-
 	/* make jump label nops */
 	jump_label_apply_nops(me);
 

^ permalink raw reply related	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2022-03-08 13:59 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03 11:23 [PATCH v3 00/39] x86: Kernel IBT Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 01/39] static_call: Avoid building empty .static_call_sites Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 02/39] x86/module: Fix the paravirt vs alternative order Peter Zijlstra
2022-03-08 13:58   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 03/39] objtool: Add --dry-run Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 04/39] x86/ibt: Base IBT bits Peter Zijlstra
2022-03-07 11:17   ` Peter Zijlstra
2022-03-07 12:25     ` Joao Moreira
2022-03-07 18:19       ` Nick Desaulniers
2022-03-03 11:23 ` [PATCH v3 05/39] x86/ibt: Add ANNOTATE_NOENDBR Peter Zijlstra
2022-03-04 18:59   ` Josh Poimboeuf
2022-03-04 20:22     ` Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 06/39] x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 07/39] x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch() Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 08/39] x86/entry: Cleanup PARAVIRT Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 09/39] x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel() Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 10/39] x86/ibt,xen: Sprinkle the ENDBR Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 11/39] x86/ibt,entry: Sprinkle ENDBR dust Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 12/39] x86/linkage: Add ENDBR to SYM_FUNC_START*() Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 13/39] x86/ibt,paravirt: Sprinkle ENDBR Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 14/39] x86/ibt,crypto: Add ENDBR for the jump-table entries Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 15/39] x86/ibt,kvm: Add ENDBR to fastops Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 16/39] x86/ibt,ftrace: Search for __fentry__ location Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 17/39] x86/livepatch: Validate " Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice Peter Zijlstra
2022-03-04 17:51   ` Josh Poimboeuf
2022-03-04 19:48     ` Peter Zijlstra
2022-03-04 21:03       ` Josh Poimboeuf
2022-03-04 21:44         ` Peter Zijlstra
2022-03-04 22:05           ` Josh Poimboeuf
2022-03-04 23:08       ` David Laight
2022-03-03 11:23 ` [PATCH v3 19/39] x86/ibt,kprobes: Cure sym+0 equals fentry woes Peter Zijlstra
2022-03-07  9:44   ` Masami Hiramatsu
2022-03-03 11:23 ` [PATCH v3 20/39] x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 21/39] x86/ibt,ftrace: Add ENDBR to samples/ftrace Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 22/39] x86/ibt: Add IBT feature, MSR and #CP handling Peter Zijlstra
2022-03-04 17:57   ` Josh Poimboeuf
2022-03-04 20:38     ` Peter Zijlstra
2022-03-04 18:07   ` Josh Poimboeuf
2022-03-04 20:39     ` Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 23/39] x86/alternative: Simplify int3_selftest_ip Peter Zijlstra
2022-03-04 18:02   ` Josh Poimboeuf
2022-03-04 18:09     ` Josh Poimboeuf
2022-03-03 11:23 ` [PATCH v3 24/39] x86/ibt: Disable IBT around firmware Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 25/39] x86/bugs: Disable Retpoline when IBT Peter Zijlstra
2022-03-04 17:39   ` Josh Poimboeuf
2022-03-03 11:23 ` [PATCH v3 26/39] x86/ibt: Annotate text references Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 27/39] x86/ibt,ftrace: Annotate ftrace code patching Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 28/39] x86/ibt,sev: Annotations Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 29/39] x86/ibt: Dont generate ENDBR in .discard.text Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 30/39] x86/ibt: Ensure module init/exit points have references Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 31/39] objtool: Rename --duplicate to --lto Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 32/39] Kbuild: Allow whole module objtool runs Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 33/39] objtool: Read the NOENDBR annotation Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 34/39] objtool: Have WARN_FUNC fall back to sym+off Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 35/39] objtool: Add IBT/ENDBR decoding Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 36/39] objtool: Validate IBT assumptions Peter Zijlstra
2022-03-03 11:23 ` [PATCH v3 37/39] objtool: Optionally WARN about unused ANNOTATE_NOENDBR Peter Zijlstra
2022-03-04 18:27   ` Josh Poimboeuf
2022-03-03 11:23 ` [PATCH v3 38/39] objtool: Find unused ENDBR instructions Peter Zijlstra
2022-03-03 11:24 ` [PATCH v3 39/39] x86/alternative: Use .ibt_endbr_seal to seal indirect calls Peter Zijlstra
2022-03-04 19:09 ` [PATCH v3 00/39] x86: Kernel IBT Josh Poimboeuf
2022-03-04 20:22   ` Peter Zijlstra
2022-03-04 21:39   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).