This series adds support for Clang's Control-Flow Integrity (CFI) checking to x86_64. With CFI, the compiler injects a runtime check before each indirect function call to ensure the target is a valid function with the correct static type. This restricts possible call targets and makes it more difficult for an attacker to exploit bugs that allow the modification of stored function pointers. For more details, see: https://clang.llvm.org/docs/ControlFlowIntegrity.html The first two patches contain objtool support for CFI, the remaining patches change function declarations to use opaque types, fix type mismatch issues that confuse the compiler, and disable CFI where it can't be used. You can also pull this series from https://github.com/samitolvanen/linux.git x86-cfi-v3 --- Changes in v3: - Dropped Clang requirement to >= 13 after the missing compiler fix was backported there. - Added DEFINE_CFI_IMMEDIATE_RETURN_STUB to address the issue with tp_stub_func in kernel/tracepoint.c. - Renamed asm_func_t to asm_func_ptr. - Changed extable handlers to use __cficanonical instead of disabling CFI for fixup_exception. Changes in v2: - Dropped the first objtool patch as the warnings were fixed in separate patches. - Changed fix_cfi_relocs() in objtool to not rely on jump table symbols, and to return an error if it can't find a relocation. - Fixed a build issue with ASM_STACK_FRAME_NON_STANDARD(). - Dropped workarounds for inline assembly references to address-taken static functions with CFI as this was fixed in the compiler. - Changed the C declarations of non-callable functions to use opaque types and dropped the function_nocfi() patches. - Changed ARCH_SUPPORTS_CFI_CLANG to depend on Clang >=14 for the compiler fixes. Kees Cook (1): x86, relocs: Ignore __typeid__ relocations Sami Tolvanen (15): objtool: Add CONFIG_CFI_CLANG support objtool: Add ASM_STACK_FRAME_NON_STANDARD linkage: Add DECLARE_ASM_FUNC_SYMBOL cfi: Add DEFINE_CFI_IMMEDIATE_RETURN_STUB tracepoint: Exclude tp_stub_func from CFI checking ftrace: Use an opaque type for functions not callable from C lkdtm: Disable UNSET_SMEP with CFI lkdtm: Use an opaque type for lkdtm_rodata_do_nothing x86: Use an opaque type for functions not callable from C x86/extable: Mark handlers __cficanonical x86/purgatory: Disable CFI x86, module: Ignore __typeid__ relocations x86, cpu: Use LTO for cpu.c with CFI x86, kprobes: Fix optprobe_template_func type mismatch x86, build: Allow CONFIG_CFI_CLANG to be selected arch/x86/Kconfig | 1 + arch/x86/include/asm/ftrace.h | 2 +- arch/x86/include/asm/idtentry.h | 10 ++--- arch/x86/include/asm/page_64.h | 7 +-- arch/x86/include/asm/paravirt_types.h | 3 +- arch/x86/include/asm/processor.h | 2 +- arch/x86/include/asm/proto.h | 25 ++++++----- arch/x86/include/asm/uaccess_64.h | 9 ++-- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/kprobes/opt.c | 4 +- arch/x86/kernel/module.c | 4 ++ arch/x86/kernel/paravirt.c | 4 +- arch/x86/kvm/emulate.c | 4 +- arch/x86/kvm/kvm_emulate.h | 9 +--- arch/x86/mm/extable.c | 64 +++++++++++++++------------ arch/x86/power/Makefile | 2 + arch/x86/purgatory/Makefile | 2 +- arch/x86/tools/relocs.c | 7 +++ arch/x86/xen/enlighten_pv.c | 6 +-- arch/x86/xen/xen-ops.h | 10 ++--- drivers/misc/lkdtm/bugs.c | 2 +- drivers/misc/lkdtm/lkdtm.h | 2 +- drivers/misc/lkdtm/perms.c | 2 +- drivers/misc/lkdtm/rodata.c | 2 +- include/asm-generic/vmlinux.lds.h | 11 +++++ include/linux/cfi.h | 14 ++++++ include/linux/ftrace.h | 7 +-- include/linux/linkage.h | 13 ++++++ include/linux/objtool.h | 6 +++ kernel/cfi.c | 24 +++++++++- kernel/tracepoint.c | 5 +-- tools/include/linux/objtool.h | 6 +++ tools/objtool/arch/x86/decode.c | 16 +++++++ tools/objtool/elf.c | 51 +++++++++++++++++++++ tools/objtool/include/objtool/arch.h | 3 ++ tools/objtool/include/objtool/elf.h | 2 +- 37 files changed, 250 insertions(+), 95 deletions(-) base-commit: d0ee23f9d78be5531c4b055ea424ed0b489dfe9b -- 2.33.0.309.g3052b89438-goog
With CONFIG_CFI_CLANG, the compiler replaces function references with references to the CFI jump table, which confuses objtool. This change, based on Josh's initial patch [1], goes through the list of relocations and replaces jump table symbols with the actual function symbols. [1] https://lore.kernel.org/r/d743f4b36e120c06506567a9f87a062ae03da47f.1611263462.git.jpoimboe@redhat.com/ Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- tools/objtool/arch/x86/decode.c | 16 +++++++++ tools/objtool/elf.c | 51 ++++++++++++++++++++++++++++ tools/objtool/include/objtool/arch.h | 3 ++ tools/objtool/include/objtool/elf.h | 2 +- 4 files changed, 71 insertions(+), 1 deletion(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index bc821056aba9..318189c8065e 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -62,6 +62,22 @@ bool arch_callee_saved_reg(unsigned char reg) } } +unsigned long arch_cfi_section_reloc_offset(struct reloc *reloc) +{ + if (!reloc->addend) + return 0; + + if (reloc->type == R_X86_64_PC32 || reloc->type == R_X86_64_PLT32) + return reloc->addend + 4; + + return reloc->addend; +} + +unsigned long arch_cfi_jump_reloc_offset(unsigned long offset) +{ + return offset + 1; +} + unsigned long arch_dest_reloc_offset(int addend) { return addend + 4; diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index 8676c7598728..05a5f51aad2c 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -18,6 +18,7 @@ #include <errno.h> #include <objtool/builtin.h> +#include <objtool/arch.h> #include <objtool/elf.h> #include <objtool/warn.h> @@ -291,6 +292,10 @@ static int read_sections(struct elf *elf) if (sec->sh.sh_flags & SHF_EXECINSTR) elf->text_size += sec->len; + /* Detect -fsanitize=cfi jump table sections */ + if (!strncmp(sec->name, ".text..L.cfi.jumptable", 22)) + sec->cfi_jt = true; + list_add_tail(&sec->list, &elf->sections); elf_hash_add(section, &sec->hash, sec->idx); elf_hash_add(section_name, &sec->name_hash, str_hash(sec->name)); @@ -576,6 +581,49 @@ static int read_rela_reloc(struct section *sec, int i, struct reloc *reloc, unsi return 0; } +/* + * CONFIG_CFI_CLANG replaces function relocations to refer to an intermediate + * jump table. Undo the conversion so objtool can make sense of things. + */ +static int fix_cfi_relocs(const struct elf *elf) +{ + struct section *sec; + struct reloc *reloc; + + list_for_each_entry(sec, &elf->sections, list) { + list_for_each_entry(reloc, &sec->reloc_list, list) { + struct reloc *cfi_reloc; + unsigned long offset; + + if (!reloc->sym->sec->cfi_jt) + continue; + + if (reloc->sym->type == STT_SECTION) + offset = arch_cfi_section_reloc_offset(reloc); + else + offset = reloc->sym->offset; + + /* + * The jump table immediately jumps to the actual function, + * so look up the relocation there. + */ + offset = arch_cfi_jump_reloc_offset(offset); + cfi_reloc = find_reloc_by_dest(elf, reloc->sym->sec, offset); + + if (!cfi_reloc || !cfi_reloc->sym) { + WARN("can't find a CFI jump table relocation at %s+0x%lx", + reloc->sym->sec->name, offset); + return -1; + } + + reloc->sym = cfi_reloc->sym; + reloc->addend = 0; + } + } + + return 0; +} + static int read_relocs(struct elf *elf) { struct section *sec; @@ -639,6 +687,9 @@ static int read_relocs(struct elf *elf) tot_reloc += nr_reloc; } + if (fix_cfi_relocs(elf)) + return -1; + if (stats) { printf("max_reloc: %lu\n", max_reloc); printf("tot_reloc: %lu\n", tot_reloc); diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h index 062bb6e9b865..2205b2b08268 100644 --- a/tools/objtool/include/objtool/arch.h +++ b/tools/objtool/include/objtool/arch.h @@ -81,6 +81,9 @@ unsigned long arch_jump_destination(struct instruction *insn); unsigned long arch_dest_reloc_offset(int addend); +unsigned long arch_cfi_section_reloc_offset(struct reloc *reloc); +unsigned long arch_cfi_jump_reloc_offset(unsigned long offset); + const char *arch_nop_insn(int len); int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg); diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h index e34395047530..d9c1dacc6572 100644 --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -39,7 +39,7 @@ struct section { char *name; int idx; unsigned int len; - bool changed, text, rodata, noinstr; + bool changed, text, rodata, noinstr, cfi_jt; }; struct symbol { -- 2.33.0.309.g3052b89438-goog
To use the STACK_FRAME_NON_STANDARD macro for a static symbol defined in inline assembly, we need a C declaration that implies global visibility. This type mismatch confuses the compiler with CONFIG_CFI_CLANG. This change adds an inline assembly version of the macro to avoid the issue. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/objtool.h | 6 ++++++ tools/include/linux/objtool.h | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/include/linux/objtool.h b/include/linux/objtool.h index 7e72d975cb76..080e95174536 100644 --- a/include/linux/objtool.h +++ b/include/linux/objtool.h @@ -66,6 +66,11 @@ struct unwind_hint { static void __used __section(".discard.func_stack_frame_non_standard") \ *__func_stack_frame_non_standard_##func = func +#define ASM_STACK_FRAME_NON_STANDARD(func) \ + ".pushsection .discard.func_stack_frame_non_standard, \"aw\"\n" \ + ".long " __stringify(func) " - .\n" \ + ".popsection\n" + #else /* __ASSEMBLY__ */ /* @@ -127,6 +132,7 @@ struct unwind_hint { #define UNWIND_HINT(sp_reg, sp_offset, type, end) \ "\n\t" #define STACK_FRAME_NON_STANDARD(func) +#define ASM_STACK_FRAME_NON_STANDARD(func) #else #define ANNOTATE_INTRA_FUNCTION_CALL .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0 diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h index 7e72d975cb76..080e95174536 100644 --- a/tools/include/linux/objtool.h +++ b/tools/include/linux/objtool.h @@ -66,6 +66,11 @@ struct unwind_hint { static void __used __section(".discard.func_stack_frame_non_standard") \ *__func_stack_frame_non_standard_##func = func +#define ASM_STACK_FRAME_NON_STANDARD(func) \ + ".pushsection .discard.func_stack_frame_non_standard, \"aw\"\n" \ + ".long " __stringify(func) " - .\n" \ + ".popsection\n" + #else /* __ASSEMBLY__ */ /* @@ -127,6 +132,7 @@ struct unwind_hint { #define UNWIND_HINT(sp_reg, sp_offset, type, end) \ "\n\t" #define STACK_FRAME_NON_STANDARD(func) +#define ASM_STACK_FRAME_NON_STANDARD(func) #else #define ANNOTATE_INTRA_FUNCTION_CALL .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0 -- 2.33.0.309.g3052b89438-goog
The kernel has several assembly functions, which are not directly callable from C but need to be referred to from C code. This change adds the DECLARE_ASM_FUNC_SYMBOL macro, which allows us to declare these symbols using an opaque type, which makes misuse harder, and avoids the need to annotate references to the functions for Clang's Control-Flow Integrity (CFI). Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/linkage.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/linux/linkage.h b/include/linux/linkage.h index dbf8506decca..f1eac26b2dd6 100644 --- a/include/linux/linkage.h +++ b/include/linux/linkage.h @@ -48,6 +48,19 @@ #define __PAGE_ALIGNED_DATA .section ".data..page_aligned", "aw" #define __PAGE_ALIGNED_BSS .section ".bss..page_aligned", "aw" +/* + * Declares a function not callable from C using an opaque type. Defined as + * an array to allow the address of the symbol to be taken without '&'. + */ +#ifndef DECLARE_ASM_FUNC_SYMBOL +#define DECLARE_ASM_FUNC_SYMBOL(sym) \ + extern const u8 sym[] +#endif + +#ifndef __ASSEMBLY__ +typedef const u8 *asm_func_ptr; +#endif + /* * This is used by architectures to keep arguments on the stack * untouched by the compiler by keeping them live until the end. -- 2.33.0.309.g3052b89438-goog
This change introduces the DEFINE_CFI_IMMEDIATE_RETURN_STUB macro, which defines a stub function that immediately returns and when defined in the core kernel, always passes indirect call checking with CONFIG_CFI_CLANG. Note that this macro should only be used when a stub cannot be called using the correct function type. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/asm-generic/vmlinux.lds.h | 11 +++++++++++ include/linux/cfi.h | 14 ++++++++++++++ kernel/cfi.c | 24 +++++++++++++++++++++++- 3 files changed, 48 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index f2984af2b85b..5b77284f7221 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -407,6 +407,16 @@ KEEP(*(.static_call_tramp_key)) \ __stop_static_call_tramp_key = .; +#ifdef CONFIG_CFI_CLANG +#define CFI_EXCLUDED_DATA \ + . = ALIGN(8); \ + __start_cfi_excluded = .; \ + KEEP(*(.cfi_excluded_stubs)) \ + __stop_cfi_excluded = .; +#else +#define CFI_EXCLUDED_DATA +#endif + /* * Allow architectures to handle ro_after_init data on their * own by defining an empty RO_AFTER_INIT_DATA. @@ -430,6 +440,7 @@ __start_rodata = .; \ *(.rodata) *(.rodata.*) \ SCHED_DATA \ + CFI_EXCLUDED_DATA \ RO_AFTER_INIT_DATA /* Read only after init */ \ . = ALIGN(8); \ __start___tracepoints_ptrs = .; \ diff --git a/include/linux/cfi.h b/include/linux/cfi.h index 879744aaa6e0..9ebf67a0d421 100644 --- a/include/linux/cfi.h +++ b/include/linux/cfi.h @@ -20,6 +20,18 @@ extern void __cfi_check(uint64_t id, void *ptr, void *diag); #define __CFI_ADDRESSABLE(fn, __attr) \ const void *__cfi_jt_ ## fn __visible __attr = (void *)&fn +/* + * Defines a stub function that returns immediately, and when defined and + * referenced in the core kernel, always passes CFI checking. This should + * be used only for stubs that cannot be called using the correct function + * pointer type, which should be rare. + */ +#define DEFINE_CFI_IMMEDIATE_RETURN_STUB(fn) \ + void fn(void) { return; } \ + const void *__cfi_excl_ ## fn __visible \ + __attribute__((__section__(".cfi_excluded_stubs"))) \ + = (void *)&fn + #ifdef CONFIG_CFI_CLANG_SHADOW extern void cfi_module_add(struct module *mod, unsigned long base_addr); @@ -35,6 +47,8 @@ static inline void cfi_module_remove(struct module *mod, unsigned long base_addr #else /* !CONFIG_CFI_CLANG */ #define __CFI_ADDRESSABLE(fn, __attr) +#define DEFINE_CFI_IMMEDIATE_RETURN_STUB(fn) \ + void fn(void) { return; } #endif /* CONFIG_CFI_CLANG */ diff --git a/kernel/cfi.c b/kernel/cfi.c index 9594cfd1cf2c..8d931089141b 100644 --- a/kernel/cfi.c +++ b/kernel/cfi.c @@ -278,12 +278,34 @@ static inline cfi_check_fn find_module_check_fn(unsigned long ptr) return fn; } +extern unsigned long __start_cfi_excluded[]; +extern unsigned long __stop_cfi_excluded[]; + +static inline bool is_cfi_excluded(unsigned long ptr) +{ + unsigned long *p = __start_cfi_excluded; + + for ( ; p < __stop_cfi_excluded; ++p) + if (*p == ptr) + return true; + + return false; +} + +static void __cfi_pass(uint64_t id, void *ptr, void *diag) +{ +} + static inline cfi_check_fn find_check_fn(unsigned long ptr) { cfi_check_fn fn = NULL; - if (is_kernel_text(ptr)) + if (is_kernel_text(ptr)) { + if (unlikely(is_cfi_excluded(ptr))) + return __cfi_pass; + return __cfi_check; + } /* * Indirect call checks can happen when RCU is not watching. Both -- 2.33.0.309.g3052b89438-goog
If allocate_probes fails, func_remove replaces the old function with a pointer to tp_stub_func, which is called using a mismatching function pointer that's will always trip indirect call checks with CONFIG_CFI_CLANG. Use DEFINE_CFI_IMMEDATE_RETURN_STUB to define tp_stub_func to allow it to pass CFI checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- kernel/tracepoint.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 64ea283f2f86..58acc7d86c3f 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -99,10 +99,7 @@ struct tp_probes { }; /* Called in removal of a func but failed to allocate a new tp_funcs */ -static void tp_stub_func(void) -{ - return; -} +static DEFINE_CFI_IMMEDIATE_RETURN_STUB(tp_stub_func); static inline void *allocate_probes(int count) { -- 2.33.0.309.g3052b89438-goog
With CONFIG_CFI_CLANG, the compiler changes function references to point to the CFI jump table. As ftrace_call, ftrace_regs_call, and mcount_call are not called from C, use DECLARE_ASM_FUNC_SYMBOL to declare them. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/ftrace.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 832e65f06754..67de28464aeb 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -578,9 +578,10 @@ extern void ftrace_replace_code(int enable); extern int ftrace_update_ftrace_func(ftrace_func_t func); extern void ftrace_caller(void); extern void ftrace_regs_caller(void); -extern void ftrace_call(void); -extern void ftrace_regs_call(void); -extern void mcount_call(void); + +DECLARE_ASM_FUNC_SYMBOL(ftrace_call); +DECLARE_ASM_FUNC_SYMBOL(ftrace_regs_call); +DECLARE_ASM_FUNC_SYMBOL(mcount_call); void ftrace_modify_all_code(int command); -- 2.33.0.309.g3052b89438-goog
Disable the UNSET_SMEP test when CONFIG_CFI_CLANG is enabled as jumping to a call gadget would always trip CFI instead. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- drivers/misc/lkdtm/bugs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c index 4282b625200f..6e8677852262 100644 --- a/drivers/misc/lkdtm/bugs.c +++ b/drivers/misc/lkdtm/bugs.c @@ -367,7 +367,7 @@ void lkdtm_STACK_GUARD_PAGE_TRAILING(void) void lkdtm_UNSET_SMEP(void) { -#if IS_ENABLED(CONFIG_X86_64) && !IS_ENABLED(CONFIG_UML) +#if IS_ENABLED(CONFIG_X86_64) && !IS_ENABLED(CONFIG_UML) && !IS_ENABLED(CONFIG_CFI_CLANG) #define MOV_CR4_DEPTH 64 void (*direct_write_cr4)(unsigned long val); unsigned char *insn; -- 2.33.0.309.g3052b89438-goog
Use an opaque type for lkdtm_rodata_do_nothing to stop the compiler from generating a CFI jump table entry that jumps to .rodata. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- drivers/misc/lkdtm/lkdtm.h | 2 +- drivers/misc/lkdtm/perms.c | 2 +- drivers/misc/lkdtm/rodata.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h index c212a253edde..2da74236c005 100644 --- a/drivers/misc/lkdtm/lkdtm.h +++ b/drivers/misc/lkdtm/lkdtm.h @@ -137,7 +137,7 @@ void lkdtm_REFCOUNT_TIMING(void); void lkdtm_ATOMIC_TIMING(void); /* rodata.c */ -void lkdtm_rodata_do_nothing(void); +DECLARE_ASM_FUNC_SYMBOL(lkdtm_rodata_do_nothing); /* usercopy.c */ void __init lkdtm_usercopy_init(void); diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c index 2dede2ef658f..fa2bd90bd8ee 100644 --- a/drivers/misc/lkdtm/perms.c +++ b/drivers/misc/lkdtm/perms.c @@ -151,7 +151,7 @@ void lkdtm_EXEC_VMALLOC(void) void lkdtm_EXEC_RODATA(void) { - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS); + execute_location((void *)lkdtm_rodata_do_nothing, CODE_AS_IS); } void lkdtm_EXEC_USERSPACE(void) diff --git a/drivers/misc/lkdtm/rodata.c b/drivers/misc/lkdtm/rodata.c index baacb876d1d9..17ed0ad4e6ae 100644 --- a/drivers/misc/lkdtm/rodata.c +++ b/drivers/misc/lkdtm/rodata.c @@ -3,7 +3,7 @@ * This includes functions that are meant to live entirely in .rodata * (via objcopy tricks), to validate the non-executability of .rodata. */ -#include "lkdtm.h" +void lkdtm_rodata_do_nothing(void); void noinstr lkdtm_rodata_do_nothing(void) { -- 2.33.0.309.g3052b89438-goog
The kernel has several assembly functions that are not directly callable from C. Use an opaque type for these function prototypes to make misuse harder, and to avoid the need to annotate references to these functions for Clang's Control-Flow Integrity (CFI). Suggested-by: Andy Lutomirski <luto@amacapital.net> Suggested-by: Alexander Lobakin <alobakin@pm.me> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/include/asm/ftrace.h | 2 +- arch/x86/include/asm/idtentry.h | 10 +++++----- arch/x86/include/asm/page_64.h | 7 ++++--- arch/x86/include/asm/paravirt_types.h | 3 ++- arch/x86/include/asm/processor.h | 2 +- arch/x86/include/asm/proto.h | 25 +++++++++++++------------ arch/x86/include/asm/uaccess_64.h | 9 +++------ arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/paravirt.c | 4 ++-- arch/x86/kvm/emulate.c | 4 ++-- arch/x86/kvm/kvm_emulate.h | 9 ++------- arch/x86/xen/enlighten_pv.c | 6 +++--- arch/x86/xen/xen-ops.h | 10 +++++----- 14 files changed, 45 insertions(+), 50 deletions(-) diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h index 9f3130f40807..54d23f421c16 100644 --- a/arch/x86/include/asm/ftrace.h +++ b/arch/x86/include/asm/ftrace.h @@ -17,7 +17,7 @@ #ifndef __ASSEMBLY__ extern atomic_t modifying_ftrace_code; -extern void __fentry__(void); +DECLARE_ASM_FUNC_SYMBOL(__fentry__); static inline unsigned long ftrace_call_adjust(unsigned long addr) { diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 1345088e9902..2f6d0528bdd2 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -27,8 +27,8 @@ * as well which is used to emit the entry stubs in entry_32/64.S. */ #define DECLARE_IDTENTRY(vector, func) \ - asmlinkage void asm_##func(void); \ - asmlinkage void xen_asm_##func(void); \ + DECLARE_ASM_FUNC_SYMBOL(asm_##func); \ + DECLARE_ASM_FUNC_SYMBOL(xen_asm_##func); \ __visible void func(struct pt_regs *regs) /** @@ -78,8 +78,8 @@ static __always_inline void __##func(struct pt_regs *regs) * C-handler. */ #define DECLARE_IDTENTRY_ERRORCODE(vector, func) \ - asmlinkage void asm_##func(void); \ - asmlinkage void xen_asm_##func(void); \ + DECLARE_ASM_FUNC_SYMBOL(asm_##func); \ + DECLARE_ASM_FUNC_SYMBOL(xen_asm_##func); \ __visible void func(struct pt_regs *regs, unsigned long error_code) /** @@ -386,7 +386,7 @@ static __always_inline void __##func(struct pt_regs *regs) * - The C handler called from the C shim */ #define DECLARE_IDTENTRY_DF(vector, func) \ - asmlinkage void asm_##func(void); \ + DECLARE_ASM_FUNC_SYMBOL(asm_##func); \ __visible void func(struct pt_regs *regs, \ unsigned long error_code, \ unsigned long address) diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index 4bde0dc66100..d6760b6773de 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -5,6 +5,7 @@ #include <asm/page_64_types.h> #ifndef __ASSEMBLY__ +#include <linux/linkage.h> #include <asm/alternative.h> /* duplicated to the one in bootmem.h */ @@ -40,9 +41,9 @@ extern unsigned long __phys_addr_symbol(unsigned long); #define pfn_valid(pfn) ((pfn) < max_pfn) #endif -void clear_page_orig(void *page); -void clear_page_rep(void *page); -void clear_page_erms(void *page); +DECLARE_ASM_FUNC_SYMBOL(clear_page_orig); +DECLARE_ASM_FUNC_SYMBOL(clear_page_rep); +DECLARE_ASM_FUNC_SYMBOL(clear_page_erms); static inline void clear_page(void *page) { diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index d9d6b0203ec4..dfaa50d20d6a 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -38,6 +38,7 @@ #include <asm/desc_defs.h> #include <asm/pgtable_types.h> #include <asm/nospec-branch.h> +#include <asm/proto.h> struct page; struct thread_struct; @@ -271,7 +272,7 @@ struct paravirt_patch_template { extern struct pv_info pv_info; extern struct paravirt_patch_template pv_ops; -extern void (*paravirt_iret)(void); +extern asm_func_ptr paravirt_iret; #define PARAVIRT_PATCH(x) \ (offsetof(struct paravirt_patch_template, x) / sizeof(void *)) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..3f5454c9b121 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -449,7 +449,7 @@ static inline unsigned long cpu_kernelmode_gs_base(int cpu) DECLARE_PER_CPU(void *, hardirq_stack_ptr); DECLARE_PER_CPU(bool, hardirq_stack_inuse); -extern asmlinkage void ignore_sysret(void); +DECLARE_ASM_FUNC_SYMBOL(ignore_sysret); /* Save actual FS/GS selectors and bases to current->thread */ void current_save_fsgs(void); diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h index 8c5d1910a848..a6aa64eb3657 100644 --- a/arch/x86/include/asm/proto.h +++ b/arch/x86/include/asm/proto.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_PROTO_H #define _ASM_X86_PROTO_H +#include <linux/linkage.h> #include <asm/ldt.h> struct task_struct; @@ -11,26 +12,26 @@ struct task_struct; void syscall_init(void); #ifdef CONFIG_X86_64 -void entry_SYSCALL_64(void); -void entry_SYSCALL_64_safe_stack(void); +DECLARE_ASM_FUNC_SYMBOL(entry_SYSCALL_64); +DECLARE_ASM_FUNC_SYMBOL(entry_SYSCALL_64_safe_stack); long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2); #endif #ifdef CONFIG_X86_32 -void entry_INT80_32(void); -void entry_SYSENTER_32(void); -void __begin_SYSENTER_singlestep_region(void); -void __end_SYSENTER_singlestep_region(void); +DECLARE_ASM_FUNC_SYMBOL(entry_INT80_32); +DECLARE_ASM_FUNC_SYMBOL(entry_SYSENTER_32); +DECLARE_ASM_FUNC_SYMBOL(__begin_SYSENTER_singlestep_region); +DECLARE_ASM_FUNC_SYMBOL(__end_SYSENTER_singlestep_region); #endif #ifdef CONFIG_IA32_EMULATION -void entry_SYSENTER_compat(void); -void __end_entry_SYSENTER_compat(void); -void entry_SYSCALL_compat(void); -void entry_SYSCALL_compat_safe_stack(void); -void entry_INT80_compat(void); +DECLARE_ASM_FUNC_SYMBOL(entry_SYSENTER_compat); +DECLARE_ASM_FUNC_SYMBOL(__end_entry_SYSENTER_compat); +DECLARE_ASM_FUNC_SYMBOL(entry_SYSCALL_compat); +DECLARE_ASM_FUNC_SYMBOL(entry_SYSCALL_compat_safe_stack); +DECLARE_ASM_FUNC_SYMBOL(entry_INT80_compat); #ifdef CONFIG_XEN_PV -void xen_entry_INT80_compat(void); +DECLARE_ASM_FUNC_SYMBOL(xen_entry_INT80_compat); #endif #endif diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index 45697e04d771..df2be1efa35e 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -17,12 +17,9 @@ */ /* Handles exceptions in both to and from, but doesn't do access_ok */ -__must_check unsigned long -copy_user_enhanced_fast_string(void *to, const void *from, unsigned len); -__must_check unsigned long -copy_user_generic_string(void *to, const void *from, unsigned len); -__must_check unsigned long -copy_user_generic_unrolled(void *to, const void *from, unsigned len); +DECLARE_ASM_FUNC_SYMBOL(copy_user_enhanced_fast_string); +DECLARE_ASM_FUNC_SYMBOL(copy_user_generic_string); +DECLARE_ASM_FUNC_SYMBOL(copy_user_generic_unrolled); static __always_inline __must_check unsigned long copy_user_generic(void *to, const void *from, unsigned len) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index e9da3dc71254..0c60a7fa6fa5 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -530,7 +530,7 @@ extern struct paravirt_patch_site __start_parainstructions[], * convention such that we can 'call' it from assembly. */ -extern void int3_magic(unsigned int *ptr); /* defined in asm */ +DECLARE_ASM_FUNC_SYMBOL(int3_magic); asm ( " .pushsection .init.text, \"ax\", @progbits\n" diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 1b3ce3b4a2a2..9e0c07a82b44 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -589,7 +589,7 @@ void arch_ftrace_trampoline_free(struct ftrace_ops *ops) #ifdef CONFIG_FUNCTION_GRAPH_TRACER #ifdef CONFIG_DYNAMIC_FTRACE -extern void ftrace_graph_call(void); +DECLARE_ASM_FUNC_SYMBOL(ftrace_graph_call); static const char *ftrace_jmp_replace(unsigned long ip, unsigned long addr) { diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 04cafc057bed..4196902527d1 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -138,7 +138,7 @@ void paravirt_set_sched_clock(u64 (*func)(void)) } /* These are in entry.S */ -extern void native_iret(void); +DECLARE_ASM_FUNC_SYMBOL(native_iret); static struct resource reserve_ioports = { .start = 0, @@ -376,7 +376,7 @@ NOKPROBE_SYMBOL(native_get_debugreg); NOKPROBE_SYMBOL(native_set_debugreg); NOKPROBE_SYMBOL(native_load_idt); -void (*paravirt_iret)(void) = native_iret; +asm_func_ptr paravirt_iret = native_iret; #endif EXPORT_SYMBOL(pv_ops); diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 2837110e66ed..1f81f939d982 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -201,7 +201,7 @@ struct opcode { const struct escape *esc; const struct instr_dual *idual; const struct mode_dual *mdual; - void (*fastop)(struct fastop *fake); + fastop_t fastop; } u; int (*check_perm)(struct x86_emulate_ctxt *ctxt); }; @@ -322,7 +322,7 @@ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop); __FOP_RET(#name) #define FOP_START(op) \ - extern void em_##op(struct fastop *fake); \ + DECLARE_ASM_FUNC_SYMBOL(em_##op); \ asm(".pushsection .text, \"ax\" \n\t" \ ".global em_" #op " \n\t" \ ".align " __stringify(FASTOP_SIZE) " \n\t" \ diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h index 68b420289d7e..44c1a9324e1c 100644 --- a/arch/x86/kvm/kvm_emulate.h +++ b/arch/x86/kvm/kvm_emulate.h @@ -290,13 +290,8 @@ enum x86emul_mode { #define X86EMUL_SMM_MASK (1 << 6) #define X86EMUL_SMM_INSIDE_NMI_MASK (1 << 7) -/* - * fastop functions are declared as taking a never-defined fastop parameter, - * so they can't be called from C directly. - */ -struct fastop; - -typedef void (*fastop_t)(struct fastop *); +/* fastop functions cannot be called from C directly. */ +typedef asm_func_ptr fastop_t; struct x86_emulate_ctxt { void *vcpu; diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 753f63734c13..398ba060185a 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -612,8 +612,8 @@ DEFINE_IDTENTRY_RAW(xenpv_exc_machine_check) #endif struct trap_array_entry { - void (*orig)(void); - void (*xen)(void); + asm_func_ptr orig; + asm_func_ptr xen; bool ist_okay; }; @@ -672,7 +672,7 @@ static bool __ref get_trap_addr(void **addr, unsigned int ist) struct trap_array_entry *entry = trap_array + nr; if (*addr == entry->orig) { - *addr = entry->xen; + *addr = (void *)entry->xen; ist_okay = entry->ist_okay; found = true; break; diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index 8d7ec49a35fb..b5ceb3007cfe 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -8,12 +8,12 @@ #include <xen/xen-ops.h> /* These are code, but not functions. Defined in entry.S */ -extern const char xen_failsafe_callback[]; +DECLARE_ASM_FUNC_SYMBOL(xen_failsafe_callback); -void xen_sysenter_target(void); +DECLARE_ASM_FUNC_SYMBOL(xen_sysenter_target); #ifdef CONFIG_X86_64 -void xen_syscall_target(void); -void xen_syscall32_target(void); +DECLARE_ASM_FUNC_SYMBOL(xen_syscall_target); +DECLARE_ASM_FUNC_SYMBOL(xen_syscall32_target); #endif extern void *xen_initial_gdt; @@ -136,7 +136,7 @@ __visible unsigned long xen_read_cr2(void); __visible unsigned long xen_read_cr2_direct(void); /* These are not functions, and cannot be called normally */ -__visible void xen_iret(void); +DECLARE_ASM_FUNC_SYMBOL(xen_iret); extern int xen_panic_handler_init(void); -- 2.33.0.309.g3052b89438-goog
Exception tables are populated in assembly code, but the handlers are called in fixup_exception, which trips indirect call checking with CONFIG_CFI_CLANG. Mark the handlers __cficanonical to allow addresses taken in assembly to pass CFI checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/mm/extable.c | 64 ++++++++++++++++++++++++------------------- 1 file changed, 36 insertions(+), 28 deletions(-) diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index e1664e9f969c..d16912dcbb4e 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -24,16 +24,18 @@ ex_fixup_handler(const struct exception_table_entry *x) return (ex_handler_t)((unsigned long)&x->handler + x->handler); } -__visible bool ex_handler_default(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_default(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { regs->ip = ex_fixup_addr(fixup); return true; } EXPORT_SYMBOL(ex_handler_default); +__visible __cficanonical __visible bool ex_handler_fault(const struct exception_table_entry *fixup, struct pt_regs *regs, int trapnr, unsigned long error_code, @@ -55,10 +57,11 @@ EXPORT_SYMBOL_GPL(ex_handler_fault); * of vulnerability by restoring from the initial state (essentially, zeroing * out all the FPU registers) if we can't restore from the task's FPU state. */ -__visible bool ex_handler_fprestore(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_fprestore(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { regs->ip = ex_fixup_addr(fixup); @@ -70,10 +73,11 @@ __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup, } EXPORT_SYMBOL_GPL(ex_handler_fprestore); -__visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_uaccess(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. Non-canonical address?"); regs->ip = ex_fixup_addr(fixup); @@ -81,10 +85,11 @@ __visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, } EXPORT_SYMBOL(ex_handler_uaccess); -__visible bool ex_handler_copy(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_copy(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. Non-canonical address?"); regs->ip = ex_fixup_addr(fixup); @@ -93,10 +98,11 @@ __visible bool ex_handler_copy(const struct exception_table_entry *fixup, } EXPORT_SYMBOL(ex_handler_copy); -__visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { if (pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n", (unsigned int)regs->cx, regs->ip, (void *)regs->ip)) @@ -110,10 +116,11 @@ __visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup } EXPORT_SYMBOL(ex_handler_rdmsr_unsafe); -__visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { if (pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n", (unsigned int)regs->cx, (unsigned int)regs->dx, @@ -126,10 +133,11 @@ __visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup } EXPORT_SYMBOL(ex_handler_wrmsr_unsafe); -__visible bool ex_handler_clear_fs(const struct exception_table_entry *fixup, - struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr) +__visible __cficanonical +bool ex_handler_clear_fs(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) { if (static_cpu_has(X86_BUG_NULL_SEG)) asm volatile ("mov %0, %%fs" : : "rm" (__USER_DS)); -- 2.33.0.309.g3052b89438-goog
Disable CONFIG_CFI_CLANG for the stand-alone purgatory.ro. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/purgatory/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile index 95ea17a9d20c..ed46ad780130 100644 --- a/arch/x86/purgatory/Makefile +++ b/arch/x86/purgatory/Makefile @@ -31,7 +31,7 @@ KCOV_INSTRUMENT := n # These are adjustments to the compiler flags used for objects that # make up the standalone purgatory.ro -PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel +PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel $(CC_FLAGS_CFI) PURGATORY_CFLAGS := -mcmodel=large -ffreestanding -fno-zero-initialized-in-bss -g0 PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING PURGATORY_CFLAGS += -fno-stack-protector -- 2.33.0.309.g3052b89438-goog
From: Kees Cook <keescook@chromium.org> The __typeid__* symbols aren't actually relocations, so they can be ignored during relocation generation. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/tools/relocs.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index 27c82207d387..5304a6037924 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -51,6 +51,7 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = { "^(xen_irq_disable_direct_reloc$|" "xen_save_fl_direct_reloc$|" "VDSO|" + "__typeid__|" "__crc_)", /* @@ -811,6 +812,12 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym, symname); break; + case R_X86_64_8: + if (!shn_abs || !is_reloc(S_ABS, symname)) + die("Non-whitelisted %s relocation: %s\n", + rel_type(r_type), symname); + break; + case R_X86_64_32: case R_X86_64_32S: case R_X86_64_64: -- 2.33.0.309.g3052b89438-goog
Ignore the __typeid__ relocations generated with CONFIG_CFI_CLANG when loading modules. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/kernel/module.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 5e9a34b5bd74..c4aeba237eef 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -197,6 +197,10 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs, val -= (u64)loc; write(loc, &val, 8); break; + case R_X86_64_8: + if (!strncmp(strtab + sym->st_name, "__typeid__", 10)) + break; + fallthrough; default: pr_err("%s: Unknown rela relocation: %llu\n", me->name, ELF64_R_TYPE(rel[i].r_info)); -- 2.33.0.309.g3052b89438-goog
Allow LTO to be used for cpu.c when CONFIG_CFI_CLANG is enabled to avoid indirect call failures. CFI requires Clang >= 13, which doesn't have the stack protector inlining bug. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/power/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile index 379777572bc9..a0532851fed7 100644 --- a/arch/x86/power/Makefile +++ b/arch/x86/power/Makefile @@ -4,9 +4,11 @@ # itself be stack-protected CFLAGS_cpu.o := -fno-stack-protector +ifndef CONFIG_CFI_CLANG # Clang may incorrectly inline functions with stack protector enabled into # __restore_processor_state(): https://bugs.llvm.org/show_bug.cgi?id=47479 CFLAGS_REMOVE_cpu.o := $(CC_FLAGS_LTO) +endif obj-$(CONFIG_PM_SLEEP) += cpu.o obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o hibernate.o -- 2.33.0.309.g3052b89438-goog
The optprobe_template_func symbol is defined in inline assembly, but it's not marked global, which conflicts with the C declaration needed for STACK_FRAME_NON_STANDARD and confuses the compiler when CONFIG_CFI_CLANG is enabled. Marking the symbol global would make the compiler happy, but as the compiler also generates a CFI jump table entry for all address-taken functions, the jump table ends up containing a jump to the .rodata section where optprobe_template_func resides, which results in an objtool warning. Use ASM_STACK_FRAME_NON_STANDARD instead to avoid both issues. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/kernel/kprobes/opt.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c index 71425ebba98a..95375ef5deee 100644 --- a/arch/x86/kernel/kprobes/opt.c +++ b/arch/x86/kernel/kprobes/opt.c @@ -103,6 +103,7 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val) asm ( ".pushsection .rodata\n" "optprobe_template_func:\n" + ASM_STACK_FRAME_NON_STANDARD(optprobe_template_func) ".global optprobe_template_entry\n" "optprobe_template_entry:\n" #ifdef CONFIG_X86_64 @@ -154,9 +155,6 @@ asm ( "optprobe_template_end:\n" ".popsection\n"); -void optprobe_template_func(void); -STACK_FRAME_NON_STANDARD(optprobe_template_func); - #define TMPL_CLAC_IDX \ ((long)optprobe_template_clac - (long)optprobe_template_entry) #define TMPL_MOVE_IDX \ -- 2.33.0.309.g3052b89438-goog
Select ARCH_SUPPORTS_CFI_CLANG to allow CFI to be enabled with Clang >= 13. Link: https://bugs.llvm.org/show_bug.cgi?id=51588 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 4e001bbbb425..0df0285d3ed4 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -107,6 +107,7 @@ config X86 select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096 select ARCH_SUPPORTS_LTO_CLANG select ARCH_SUPPORTS_LTO_CLANG_THIN + select ARCH_SUPPORTS_CFI_CLANG if X86_64 && CLANG_VERSION >= 130000 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_MEMTEST select ARCH_USE_QUEUED_RWLOCKS -- 2.33.0.309.g3052b89438-goog
On Tue, Sep 14, 2021 at 12:10 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > With CONFIG_CFI_CLANG, the compiler replaces function references with > references to the CFI jump table, which confuses objtool. This change, > based on Josh's initial patch [1], goes through the list of relocations > and replaces jump table symbols with the actual function symbols. > > [1] https://lore.kernel.org/r/d743f4b36e120c06506567a9f87a062ae03da47f.1611263462.git.jpoimboe@redhat.com/ > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > tools/objtool/arch/x86/decode.c | 16 +++++++++ > tools/objtool/elf.c | 51 ++++++++++++++++++++++++++++ > tools/objtool/include/objtool/arch.h | 3 ++ > tools/objtool/include/objtool/elf.h | 2 +- > 4 files changed, 71 insertions(+), 1 deletion(-) > > diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c > index bc821056aba9..318189c8065e 100644 > --- a/tools/objtool/arch/x86/decode.c > +++ b/tools/objtool/arch/x86/decode.c > @@ -62,6 +62,22 @@ bool arch_callee_saved_reg(unsigned char reg) > } > } > > +unsigned long arch_cfi_section_reloc_offset(struct reloc *reloc) > +{ > + if (!reloc->addend) > + return 0; > + > + if (reloc->type == R_X86_64_PC32 || reloc->type == R_X86_64_PLT32) > + return reloc->addend + 4; > + > + return reloc->addend; > +} > + > +unsigned long arch_cfi_jump_reloc_offset(unsigned long offset) > +{ > + return offset + 1; > +} > + > unsigned long arch_dest_reloc_offset(int addend) > { > return addend + 4; > diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c > index 8676c7598728..05a5f51aad2c 100644 > --- a/tools/objtool/elf.c > +++ b/tools/objtool/elf.c > @@ -18,6 +18,7 @@ > #include <errno.h> > #include <objtool/builtin.h> > > +#include <objtool/arch.h> > #include <objtool/elf.h> > #include <objtool/warn.h> > > @@ -291,6 +292,10 @@ static int read_sections(struct elf *elf) > if (sec->sh.sh_flags & SHF_EXECINSTR) > elf->text_size += sec->len; > > + /* Detect -fsanitize=cfi jump table sections */ > + if (!strncmp(sec->name, ".text..L.cfi.jumptable", 22)) > + sec->cfi_jt = true; > + > list_add_tail(&sec->list, &elf->sections); > elf_hash_add(section, &sec->hash, sec->idx); > elf_hash_add(section_name, &sec->name_hash, str_hash(sec->name)); > @@ -576,6 +581,49 @@ static int read_rela_reloc(struct section *sec, int i, struct reloc *reloc, unsi > return 0; > } > > +/* > + * CONFIG_CFI_CLANG replaces function relocations to refer to an intermediate > + * jump table. Undo the conversion so objtool can make sense of things. > + */ > +static int fix_cfi_relocs(const struct elf *elf) > +{ > + struct section *sec; > + struct reloc *reloc; > + > + list_for_each_entry(sec, &elf->sections, list) { > + list_for_each_entry(reloc, &sec->reloc_list, list) { > + struct reloc *cfi_reloc; > + unsigned long offset; > + > + if (!reloc->sym->sec->cfi_jt) > + continue; > + > + if (reloc->sym->type == STT_SECTION) > + offset = arch_cfi_section_reloc_offset(reloc); > + else > + offset = reloc->sym->offset; > + > + /* > + * The jump table immediately jumps to the actual function, > + * so look up the relocation there. > + */ > + offset = arch_cfi_jump_reloc_offset(offset); Sorry, this comment is curious to me, it looks like we jump to the offset+1, not directly to the actual function? Perhaps a comment above arch_cfi_jump_reloc_offset() and/or amending this comment might make it clearer? Sorry if this is obvious to others? Perhaps comments can be cleaned up in a follow up, if this is not a bug? > + cfi_reloc = find_reloc_by_dest(elf, reloc->sym->sec, offset); > + > + if (!cfi_reloc || !cfi_reloc->sym) { > + WARN("can't find a CFI jump table relocation at %s+0x%lx", > + reloc->sym->sec->name, offset); > + return -1; > + } > + > + reloc->sym = cfi_reloc->sym; > + reloc->addend = 0; > + } > + } > + > + return 0; > +} > + > static int read_relocs(struct elf *elf) > { > struct section *sec; > @@ -639,6 +687,9 @@ static int read_relocs(struct elf *elf) > tot_reloc += nr_reloc; > } > > + if (fix_cfi_relocs(elf)) > + return -1; > + > if (stats) { > printf("max_reloc: %lu\n", max_reloc); > printf("tot_reloc: %lu\n", tot_reloc); > diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h > index 062bb6e9b865..2205b2b08268 100644 > --- a/tools/objtool/include/objtool/arch.h > +++ b/tools/objtool/include/objtool/arch.h > @@ -81,6 +81,9 @@ unsigned long arch_jump_destination(struct instruction *insn); > > unsigned long arch_dest_reloc_offset(int addend); > > +unsigned long arch_cfi_section_reloc_offset(struct reloc *reloc); > +unsigned long arch_cfi_jump_reloc_offset(unsigned long offset); > + > const char *arch_nop_insn(int len); > > int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg); > diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h > index e34395047530..d9c1dacc6572 100644 > --- a/tools/objtool/include/objtool/elf.h > +++ b/tools/objtool/include/objtool/elf.h > @@ -39,7 +39,7 @@ struct section { > char *name; > int idx; > unsigned int len; > - bool changed, text, rodata, noinstr; > + bool changed, text, rodata, noinstr, cfi_jt; > }; > > struct symbol { > -- > 2.33.0.309.g3052b89438-goog > -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 12:10:36PM -0700, Sami Tolvanen wrote:
> Disable the UNSET_SMEP test when CONFIG_CFI_CLANG is enabled as
> jumping to a call gadget would always trip CFI instead.
>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Yeah, (thankfully) this test can't work sanely under CFI.
Acked-by: Kees Cook <keescook@chromium.org>
--
Kees Cook
On Tue, Sep 14, 2021 at 12:10:37PM -0700, Sami Tolvanen wrote:
> Use an opaque type for lkdtm_rodata_do_nothing to stop the compiler
> from generating a CFI jump table entry that jumps to .rodata.
>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
--
Kees Cook
On Tue, Sep 14, 2021 at 12:10:38PM -0700, Sami Tolvanen wrote:
> The kernel has several assembly functions that are not directly callable
> from C. Use an opaque type for these function prototypes to make misuse
> harder, and to avoid the need to annotate references to these functions
> for Clang's Control-Flow Integrity (CFI).
>
> Suggested-by: Andy Lutomirski <luto@amacapital.net>
> Suggested-by: Alexander Lobakin <alobakin@pm.me>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
This looks clean to me.
Reviewed-by: Kees Cook <keescook@chromium.org>
--
Kees Cook
On Tue, Sep 14, 2021 at 12:10 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > This change introduces the DEFINE_CFI_IMMEDIATE_RETURN_STUB macro, > which defines a stub function that immediately returns and when > defined in the core kernel, always passes indirect call checking > with CONFIG_CFI_CLANG. Note that this macro should only be used when > a stub cannot be called using the correct function type. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > include/asm-generic/vmlinux.lds.h | 11 +++++++++++ > include/linux/cfi.h | 14 ++++++++++++++ > kernel/cfi.c | 24 +++++++++++++++++++++++- > 3 files changed, 48 insertions(+), 1 deletion(-) > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h > index f2984af2b85b..5b77284f7221 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -407,6 +407,16 @@ > KEEP(*(.static_call_tramp_key)) \ > __stop_static_call_tramp_key = .; > > +#ifdef CONFIG_CFI_CLANG > +#define CFI_EXCLUDED_DATA \ > + . = ALIGN(8); \ > + __start_cfi_excluded = .; \ > + KEEP(*(.cfi_excluded_stubs)) \ > + __stop_cfi_excluded = .; > +#else > +#define CFI_EXCLUDED_DATA > +#endif > + > /* > * Allow architectures to handle ro_after_init data on their > * own by defining an empty RO_AFTER_INIT_DATA. > @@ -430,6 +440,7 @@ > __start_rodata = .; \ > *(.rodata) *(.rodata.*) \ > SCHED_DATA \ > + CFI_EXCLUDED_DATA \ > RO_AFTER_INIT_DATA /* Read only after init */ \ > . = ALIGN(8); \ > __start___tracepoints_ptrs = .; \ > diff --git a/include/linux/cfi.h b/include/linux/cfi.h > index 879744aaa6e0..9ebf67a0d421 100644 > --- a/include/linux/cfi.h > +++ b/include/linux/cfi.h > @@ -20,6 +20,18 @@ extern void __cfi_check(uint64_t id, void *ptr, void *diag); > #define __CFI_ADDRESSABLE(fn, __attr) \ > const void *__cfi_jt_ ## fn __visible __attr = (void *)&fn > > +/* > + * Defines a stub function that returns immediately, and when defined and > + * referenced in the core kernel, always passes CFI checking. This should > + * be used only for stubs that cannot be called using the correct function > + * pointer type, which should be rare. > + */ > +#define DEFINE_CFI_IMMEDIATE_RETURN_STUB(fn) \ > + void fn(void) { return; } \ > + const void *__cfi_excl_ ## fn __visible \ > + __attribute__((__section__(".cfi_excluded_stubs"))) \ Can we use __section from include/linux/compiler_attributes.h here rather than open coding the function attribute? > + = (void *)&fn > + > #ifdef CONFIG_CFI_CLANG_SHADOW > > extern void cfi_module_add(struct module *mod, unsigned long base_addr); > @@ -35,6 +47,8 @@ static inline void cfi_module_remove(struct module *mod, unsigned long base_addr > #else /* !CONFIG_CFI_CLANG */ > > #define __CFI_ADDRESSABLE(fn, __attr) > +#define DEFINE_CFI_IMMEDIATE_RETURN_STUB(fn) \ > + void fn(void) { return; } > > #endif /* CONFIG_CFI_CLANG */ > > diff --git a/kernel/cfi.c b/kernel/cfi.c > index 9594cfd1cf2c..8d931089141b 100644 > --- a/kernel/cfi.c > +++ b/kernel/cfi.c > @@ -278,12 +278,34 @@ static inline cfi_check_fn find_module_check_fn(unsigned long ptr) > return fn; > } > > +extern unsigned long __start_cfi_excluded[]; > +extern unsigned long __stop_cfi_excluded[]; > + > +static inline bool is_cfi_excluded(unsigned long ptr) > +{ > + unsigned long *p = __start_cfi_excluded; > + > + for ( ; p < __stop_cfi_excluded; ++p) > + if (*p == ptr) > + return true; > + > + return false; > +} > + > +static void __cfi_pass(uint64_t id, void *ptr, void *diag) > +{ > +} > + > static inline cfi_check_fn find_check_fn(unsigned long ptr) > { > cfi_check_fn fn = NULL; > > - if (is_kernel_text(ptr)) > + if (is_kernel_text(ptr)) { > + if (unlikely(is_cfi_excluded(ptr))) > + return __cfi_pass; > + > return __cfi_check; > + } > > /* > * Indirect call checks can happen when RCU is not watching. Both > -- > 2.33.0.309.g3052b89438-goog > -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 12:10:39PM -0700, Sami Tolvanen wrote: > Exception tables are populated in assembly code, but the handlers are > called in fixup_exception, which trips indirect call checking with > CONFIG_CFI_CLANG. Mark the handlers __cficanonical to allow addresses > taken in assembly to pass CFI checking. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > arch/x86/mm/extable.c | 64 ++++++++++++++++++++++++------------------- > 1 file changed, 36 insertions(+), 28 deletions(-) > > diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c > index e1664e9f969c..d16912dcbb4e 100644 > --- a/arch/x86/mm/extable.c > +++ b/arch/x86/mm/extable.c > @@ -24,16 +24,18 @@ ex_fixup_handler(const struct exception_table_entry *x) > return (ex_handler_t)((unsigned long)&x->handler + x->handler); > } > > -__visible bool ex_handler_default(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_default(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > regs->ip = ex_fixup_addr(fixup); > return true; > } > EXPORT_SYMBOL(ex_handler_default); > > +__visible __cficanonical > __visible bool ex_handler_fault(const struct exception_table_entry *fixup, Double __visible here, but with that fixed: Reviewed-by: Kees Cook <keescook@chromium.org> I would note that given Linus's recent comments on attribute locations, it does seem that __cficanonical is more a function behavior attribute than a storage class... I'm not really sure: https://lore.kernel.org/mm-commits/CAHk-=wiOCLRny5aifWNhr621kYrJwhfURsa0vFPeUEm8mF0ufg@mail.gmail.com -Kees > struct pt_regs *regs, int trapnr, > unsigned long error_code, > @@ -55,10 +57,11 @@ EXPORT_SYMBOL_GPL(ex_handler_fault); > * of vulnerability by restoring from the initial state (essentially, zeroing > * out all the FPU registers) if we can't restore from the task's FPU state. > */ > -__visible bool ex_handler_fprestore(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_fprestore(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > regs->ip = ex_fixup_addr(fixup); > > @@ -70,10 +73,11 @@ __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup, > } > EXPORT_SYMBOL_GPL(ex_handler_fprestore); > > -__visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_uaccess(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. Non-canonical address?"); > regs->ip = ex_fixup_addr(fixup); > @@ -81,10 +85,11 @@ __visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, > } > EXPORT_SYMBOL(ex_handler_uaccess); > > -__visible bool ex_handler_copy(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_copy(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. Non-canonical address?"); > regs->ip = ex_fixup_addr(fixup); > @@ -93,10 +98,11 @@ __visible bool ex_handler_copy(const struct exception_table_entry *fixup, > } > EXPORT_SYMBOL(ex_handler_copy); > > -__visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > if (pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n", > (unsigned int)regs->cx, regs->ip, (void *)regs->ip)) > @@ -110,10 +116,11 @@ __visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup > } > EXPORT_SYMBOL(ex_handler_rdmsr_unsafe); > > -__visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > if (pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n", > (unsigned int)regs->cx, (unsigned int)regs->dx, > @@ -126,10 +133,11 @@ __visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup > } > EXPORT_SYMBOL(ex_handler_wrmsr_unsafe); > > -__visible bool ex_handler_clear_fs(const struct exception_table_entry *fixup, > - struct pt_regs *regs, int trapnr, > - unsigned long error_code, > - unsigned long fault_addr) > +__visible __cficanonical > +bool ex_handler_clear_fs(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > { > if (static_cpu_has(X86_BUG_NULL_SEG)) > asm volatile ("mov %0, %%fs" : : "rm" (__USER_DS)); > -- > 2.33.0.309.g3052b89438-goog > -- Kees Cook
On Tue, Sep 14, 2021 at 12:11 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > If allocate_probes fails, func_remove replaces the old function > with a pointer to tp_stub_func, which is called using a mismatching > function pointer that's will always trip indirect call checks with > CONFIG_CFI_CLANG. Use DEFINE_CFI_IMMEDATE_RETURN_STUB to define > tp_stub_func to allow it to pass CFI checking. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > --- > kernel/tracepoint.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 64ea283f2f86..58acc7d86c3f 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -99,10 +99,7 @@ struct tp_probes { > }; > > /* Called in removal of a func but failed to allocate a new tp_funcs */ > -static void tp_stub_func(void) > -{ > - return; > -} > +static DEFINE_CFI_IMMEDIATE_RETURN_STUB(tp_stub_func); > > static inline void *allocate_probes(int count) > { > -- > 2.33.0.309.g3052b89438-goog > -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 12:10:44PM -0700, Sami Tolvanen wrote:
> The optprobe_template_func symbol is defined in inline assembly,
> but it's not marked global, which conflicts with the C declaration
> needed for STACK_FRAME_NON_STANDARD and confuses the compiler when
> CONFIG_CFI_CLANG is enabled.
>
> Marking the symbol global would make the compiler happy, but as the
> compiler also generates a CFI jump table entry for all address-taken
> functions, the jump table ends up containing a jump to the .rodata
> section where optprobe_template_func resides, which results in an
> objtool warning.
>
> Use ASM_STACK_FRAME_NON_STANDARD instead to avoid both issues.
>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
--
Kees Cook
On Tue, Sep 14, 2021 at 12:10:43PM -0700, Sami Tolvanen wrote: > Allow LTO to be used for cpu.c when CONFIG_CFI_CLANG is enabled to avoid > indirect call failures. CFI requires Clang >= 13, which doesn't have the > stack protector inlining bug. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > arch/x86/power/Makefile | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile > index 379777572bc9..a0532851fed7 100644 > --- a/arch/x86/power/Makefile > +++ b/arch/x86/power/Makefile > @@ -4,9 +4,11 @@ > # itself be stack-protected > CFLAGS_cpu.o := -fno-stack-protector > > +ifndef CONFIG_CFI_CLANG > # Clang may incorrectly inline functions with stack protector enabled into > # __restore_processor_state(): https://bugs.llvm.org/show_bug.cgi?id=47479 > CFLAGS_REMOVE_cpu.o := $(CC_FLAGS_LTO) > +endif This seems weird -- it's tangential that CONFIG_CFI_CLANG is only on Clang >= 13, but it does seem better than adding a $(shell) to test the Clang version. So: Reviewed-by: Kees Cook <keescook@chromium.org> > > obj-$(CONFIG_PM_SLEEP) += cpu.o > obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o hibernate.o > -- > 2.33.0.309.g3052b89438-goog > -- Kees Cook
On Tue, Sep 14, 2021 at 12:11 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > Allow LTO to be used for cpu.c when CONFIG_CFI_CLANG is enabled to avoid > indirect call failures. CFI requires Clang >= 13, which doesn't have the > stack protector inlining bug. True, that was fixed in clang-12. I hope to one day use __attribute__((no_stack_protector)) to clean up this chem spill. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Technically, GCC still has this bug. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > arch/x86/power/Makefile | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile > index 379777572bc9..a0532851fed7 100644 > --- a/arch/x86/power/Makefile > +++ b/arch/x86/power/Makefile > @@ -4,9 +4,11 @@ > # itself be stack-protected > CFLAGS_cpu.o := -fno-stack-protector > > +ifndef CONFIG_CFI_CLANG > # Clang may incorrectly inline functions with stack protector enabled into > # __restore_processor_state(): https://bugs.llvm.org/show_bug.cgi?id=47479 > CFLAGS_REMOVE_cpu.o := $(CC_FLAGS_LTO) > +endif > > obj-$(CONFIG_PM_SLEEP) += cpu.o > obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o hibernate.o > -- > 2.33.0.309.g3052b89438-goog > -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 12:11 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > Disable CONFIG_CFI_CLANG for the stand-alone purgatory.ro. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> I kind of prefer the existing convention that has explicit guards on specific configs (ie. CONFIG_FUNCTION_TRACER, CONFIG_STACKPROTECTOR, CONFIG_STACKPROTECTOR_STRONG, CONFIG_RETPOLINE); it's more obvious which configs may introduce which flags that are problematic. This patch is ok as is, but it kind of makes this Makefile more inconsistent. I would prefer we had the explicit checks. Does CFI actually do any instrumentation in these object files? I guess issues in purgatory cause silent/hard to debug kexec failures? > --- > arch/x86/purgatory/Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile > index 95ea17a9d20c..ed46ad780130 100644 > --- a/arch/x86/purgatory/Makefile > +++ b/arch/x86/purgatory/Makefile > @@ -31,7 +31,7 @@ KCOV_INSTRUMENT := n > # These are adjustments to the compiler flags used for objects that > # make up the standalone purgatory.ro > > -PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel > +PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel $(CC_FLAGS_CFI) > PURGATORY_CFLAGS := -mcmodel=large -ffreestanding -fno-zero-initialized-in-bss -g0 > PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING > PURGATORY_CFLAGS += -fno-stack-protector > -- > 2.33.0.309.g3052b89438-goog > -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 1:02 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Sep 14, 2021 at 12:11 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > Disable CONFIG_CFI_CLANG for the stand-alone purgatory.ro. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > I kind of prefer the existing convention that has explicit guards on > specific configs (ie. CONFIG_FUNCTION_TRACER, CONFIG_STACKPROTECTOR, > CONFIG_STACKPROTECTOR_STRONG, CONFIG_RETPOLINE); it's more obvious > which configs may introduce which flags that are problematic. This > patch is ok as is, but it kind of makes this Makefile more > inconsistent. I would prefer we had the explicit checks. The Makefile does already use DISABLE_STACKLEAK_PLUGIN in a similar way, but I don't have a strong preference here. I can move this into an ifdef if it makes things cleaner. > Does CFI actually do any instrumentation in these object files? I > guess issues in purgatory cause silent/hard to debug kexec failures? The compiler shouldn't add any actual CFI instrumentation here right now, but I would prefer to avoid issues in future. Sami
On Tue, Sep 14, 2021 at 12:36 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Sep 14, 2021 at 12:10 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> >
> > This change introduces the DEFINE_CFI_IMMEDIATE_RETURN_STUB macro,
> > which defines a stub function that immediately returns and when
> > defined in the core kernel, always passes indirect call checking
> > with CONFIG_CFI_CLANG. Note that this macro should only be used when
> > a stub cannot be called using the correct function type.
> >
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > ---
> > include/asm-generic/vmlinux.lds.h | 11 +++++++++++
> > include/linux/cfi.h | 14 ++++++++++++++
> > kernel/cfi.c | 24 +++++++++++++++++++++++-
> > 3 files changed, 48 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > index f2984af2b85b..5b77284f7221 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -407,6 +407,16 @@
> > KEEP(*(.static_call_tramp_key)) \
> > __stop_static_call_tramp_key = .;
> >
> > +#ifdef CONFIG_CFI_CLANG
> > +#define CFI_EXCLUDED_DATA \
> > + . = ALIGN(8); \
> > + __start_cfi_excluded = .; \
> > + KEEP(*(.cfi_excluded_stubs)) \
> > + __stop_cfi_excluded = .;
> > +#else
> > +#define CFI_EXCLUDED_DATA
> > +#endif
> > +
> > /*
> > * Allow architectures to handle ro_after_init data on their
> > * own by defining an empty RO_AFTER_INIT_DATA.
> > @@ -430,6 +440,7 @@
> > __start_rodata = .; \
> > *(.rodata) *(.rodata.*) \
> > SCHED_DATA \
> > + CFI_EXCLUDED_DATA \
> > RO_AFTER_INIT_DATA /* Read only after init */ \
> > . = ALIGN(8); \
> > __start___tracepoints_ptrs = .; \
> > diff --git a/include/linux/cfi.h b/include/linux/cfi.h
> > index 879744aaa6e0..9ebf67a0d421 100644
> > --- a/include/linux/cfi.h
> > +++ b/include/linux/cfi.h
> > @@ -20,6 +20,18 @@ extern void __cfi_check(uint64_t id, void *ptr, void *diag);
> > #define __CFI_ADDRESSABLE(fn, __attr) \
> > const void *__cfi_jt_ ## fn __visible __attr = (void *)&fn
> >
> > +/*
> > + * Defines a stub function that returns immediately, and when defined and
> > + * referenced in the core kernel, always passes CFI checking. This should
> > + * be used only for stubs that cannot be called using the correct function
> > + * pointer type, which should be rare.
> > + */
> > +#define DEFINE_CFI_IMMEDIATE_RETURN_STUB(fn) \
> > + void fn(void) { return; } \
> > + const void *__cfi_excl_ ## fn __visible \
> > + __attribute__((__section__(".cfi_excluded_stubs"))) \
>
> Can we use __section from include/linux/compiler_attributes.h here
> rather than open coding the function attribute?
Sure. I'll change this in v4.
Sami
On Tue, Sep 14, 2021 at 12:37 PM Kees Cook <keescook@chromium.org> wrote: > > On Tue, Sep 14, 2021 at 12:10:39PM -0700, Sami Tolvanen wrote: > > Exception tables are populated in assembly code, but the handlers are > > called in fixup_exception, which trips indirect call checking with > > CONFIG_CFI_CLANG. Mark the handlers __cficanonical to allow addresses > > taken in assembly to pass CFI checking. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > arch/x86/mm/extable.c | 64 ++++++++++++++++++++++++------------------- > > 1 file changed, 36 insertions(+), 28 deletions(-) > > > > diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c > > index e1664e9f969c..d16912dcbb4e 100644 > > --- a/arch/x86/mm/extable.c > > +++ b/arch/x86/mm/extable.c > > @@ -24,16 +24,18 @@ ex_fixup_handler(const struct exception_table_entry *x) > > return (ex_handler_t)((unsigned long)&x->handler + x->handler); > > } > > > > -__visible bool ex_handler_default(const struct exception_table_entry *fixup, > > - struct pt_regs *regs, int trapnr, > > - unsigned long error_code, > > - unsigned long fault_addr) > > +__visible __cficanonical > > +bool ex_handler_default(const struct exception_table_entry *fixup, > > + struct pt_regs *regs, int trapnr, > > + unsigned long error_code, > > + unsigned long fault_addr) > > { > > regs->ip = ex_fixup_addr(fixup); > > return true; > > } > > EXPORT_SYMBOL(ex_handler_default); > > > > +__visible __cficanonical > > __visible bool ex_handler_fault(const struct exception_table_entry *fixup, > > Double __visible here, but with that fixed: Ah, thanks for noticing that. I'll fix it in the next version. > > Reviewed-by: Kees Cook <keescook@chromium.org> > > I would note that given Linus's recent comments on attribute locations, > it does seem that __cficanonical is more a function behavior attribute > than a storage class... I'm not really sure: > https://lore.kernel.org/mm-commits/CAHk-=wiOCLRny5aifWNhr621kYrJwhfURsa0vFPeUEm8mF0ufg@mail.gmail.com __visible is not really a storage class either, but I thought it would make sense to keep these attributes together. I can certainly move them if anyone has strong feelings about the location. Sami
On Tue, Sep 14, 2021 at 12:29 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Sep 14, 2021 at 12:10 PM Sami Tolvanen <samitolvanen@google.com> wrote:
> >
> > With CONFIG_CFI_CLANG, the compiler replaces function references with
> > references to the CFI jump table, which confuses objtool. This change,
> > based on Josh's initial patch [1], goes through the list of relocations
> > and replaces jump table symbols with the actual function symbols.
> >
> > [1] https://lore.kernel.org/r/d743f4b36e120c06506567a9f87a062ae03da47f.1611263462.git.jpoimboe@redhat.com/
> >
> > Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > ---
> > tools/objtool/arch/x86/decode.c | 16 +++++++++
> > tools/objtool/elf.c | 51 ++++++++++++++++++++++++++++
> > tools/objtool/include/objtool/arch.h | 3 ++
> > tools/objtool/include/objtool/elf.h | 2 +-
> > 4 files changed, 71 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
> > index bc821056aba9..318189c8065e 100644
> > --- a/tools/objtool/arch/x86/decode.c
> > +++ b/tools/objtool/arch/x86/decode.c
> > @@ -62,6 +62,22 @@ bool arch_callee_saved_reg(unsigned char reg)
> > }
> > }
> >
> > +unsigned long arch_cfi_section_reloc_offset(struct reloc *reloc)
> > +{
> > + if (!reloc->addend)
> > + return 0;
> > +
> > + if (reloc->type == R_X86_64_PC32 || reloc->type == R_X86_64_PLT32)
> > + return reloc->addend + 4;
> > +
> > + return reloc->addend;
> > +}
> > +
> > +unsigned long arch_cfi_jump_reloc_offset(unsigned long offset)
> > +{
> > + return offset + 1;
> > +}
> > +
> > unsigned long arch_dest_reloc_offset(int addend)
> > {
> > return addend + 4;
> > diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
> > index 8676c7598728..05a5f51aad2c 100644
> > --- a/tools/objtool/elf.c
> > +++ b/tools/objtool/elf.c
> > @@ -18,6 +18,7 @@
> > #include <errno.h>
> > #include <objtool/builtin.h>
> >
> > +#include <objtool/arch.h>
> > #include <objtool/elf.h>
> > #include <objtool/warn.h>
> >
> > @@ -291,6 +292,10 @@ static int read_sections(struct elf *elf)
> > if (sec->sh.sh_flags & SHF_EXECINSTR)
> > elf->text_size += sec->len;
> >
> > + /* Detect -fsanitize=cfi jump table sections */
> > + if (!strncmp(sec->name, ".text..L.cfi.jumptable", 22))
> > + sec->cfi_jt = true;
> > +
> > list_add_tail(&sec->list, &elf->sections);
> > elf_hash_add(section, &sec->hash, sec->idx);
> > elf_hash_add(section_name, &sec->name_hash, str_hash(sec->name));
> > @@ -576,6 +581,49 @@ static int read_rela_reloc(struct section *sec, int i, struct reloc *reloc, unsi
> > return 0;
> > }
> >
> > +/*
> > + * CONFIG_CFI_CLANG replaces function relocations to refer to an intermediate
> > + * jump table. Undo the conversion so objtool can make sense of things.
> > + */
> > +static int fix_cfi_relocs(const struct elf *elf)
> > +{
> > + struct section *sec;
> > + struct reloc *reloc;
> > +
> > + list_for_each_entry(sec, &elf->sections, list) {
> > + list_for_each_entry(reloc, &sec->reloc_list, list) {
> > + struct reloc *cfi_reloc;
> > + unsigned long offset;
> > +
> > + if (!reloc->sym->sec->cfi_jt)
> > + continue;
> > +
> > + if (reloc->sym->type == STT_SECTION)
> > + offset = arch_cfi_section_reloc_offset(reloc);
> > + else
> > + offset = reloc->sym->offset;
> > +
> > + /*
> > + * The jump table immediately jumps to the actual function,
> > + * so look up the relocation there.
> > + */
> > + offset = arch_cfi_jump_reloc_offset(offset);
>
> Sorry, this comment is curious to me, it looks like we jump to the
> offset+1, not directly to the actual function? Perhaps a comment
> above arch_cfi_jump_reloc_offset() and/or amending this comment might
> make it clearer? Sorry if this is obvious to others? Perhaps comments
> can be cleaned up in a follow up, if this is not a bug?
It looks like my response was sent only to Nick, so to summarize the
brief off-list discussion:
arch_cfi_jump_reloc_offset() returns the offset to a relocation when
given the address of a jmp instruction in the CFI jump table. Here's
an example:
Disassembly of section .text..L.cfi.jumptable:
0000000000000000 <_printk.cfi_jt>:
0: e9 00 00 00 00 jmp 0x5 <_printk.cfi_jt+0x5>
0000000000000001: R_X86_64_PLT32 _printk-0x4
5: cc int3
6: cc int3
7: cc int3
We look at the relocation in the jump table to figure out the actual
target function, in this case, _printk. Alternatively, we could look
at the jump table symbol instead (i.e., _printk.cfi_jt) and use the
name to figure out the target. Unfortunately, LLVM doesn't generate
symbols for all the jump table entries, so we can't rely on them in
objtool.
What comes to the magic offset value, it's one because the first byte
of the jmp instruction is the opcode and the relocation only applies
to the rest of the instruction.
I'll add a comment to arch_cfi_jump_reloc_offset() in v4 to clarify this.
Sami
On Tue, Sep 14, 2021 at 1:30 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > On Tue, Sep 14, 2021 at 1:02 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: > > > > On Tue, Sep 14, 2021 at 12:11 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > Disable CONFIG_CFI_CLANG for the stand-alone purgatory.ro. > > > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > > > I kind of prefer the existing convention that has explicit guards on > > specific configs (ie. CONFIG_FUNCTION_TRACER, CONFIG_STACKPROTECTOR, > > CONFIG_STACKPROTECTOR_STRONG, CONFIG_RETPOLINE); it's more obvious > > which configs may introduce which flags that are problematic. This > > patch is ok as is, but it kind of makes this Makefile more > > inconsistent. I would prefer we had the explicit checks. > > The Makefile does already use DISABLE_STACKLEAK_PLUGIN in a similar > way, but I don't have a strong preference here. mmm...DISABLE_STACKLEAK_PLUGIN adds to PURGATORY_CFLAGS. This patch adds to PURGATORY_CFLAGS_REMOVE. > I can move this into > an ifdef if it makes things cleaner. > > > Does CFI actually do any instrumentation in these object files? I > > guess issues in purgatory cause silent/hard to debug kexec failures? > > The compiler shouldn't add any actual CFI instrumentation here right > now, but I would prefer to avoid issues in future. Ok, good to know. -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 03:31:14PM -0700, Nick Desaulniers wrote: > On Tue, Sep 14, 2021 at 1:30 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > On Tue, Sep 14, 2021 at 1:02 PM Nick Desaulniers > > <ndesaulniers@google.com> wrote: > > > > > > On Tue, Sep 14, 2021 at 12:11 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > > > Disable CONFIG_CFI_CLANG for the stand-alone purgatory.ro. > > > > > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > > > > > I kind of prefer the existing convention that has explicit guards on > > > specific configs (ie. CONFIG_FUNCTION_TRACER, CONFIG_STACKPROTECTOR, > > > CONFIG_STACKPROTECTOR_STRONG, CONFIG_RETPOLINE); it's more obvious > > > which configs may introduce which flags that are problematic. This > > > patch is ok as is, but it kind of makes this Makefile more > > > inconsistent. I would prefer we had the explicit checks. Can you explain your reasoning a bit more? It seems like redundant open-coded logic to me, but I do see this idiom repeated in the kernel. And/or maybe I've misunderstood you? It seems like it's better to have a single variable (like in the proposed patch: CC_FLAGS_CFI) that has all the details internal -- no tests needed. i.e.: instead of this in many places: ifdef CONFIG_FEATURE PURGATORY_CFLAGS_REMOVE += -feature-flag endif do this once: CC_FEATURE_CFLAGS := -feature-flag ... KBUILD_CFLAGS += $(CC_FEATURE_CFLAGS) and only repeat a single line in for targets: CFLAGS_REMOVE_target.o += $(CC_FEATURE_CFLAGS) > > > > The Makefile does already use DISABLE_STACKLEAK_PLUGIN in a similar > > way, but I don't have a strong preference here. > > mmm...DISABLE_STACKLEAK_PLUGIN adds to PURGATORY_CFLAGS. This patch > adds to PURGATORY_CFLAGS_REMOVE. CFI is "simple" in that regard; its options can just be left off. This isn't true for some weirder stuff. Stack protector is a good one, in that just removing the options may not disable it depending on distro patches (which may turn it on by default), so both target_CFLAGS and target_REMOVE are needed there. (In the case of the plugins, yes, I think they could be rearranged to use the target_REMOVE method, but I have a memory of REMOVE not working there for some weird thing? Hmm.) -- Kees Cook