* [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE @ 2020-04-07 7:31 Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre ` (9 more replies) 0 siblings, 10 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Hi, This is version v2 of this patchset based on the different comments received so far. It now uses and includes PeterZ patch to add UNWIND_HINT_RET_OFFSET. Other changes are described below. Code like retpoline or RSB stuffing, which is used to mitigate some of the speculative execution issues, is currently ignored by objtool with the ANNOTATE_NOSPEC_ALTERNATIVE directive. This series adds support for intra-function calls to objtool so that it can handle such a code. With these changes, we can remove all ANNOTATE_NOSPEC_ALTERNATIVE directives. Changes: - replace RETPOLINE_RET with PeterZ UNWIND_HINT_RET_OFFSET - make objtool intra-function call action architecture dependent - objtool now automatically detects and validates all intra-function calls but it issues a warning if the call was not explicitly tagged - change __FILL_RETURN_BUFFER to work with objtool - add generic ANNOTATE_INTRA_FUNCTION_CALL macro - remove all ANNOTATE_SPEC_ALTERNATIVE (even for __FILL_RETURN_BUFFER) Thanks, alex. ----- Alexandre Chartre (8): objtool: UNWIND_HINT_RET_OFFSET should not check registers objtool: is_fentry_call() crashes if call has no destination objtool: Allow branches within the same alternative. objtool: Add support for intra-function calls x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool x86/speculation: Annotate intra-function calls x86/speculation: Add unwind hint to trampoline return x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Peter Zijlstra (Intel) (1): objtool: Introduce HINT_RET_OFFSET arch/x86/include/asm/nospec-branch.h | 32 ++-- arch/x86/include/asm/orc_types.h | 1 + arch/x86/include/asm/unwind_hints.h | 10 ++ include/linux/frame.h | 11 ++ tools/arch/x86/include/asm/orc_types.h | 1 + .../Documentation/stack-validation.txt | 8 + tools/objtool/arch.h | 2 + tools/objtool/arch/x86/decode.c | 12 ++ tools/objtool/check.c | 152 ++++++++++++++---- tools/objtool/check.h | 6 +- 10 files changed, 192 insertions(+), 43 deletions(-) -- 2.18.2 ^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 12:53 ` Peter Zijlstra 2020-04-07 7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre ` (8 subsequent siblings) 9 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre From: "Peter Zijlstra (Intel)" <peterz@infradead.org> Normally objtool ensures a function keeps the stack layout invariant. But there is a useful exception, it is possible to stuff the return stack in order to 'inject' a 'call': push $fun ret In this case the invariant mentioned above is violated. Add an objtool HINT to annotate this and allow a function exit with a modified stack frame. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- arch/x86/include/asm/orc_types.h | 1 + arch/x86/include/asm/unwind_hints.h | 10 ++++++++++ tools/arch/x86/include/asm/orc_types.h | 1 + tools/objtool/check.c | 26 ++++++++++++++++++-------- tools/objtool/check.h | 5 ++++- 5 files changed, 34 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h index 6e060907c163..5f18ca7ac51a 100644 --- a/arch/x86/include/asm/orc_types.h +++ b/arch/x86/include/asm/orc_types.h @@ -60,6 +60,7 @@ #define ORC_TYPE_REGS_IRET 2 #define UNWIND_HINT_TYPE_SAVE 3 #define UNWIND_HINT_TYPE_RESTORE 4 +#define UNWIND_HINT_TYPE_RET_OFFSET 5 #ifndef __ASSEMBLY__ /* diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h index f5e2eb12cb71..aabf7ace0476 100644 --- a/arch/x86/include/asm/unwind_hints.h +++ b/arch/x86/include/asm/unwind_hints.h @@ -94,6 +94,16 @@ UNWIND_HINT type=UNWIND_HINT_TYPE_RESTORE .endm + +/* + * RET_OFFSET: Used on instructions that terminate a function; mostly RETURN + * and sibling calls. On these, sp_offset denotes the expected offset from + * initial_func_cfi. + */ +.macro UNWIND_HINT_RET_OFFSET sp_offset=8 + UNWIND_HINT type=UNWIND_HINT_TYPE_RET_OFFSET sp_offset=\sp_offset +.endm + #else /* !__ASSEMBLY__ */ #define UNWIND_HINT(sp_reg, sp_offset, type, end) \ diff --git a/tools/arch/x86/include/asm/orc_types.h b/tools/arch/x86/include/asm/orc_types.h index 6e060907c163..5f18ca7ac51a 100644 --- a/tools/arch/x86/include/asm/orc_types.h +++ b/tools/arch/x86/include/asm/orc_types.h @@ -60,6 +60,7 @@ #define ORC_TYPE_REGS_IRET 2 #define UNWIND_HINT_TYPE_SAVE 3 #define UNWIND_HINT_TYPE_RESTORE 4 +#define UNWIND_HINT_TYPE_RET_OFFSET 5 #ifndef __ASSEMBLY__ /* diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 4768d91c6d68..bbee26de92ec 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1209,6 +1209,10 @@ static int read_unwind_hints(struct objtool_file *file) insn->restore = true; insn->hint = true; continue; + + } else if (hint->type == UNWIND_HINT_TYPE_RET_OFFSET) { + insn->ret_offset = hint->sp_offset; + continue; } insn->hint = true; @@ -1371,20 +1375,26 @@ static bool is_fentry_call(struct instruction *insn) return false; } -static bool has_modified_stack_frame(struct insn_state *state) +static bool has_modified_stack_frame(struct instruction *insn, + struct insn_state *state) { + u8 ret_offset = insn->ret_offset; int i; - if (state->cfa.base != initial_func_cfi.cfa.base || - state->cfa.offset != initial_func_cfi.cfa.offset || - state->stack_size != initial_func_cfi.cfa.offset || - state->drap) + if (state->cfa.base != initial_func_cfi.cfa.base || state->drap) + return true; + + if (state->cfa.offset != initial_func_cfi.cfa.offset + ret_offset) return true; - for (i = 0; i < CFI_NUM_REGS; i++) + if (state->stack_size != initial_func_cfi.cfa.offset + ret_offset) + return true; + + for (i = 0; i < CFI_NUM_REGS; i++) { if (state->regs[i].base != initial_func_cfi.regs[i].base || state->regs[i].offset != initial_func_cfi.regs[i].offset) return true; + } return false; } @@ -1926,7 +1936,7 @@ static int validate_call(struct instruction *insn, struct insn_state *state) static int validate_sibling_call(struct instruction *insn, struct insn_state *state) { - if (has_modified_stack_frame(state)) { + if (has_modified_stack_frame(insn, state)) { WARN_FUNC("sibling call from callable instruction with modified stack frame", insn->sec, insn->offset); return 1; @@ -2065,7 +2075,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, return 1; } - if (func && has_modified_stack_frame(&state)) { + if (func && has_modified_stack_frame(insn, &state)) { WARN_FUNC("return with modified stack frame", sec, insn->offset); return 1; diff --git a/tools/objtool/check.h b/tools/objtool/check.h index 6d875ca6fce0..7a91497fee7e 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -33,9 +33,12 @@ struct instruction { unsigned int len; enum insn_type type; unsigned long immediate; - bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts; + unsigned int alt_group; + bool dead_end, ignore, ignore_alts; + bool hint, save, restore; bool retpoline_safe; u8 visited; + u8 ret_offset; struct symbol *call_dest; struct instruction *jump_dest; struct instruction *first_jump_src; -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET 2020-04-07 7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre @ 2020-04-07 12:53 ` Peter Zijlstra 2020-04-07 13:17 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 12:53 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On Tue, Apr 07, 2020 at 09:31:34AM +0200, Alexandre Chartre wrote: > diff --git a/tools/objtool/check.h b/tools/objtool/check.h > index 6d875ca6fce0..7a91497fee7e 100644 > --- a/tools/objtool/check.h > +++ b/tools/objtool/check.h > @@ -33,9 +33,12 @@ struct instruction { > unsigned int len; > enum insn_type type; > unsigned long immediate; > - bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts; > + unsigned int alt_group; ^^ that wants to be in a later patch I'm thinking > + bool dead_end, ignore, ignore_alts; > + bool hint, save, restore; > bool retpoline_safe; > u8 visited; > + u8 ret_offset; > struct symbol *call_dest; > struct instruction *jump_dest; > struct instruction *first_jump_src; > -- > 2.18.2 > ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET 2020-04-07 12:53 ` Peter Zijlstra @ 2020-04-07 13:17 ` Alexandre Chartre 0 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 13:17 UTC (permalink / raw) To: Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On 4/7/20 2:53 PM, Peter Zijlstra wrote: > On Tue, Apr 07, 2020 at 09:31:34AM +0200, Alexandre Chartre wrote: >> diff --git a/tools/objtool/check.h b/tools/objtool/check.h >> index 6d875ca6fce0..7a91497fee7e 100644 >> --- a/tools/objtool/check.h >> +++ b/tools/objtool/check.h >> @@ -33,9 +33,12 @@ struct instruction { >> unsigned int len; >> enum insn_type type; >> unsigned long immediate; >> - bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts; >> + unsigned int alt_group; > > ^^ that wants to be in a later patch I'm thinking Correct, rebase error, should be in patch 4. Thanks, alex. >> + bool dead_end, ignore, ignore_alts; >> + bool hint, save, restore; >> bool retpoline_safe; >> u8 visited; >> + u8 ret_offset; >> struct symbol *call_dest; >> struct instruction *jump_dest; >> struct instruction *first_jump_src; >> -- >> 2.18.2 >> ^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-05-01 18:22 ` [tip: objtool/core] " tip-bot2 for Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination Alexandre Chartre ` (7 subsequent siblings) 9 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre UNWIND_HINT_RET_OFFSET will adjust a modified stack. However if a callee-saved register was pushed on the stack then the stack frame will still appear modified. So stop checking registers when UNWIND_HINT_RET_OFFSET is used. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- tools/objtool/check.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index bbee26de92ec..c7fcaddfaa8a 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1390,6 +1390,14 @@ static bool has_modified_stack_frame(struct instruction *insn, if (state->stack_size != initial_func_cfi.cfa.offset + ret_offset) return true; + /* + * If there is a ret offset hint then don't check registers + * because a callee-saved register might have been pushed on + * the stack. + */ + if (ret_offset) + return false; + for (i = 0; i < CFI_NUM_REGS; i++) { if (state->regs[i].base != initial_func_cfi.regs[i].base || state->regs[i].offset != initial_func_cfi.regs[i].offset) -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* [tip: objtool/core] objtool: UNWIND_HINT_RET_OFFSET should not check registers 2020-04-07 7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre @ 2020-05-01 18:22 ` tip-bot2 for Alexandre Chartre 0 siblings, 0 replies; 40+ messages in thread From: tip-bot2 for Alexandre Chartre @ 2020-05-01 18:22 UTC (permalink / raw) To: linux-tip-commits Cc: Alexandre Chartre, Peter Zijlstra (Intel), Miroslav Benes, Josh Poimboeuf, x86, LKML The following commit has been merged into the objtool/core branch of tip: Commit-ID: c721b3f80faebc7891211fa82de303eebadfed15 Gitweb: https://git.kernel.org/tip/c721b3f80faebc7891211fa82de303eebadfed15 Author: Alexandre Chartre <alexandre.chartre@oracle.com> AuthorDate: Tue, 07 Apr 2020 09:31:35 +02:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Thu, 30 Apr 2020 20:14:32 +02:00 objtool: UNWIND_HINT_RET_OFFSET should not check registers UNWIND_HINT_RET_OFFSET will adjust a modified stack. However if a callee-saved register was pushed on the stack then the stack frame will still appear modified. So stop checking registers when UNWIND_HINT_RET_OFFSET is used. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200407073142.20659-3-alexandre.chartre@oracle.com --- tools/objtool/check.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 8af8de2..068897d 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1507,6 +1507,14 @@ static bool has_modified_stack_frame(struct instruction *insn, struct insn_state if (cfi->stack_size != initial_func_cfi.cfa.offset + ret_offset) return true; + /* + * If there is a ret offset hint then don't check registers + * because a callee-saved register might have been pushed on + * the stack. + */ + if (ret_offset) + return false; + for (i = 0; i < CFI_NUM_REGS; i++) { if (cfi->regs[i].base != initial_func_cfi.regs[i].base || cfi->regs[i].offset != initial_func_cfi.regs[i].offset) ^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 4/9] objtool: Allow branches within the same alternative Alexandre Chartre ` (6 subsequent siblings) 9 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Fix is_fentry_call() so that it works if a call has no destination set (call_dest). This needs to be done in order to support intra- function calls. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- tools/objtool/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index c7fcaddfaa8a..c0da033a40c5 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1367,7 +1367,7 @@ static int decode_sections(struct objtool_file *file) static bool is_fentry_call(struct instruction *insn) { - if (insn->type == INSN_CALL && + if (insn->type == INSN_CALL && insn->call_dest && insn->call_dest->type == STT_NOTYPE && !strcmp(insn->call_dest->name, "__fentry__")) return true; -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH V2 4/9] objtool: Allow branches within the same alternative. 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (2 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre ` (5 subsequent siblings) 9 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Currently objtool prevents any branch to an alternative. While preventing branching from the outside to the middle of an alternative makes perfect sense, branching within the same alternative should be allowed. To do so, identify each alternative and check that a branch to an alternative comes from the same alternative. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- tools/objtool/check.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index c0da033a40c5..ab06e9bdd396 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -712,7 +712,9 @@ static int handle_group_alt(struct objtool_file *file, struct instruction *orig_insn, struct instruction **new_insn) { + static unsigned int alt_group_next_index = 1; struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump = NULL; + unsigned int alt_group = alt_group_next_index++; unsigned long dest_off; last_orig_insn = NULL; @@ -721,7 +723,7 @@ static int handle_group_alt(struct objtool_file *file, if (insn->offset >= special_alt->orig_off + special_alt->orig_len) break; - insn->alt_group = true; + insn->alt_group = alt_group; last_orig_insn = insn; } @@ -1960,6 +1962,7 @@ static int validate_sibling_call(struct instruction *insn, struct insn_state *st * tools/objtool/Documentation/stack-validation.txt. */ static int validate_branch(struct objtool_file *file, struct symbol *func, + struct instruction *from, struct instruction *first, struct insn_state state) { struct alternative *alt; @@ -1971,7 +1974,9 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, insn = first; sec = insn->sec; - if (insn->alt_group && list_empty(&insn->alts)) { + if (insn->alt_group && + (!from || from->alt_group != insn->alt_group) && + list_empty(&insn->alts)) { WARN_FUNC("don't know how to handle branch to middle of alternative instruction group", sec, insn->offset); return 1; @@ -2053,7 +2058,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, if (alt->skip_orig) skip_orig = true; - ret = validate_branch(file, func, alt->insn, state); + ret = validate_branch(file, func, + NULL, alt->insn, state); if (ret) { if (backtrace) BT_FUNC("(alt)", insn); @@ -2123,7 +2129,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, return ret; } else if (insn->jump_dest) { - ret = validate_branch(file, func, + ret = validate_branch(file, func, insn, insn->jump_dest, state); if (ret) { if (backtrace) @@ -2254,7 +2260,8 @@ static int validate_unwind_hints(struct objtool_file *file) for_each_insn(file, insn) { if (insn->hint && !insn->visited) { - ret = validate_branch(file, insn->func, insn, state); + ret = validate_branch(file, insn->func, + NULL, insn, state); if (ret && backtrace) BT_FUNC("<=== (hint)", insn); warnings += ret; @@ -2395,7 +2402,7 @@ static int validate_functions(struct objtool_file *file) state.uaccess = func->uaccess_safe; - ret = validate_branch(file, func, insn, state); + ret = validate_branch(file, func, NULL, insn, state); if (ret && backtrace) BT_FUNC("<=== (func)", insn); warnings += ret; -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (3 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 4/9] objtool: Allow branches within the same alternative Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 13:07 ` Peter Zijlstra 2020-04-07 7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre ` (4 subsequent siblings) 9 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Change objtool to support intra-function calls. On x86, an intra-function call is represented in objtool as a push onto the stack (of the return address), and a jump to the destination address. That way the stack information is correctly updated and the call flow is still accurate. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- include/linux/frame.h | 11 +++ .../Documentation/stack-validation.txt | 8 ++ tools/objtool/arch.h | 2 + tools/objtool/arch/x86/decode.c | 12 +++ tools/objtool/check.c | 97 ++++++++++++++++--- tools/objtool/check.h | 1 + 6 files changed, 117 insertions(+), 14 deletions(-) diff --git a/include/linux/frame.h b/include/linux/frame.h index 02d3ca2d9598..c7178e6c9d48 100644 --- a/include/linux/frame.h +++ b/include/linux/frame.h @@ -15,9 +15,20 @@ static void __used __section(.discard.func_stack_frame_non_standard) \ *__func_stack_frame_non_standard_##func = func +/* + * This macro indicates that the following intra-function call is valid. + * Any non-annotated intra-function call will cause objtool to issue warning. + */ +#define ANNOTATE_INTRA_FUNCTION_CALL \ + 999: \ + .pushsection .discard.intra_function_call; \ + .long 999b; \ + .popsection; + #else /* !CONFIG_STACK_VALIDATION */ #define STACK_FRAME_NON_STANDARD(func) +#define ANNOTATE_INTRA_FUNCTION_CALL #endif /* CONFIG_STACK_VALIDATION */ diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt index de094670050b..09f863fd32d2 100644 --- a/tools/objtool/Documentation/stack-validation.txt +++ b/tools/objtool/Documentation/stack-validation.txt @@ -290,6 +290,14 @@ they mean, and suggestions for how to fix them. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646 +9. file.o: warning: unsupported intra-function call + + This warning means that a direct call is done to a destination which + is not at the beginning of a function. If this is a legit call, you + can remove this warning by putting the ANNOTATE_INTRA_FUNCTION_CALL + directive right before the call. + + If the error doesn't seem to make sense, it could be a bug in objtool. Feel free to ask the objtool maintainer for help. diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index ced3765c4f44..4d42c12d9957 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -75,4 +75,6 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, bool arch_callee_saved_reg(unsigned char reg); +void arch_configure_intra_function_call(struct stack_op *op); + #endif /* _ARCH_H */ diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index a62e032863a8..7ee1561bf7ad 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state) state->regs[16].base = CFI_CFA; state->regs[16].offset = -8; } + + +void arch_configure_intra_function_call(struct stack_op *op) +{ + /* + * For the impact on the stack, make an intra-function + * call behaves like a push of an immediate value (the + * return address). + */ + op->src.type = OP_SRC_CONST; + op->dest.type = OP_DEST_PUSH; +} diff --git a/tools/objtool/check.c b/tools/objtool/check.c index ab06e9bdd396..bf1e4e94d686 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -644,34 +644,50 @@ static int add_jump_destinations(struct objtool_file *file) return 0; } +static int configure_call(struct objtool_file *file, struct instruction *insn) +{ + unsigned long dest_off; + + dest_off = insn->offset + insn->len + insn->immediate; + insn->call_dest = find_symbol_by_offset(insn->sec, dest_off); + if (insn->call_dest) { + /* regular call */ + return 0; + } + + /* intra-function call */ + if (!insn->intra_function_call) + WARN_FUNC("intra-function call", insn->sec, insn->offset); + + dest_off = insn->offset + insn->len + insn->immediate; + insn->jump_dest = find_insn(file, insn->sec, dest_off); + if (!insn->jump_dest) { + WARN_FUNC("can't find call dest at %s+0x%lx", + insn->sec, insn->offset, + insn->sec->name, dest_off); + return -1; + } + + return 0; +} + /* * Find the destination instructions for all calls. */ static int add_call_destinations(struct objtool_file *file) { struct instruction *insn; - unsigned long dest_off; struct rela *rela; for_each_insn(file, insn) { - if (insn->type != INSN_CALL) + if (insn->type != INSN_CALL || insn->ignore) continue; rela = find_rela_by_dest_range(insn->sec, insn->offset, insn->len); if (!rela) { - dest_off = insn->offset + insn->len + insn->immediate; - insn->call_dest = find_symbol_by_offset(insn->sec, - dest_off); - - if (!insn->call_dest && !insn->ignore) { - WARN_FUNC("unsupported intra-function call", - insn->sec, insn->offset); - if (retpoline) - WARN("If this is a retpoline, please patch it in with alternatives and annotate it with ANNOTATE_NOSPEC_ALTERNATIVE."); + if (configure_call(file, insn)) return -1; - } - } else if (rela->sym->type == STT_SECTION) { insn->call_dest = find_symbol_by_offset(rela->sym->sec, rela->addend+4); @@ -1293,6 +1309,43 @@ static int read_retpoline_hints(struct objtool_file *file) return 0; } +static int read_intra_function_call(struct objtool_file *file) +{ + struct section *sec; + struct instruction *insn; + struct rela *rela; + + sec = find_section_by_name(file->elf, + ".rela.discard.intra_function_call"); + if (!sec) + return 0; + + list_for_each_entry(rela, &sec->rela_list, list) { + if (rela->sym->type != STT_SECTION) { + WARN("unexpected relocation symbol type in %s", + sec->name); + return -1; + } + + insn = find_insn(file, rela->sym->sec, rela->addend); + if (!insn) { + WARN("bad .discard.intra_function_call entry"); + return -1; + } + + if (insn->type != INSN_CALL) { + WARN_FUNC("intra_function_call not a direct call", + insn->sec, insn->offset); + return -1; + } + + arch_configure_intra_function_call(&insn->stack_op); + insn->intra_function_call = true; + } + + return 0; +} + static void mark_rodata(struct objtool_file *file) { struct section *sec; @@ -1348,6 +1401,10 @@ static int decode_sections(struct objtool_file *file) if (ret) return ret; + ret = read_intra_function_call(file); + if (ret) + return ret; + ret = add_call_destinations(file); if (ret) return ret; @@ -2110,7 +2167,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, return ret; if (!no_fp && func && !is_fentry_call(insn) && - !has_valid_stack_frame(&state)) { + !has_valid_stack_frame(&state) && + !insn->intra_function_call) { WARN_FUNC("call without frame pointer save/setup", sec, insn->offset); return 1; @@ -2119,6 +2177,17 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, if (dead_end_function(file, insn->call_dest)) return 0; + if (insn->intra_function_call) { + update_insn_state(insn, &state); + ret = validate_branch(file, func, insn, + insn->jump_dest, state); + if (ret) { + if (backtrace) + BT_FUNC("(intra-call)", insn); + return ret; + } + } + break; case INSN_JUMP_CONDITIONAL: diff --git a/tools/objtool/check.h b/tools/objtool/check.h index 7a91497fee7e..7c7ec3d8fb00 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -36,6 +36,7 @@ struct instruction { unsigned int alt_group; bool dead_end, ignore, ignore_alts; bool hint, save, restore; + bool intra_function_call; bool retpoline_safe; u8 visited; u8 ret_offset; -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-07 7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre @ 2020-04-07 13:07 ` Peter Zijlstra 2020-04-07 13:28 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 13:07 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: > index a62e032863a8..7ee1561bf7ad 100644 > --- a/tools/objtool/arch/x86/decode.c > +++ b/tools/objtool/arch/x86/decode.c > @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state) > state->regs[16].base = CFI_CFA; > state->regs[16].offset = -8; > } > + > + > +void arch_configure_intra_function_call(struct stack_op *op) > +{ > + /* > + * For the impact on the stack, make an intra-function > + * call behaves like a push of an immediate value (the > + * return address). > + */ > + op->src.type = OP_SRC_CONST; > + op->dest.type = OP_DEST_PUSH; > +} An alternative is to always set up stack ops for CALL/RET on decode, but conditionally run update_insn_state() for them. Not sure that makes more logical sense, but the patch would be simpler I think. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-07 13:07 ` Peter Zijlstra @ 2020-04-07 13:28 ` Alexandre Chartre 2020-04-08 14:06 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 13:28 UTC (permalink / raw) To: Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On 4/7/20 3:07 PM, Peter Zijlstra wrote: > On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: > >> index a62e032863a8..7ee1561bf7ad 100644 >> --- a/tools/objtool/arch/x86/decode.c >> +++ b/tools/objtool/arch/x86/decode.c >> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state) >> state->regs[16].base = CFI_CFA; >> state->regs[16].offset = -8; >> } >> + >> + >> +void arch_configure_intra_function_call(struct stack_op *op) >> +{ >> + /* >> + * For the impact on the stack, make an intra-function >> + * call behaves like a push of an immediate value (the >> + * return address). >> + */ >> + op->src.type = OP_SRC_CONST; >> + op->dest.type = OP_DEST_PUSH; >> +} > > An alternative is to always set up stack ops for CALL/RET on decode, but > conditionally run update_insn_state() for them. > > Not sure that makes more logical sense, but the patch would be simpler I > think. Right, this would avoid adding a new arch dependent function and the patch will be simpler. This probably makes sense as the stack impact is the same for all calls (but objtool will use it only for intra-function calls). Thanks, alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-07 13:28 ` Alexandre Chartre @ 2020-04-08 14:06 ` Alexandre Chartre 2020-04-08 14:19 ` Julien Thierry 0 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-08 14:06 UTC (permalink / raw) To: Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On 4/7/20 3:28 PM, Alexandre Chartre wrote: > > On 4/7/20 3:07 PM, Peter Zijlstra wrote: >> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: >> >>> index a62e032863a8..7ee1561bf7ad 100644 >>> --- a/tools/objtool/arch/x86/decode.c >>> +++ b/tools/objtool/arch/x86/decode.c >>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state) >>> state->regs[16].base = CFI_CFA; >>> state->regs[16].offset = -8; >>> } >>> + >>> + >>> +void arch_configure_intra_function_call(struct stack_op *op) >>> +{ >>> + /* >>> + * For the impact on the stack, make an intra-function >>> + * call behaves like a push of an immediate value (the >>> + * return address). >>> + */ >>> + op->src.type = OP_SRC_CONST; >>> + op->dest.type = OP_DEST_PUSH; >>> +} >> >> An alternative is to always set up stack ops for CALL/RET on decode, but >> conditionally run update_insn_state() for them. >> >> Not sure that makes more logical sense, but the patch would be simpler I >> think. > > Right, this would avoid adding a new arch dependent function and the patch > will be simpler. This probably makes sense as the stack impact is the same > for all calls (but objtool will use it only for intra-function calls). > Actually the processing of the ret instruction is more complicated than I anticipated with intra-function calls, and so my implementation is not complete at the moment. The issue is to correctly handle how the ret is going to behave depending how the stack (or register on arm) is modified before the ret. Adjusting the stack offset makes the stack state correct, but objtool still needs to correctly figure out where the ret is going to return and where the code flow continues. alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-08 14:06 ` Alexandre Chartre @ 2020-04-08 14:19 ` Julien Thierry 2020-04-08 16:03 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Julien Thierry @ 2020-04-08 14:19 UTC (permalink / raw) To: Alexandre Chartre, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx On 4/8/20 3:06 PM, Alexandre Chartre wrote: > > > On 4/7/20 3:28 PM, Alexandre Chartre wrote: >> >> On 4/7/20 3:07 PM, Peter Zijlstra wrote: >>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: >>> >>>> index a62e032863a8..7ee1561bf7ad 100644 >>>> --- a/tools/objtool/arch/x86/decode.c >>>> +++ b/tools/objtool/arch/x86/decode.c >>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct >>>> cfi_state *state) >>>> state->regs[16].base = CFI_CFA; >>>> state->regs[16].offset = -8; >>>> } >>>> + >>>> + >>>> +void arch_configure_intra_function_call(struct stack_op *op) >>>> +{ >>>> + /* >>>> + * For the impact on the stack, make an intra-function >>>> + * call behaves like a push of an immediate value (the >>>> + * return address). >>>> + */ >>>> + op->src.type = OP_SRC_CONST; >>>> + op->dest.type = OP_DEST_PUSH; >>>> +} >>> >>> An alternative is to always set up stack ops for CALL/RET on decode, but >>> conditionally run update_insn_state() for them. >>> >>> Not sure that makes more logical sense, but the patch would be simpler I >>> think. >> >> Right, this would avoid adding a new arch dependent function and the >> patch >> will be simpler. This probably makes sense as the stack impact is the >> same >> for all calls (but objtool will use it only for intra-function calls). >> > > Actually the processing of the ret instruction is more complicated than I > anticipated with intra-function calls, and so my implementation is not > complete at the moment. > > The issue is to correctly handle how the ret is going to behave > depending how > the stack (or register on arm) is modified before the ret. Adjusting the > stack > offset makes the stack state correct, but objtool still needs to correctly > figure out where the ret is going to return and where the code flow > continues. > A hint indicating the target "jump" address could be useful. It could be used to add the information on some call/jump dynamic that aren't associated with jump tables. Currently when objtool finds a jump dynamic, if no branches were added to it, it will just return. Having such a hint could help make additional links (at least on arm64). I don't know what Peter and Josh would think of that (if that helps in your case of course). -- Julien Thierry ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-08 14:19 ` Julien Thierry @ 2020-04-08 16:03 ` Alexandre Chartre 2020-04-08 16:04 ` Julien Thierry 0 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-08 16:03 UTC (permalink / raw) To: Julien Thierry, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx On 4/8/20 4:19 PM, Julien Thierry wrote: > > > On 4/8/20 3:06 PM, Alexandre Chartre wrote: >> >> >> On 4/7/20 3:28 PM, Alexandre Chartre wrote: >>> >>> On 4/7/20 3:07 PM, Peter Zijlstra wrote: >>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: >>>> >>>>> index a62e032863a8..7ee1561bf7ad 100644 >>>>> --- a/tools/objtool/arch/x86/decode.c >>>>> +++ b/tools/objtool/arch/x86/decode.c >>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state) >>>>> state->regs[16].base = CFI_CFA; >>>>> state->regs[16].offset = -8; >>>>> } >>>>> + >>>>> + >>>>> +void arch_configure_intra_function_call(struct stack_op *op) >>>>> +{ >>>>> + /* >>>>> + * For the impact on the stack, make an intra-function >>>>> + * call behaves like a push of an immediate value (the >>>>> + * return address). >>>>> + */ >>>>> + op->src.type = OP_SRC_CONST; >>>>> + op->dest.type = OP_DEST_PUSH; >>>>> +} >>>> >>>> An alternative is to always set up stack ops for CALL/RET on decode, but >>>> conditionally run update_insn_state() for them. >>>> >>>> Not sure that makes more logical sense, but the patch would be simpler I >>>> think. >>> >>> Right, this would avoid adding a new arch dependent function and the patch >>> will be simpler. This probably makes sense as the stack impact is the same >>> for all calls (but objtool will use it only for intra-function calls). >>> >> >> Actually the processing of the ret instruction is more complicated than I >> anticipated with intra-function calls, and so my implementation is not >> complete at the moment. >> >> The issue is to correctly handle how the ret is going to behave depending how >> the stack (or register on arm) is modified before the ret. Adjusting the stack >> offset makes the stack state correct, but objtool still needs to correctly >> figure out where the ret is going to return and where the code flow continues. >> > > A hint indicating the target "jump" address could be useful. It could > be used to add the information on some call/jump dynamic that aren't > associated with jump tables. Currently when objtool finds a jump > dynamic, if no branches were added to it, it will just return. > > Having such a hint could help make additional links (at least on > arm64). I don't know what Peter and Josh would think of that (if that > helps in your case of course). > Yes, I am thinking about tracking intra-function call return address, and having hints to specify a return address changes. For example, on x86, when we push the branch address on the stack we overwrite the last return address (the return address of the last intra-function call). Then the return instruction can figure out where to branch. alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-08 16:03 ` Alexandre Chartre @ 2020-04-08 16:04 ` Julien Thierry 2020-04-08 17:06 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Julien Thierry @ 2020-04-08 16:04 UTC (permalink / raw) To: Alexandre Chartre, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx On 4/8/20 5:03 PM, Alexandre Chartre wrote: > > > On 4/8/20 4:19 PM, Julien Thierry wrote: >> >> >> On 4/8/20 3:06 PM, Alexandre Chartre wrote: >>> >>> >>> On 4/7/20 3:28 PM, Alexandre Chartre wrote: >>>> >>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote: >>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: >>>>> >>>>>> index a62e032863a8..7ee1561bf7ad 100644 >>>>>> --- a/tools/objtool/arch/x86/decode.c >>>>>> +++ b/tools/objtool/arch/x86/decode.c >>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct >>>>>> cfi_state *state) >>>>>> state->regs[16].base = CFI_CFA; >>>>>> state->regs[16].offset = -8; >>>>>> } >>>>>> + >>>>>> + >>>>>> +void arch_configure_intra_function_call(struct stack_op *op) >>>>>> +{ >>>>>> + /* >>>>>> + * For the impact on the stack, make an intra-function >>>>>> + * call behaves like a push of an immediate value (the >>>>>> + * return address). >>>>>> + */ >>>>>> + op->src.type = OP_SRC_CONST; >>>>>> + op->dest.type = OP_DEST_PUSH; >>>>>> +} >>>>> >>>>> An alternative is to always set up stack ops for CALL/RET on >>>>> decode, but >>>>> conditionally run update_insn_state() for them. >>>>> >>>>> Not sure that makes more logical sense, but the patch would be >>>>> simpler I >>>>> think. >>>> >>>> Right, this would avoid adding a new arch dependent function and the >>>> patch >>>> will be simpler. This probably makes sense as the stack impact is >>>> the same >>>> for all calls (but objtool will use it only for intra-function calls). >>>> >>> >>> Actually the processing of the ret instruction is more complicated >>> than I >>> anticipated with intra-function calls, and so my implementation is not >>> complete at the moment. >>> >>> The issue is to correctly handle how the ret is going to behave >>> depending how >>> the stack (or register on arm) is modified before the ret. Adjusting >>> the stack >>> offset makes the stack state correct, but objtool still needs to >>> correctly >>> figure out where the ret is going to return and where the code flow >>> continues. >>> >> >> A hint indicating the target "jump" address could be useful. It could >> be used to add the information on some call/jump dynamic that aren't >> associated with jump tables. Currently when objtool finds a jump >> dynamic, if no branches were added to it, it will just return. >> >> Having such a hint could help make additional links (at least on >> arm64). I don't know what Peter and Josh would think of that (if that >> helps in your case of course). >> > > Yes, I am thinking about tracking intra-function call return address, > and having hints to specify a return address changes. For example, > on x86, when we push the branch address on the stack we overwrite the > last return address (the return address of the last intra-function call). > Then the return instruction can figure out where to branch. I see, I was thinking about a more generic hint, that would just indicate "this instruction actually jumps here". So in your case it would just point that a certain return instruction causes to branch somewhere. This way the hint could also be used for other instructions (e.g. INSN_JUMP_DYNAMIC). -- Julien Thierry ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-08 16:04 ` Julien Thierry @ 2020-04-08 17:06 ` Alexandre Chartre 2020-04-08 17:07 ` Julien Thierry 0 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-08 17:06 UTC (permalink / raw) To: Julien Thierry, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx On 4/8/20 6:04 PM, Julien Thierry wrote: > > > On 4/8/20 5:03 PM, Alexandre Chartre wrote: >> >> >> On 4/8/20 4:19 PM, Julien Thierry wrote: >>> >>> >>> On 4/8/20 3:06 PM, Alexandre Chartre wrote: >>>> >>>> >>>> On 4/7/20 3:28 PM, Alexandre Chartre wrote: >>>>> >>>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote: >>>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: >>>>>> >>>>>>> index a62e032863a8..7ee1561bf7ad 100644 >>>>>>> --- a/tools/objtool/arch/x86/decode.c >>>>>>> +++ b/tools/objtool/arch/x86/decode.c >>>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state) >>>>>>> state->regs[16].base = CFI_CFA; >>>>>>> state->regs[16].offset = -8; >>>>>>> } >>>>>>> + >>>>>>> + >>>>>>> +void arch_configure_intra_function_call(struct stack_op *op) >>>>>>> +{ >>>>>>> + /* >>>>>>> + * For the impact on the stack, make an intra-function >>>>>>> + * call behaves like a push of an immediate value (the >>>>>>> + * return address). >>>>>>> + */ >>>>>>> + op->src.type = OP_SRC_CONST; >>>>>>> + op->dest.type = OP_DEST_PUSH; >>>>>>> +} >>>>>> >>>>>> An alternative is to always set up stack ops for CALL/RET on decode, but >>>>>> conditionally run update_insn_state() for them. >>>>>> >>>>>> Not sure that makes more logical sense, but the patch would be simpler I >>>>>> think. >>>>> >>>>> Right, this would avoid adding a new arch dependent function and the patch >>>>> will be simpler. This probably makes sense as the stack impact is the same >>>>> for all calls (but objtool will use it only for intra-function calls). >>>>> >>>> >>>> Actually the processing of the ret instruction is more complicated than I >>>> anticipated with intra-function calls, and so my implementation is not >>>> complete at the moment. >>>> >>>> The issue is to correctly handle how the ret is going to behave depending how >>>> the stack (or register on arm) is modified before the ret. Adjusting the stack >>>> offset makes the stack state correct, but objtool still needs to correctly >>>> figure out where the ret is going to return and where the code flow continues. >>>> >>> >>> A hint indicating the target "jump" address could be useful. It could >>> be used to add the information on some call/jump dynamic that aren't >>> associated with jump tables. Currently when objtool finds a jump >>> dynamic, if no branches were added to it, it will just return. >>> >>> Having such a hint could help make additional links (at least on >>> arm64). I don't know what Peter and Josh would think of that (if that >>> helps in your case of course). >>> >> >> Yes, I am thinking about tracking intra-function call return address, >> and having hints to specify a return address changes. For example, >> on x86, when we push the branch address on the stack we overwrite the >> last return address (the return address of the last intra-function call). >> Then the return instruction can figure out where to branch. > > I see, I was thinking about a more generic hint, that would just > indicate "this instruction actually jumps here". So in your case it > would just point that a certain return instruction causes to branch > somewhere. I thought about doing that but the problem is that on x86 the same retpoline code can branch differently depending on how it is used. Basically we have a return instruction that will branch differently based on what's on the stack. So we can just tell that this ret instruction will branch/return there. alex. > This way the hint could also be used for other instructions (e.g. > INSN_JUMP_DYNAMIC). > > > ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls 2020-04-08 17:06 ` Alexandre Chartre @ 2020-04-08 17:07 ` Julien Thierry 0 siblings, 0 replies; 40+ messages in thread From: Julien Thierry @ 2020-04-08 17:07 UTC (permalink / raw) To: Alexandre Chartre, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx On 4/8/20 6:06 PM, Alexandre Chartre wrote: > > > On 4/8/20 6:04 PM, Julien Thierry wrote: >> >> >> On 4/8/20 5:03 PM, Alexandre Chartre wrote: >>> >>> >>> On 4/8/20 4:19 PM, Julien Thierry wrote: >>>> >>>> >>>> On 4/8/20 3:06 PM, Alexandre Chartre wrote: >>>>> >>>>> >>>>> On 4/7/20 3:28 PM, Alexandre Chartre wrote: >>>>>> >>>>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote: >>>>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote: >>>>>>> >>>>>>>> index a62e032863a8..7ee1561bf7ad 100644 >>>>>>>> --- a/tools/objtool/arch/x86/decode.c >>>>>>>> +++ b/tools/objtool/arch/x86/decode.c >>>>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct >>>>>>>> cfi_state *state) >>>>>>>> state->regs[16].base = CFI_CFA; >>>>>>>> state->regs[16].offset = -8; >>>>>>>> } >>>>>>>> + >>>>>>>> + >>>>>>>> +void arch_configure_intra_function_call(struct stack_op *op) >>>>>>>> +{ >>>>>>>> + /* >>>>>>>> + * For the impact on the stack, make an intra-function >>>>>>>> + * call behaves like a push of an immediate value (the >>>>>>>> + * return address). >>>>>>>> + */ >>>>>>>> + op->src.type = OP_SRC_CONST; >>>>>>>> + op->dest.type = OP_DEST_PUSH; >>>>>>>> +} >>>>>>> >>>>>>> An alternative is to always set up stack ops for CALL/RET on >>>>>>> decode, but >>>>>>> conditionally run update_insn_state() for them. >>>>>>> >>>>>>> Not sure that makes more logical sense, but the patch would be >>>>>>> simpler I >>>>>>> think. >>>>>> >>>>>> Right, this would avoid adding a new arch dependent function and >>>>>> the patch >>>>>> will be simpler. This probably makes sense as the stack impact is >>>>>> the same >>>>>> for all calls (but objtool will use it only for intra-function >>>>>> calls). >>>>>> >>>>> >>>>> Actually the processing of the ret instruction is more complicated >>>>> than I >>>>> anticipated with intra-function calls, and so my implementation is not >>>>> complete at the moment. >>>>> >>>>> The issue is to correctly handle how the ret is going to behave >>>>> depending how >>>>> the stack (or register on arm) is modified before the ret. >>>>> Adjusting the stack >>>>> offset makes the stack state correct, but objtool still needs to >>>>> correctly >>>>> figure out where the ret is going to return and where the code flow >>>>> continues. >>>>> >>>> >>>> A hint indicating the target "jump" address could be useful. It could >>>> be used to add the information on some call/jump dynamic that aren't >>>> associated with jump tables. Currently when objtool finds a jump >>>> dynamic, if no branches were added to it, it will just return. >>>> >>>> Having such a hint could help make additional links (at least on >>>> arm64). I don't know what Peter and Josh would think of that (if that >>>> helps in your case of course). >>>> >>> >>> Yes, I am thinking about tracking intra-function call return address, >>> and having hints to specify a return address changes. For example, >>> on x86, when we push the branch address on the stack we overwrite the >>> last return address (the return address of the last intra-function >>> call). >>> Then the return instruction can figure out where to branch. >> >> I see, I was thinking about a more generic hint, that would just >> indicate "this instruction actually jumps here". So in your case it >> would just point that a certain return instruction causes to branch >> somewhere. > > I thought about doing that but the problem is that on x86 the same > retpoline code can branch differently depending on how it is used. > Basically we have a return instruction that will branch differently > based on what's on the stack. So we can just tell that this ret > instruction will branch/return there. > Oh, I see. Nevermind then! -- Julien Thierry ^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (4 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 13:27 ` Josh Poimboeuf 2020-04-07 7:31 ` [PATCH V2 7/9] x86/speculation: Annotate intra-function calls Alexandre Chartre ` (3 subsequent siblings) 9 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Change __FILL_RETURN_BUFFER so that the stack state is deterministically defined for each iteration and that objtool can have an accurate view of the stack. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- arch/x86/include/asm/nospec-branch.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 5c24a7b35166..9a946fd5e824 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -60,8 +60,8 @@ jmp 775b; \ 774: \ dec reg; \ - jnz 771b; \ - add $(BITS_PER_LONG/8) * nr, sp; + add $(BITS_PER_LONG/8) * 2, sp; \ + jnz 771b; #ifdef __ASSEMBLY__ -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool 2020-04-07 7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre @ 2020-04-07 13:27 ` Josh Poimboeuf 0 siblings, 0 replies; 40+ messages in thread From: Josh Poimboeuf @ 2020-04-07 13:27 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, peterz, jthierry, tglx On Tue, Apr 07, 2020 at 09:31:39AM +0200, Alexandre Chartre wrote: > Change __FILL_RETURN_BUFFER so that the stack state is deterministically > defined for each iteration and that objtool can have an accurate view > of the stack. > > Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> > --- > arch/x86/include/asm/nospec-branch.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h > index 5c24a7b35166..9a946fd5e824 100644 > --- a/arch/x86/include/asm/nospec-branch.h > +++ b/arch/x86/include/asm/nospec-branch.h > @@ -60,8 +60,8 @@ > jmp 775b; \ > 774: \ > dec reg; \ > - jnz 771b; \ > - add $(BITS_PER_LONG/8) * nr, sp; > + add $(BITS_PER_LONG/8) * 2, sp; \ > + jnz 771b; > > #ifdef __ASSEMBLY__ This still isn't a complete fix because the macro is used in an alternative. So in the !X86_FEATURE_RSB_CTXSW case, this code isn't patched in and the ORC data which refers to it is wrong. As I said before I think the easiest fix would be to convert RSB and retpolines to use static branches instead of alternatives. -- Josh ^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH V2 7/9] x86/speculation: Annotate intra-function calls 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (5 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return Alexandre Chartre ` (2 subsequent siblings) 9 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Some speculative execution mitigations (like retpoline) use intra- function calls. Provide a macro to annotate such intra-function calls so they can be properly handled by objtool, and use this macro to annotate intra-function calls. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- arch/x86/include/asm/nospec-branch.h | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 9a946fd5e824..a345c6fa0541 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -3,6 +3,7 @@ #ifndef _ASM_X86_NOSPEC_BRANCH_H_ #define _ASM_X86_NOSPEC_BRANCH_H_ +#include <linux/frame.h> #include <linux/static_key.h> #include <asm/alternative.h> @@ -19,6 +20,15 @@ #define ANNOTATE_NOSPEC_ALTERNATIVE \ ANNOTATE_IGNORE_ALTERNATIVE +/* + * Intra-function call instruction. This should be used as a substitute + * for the call instruction when doing an intra-function call. It is + * similar to the call instruction but it tells objtool that this is + * an intra-function call. + */ +#define INTRA_FUNCTION_CALL \ + ANNOTATE_INTRA_FUNCTION_CALL call + /* * Fill the CPU return stack buffer. * @@ -47,13 +57,13 @@ #define __FILL_RETURN_BUFFER(reg, nr, sp) \ mov $(nr/2), reg; \ 771: \ - call 772f; \ + INTRA_FUNCTION_CALL 772f; \ 773: /* speculation trap */ \ pause; \ lfence; \ jmp 773b; \ 772: \ - call 774f; \ + INTRA_FUNCTION_CALL 774f; \ 775: /* speculation trap */ \ pause; \ lfence; \ @@ -83,7 +93,7 @@ * invocation below less ugly. */ .macro RETPOLINE_JMP reg:req - call .Ldo_rop_\@ + INTRA_FUNCTION_CALL .Ldo_rop_\@ .Lspec_trap_\@: pause lfence @@ -102,7 +112,7 @@ .Ldo_retpoline_jmp_\@: RETPOLINE_JMP \reg .Ldo_call_\@: - call .Ldo_retpoline_jmp_\@ + INTRA_FUNCTION_CALL .Ldo_retpoline_jmp_\@ .endm /* -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (6 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 7/9] x86/speculation: Annotate intra-function calls Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre 2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf 9 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre With retpoline, the address to branch to is pushed on the stack and the return instruction (ret) is used to jump to that address. Use the UNWIND_HINT_RET_OFFSET directive to inform objtool that the stack should be adjusted. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> replace RETPOLINE_RET with UNWIND_HINT_RET_OFFSET --- arch/x86/include/asm/nospec-branch.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index a345c6fa0541..5ce2a40a26da 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -10,6 +10,7 @@ #include <asm/alternative-asm.h> #include <asm/cpufeatures.h> #include <asm/msr-index.h> +#include <asm/unwind_hints.h> /* * This should be used immediately before a retpoline alternative. It tells @@ -100,6 +101,7 @@ jmp .Lspec_trap_\@ .Ldo_rop_\@: mov \reg, (%_ASM_SP) + UNWIND_HINT_RET_OFFSET ret .endm -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (7 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return Alexandre Chartre @ 2020-04-07 7:31 ` Alexandre Chartre 2020-04-07 13:28 ` Peter Zijlstra 2020-04-07 13:52 ` Peter Zijlstra 2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf 9 siblings, 2 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 7:31 UTC (permalink / raw) To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre Now that intra-function calls have been annotated and are supported by objtool, that retpoline return instructions have been annotated, and that __FILL_RETURN_BUFFER code is compatible with objtool, then all ANNOTATE_NOSPEC_ALTERNATIVE directives can be removed. Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> --- arch/x86/include/asm/nospec-branch.h | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 5ce2a40a26da..6480db4304a0 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -124,7 +124,6 @@ */ .macro JMP_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD @@ -135,7 +134,6 @@ .macro CALL_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD @@ -150,7 +148,6 @@ */ .macro FILL_RETURN_BUFFER reg:req nr:req ftr:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE ALTERNATIVE "jmp .Lskip_rsb_\@", \ __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \ \ftr @@ -174,7 +171,6 @@ * which is ensured when CONFIG_RETPOLINE is defined. */ # define CALL_NOSPEC \ - ANNOTATE_NOSPEC_ALTERNATIVE \ ALTERNATIVE_2( \ ANNOTATE_RETPOLINE_SAFE \ "call *%[thunk_target]\n", \ @@ -193,7 +189,6 @@ * here, anyway. */ # define CALL_NOSPEC \ - ANNOTATE_NOSPEC_ALTERNATIVE \ ALTERNATIVE_2( \ ANNOTATE_RETPOLINE_SAFE \ "call *%[thunk_target]\n", \ @@ -261,8 +256,7 @@ static inline void vmexit_fill_RSB(void) #ifdef CONFIG_RETPOLINE unsigned long loops; - asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE - ALTERNATIVE("jmp 910f", + asm volatile (ALTERNATIVE("jmp 910f", __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), X86_FEATURE_RETPOLINE) "910:" -- 2.18.2 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre @ 2020-04-07 13:28 ` Peter Zijlstra 2020-04-07 13:34 ` Josh Poimboeuf 2020-04-07 13:52 ` Peter Zijlstra 1 sibling, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 13:28 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On Tue, Apr 07, 2020 at 09:31:42AM +0200, Alexandre Chartre wrote: > Now that intra-function calls have been annotated and are supported > by objtool, that retpoline return instructions have been annotated, > and that __FILL_RETURN_BUFFER code is compatible with objtool, then > all ANNOTATE_NOSPEC_ALTERNATIVE directives can be removed. Like Josh said in the previous thread, this isn't going to work right. > - ANNOTATE_NOSPEC_ALTERNATIVE > ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ > __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ > __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD The problem is that while objtool can now understand the code flow and the corresponding stack layout, we only have a single ORC table, one that must be valid for all alternatives. Effectively this means there should not be any orc entries in an alternative range. In practise it _might_ work when the instruction of the various alternatives have unique offsets in the range. But I'm not entirely sure of that. Josh, we should probably have objtool verify it doesn't emit ORC entries in alternative ranges. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 13:28 ` Peter Zijlstra @ 2020-04-07 13:34 ` Josh Poimboeuf 2020-04-07 14:32 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Josh Poimboeuf @ 2020-04-07 13:34 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Alexandre Chartre, x86, linux-kernel, jthierry, tglx On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: > Josh, we should probably have objtool verify it doesn't emit ORC entries > in alternative ranges. Agreed, it might be as simple as checking for insn->alt_group in the INSN_STACK check or in update_insn_state(). -- Josh ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 13:34 ` Josh Poimboeuf @ 2020-04-07 14:32 ` Alexandre Chartre 2020-04-07 16:18 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 14:32 UTC (permalink / raw) To: Josh Poimboeuf, Peter Zijlstra; +Cc: x86, linux-kernel, jthierry, tglx On 4/7/20 3:34 PM, Josh Poimboeuf wrote: > On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: >> Josh, we should probably have objtool verify it doesn't emit ORC entries >> in alternative ranges. > > Agreed, it might be as simple as checking for insn->alt_group in the > INSN_STACK check or in update_insn_state(). > We could do that only for the "objtool orc generate" command. That way "objtool check" would still check the alternative, but "objtool orc generate" will just use the first half of the alternative (like it does today with ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE but only use them for "objtool orc generate". alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 14:32 ` Alexandre Chartre @ 2020-04-07 16:18 ` Alexandre Chartre 2020-04-07 16:28 ` Josh Poimboeuf 2020-04-07 16:41 ` Peter Zijlstra 0 siblings, 2 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 16:18 UTC (permalink / raw) To: Josh Poimboeuf, Peter Zijlstra; +Cc: x86, linux-kernel, jthierry, tglx On 4/7/20 4:32 PM, Alexandre Chartre wrote: > > On 4/7/20 3:34 PM, Josh Poimboeuf wrote: >> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: >>> Josh, we should probably have objtool verify it doesn't emit ORC entries >>> in alternative ranges. >> >> Agreed, it might be as simple as checking for insn->alt_group in the >> INSN_STACK check or in update_insn_state(). >> > > We could do that only for the "objtool orc generate" command. That way > "objtool check" would still check the alternative, but "objtool orc generate" > will just use the first half of the alternative (like it does today with > ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE > but only use them for "objtool orc generate". > I have checked and objtool doesn't emit ORC entries for alternative: decode_instructions() doesn't mark such section with sec->text = true so create_orc_sections() doesn't emit corresponding ORC entries. So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives, this will allow objtool to check the instructions but it still won't emit ORC entries (same behavior as today). In the future, if ORC eventually supports alternative we will be ready to have objtool emit ORC entries. alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 16:18 ` Alexandre Chartre @ 2020-04-07 16:28 ` Josh Poimboeuf 2020-04-07 17:01 ` Alexandre Chartre 2020-04-07 17:27 ` Peter Zijlstra 2020-04-07 16:41 ` Peter Zijlstra 1 sibling, 2 replies; 40+ messages in thread From: Josh Poimboeuf @ 2020-04-07 16:28 UTC (permalink / raw) To: Alexandre Chartre; +Cc: Peter Zijlstra, x86, linux-kernel, jthierry, tglx On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote: > > On 4/7/20 4:32 PM, Alexandre Chartre wrote: > > > > On 4/7/20 3:34 PM, Josh Poimboeuf wrote: > > > On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: > > > > Josh, we should probably have objtool verify it doesn't emit ORC entries > > > > in alternative ranges. > > > > > > Agreed, it might be as simple as checking for insn->alt_group in the > > > INSN_STACK check or in update_insn_state(). > > > > > > > We could do that only for the "objtool orc generate" command. That way > > "objtool check" would still check the alternative, but "objtool orc generate" > > will just use the first half of the alternative (like it does today with > > ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE > > but only use them for "objtool orc generate". > > > > I have checked and objtool doesn't emit ORC entries for alternative: > decode_instructions() doesn't mark such section with sec->text = true > so create_orc_sections() doesn't emit corresponding ORC entries. > > So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives, > this will allow objtool to check the instructions but it still won't > emit ORC entries (same behavior as today). In the future, if ORC > eventually supports alternative we will be ready to have objtool emit > ORC entries. What's the benefit of removing ANNOTATE_NOSPEC_ALTERNATIVE if there's no ORC support to go along with it? Also I want to avoid adding "ORC alternatives". ORC is nice and simple and we should keep it that way as much as possible. Again, we should warn on stack changes inside alternatives, and then look at converting RSB and retpolines to use static branches so they have deterministic stacks. -- Josh ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 16:28 ` Josh Poimboeuf @ 2020-04-07 17:01 ` Alexandre Chartre 2020-04-07 17:26 ` Peter Zijlstra 2020-04-07 17:27 ` Peter Zijlstra 1 sibling, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 17:01 UTC (permalink / raw) To: Josh Poimboeuf; +Cc: Peter Zijlstra, x86, linux-kernel, jthierry, tglx On 4/7/20 6:28 PM, Josh Poimboeuf wrote: > On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote: >> >> On 4/7/20 4:32 PM, Alexandre Chartre wrote: >>> >>> On 4/7/20 3:34 PM, Josh Poimboeuf wrote: >>>> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: >>>>> Josh, we should probably have objtool verify it doesn't emit ORC entries >>>>> in alternative ranges. >>>> >>>> Agreed, it might be as simple as checking for insn->alt_group in the >>>> INSN_STACK check or in update_insn_state(). >>>> >>> >>> We could do that only for the "objtool orc generate" command. That way >>> "objtool check" would still check the alternative, but "objtool orc generate" >>> will just use the first half of the alternative (like it does today with >>> ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE >>> but only use them for "objtool orc generate". >>> >> >> I have checked and objtool doesn't emit ORC entries for alternative: >> decode_instructions() doesn't mark such section with sec->text = true >> so create_orc_sections() doesn't emit corresponding ORC entries. >> >> So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives, >> this will allow objtool to check the instructions but it still won't >> emit ORC entries (same behavior as today). In the future, if ORC >> eventually supports alternative we will be ready to have objtool emit >> ORC entries. > > What's the benefit of removing ANNOTATE_NOSPEC_ALTERNATIVE if there's no > ORC support to go along with it? To have the code validated by objtool like any other alternative code (which is not tagged with ANNOTATE_NOSPEC_ALTERNATIVE). > Also I want to avoid adding "ORC alternatives". ORC is nice and simple > and we should keep it that way as much as possible. > > Again, we should warn on stack changes inside alternatives, and then > look at converting RSB and retpolines to use static branches so they > have deterministic stacks. > objtool doesn't currently warn on stack changes inside alternatives. The RSB/retpoline alternatives have warning because objtool doesn't support retpoline ret and intra-function calls. If you have an alternative doing stack changes that objtool understand (like push/pop, add/remove to sp) then you won't have a warning. I think that's the case with smap_save: static __always_inline unsigned long smap_save(void) { unsigned long flags; asm volatile (ALTERNATIVE("", "pushf; pop %0; " __ASM_CLAC, X86_FEATURE_SMAP) : "=rm" (flags) : : "memory", "cc"); return flags; } The alternative does change the stack but objtool won't complain because it handles the pushf and pop instruction. alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 17:01 ` Alexandre Chartre @ 2020-04-07 17:26 ` Peter Zijlstra 0 siblings, 0 replies; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 17:26 UTC (permalink / raw) To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx On Tue, Apr 07, 2020 at 07:01:43PM +0200, Alexandre Chartre wrote: > I think that's the case with smap_save: > > static __always_inline unsigned long smap_save(void) > { > unsigned long flags; > > asm volatile (ALTERNATIVE("", "pushf; pop %0; " __ASM_CLAC, > X86_FEATURE_SMAP) > : "=rm" (flags) : : "memory", "cc"); > > return flags; > } > > The alternative does change the stack but objtool won't complain > because it handles the pushf and pop instruction. Oh bugger; indeed! That'll wreck unwinding. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 16:28 ` Josh Poimboeuf 2020-04-07 17:01 ` Alexandre Chartre @ 2020-04-07 17:27 ` Peter Zijlstra 2020-04-08 21:35 ` Peter Zijlstra 1 sibling, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 17:27 UTC (permalink / raw) To: Josh Poimboeuf; +Cc: Alexandre Chartre, x86, linux-kernel, jthierry, tglx On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote: > Again, we should warn on stack changes inside alternatives, and then > look at converting RSB and retpolines to use static branches so they > have deterministic stacks. I don't think we need static brancher, we should just out-of-line the whole thing. Let me sort this CFI error Thomas is getting and then I'll attempt a patch along the lines I outlined in earlier emails. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 17:27 ` Peter Zijlstra @ 2020-04-08 21:35 ` Peter Zijlstra 2020-04-09 8:18 ` Alexandre Chartre 0 siblings, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-08 21:35 UTC (permalink / raw) To: Josh Poimboeuf; +Cc: Alexandre Chartre, x86, linux-kernel, jthierry, tglx On Tue, Apr 07, 2020 at 07:27:39PM +0200, Peter Zijlstra wrote: > On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote: > > Again, we should warn on stack changes inside alternatives, and then > > look at converting RSB and retpolines to use static branches so they > > have deterministic stacks. > > I don't think we need static brancher, we should just out-of-line the > whole thing. > > Let me sort this CFI error Thomas is getting and then I'll attempt a > patch along the lines I outlined in earlier emails. Something like so.. seems to build and boot. --- From: Peter Zijlstra (Intel) <peterz@infradead.org> Subject: x86: Out-of-line retpoline Since GCC generated code already uses out-of-line retpolines and objtool has trouble with retpolines in alternatives, out-of-line them entirely. This will enable objtool (once it's been taught a few more tricks) to generate valid ORC data for the out-of-line copies, which means we can correctly and reliably unwind through a retpoline. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- arch/x86/crypto/aesni-intel_asm.S | 4 +-- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 2 +- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 2 +- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 26 ++++++++--------- arch/x86/entry/entry_32.S | 6 ++-- arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/asm-prototypes.h | 8 ++++-- arch/x86/include/asm/nospec-branch.h | 42 ++++------------------------ arch/x86/kernel/ftrace_32.S | 2 +- arch/x86/kernel/ftrace_64.S | 4 +-- arch/x86/lib/checksum_32.S | 4 +-- arch/x86/lib/retpoline.S | 27 +++++++++++++++--- arch/x86/platform/efi/efi_stub_64.S | 2 +- 13 files changed, 62 insertions(+), 69 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S index cad6e1bfa7d5..54e7d15dbd0d 100644 --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -2758,7 +2758,7 @@ SYM_FUNC_START(aesni_xts_crypt8) pxor INC, STATE4 movdqu IV, 0x30(OUTP) - CALL_NOSPEC %r11 + CALL_NOSPEC r11 movdqu 0x00(OUTP), INC pxor INC, STATE1 @@ -2803,7 +2803,7 @@ SYM_FUNC_START(aesni_xts_crypt8) _aesni_gf128mul_x_ble() movups IV, (IVP) - CALL_NOSPEC %r11 + CALL_NOSPEC r11 movdqu 0x40(OUTP), INC pxor INC, STATE1 diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S index d01ddd73de65..ecc0a9a905c4 100644 --- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S @@ -1228,7 +1228,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_16way) vpxor 14 * 16(%rax), %xmm15, %xmm14; vpxor 15 * 16(%rax), %xmm15, %xmm15; - CALL_NOSPEC %r9; + CALL_NOSPEC r9; addq $(16 * 16), %rsp; diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S index 563ef6e83cdd..0907243c501c 100644 --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -1339,7 +1339,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_32way) vpxor 14 * 32(%rax), %ymm15, %ymm14; vpxor 15 * 32(%rax), %ymm15, %ymm15; - CALL_NOSPEC %r9; + CALL_NOSPEC r9; addq $(16 * 32), %rsp; diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S index 0e6690e3618c..8501ec4532f4 100644 --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -75,7 +75,7 @@ .text SYM_FUNC_START(crc_pcl) -#define bufp %rdi +#define bufp rdi #define bufp_dw %edi #define bufp_w %di #define bufp_b %dil @@ -105,9 +105,9 @@ SYM_FUNC_START(crc_pcl) ## 1) ALIGN: ################################################################ - mov bufp, bufptmp # rdi = *buf - neg bufp - and $7, bufp # calculate the unalignment amount of + mov %bufp, bufptmp # rdi = *buf + neg %bufp + and $7, %bufp # calculate the unalignment amount of # the address je proc_block # Skip if aligned @@ -123,13 +123,13 @@ SYM_FUNC_START(crc_pcl) do_align: #### Calculate CRC of unaligned bytes of the buffer (if any) movq (bufptmp), tmp # load a quadward from the buffer - add bufp, bufptmp # align buffer pointer for quadword + add %bufp, bufptmp # align buffer pointer for quadword # processing - sub bufp, len # update buffer length + sub %bufp, len # update buffer length align_loop: crc32b %bl, crc_init_dw # compute crc32 of 1-byte shr $8, tmp # get next byte - dec bufp + dec %bufp jne align_loop proc_block: @@ -169,10 +169,10 @@ continue_block: xor crc2, crc2 ## branch into array - lea jump_table(%rip), bufp - movzxw (bufp, %rax, 2), len - lea crc_array(%rip), bufp - lea (bufp, len, 1), bufp + lea jump_table(%rip), %bufp + movzxw (%bufp, %rax, 2), len + lea crc_array(%rip), %bufp + lea (%bufp, len, 1), %bufp JMP_NOSPEC bufp ################################################################ @@ -218,9 +218,9 @@ LABEL crc_ %i ## 4) Combine three results: ################################################################ - lea (K_table-8)(%rip), bufp # first entry is for idx 1 + lea (K_table-8)(%rip), %bufp # first entry is for idx 1 shlq $3, %rax # rax *= 8 - pmovzxdq (bufp,%rax), %xmm0 # 2 consts: K1:K2 + pmovzxdq (%bufp,%rax), %xmm0 # 2 consts: K1:K2 leal (%eax,%eax,2), %eax # rax *= 3 (total *24) subq %rax, tmp # tmp -= rax*24 diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index b67bae7091d7..7e7ffb7a5147 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -816,7 +816,7 @@ SYM_CODE_START(ret_from_fork) /* kernel thread */ 1: movl %edi, %eax - CALL_NOSPEC %ebx + CALL_NOSPEC ebx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() @@ -1501,7 +1501,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception_read_cr2) TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - CALL_NOSPEC %edi + CALL_NOSPEC edi jmp ret_from_exception SYM_CODE_END(common_exception_read_cr2) @@ -1522,7 +1522,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception) TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - CALL_NOSPEC %edi + CALL_NOSPEC edi jmp ret_from_exception SYM_CODE_END(common_exception) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 0e9504fabe52..168b798913bc 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -349,7 +349,7 @@ SYM_CODE_START(ret_from_fork) /* kernel thread */ UNWIND_HINT_EMPTY movq %r12, %rdi - CALL_NOSPEC %rbx + CALL_NOSPEC rbx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h index ce92c4acc913..a75195f159cc 100644 --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -18,9 +18,13 @@ extern void cmpxchg8b_emu(void); #ifdef CONFIG_RETPOLINE #ifdef CONFIG_X86_32 -#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); +#define INDIRECT_THUNK(reg) \ + extern asmlinkage void __x86_retpoline_e ## reg(void); \ + extern asmlinkage void __x86_indirect_thunk_e ## reg(void); #else -#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); +#define INDIRECT_THUNK(reg) \ + extern asmlinkage void __x86_retpoline_r ## reg(void); \ + extern asmlinkage void __x86_indirect_thunk_r ## reg(void); INDIRECT_THUNK(8) INDIRECT_THUNK(9) INDIRECT_THUNK(10) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 07e95dcb40ad..0a3ceab5e0ec 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -76,34 +76,6 @@ .popsection .endm -/* - * These are the bare retpoline primitives for indirect jmp and call. - * Do not use these directly; they only exist to make the ALTERNATIVE - * invocation below less ugly. - */ -.macro RETPOLINE_JMP reg:req - call .Ldo_rop_\@ -.Lspec_trap_\@: - pause - lfence - jmp .Lspec_trap_\@ -.Ldo_rop_\@: - mov \reg, (%_ASM_SP) - ret -.endm - -/* - * This is a wrapper around RETPOLINE_JMP so the called function in reg - * returns to the instruction after the macro. - */ -.macro RETPOLINE_CALL reg:req - jmp .Ldo_call_\@ -.Ldo_retpoline_jmp_\@: - RETPOLINE_JMP \reg -.Ldo_call_\@: - call .Ldo_retpoline_jmp_\@ -.endm - /* * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple * indirect jmp/call which may be susceptible to the Spectre variant 2 @@ -111,10 +83,9 @@ */ .macro JMP_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ - __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \ + __stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE, \ + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD #else jmp *\reg #endif @@ -122,10 +93,9 @@ .macro CALL_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ - __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \ + __stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\ + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD #else call *\reg #endif diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S index e8a9f8370112..e405fe1a8bf4 100644 --- a/arch/x86/kernel/ftrace_32.S +++ b/arch/x86/kernel/ftrace_32.S @@ -189,5 +189,5 @@ return_to_handler: movl %eax, %ecx popl %edx popl %eax - JMP_NOSPEC %ecx + JMP_NOSPEC ecx #endif diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 369e61faacfe..2f2b5702e6cf 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -303,7 +303,7 @@ trace: * function tracing is enabled. */ movq ftrace_trace_function, %r8 - CALL_NOSPEC %r8 + CALL_NOSPEC r8 restore_mcount_regs jmp fgraph_trace @@ -340,6 +340,6 @@ SYM_CODE_START(return_to_handler) movq 8(%rsp), %rdx movq (%rsp), %rax addq $24, %rsp - JMP_NOSPEC %rdi + JMP_NOSPEC rdi SYM_CODE_END(return_to_handler) #endif diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S index 4742e8fa7ee7..d1d768912368 100644 --- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -153,7 +153,7 @@ SYM_FUNC_START(csum_partial) negl %ebx lea 45f(%ebx,%ebx,2), %ebx testl %esi, %esi - JMP_NOSPEC %ebx + JMP_NOSPEC ebx # Handle 2-byte-aligned regions 20: addw (%esi), %ax @@ -436,7 +436,7 @@ SYM_FUNC_START(csum_partial_copy_generic) andl $-32,%edx lea 3f(%ebx,%ebx), %ebx testl %esi, %esi - JMP_NOSPEC %ebx + JMP_NOSPEC ebx 1: addl $64,%esi addl $64,%edi SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl) diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index 363ec132df7e..d4aef0c856a6 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -8,14 +8,31 @@ #include <asm/export.h> #include <asm/nospec-branch.h> +/* + * This is the bare retpoline primitive. + */ +.macro RETPOLINE_JMP reg:req + call .Ldo_rop_\@ +.Lspec_trap_\@: + pause + lfence + jmp .Lspec_trap_\@ +.Ldo_rop_\@: + mov \reg, (%_ASM_SP) + ret +.endm + .macro THUNK reg .section .text.__x86.indirect_thunk +SYM_FUNC_START(__x86_retpoline_\reg) + RETPOLINE_JMP %\reg +SYM_FUNC_END(__x86_retpoline_\reg) + SYM_FUNC_START(__x86_indirect_thunk_\reg) - CFI_STARTPROC - JMP_NOSPEC %\reg - CFI_ENDPROC + JMP_NOSPEC \reg SYM_FUNC_END(__x86_indirect_thunk_\reg) + .endm /* @@ -26,7 +43,9 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg) * the simple and nasty way... */ #define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym) -#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) +#define EXPORT_THUNK(reg) \ + __EXPORT_THUNK(__x86_retpoline_ ## reg); \ + __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) #define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg) GENERATE_THUNK(_ASM_AX) diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S index 15da118f04f0..90380a17ab23 100644 --- a/arch/x86/platform/efi/efi_stub_64.S +++ b/arch/x86/platform/efi/efi_stub_64.S @@ -21,7 +21,7 @@ SYM_FUNC_START(__efi_call) mov %r8, %r9 mov %rcx, %r8 mov %rsi, %rcx - CALL_NOSPEC %rdi + CALL_NOSPEC rdi leave ret SYM_FUNC_END(__efi_call) ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-08 21:35 ` Peter Zijlstra @ 2020-04-09 8:18 ` Alexandre Chartre 2020-04-09 10:34 ` Peter Zijlstra 0 siblings, 1 reply; 40+ messages in thread From: Alexandre Chartre @ 2020-04-09 8:18 UTC (permalink / raw) To: Peter Zijlstra, Josh Poimboeuf; +Cc: x86, linux-kernel, jthierry, tglx On 4/8/20 11:35 PM, Peter Zijlstra wrote: > On Tue, Apr 07, 2020 at 07:27:39PM +0200, Peter Zijlstra wrote: >> On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote: >>> Again, we should warn on stack changes inside alternatives, and then >>> look at converting RSB and retpolines to use static branches so they >>> have deterministic stacks. >> >> I don't think we need static brancher, we should just out-of-line the >> whole thing. >> >> Let me sort this CFI error Thomas is getting and then I'll attempt a >> patch along the lines I outlined in earlier emails. > > Something like so.. seems to build and boot. > > --- > From: Peter Zijlstra (Intel) <peterz@infradead.org> > Subject: x86: Out-of-line retpoline > > Since GCC generated code already uses out-of-line retpolines and objtool > has trouble with retpolines in alternatives, out-of-line them entirely. > > This will enable objtool (once it's been taught a few more tricks) to > generate valid ORC data for the out-of-line copies, which means we can > correctly and reliably unwind through a retpoline. > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > arch/x86/crypto/aesni-intel_asm.S | 4 +-- > arch/x86/crypto/camellia-aesni-avx-asm_64.S | 2 +- > arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 2 +- > arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 26 ++++++++--------- > arch/x86/entry/entry_32.S | 6 ++-- > arch/x86/entry/entry_64.S | 2 +- > arch/x86/include/asm/asm-prototypes.h | 8 ++++-- > arch/x86/include/asm/nospec-branch.h | 42 ++++------------------------ > arch/x86/kernel/ftrace_32.S | 2 +- > arch/x86/kernel/ftrace_64.S | 4 +-- > arch/x86/lib/checksum_32.S | 4 +-- > arch/x86/lib/retpoline.S | 27 +++++++++++++++--- > arch/x86/platform/efi/efi_stub_64.S | 2 +- > 13 files changed, 62 insertions(+), 69 deletions(-) > ... > /* > * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple > * indirect jmp/call which may be susceptible to the Spectre variant 2 > @@ -111,10 +83,9 @@ > */ > .macro JMP_NOSPEC reg:req > #ifdef CONFIG_RETPOLINE > - ANNOTATE_NOSPEC_ALTERNATIVE > - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ > - __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ > - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD > + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \ > + __stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE, \ > + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD > #else > jmp *\reg > #endif > @@ -122,10 +93,9 @@ > > .macro CALL_NOSPEC reg:req > #ifdef CONFIG_RETPOLINE > - ANNOTATE_NOSPEC_ALTERNATIVE > - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ > - __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ > - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD > + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \ > + __stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\ > + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD For X86_FEATURE_RETPOLINE_AMD, the call won't be aligned like the others, it will be after the lfence instruction so ORC data won't be at the same place. I am adding some code in objtool to check that alternatives don't change the stack, but I should actually be checking if all alternatives have the same unwind instructions at the same place. Other than that, my only question would be any impact on performances. Retpoline code was added with trying to limit performance impact. Here, JMP_NOSPEC has now an additional (long) jump, and CALL_NOSPEC is doing a long call instead of a near call. But I have no idea if this has a visible impact. alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-09 8:18 ` Alexandre Chartre @ 2020-04-09 10:34 ` Peter Zijlstra 2020-04-09 10:40 ` Peter Zijlstra 0 siblings, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-09 10:34 UTC (permalink / raw) To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx On Thu, Apr 09, 2020 at 10:18:56AM +0200, Alexandre Chartre wrote: > > - ANNOTATE_NOSPEC_ALTERNATIVE > > - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ > > - __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ > > - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD > > + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \ > > + __stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\ > > + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD > > For X86_FEATURE_RETPOLINE_AMD, the call won't be aligned like the others, > it will be after the lfence instruction so ORC data won't be at the same > place. I am adding some code in objtool to check that alternatives don't > change the stack, but I should actually be checking if all alternatives > have the same unwind instructions at the same place. Argh; earlier (20200407135953.GC20730@hirez.programming.kicks-ass.net) I used 2 alternatives but then, when I did the patch yesterday, I forgot why I did that :/ > Other than that, my only question would be any impact on performances. Yeah, that needs testing, I suspect it's a wash. > Retpoline code was added with trying to limit performance impact. > Here, JMP_NOSPEC has now an additional (long) jump, and CALL_NOSPEC > is doing a long call instead of a near call. But I have no idea if this > has a visible impact. The thing is, all the compiler generated code already used the out-of-line copies, and those will now have a near jump extra I think, but that should be to the same $I line, given that __x86_retpoline_ and __x86_indirect_thunk_ are next to one another. We can also play alignment tricks, see below. The only sites that can suffer are those few manual asm uses of JMP_NOSPEC, and I'm not sure we care. The below results in: Disassembly of section .text.__x86.indirect_thunk: 0000000000000000 <__x86_indirect_thunk_rax>: 0: 90 nop 1: 90 nop 2: 90 nop 3: ff e0 jmpq *%rax 5: 90 nop 6: 90 nop 7: 90 nop 0000000000000008 <__x86_retpoline_rax>: 8: e8 07 00 00 00 callq 14 <__x86_retpoline_rax+0xc> d: f3 90 pause f: 0f ae e8 lfence 12: eb f9 jmp d <__x86_retpoline_rax+0x5> 14: 48 89 04 24 mov %rax,(%rsp) 18: c3 retq 19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 0000000000000020 <__x86_indirect_thunk_rbx>: .... I'm not sure why it thinks the "jmp __x86_retpoline_rax" needs 5 bytes to encode, but those nops don't hurt nothing since we'll not fit in 16 bytes anyway. --- diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S index cad6e1bfa7d5..54e7d15dbd0d 100644 --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -2758,7 +2758,7 @@ SYM_FUNC_START(aesni_xts_crypt8) pxor INC, STATE4 movdqu IV, 0x30(OUTP) - CALL_NOSPEC %r11 + CALL_NOSPEC r11 movdqu 0x00(OUTP), INC pxor INC, STATE1 @@ -2803,7 +2803,7 @@ SYM_FUNC_START(aesni_xts_crypt8) _aesni_gf128mul_x_ble() movups IV, (IVP) - CALL_NOSPEC %r11 + CALL_NOSPEC r11 movdqu 0x40(OUTP), INC pxor INC, STATE1 diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S index d01ddd73de65..ecc0a9a905c4 100644 --- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S @@ -1228,7 +1228,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_16way) vpxor 14 * 16(%rax), %xmm15, %xmm14; vpxor 15 * 16(%rax), %xmm15, %xmm15; - CALL_NOSPEC %r9; + CALL_NOSPEC r9; addq $(16 * 16), %rsp; diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S index 563ef6e83cdd..0907243c501c 100644 --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -1339,7 +1339,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_32way) vpxor 14 * 32(%rax), %ymm15, %ymm14; vpxor 15 * 32(%rax), %ymm15, %ymm15; - CALL_NOSPEC %r9; + CALL_NOSPEC r9; addq $(16 * 32), %rsp; diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S index 0e6690e3618c..8501ec4532f4 100644 --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -75,7 +75,7 @@ .text SYM_FUNC_START(crc_pcl) -#define bufp %rdi +#define bufp rdi #define bufp_dw %edi #define bufp_w %di #define bufp_b %dil @@ -105,9 +105,9 @@ SYM_FUNC_START(crc_pcl) ## 1) ALIGN: ################################################################ - mov bufp, bufptmp # rdi = *buf - neg bufp - and $7, bufp # calculate the unalignment amount of + mov %bufp, bufptmp # rdi = *buf + neg %bufp + and $7, %bufp # calculate the unalignment amount of # the address je proc_block # Skip if aligned @@ -123,13 +123,13 @@ SYM_FUNC_START(crc_pcl) do_align: #### Calculate CRC of unaligned bytes of the buffer (if any) movq (bufptmp), tmp # load a quadward from the buffer - add bufp, bufptmp # align buffer pointer for quadword + add %bufp, bufptmp # align buffer pointer for quadword # processing - sub bufp, len # update buffer length + sub %bufp, len # update buffer length align_loop: crc32b %bl, crc_init_dw # compute crc32 of 1-byte shr $8, tmp # get next byte - dec bufp + dec %bufp jne align_loop proc_block: @@ -169,10 +169,10 @@ continue_block: xor crc2, crc2 ## branch into array - lea jump_table(%rip), bufp - movzxw (bufp, %rax, 2), len - lea crc_array(%rip), bufp - lea (bufp, len, 1), bufp + lea jump_table(%rip), %bufp + movzxw (%bufp, %rax, 2), len + lea crc_array(%rip), %bufp + lea (%bufp, len, 1), %bufp JMP_NOSPEC bufp ################################################################ @@ -218,9 +218,9 @@ LABEL crc_ %i ## 4) Combine three results: ################################################################ - lea (K_table-8)(%rip), bufp # first entry is for idx 1 + lea (K_table-8)(%rip), %bufp # first entry is for idx 1 shlq $3, %rax # rax *= 8 - pmovzxdq (bufp,%rax), %xmm0 # 2 consts: K1:K2 + pmovzxdq (%bufp,%rax), %xmm0 # 2 consts: K1:K2 leal (%eax,%eax,2), %eax # rax *= 3 (total *24) subq %rax, tmp # tmp -= rax*24 diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index b67bae7091d7..7e7ffb7a5147 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -816,7 +816,7 @@ SYM_CODE_START(ret_from_fork) /* kernel thread */ 1: movl %edi, %eax - CALL_NOSPEC %ebx + CALL_NOSPEC ebx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() @@ -1501,7 +1501,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception_read_cr2) TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - CALL_NOSPEC %edi + CALL_NOSPEC edi jmp ret_from_exception SYM_CODE_END(common_exception_read_cr2) @@ -1522,7 +1522,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception) TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - CALL_NOSPEC %edi + CALL_NOSPEC edi jmp ret_from_exception SYM_CODE_END(common_exception) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 0e9504fabe52..168b798913bc 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -349,7 +349,7 @@ SYM_CODE_START(ret_from_fork) /* kernel thread */ UNWIND_HINT_EMPTY movq %r12, %rdi - CALL_NOSPEC %rbx + CALL_NOSPEC rbx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h index ce92c4acc913..a75195f159cc 100644 --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -18,9 +18,13 @@ extern void cmpxchg8b_emu(void); #ifdef CONFIG_RETPOLINE #ifdef CONFIG_X86_32 -#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); +#define INDIRECT_THUNK(reg) \ + extern asmlinkage void __x86_retpoline_e ## reg(void); \ + extern asmlinkage void __x86_indirect_thunk_e ## reg(void); #else -#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); +#define INDIRECT_THUNK(reg) \ + extern asmlinkage void __x86_retpoline_r ## reg(void); \ + extern asmlinkage void __x86_indirect_thunk_r ## reg(void); INDIRECT_THUNK(8) INDIRECT_THUNK(9) INDIRECT_THUNK(10) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 07e95dcb40ad..a180b0fe2fed 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -76,34 +76,6 @@ .popsection .endm -/* - * These are the bare retpoline primitives for indirect jmp and call. - * Do not use these directly; they only exist to make the ALTERNATIVE - * invocation below less ugly. - */ -.macro RETPOLINE_JMP reg:req - call .Ldo_rop_\@ -.Lspec_trap_\@: - pause - lfence - jmp .Lspec_trap_\@ -.Ldo_rop_\@: - mov \reg, (%_ASM_SP) - ret -.endm - -/* - * This is a wrapper around RETPOLINE_JMP so the called function in reg - * returns to the instruction after the macro. - */ -.macro RETPOLINE_CALL reg:req - jmp .Ldo_call_\@ -.Ldo_retpoline_jmp_\@: - RETPOLINE_JMP \reg -.Ldo_call_\@: - call .Ldo_retpoline_jmp_\@ -.endm - /* * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple * indirect jmp/call which may be susceptible to the Spectre variant 2 @@ -111,23 +83,21 @@ */ .macro JMP_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ - __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD + ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD + ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \ + __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE #else - jmp *\reg + jmp *%\reg #endif .endm .macro CALL_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ANNOTATE_NOSPEC_ALTERNATIVE - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ - __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD + ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD + ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \ + __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE #else - call *\reg + call *%\reg #endif .endm diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S index e8a9f8370112..e405fe1a8bf4 100644 --- a/arch/x86/kernel/ftrace_32.S +++ b/arch/x86/kernel/ftrace_32.S @@ -189,5 +189,5 @@ return_to_handler: movl %eax, %ecx popl %edx popl %eax - JMP_NOSPEC %ecx + JMP_NOSPEC ecx #endif diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 369e61faacfe..2f2b5702e6cf 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -303,7 +303,7 @@ trace: * function tracing is enabled. */ movq ftrace_trace_function, %r8 - CALL_NOSPEC %r8 + CALL_NOSPEC r8 restore_mcount_regs jmp fgraph_trace @@ -340,6 +340,6 @@ SYM_CODE_START(return_to_handler) movq 8(%rsp), %rdx movq (%rsp), %rax addq $24, %rsp - JMP_NOSPEC %rdi + JMP_NOSPEC rdi SYM_CODE_END(return_to_handler) #endif diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S index 4742e8fa7ee7..d1d768912368 100644 --- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -153,7 +153,7 @@ SYM_FUNC_START(csum_partial) negl %ebx lea 45f(%ebx,%ebx,2), %ebx testl %esi, %esi - JMP_NOSPEC %ebx + JMP_NOSPEC ebx # Handle 2-byte-aligned regions 20: addw (%esi), %ax @@ -436,7 +436,7 @@ SYM_FUNC_START(csum_partial_copy_generic) andl $-32,%edx lea 3f(%ebx,%ebx), %ebx testl %esi, %esi - JMP_NOSPEC %ebx + JMP_NOSPEC ebx 1: addl $64,%esi addl $64,%edi SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl) diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index 363ec132df7e..cc44702d0492 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -7,15 +7,35 @@ #include <asm/alternative-asm.h> #include <asm/export.h> #include <asm/nospec-branch.h> +#include <asm/unwind_hints.h> + +/* + * This is the bare retpoline primitive. + */ +.macro RETPOLINE_JMP reg:req + call .Ldo_rop_\@ +.Lspec_trap_\@: + pause + lfence + jmp .Lspec_trap_\@ +.Ldo_rop_\@: + mov \reg, (%_ASM_SP) + ret +.endm .macro THUNK reg .section .text.__x86.indirect_thunk + .align 32 SYM_FUNC_START(__x86_indirect_thunk_\reg) - CFI_STARTPROC - JMP_NOSPEC %\reg - CFI_ENDPROC + JMP_NOSPEC \reg SYM_FUNC_END(__x86_indirect_thunk_\reg) + +SYM_CODE_START_NOALIGN(__x86_retpoline_\reg) + UNWIND_HINT_EMPTY + RETPOLINE_JMP %\reg +SYM_CODE_END(__x86_retpoline_\reg) + .endm /* @@ -26,7 +46,9 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg) * the simple and nasty way... */ #define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym) -#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) +#define EXPORT_THUNK(reg) \ + __EXPORT_THUNK(__x86_retpoline_ ## reg); \ + __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) #define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg) GENERATE_THUNK(_ASM_AX) diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S index 15da118f04f0..90380a17ab23 100644 --- a/arch/x86/platform/efi/efi_stub_64.S +++ b/arch/x86/platform/efi/efi_stub_64.S @@ -21,7 +21,7 @@ SYM_FUNC_START(__efi_call) mov %r8, %r9 mov %rcx, %r8 mov %rsi, %rcx - CALL_NOSPEC %rdi + CALL_NOSPEC rdi leave ret SYM_FUNC_END(__efi_call) ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-09 10:34 ` Peter Zijlstra @ 2020-04-09 10:40 ` Peter Zijlstra 0 siblings, 0 replies; 40+ messages in thread From: Peter Zijlstra @ 2020-04-09 10:40 UTC (permalink / raw) To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx On Thu, Apr 09, 2020 at 12:34:24PM +0200, Peter Zijlstra wrote: > @@ -111,23 +83,21 @@ > */ > .macro JMP_NOSPEC reg:req > #ifdef CONFIG_RETPOLINE > - ANNOTATE_NOSPEC_ALTERNATIVE > - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ > - __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ > - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD > + ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD > + ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \ > + __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE Still, I am being an idiot, only the call (below) needs this, this can stay what it was: + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \ + __stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE, \ + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD > #else > - jmp *\reg > + jmp *%\reg > #endif > .endm > > .macro CALL_NOSPEC reg:req > #ifdef CONFIG_RETPOLINE > - ANNOTATE_NOSPEC_ALTERNATIVE > - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ > - __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ > - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD And then this needs a comment on why it is not ALTERNATIVE_2 like above. > + ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD > + ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \ > + __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE > #else > - call *\reg > + call *%\reg > #endif > .endm ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 16:18 ` Alexandre Chartre 2020-04-07 16:28 ` Josh Poimboeuf @ 2020-04-07 16:41 ` Peter Zijlstra 2020-04-07 17:04 ` Alexandre Chartre 1 sibling, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 16:41 UTC (permalink / raw) To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote: > > On 4/7/20 4:32 PM, Alexandre Chartre wrote: > > > > On 4/7/20 3:34 PM, Josh Poimboeuf wrote: > > > On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: > > > > Josh, we should probably have objtool verify it doesn't emit ORC entries > > > > in alternative ranges. > > > > > > Agreed, it might be as simple as checking for insn->alt_group in the > > > INSN_STACK check or in update_insn_state(). > > > > > > > We could do that only for the "objtool orc generate" command. That way > > "objtool check" would still check the alternative, but "objtool orc generate" > > will just use the first half of the alternative (like it does today with > > ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE > > but only use them for "objtool orc generate". > > > > I have checked and objtool doesn't emit ORC entries for alternative: > decode_instructions() doesn't mark such section with sec->text = true > so create_orc_sections() doesn't emit corresponding ORC entries. > > So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives, > this will allow objtool to check the instructions but it still won't > emit ORC entries (same behavior as today). In the future, if ORC > eventually supports alternative we will be ready to have objtool emit > ORC entries. I mean, we should make it warn for the case where you remove ANNOTATE_NOSPEC and it would like to generate ORC. Also, what's the point of having objtool grok this code and then not doing anything with it? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 16:41 ` Peter Zijlstra @ 2020-04-07 17:04 ` Alexandre Chartre 0 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 17:04 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx On 4/7/20 6:41 PM, Peter Zijlstra wrote: > On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote: >> >> On 4/7/20 4:32 PM, Alexandre Chartre wrote: >>> >>> On 4/7/20 3:34 PM, Josh Poimboeuf wrote: >>>> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote: >>>>> Josh, we should probably have objtool verify it doesn't emit ORC entries >>>>> in alternative ranges. >>>> >>>> Agreed, it might be as simple as checking for insn->alt_group in the >>>> INSN_STACK check or in update_insn_state(). >>>> >>> >>> We could do that only for the "objtool orc generate" command. That way >>> "objtool check" would still check the alternative, but "objtool orc generate" >>> will just use the first half of the alternative (like it does today with >>> ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE >>> but only use them for "objtool orc generate". >>> >> >> I have checked and objtool doesn't emit ORC entries for alternative: >> decode_instructions() doesn't mark such section with sec->text = true >> so create_orc_sections() doesn't emit corresponding ORC entries. >> >> So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives, >> this will allow objtool to check the instructions but it still won't >> emit ORC entries (same behavior as today). In the future, if ORC >> eventually supports alternative we will be ready to have objtool emit >> ORC entries. > > I mean, we should make it warn for the case where you remove > ANNOTATE_NOSPEC and it would like to generate ORC. Okay, so check if an alternative is changing the stack and warn if it is. alex. > Also, what's the point of having objtool grok this code and then not > doing anything with it? > ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre 2020-04-07 13:28 ` Peter Zijlstra @ 2020-04-07 13:52 ` Peter Zijlstra 2020-04-07 13:59 ` Peter Zijlstra 1 sibling, 1 reply; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 13:52 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On Tue, Apr 07, 2020 at 09:31:42AM +0200, Alexandre Chartre wrote: > - ANNOTATE_NOSPEC_ALTERNATIVE > ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ > __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ > __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD Possibly we can write this like: ALTERNATIVE("", "lfence", X86_FEATURE_RETPOLINE_AMD); ALTERNATIVE("jmp *\reg", "jmp __x86_retpoline_jmp_\reg", X86_FEATURE_RETPOLINE); With an out-of-line copy of the retpoline, just like the THUNKs the compiler uses, except of course, it can't be those, because we actually want to use the alternative to implement those. By moving the retpoline magic out-of-line we ensure it has a unique address and the ORC stuff should work. I'm just not sure what to do about the RETPOLINE_CALL variant. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives 2020-04-07 13:52 ` Peter Zijlstra @ 2020-04-07 13:59 ` Peter Zijlstra 0 siblings, 0 replies; 40+ messages in thread From: Peter Zijlstra @ 2020-04-07 13:59 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx On Tue, Apr 07, 2020 at 03:52:11PM +0200, Peter Zijlstra wrote: > On Tue, Apr 07, 2020 at 09:31:42AM +0200, Alexandre Chartre wrote: > > > - ANNOTATE_NOSPEC_ALTERNATIVE > > ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ > > __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ > > __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD > > Possibly we can write this like: .macro OOL_RETPOLINE_JMP reg:req SYM_FUNC_START(__x86_retpoline_jmp_\reg) CFI_STARTPROC RETPOLINE_JMP \reg CFI_ENDPROC SYM_FUNC_END(__x86_retpoline_jmp_\reg) .endm > ALTERNATIVE("", "lfence", X86_FEATURE_RETPOLINE_AMD); > ALTERNATIVE("jmp *\reg", "jmp __x86_retpoline_jmp_\reg", X86_FEATURE_RETPOLINE); > > With an out-of-line copy of the retpoline, just like the THUNKs the > compiler uses, except of course, it can't be those, because we actually > want to use the alternative to implement those. > > By moving the retpoline magic out-of-line we ensure it has a unique > address and the ORC stuff should work. > > I'm just not sure what to do about the RETPOLINE_CALL variant. Duh, something like so: ALTERNATIVE("", "lfence", X86_FEATURE_RETPOLINE_AMD); ALTERNATIVE("call *\reg", "call __x86_retpoline_jmp_\reg", X86_FEATURE_RETPOLINE); ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre ` (8 preceding siblings ...) 2020-04-07 7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre @ 2020-04-07 13:35 ` Josh Poimboeuf 2020-04-07 14:02 ` Alexandre Chartre 9 siblings, 1 reply; 40+ messages in thread From: Josh Poimboeuf @ 2020-04-07 13:35 UTC (permalink / raw) To: Alexandre Chartre; +Cc: x86, linux-kernel, peterz, jthierry, tglx On Tue, Apr 07, 2020 at 09:31:33AM +0200, Alexandre Chartre wrote: > Hi, > > This is version v2 of this patchset based on the different comments > received so far. It now uses and includes PeterZ patch to add > UNWIND_HINT_RET_OFFSET. Other changes are described below. > > Code like retpoline or RSB stuffing, which is used to mitigate some of > the speculative execution issues, is currently ignored by objtool with > the ANNOTATE_NOSPEC_ALTERNATIVE directive. This series adds support > for intra-function calls to objtool so that it can handle such a code. > With these changes, we can remove all ANNOTATE_NOSPEC_ALTERNATIVE > directives. > > Changes: > - replace RETPOLINE_RET with PeterZ UNWIND_HINT_RET_OFFSET > - make objtool intra-function call action architecture dependent > - objtool now automatically detects and validates all intra-function > calls but it issues a warning if the call was not explicitly tagged > - change __FILL_RETURN_BUFFER to work with objtool > - add generic ANNOTATE_INTRA_FUNCTION_CALL macro > - remove all ANNOTATE_SPEC_ALTERNATIVE (even for __FILL_RETURN_BUFFER) I had trouble applying the patches. What branch are they based on? In general the latest tip/master is good. -- Josh ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE 2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf @ 2020-04-07 14:02 ` Alexandre Chartre 0 siblings, 0 replies; 40+ messages in thread From: Alexandre Chartre @ 2020-04-07 14:02 UTC (permalink / raw) To: Josh Poimboeuf; +Cc: x86, linux-kernel, peterz, jthierry, tglx On 4/7/20 3:35 PM, Josh Poimboeuf wrote: > On Tue, Apr 07, 2020 at 09:31:33AM +0200, Alexandre Chartre wrote: >> Hi, >> >> This is version v2 of this patchset based on the different comments >> received so far. It now uses and includes PeterZ patch to add >> UNWIND_HINT_RET_OFFSET. Other changes are described below. >> >> Code like retpoline or RSB stuffing, which is used to mitigate some of >> the speculative execution issues, is currently ignored by objtool with >> the ANNOTATE_NOSPEC_ALTERNATIVE directive. This series adds support >> for intra-function calls to objtool so that it can handle such a code. >> With these changes, we can remove all ANNOTATE_NOSPEC_ALTERNATIVE >> directives. >> >> Changes: >> - replace RETPOLINE_RET with PeterZ UNWIND_HINT_RET_OFFSET >> - make objtool intra-function call action architecture dependent >> - objtool now automatically detects and validates all intra-function >> calls but it issues a warning if the call was not explicitly tagged >> - change __FILL_RETURN_BUFFER to work with objtool >> - add generic ANNOTATE_INTRA_FUNCTION_CALL macro >> - remove all ANNOTATE_SPEC_ALTERNATIVE (even for __FILL_RETURN_BUFFER) > > I had trouble applying the patches. What branch are they based on? In > general the latest tip/master is good. Oups, this is on based on 5.5. I didn't realize I was that late, I though I recently rebased but it looks like I didn't. Sorry about that, I will make sure the next version is more recent. alex. ^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2020-05-01 18:23 UTC | newest] Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-04-07 7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre 2020-04-07 12:53 ` Peter Zijlstra 2020-04-07 13:17 ` Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre 2020-05-01 18:22 ` [tip: objtool/core] " tip-bot2 for Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 4/9] objtool: Allow branches within the same alternative Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre 2020-04-07 13:07 ` Peter Zijlstra 2020-04-07 13:28 ` Alexandre Chartre 2020-04-08 14:06 ` Alexandre Chartre 2020-04-08 14:19 ` Julien Thierry 2020-04-08 16:03 ` Alexandre Chartre 2020-04-08 16:04 ` Julien Thierry 2020-04-08 17:06 ` Alexandre Chartre 2020-04-08 17:07 ` Julien Thierry 2020-04-07 7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre 2020-04-07 13:27 ` Josh Poimboeuf 2020-04-07 7:31 ` [PATCH V2 7/9] x86/speculation: Annotate intra-function calls Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return Alexandre Chartre 2020-04-07 7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre 2020-04-07 13:28 ` Peter Zijlstra 2020-04-07 13:34 ` Josh Poimboeuf 2020-04-07 14:32 ` Alexandre Chartre 2020-04-07 16:18 ` Alexandre Chartre 2020-04-07 16:28 ` Josh Poimboeuf 2020-04-07 17:01 ` Alexandre Chartre 2020-04-07 17:26 ` Peter Zijlstra 2020-04-07 17:27 ` Peter Zijlstra 2020-04-08 21:35 ` Peter Zijlstra 2020-04-09 8:18 ` Alexandre Chartre 2020-04-09 10:34 ` Peter Zijlstra 2020-04-09 10:40 ` Peter Zijlstra 2020-04-07 16:41 ` Peter Zijlstra 2020-04-07 17:04 ` Alexandre Chartre 2020-04-07 13:52 ` Peter Zijlstra 2020-04-07 13:59 ` Peter Zijlstra 2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf 2020-04-07 14:02 ` Alexandre Chartre
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).