linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE
@ 2020-04-07  7:31 Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre
                   ` (9 more replies)
  0 siblings, 10 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Hi,

This is version v2 of this patchset based on the different comments
received so far. It now uses and includes PeterZ patch to add
UNWIND_HINT_RET_OFFSET. Other changes are described below.

Code like retpoline or RSB stuffing, which is used to mitigate some of
the speculative execution issues, is currently ignored by objtool with
the ANNOTATE_NOSPEC_ALTERNATIVE directive. This series adds support
for intra-function calls to objtool so that it can handle such a code.
With these changes, we can remove all ANNOTATE_NOSPEC_ALTERNATIVE
directives.

Changes:
 - replace RETPOLINE_RET with PeterZ UNWIND_HINT_RET_OFFSET
 - make objtool intra-function call action architecture dependent
 - objtool now automatically detects and validates all intra-function
   calls but it issues a warning if the call was not explicitly tagged
 - change __FILL_RETURN_BUFFER to work with objtool
 - add generic ANNOTATE_INTRA_FUNCTION_CALL macro
 - remove all ANNOTATE_SPEC_ALTERNATIVE (even for __FILL_RETURN_BUFFER)

Thanks,

alex.

-----

Alexandre Chartre (8):
  objtool: UNWIND_HINT_RET_OFFSET should not check registers
  objtool: is_fentry_call() crashes if call has no destination
  objtool: Allow branches within the same alternative.
  objtool: Add support for intra-function calls
  x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool
  x86/speculation: Annotate intra-function calls
  x86/speculation: Add unwind hint to trampoline return
  x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives

Peter Zijlstra (Intel) (1):
  objtool: Introduce HINT_RET_OFFSET

 arch/x86/include/asm/nospec-branch.h          |  32 ++--
 arch/x86/include/asm/orc_types.h              |   1 +
 arch/x86/include/asm/unwind_hints.h           |  10 ++
 include/linux/frame.h                         |  11 ++
 tools/arch/x86/include/asm/orc_types.h        |   1 +
 .../Documentation/stack-validation.txt        |   8 +
 tools/objtool/arch.h                          |   2 +
 tools/objtool/arch/x86/decode.c               |  12 ++
 tools/objtool/check.c                         | 152 ++++++++++++++----
 tools/objtool/check.h                         |   6 +-
 10 files changed, 192 insertions(+), 43 deletions(-)

-- 
2.18.2


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07 12:53   ` Peter Zijlstra
  2020-04-07  7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

From: "Peter Zijlstra (Intel)" <peterz@infradead.org>

Normally objtool ensures a function keeps the stack layout invariant.
But there is a useful exception, it is possible to stuff the return
stack in order to 'inject' a 'call':

        push $fun
        ret

In this case the invariant mentioned above is violated.

Add an objtool HINT to annotate this and allow a function exit with a
modified stack frame.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/orc_types.h       |  1 +
 arch/x86/include/asm/unwind_hints.h    | 10 ++++++++++
 tools/arch/x86/include/asm/orc_types.h |  1 +
 tools/objtool/check.c                  | 26 ++++++++++++++++++--------
 tools/objtool/check.h                  |  5 ++++-
 5 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 6e060907c163..5f18ca7ac51a 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -60,6 +60,7 @@
 #define ORC_TYPE_REGS_IRET		2
 #define UNWIND_HINT_TYPE_SAVE		3
 #define UNWIND_HINT_TYPE_RESTORE	4
+#define UNWIND_HINT_TYPE_RET_OFFSET	5
 
 #ifndef __ASSEMBLY__
 /*
diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index f5e2eb12cb71..aabf7ace0476 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -94,6 +94,16 @@
 	UNWIND_HINT type=UNWIND_HINT_TYPE_RESTORE
 .endm
 
+
+/*
+ * RET_OFFSET: Used on instructions that terminate a function; mostly RETURN
+ * and sibling calls. On these, sp_offset denotes the expected offset from
+ * initial_func_cfi.
+ */
+.macro UNWIND_HINT_RET_OFFSET sp_offset=8
+	UNWIND_HINT type=UNWIND_HINT_TYPE_RET_OFFSET sp_offset=\sp_offset
+.endm
+
 #else /* !__ASSEMBLY__ */
 
 #define UNWIND_HINT(sp_reg, sp_offset, type, end)		\
diff --git a/tools/arch/x86/include/asm/orc_types.h b/tools/arch/x86/include/asm/orc_types.h
index 6e060907c163..5f18ca7ac51a 100644
--- a/tools/arch/x86/include/asm/orc_types.h
+++ b/tools/arch/x86/include/asm/orc_types.h
@@ -60,6 +60,7 @@
 #define ORC_TYPE_REGS_IRET		2
 #define UNWIND_HINT_TYPE_SAVE		3
 #define UNWIND_HINT_TYPE_RESTORE	4
+#define UNWIND_HINT_TYPE_RET_OFFSET	5
 
 #ifndef __ASSEMBLY__
 /*
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 4768d91c6d68..bbee26de92ec 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1209,6 +1209,10 @@ static int read_unwind_hints(struct objtool_file *file)
 			insn->restore = true;
 			insn->hint = true;
 			continue;
+
+		} else if (hint->type == UNWIND_HINT_TYPE_RET_OFFSET) {
+			insn->ret_offset = hint->sp_offset;
+			continue;
 		}
 
 		insn->hint = true;
@@ -1371,20 +1375,26 @@ static bool is_fentry_call(struct instruction *insn)
 	return false;
 }
 
-static bool has_modified_stack_frame(struct insn_state *state)
+static bool has_modified_stack_frame(struct instruction *insn,
+				     struct insn_state *state)
 {
+	u8 ret_offset = insn->ret_offset;
 	int i;
 
-	if (state->cfa.base != initial_func_cfi.cfa.base ||
-	    state->cfa.offset != initial_func_cfi.cfa.offset ||
-	    state->stack_size != initial_func_cfi.cfa.offset ||
-	    state->drap)
+	if (state->cfa.base != initial_func_cfi.cfa.base || state->drap)
+		return true;
+
+	if (state->cfa.offset != initial_func_cfi.cfa.offset + ret_offset)
 		return true;
 
-	for (i = 0; i < CFI_NUM_REGS; i++)
+	if (state->stack_size != initial_func_cfi.cfa.offset + ret_offset)
+		return true;
+
+	for (i = 0; i < CFI_NUM_REGS; i++) {
 		if (state->regs[i].base != initial_func_cfi.regs[i].base ||
 		    state->regs[i].offset != initial_func_cfi.regs[i].offset)
 			return true;
+	}
 
 	return false;
 }
@@ -1926,7 +1936,7 @@ static int validate_call(struct instruction *insn, struct insn_state *state)
 
 static int validate_sibling_call(struct instruction *insn, struct insn_state *state)
 {
-	if (has_modified_stack_frame(state)) {
+	if (has_modified_stack_frame(insn, state)) {
 		WARN_FUNC("sibling call from callable instruction with modified stack frame",
 				insn->sec, insn->offset);
 		return 1;
@@ -2065,7 +2075,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 				return 1;
 			}
 
-			if (func && has_modified_stack_frame(&state)) {
+			if (func && has_modified_stack_frame(insn, &state)) {
 				WARN_FUNC("return with modified stack frame",
 					  sec, insn->offset);
 				return 1;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 6d875ca6fce0..7a91497fee7e 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -33,9 +33,12 @@ struct instruction {
 	unsigned int len;
 	enum insn_type type;
 	unsigned long immediate;
-	bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts;
+	unsigned int alt_group;
+	bool dead_end, ignore, ignore_alts;
+	bool hint, save, restore;
 	bool retpoline_safe;
 	u8 visited;
+	u8 ret_offset;
 	struct symbol *call_dest;
 	struct instruction *jump_dest;
 	struct instruction *first_jump_src;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-05-01 18:22   ` [tip: objtool/core] " tip-bot2 for Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination Alexandre Chartre
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

UNWIND_HINT_RET_OFFSET will adjust a modified stack. However if a
callee-saved register was pushed on the stack then the stack frame
will still appear modified. So stop checking registers when
UNWIND_HINT_RET_OFFSET is used.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 tools/objtool/check.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index bbee26de92ec..c7fcaddfaa8a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1390,6 +1390,14 @@ static bool has_modified_stack_frame(struct instruction *insn,
 	if (state->stack_size != initial_func_cfi.cfa.offset + ret_offset)
 		return true;
 
+	/*
+	 * If there is a ret offset hint then don't check registers
+	 * because a callee-saved register might have been pushed on
+	 * the stack.
+	 */
+	if (ret_offset)
+		return false;
+
 	for (i = 0; i < CFI_NUM_REGS; i++) {
 		if (state->regs[i].base != initial_func_cfi.regs[i].base ||
 		    state->regs[i].offset != initial_func_cfi.regs[i].offset)
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 4/9] objtool: Allow branches within the same alternative Alexandre Chartre
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Fix is_fentry_call() so that it works if a call has no destination
set (call_dest). This needs to be done in order to support intra-
function calls.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 tools/objtool/check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c7fcaddfaa8a..c0da033a40c5 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1367,7 +1367,7 @@ static int decode_sections(struct objtool_file *file)
 
 static bool is_fentry_call(struct instruction *insn)
 {
-	if (insn->type == INSN_CALL &&
+	if (insn->type == INSN_CALL && insn->call_dest &&
 	    insn->call_dest->type == STT_NOTYPE &&
 	    !strcmp(insn->call_dest->name, "__fentry__"))
 		return true;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 4/9] objtool: Allow branches within the same alternative.
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (2 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Currently objtool prevents any branch to an alternative. While preventing
branching from the outside to the middle of an alternative makes perfect
sense, branching within the same alternative should be allowed. To do so,
identify each alternative and check that a branch to an alternative comes
from the same alternative.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 tools/objtool/check.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c0da033a40c5..ab06e9bdd396 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -712,7 +712,9 @@ static int handle_group_alt(struct objtool_file *file,
 			    struct instruction *orig_insn,
 			    struct instruction **new_insn)
 {
+	static unsigned int alt_group_next_index = 1;
 	struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump = NULL;
+	unsigned int alt_group = alt_group_next_index++;
 	unsigned long dest_off;
 
 	last_orig_insn = NULL;
@@ -721,7 +723,7 @@ static int handle_group_alt(struct objtool_file *file,
 		if (insn->offset >= special_alt->orig_off + special_alt->orig_len)
 			break;
 
-		insn->alt_group = true;
+		insn->alt_group = alt_group;
 		last_orig_insn = insn;
 	}
 
@@ -1960,6 +1962,7 @@ static int validate_sibling_call(struct instruction *insn, struct insn_state *st
  * tools/objtool/Documentation/stack-validation.txt.
  */
 static int validate_branch(struct objtool_file *file, struct symbol *func,
+			   struct instruction *from,
 			   struct instruction *first, struct insn_state state)
 {
 	struct alternative *alt;
@@ -1971,7 +1974,9 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 	insn = first;
 	sec = insn->sec;
 
-	if (insn->alt_group && list_empty(&insn->alts)) {
+	if (insn->alt_group &&
+	    (!from || from->alt_group != insn->alt_group) &&
+	    list_empty(&insn->alts)) {
 		WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
 			  sec, insn->offset);
 		return 1;
@@ -2053,7 +2058,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 				if (alt->skip_orig)
 					skip_orig = true;
 
-				ret = validate_branch(file, func, alt->insn, state);
+				ret = validate_branch(file, func,
+						      NULL, alt->insn, state);
 				if (ret) {
 					if (backtrace)
 						BT_FUNC("(alt)", insn);
@@ -2123,7 +2129,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 					return ret;
 
 			} else if (insn->jump_dest) {
-				ret = validate_branch(file, func,
+				ret = validate_branch(file, func, insn,
 						      insn->jump_dest, state);
 				if (ret) {
 					if (backtrace)
@@ -2254,7 +2260,8 @@ static int validate_unwind_hints(struct objtool_file *file)
 
 	for_each_insn(file, insn) {
 		if (insn->hint && !insn->visited) {
-			ret = validate_branch(file, insn->func, insn, state);
+			ret = validate_branch(file, insn->func,
+					      NULL, insn, state);
 			if (ret && backtrace)
 				BT_FUNC("<=== (hint)", insn);
 			warnings += ret;
@@ -2395,7 +2402,7 @@ static int validate_functions(struct objtool_file *file)
 
 			state.uaccess = func->uaccess_safe;
 
-			ret = validate_branch(file, func, insn, state);
+			ret = validate_branch(file, func, NULL, insn, state);
 			if (ret && backtrace)
 				BT_FUNC("<=== (func)", insn);
 			warnings += ret;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (3 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 4/9] objtool: Allow branches within the same alternative Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07 13:07   ` Peter Zijlstra
  2020-04-07  7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Change objtool to support intra-function calls. On x86, an intra-function
call is represented in objtool as a push onto the stack (of the return
address), and a jump to the destination address. That way the stack
information is correctly updated and the call flow is still accurate.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 include/linux/frame.h                         | 11 +++
 .../Documentation/stack-validation.txt        |  8 ++
 tools/objtool/arch.h                          |  2 +
 tools/objtool/arch/x86/decode.c               | 12 +++
 tools/objtool/check.c                         | 97 ++++++++++++++++---
 tools/objtool/check.h                         |  1 +
 6 files changed, 117 insertions(+), 14 deletions(-)

diff --git a/include/linux/frame.h b/include/linux/frame.h
index 02d3ca2d9598..c7178e6c9d48 100644
--- a/include/linux/frame.h
+++ b/include/linux/frame.h
@@ -15,9 +15,20 @@
 	static void __used __section(.discard.func_stack_frame_non_standard) \
 		*__func_stack_frame_non_standard_##func = func
 
+/*
+ * This macro indicates that the following intra-function call is valid.
+ * Any non-annotated intra-function call will cause objtool to issue warning.
+ */
+#define ANNOTATE_INTRA_FUNCTION_CALL				\
+	999:							\
+	.pushsection .discard.intra_function_call;		\
+	.long 999b;						\
+	.popsection;
+
 #else /* !CONFIG_STACK_VALIDATION */
 
 #define STACK_FRAME_NON_STANDARD(func)
+#define ANNOTATE_INTRA_FUNCTION_CALL
 
 #endif /* CONFIG_STACK_VALIDATION */
 
diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt
index de094670050b..09f863fd32d2 100644
--- a/tools/objtool/Documentation/stack-validation.txt
+++ b/tools/objtool/Documentation/stack-validation.txt
@@ -290,6 +290,14 @@ they mean, and suggestions for how to fix them.
       https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
 
 
+9. file.o: warning: unsupported intra-function call
+
+   This warning means that a direct call is done to a destination which
+   is not at the beginning of a function. If this is a legit call, you
+   can remove this warning by putting the ANNOTATE_INTRA_FUNCTION_CALL
+   directive right before the call.
+
+
 If the error doesn't seem to make sense, it could be a bug in objtool.
 Feel free to ask the objtool maintainer for help.
 
diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index ced3765c4f44..4d42c12d9957 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -75,4 +75,6 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
 
 bool arch_callee_saved_reg(unsigned char reg);
 
+void arch_configure_intra_function_call(struct stack_op *op);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index a62e032863a8..7ee1561bf7ad 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
 	state->regs[16].base = CFI_CFA;
 	state->regs[16].offset = -8;
 }
+
+
+void arch_configure_intra_function_call(struct stack_op *op)
+{
+	/*
+	 * For the impact on the stack, make an intra-function
+	 * call behaves like a push of an immediate value (the
+	 * return address).
+	 */
+	op->src.type = OP_SRC_CONST;
+	op->dest.type = OP_DEST_PUSH;
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index ab06e9bdd396..bf1e4e94d686 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -644,34 +644,50 @@ static int add_jump_destinations(struct objtool_file *file)
 	return 0;
 }
 
+static int configure_call(struct objtool_file *file, struct instruction *insn)
+{
+	unsigned long dest_off;
+
+	dest_off = insn->offset + insn->len + insn->immediate;
+	insn->call_dest = find_symbol_by_offset(insn->sec, dest_off);
+	if (insn->call_dest) {
+		/* regular call */
+		return 0;
+	}
+
+	/* intra-function call */
+	if (!insn->intra_function_call)
+		WARN_FUNC("intra-function call", insn->sec, insn->offset);
+
+	dest_off = insn->offset + insn->len + insn->immediate;
+	insn->jump_dest = find_insn(file, insn->sec, dest_off);
+	if (!insn->jump_dest) {
+		WARN_FUNC("can't find call dest at %s+0x%lx",
+			  insn->sec, insn->offset,
+			  insn->sec->name, dest_off);
+		return -1;
+	}
+
+	return 0;
+}
+
 /*
  * Find the destination instructions for all calls.
  */
 static int add_call_destinations(struct objtool_file *file)
 {
 	struct instruction *insn;
-	unsigned long dest_off;
 	struct rela *rela;
 
 	for_each_insn(file, insn) {
-		if (insn->type != INSN_CALL)
+		if (insn->type != INSN_CALL || insn->ignore)
 			continue;
 
 		rela = find_rela_by_dest_range(insn->sec, insn->offset,
 					       insn->len);
 		if (!rela) {
-			dest_off = insn->offset + insn->len + insn->immediate;
-			insn->call_dest = find_symbol_by_offset(insn->sec,
-								dest_off);
-
-			if (!insn->call_dest && !insn->ignore) {
-				WARN_FUNC("unsupported intra-function call",
-					  insn->sec, insn->offset);
-				if (retpoline)
-					WARN("If this is a retpoline, please patch it in with alternatives and annotate it with ANNOTATE_NOSPEC_ALTERNATIVE.");
+			if (configure_call(file, insn))
 				return -1;
-			}
-
 		} else if (rela->sym->type == STT_SECTION) {
 			insn->call_dest = find_symbol_by_offset(rela->sym->sec,
 								rela->addend+4);
@@ -1293,6 +1309,43 @@ static int read_retpoline_hints(struct objtool_file *file)
 	return 0;
 }
 
+static int read_intra_function_call(struct objtool_file *file)
+{
+	struct section *sec;
+	struct instruction *insn;
+	struct rela *rela;
+
+	sec = find_section_by_name(file->elf,
+				   ".rela.discard.intra_function_call");
+	if (!sec)
+		return 0;
+
+	list_for_each_entry(rela, &sec->rela_list, list) {
+		if (rela->sym->type != STT_SECTION) {
+			WARN("unexpected relocation symbol type in %s",
+			     sec->name);
+			return -1;
+		}
+
+		insn = find_insn(file, rela->sym->sec, rela->addend);
+		if (!insn) {
+			WARN("bad .discard.intra_function_call entry");
+			return -1;
+		}
+
+		if (insn->type != INSN_CALL) {
+			WARN_FUNC("intra_function_call not a direct call",
+				  insn->sec, insn->offset);
+			return -1;
+		}
+
+		arch_configure_intra_function_call(&insn->stack_op);
+		insn->intra_function_call = true;
+	}
+
+	return 0;
+}
+
 static void mark_rodata(struct objtool_file *file)
 {
 	struct section *sec;
@@ -1348,6 +1401,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	ret = read_intra_function_call(file);
+	if (ret)
+		return ret;
+
 	ret = add_call_destinations(file);
 	if (ret)
 		return ret;
@@ -2110,7 +2167,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 				return ret;
 
 			if (!no_fp && func && !is_fentry_call(insn) &&
-			    !has_valid_stack_frame(&state)) {
+			    !has_valid_stack_frame(&state) &&
+			    !insn->intra_function_call) {
 				WARN_FUNC("call without frame pointer save/setup",
 					  sec, insn->offset);
 				return 1;
@@ -2119,6 +2177,17 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 			if (dead_end_function(file, insn->call_dest))
 				return 0;
 
+			if (insn->intra_function_call) {
+				update_insn_state(insn, &state);
+				ret = validate_branch(file, func, insn,
+						      insn->jump_dest, state);
+				if (ret) {
+					if (backtrace)
+						BT_FUNC("(intra-call)", insn);
+					return ret;
+				}
+			}
+
 			break;
 
 		case INSN_JUMP_CONDITIONAL:
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 7a91497fee7e..7c7ec3d8fb00 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -36,6 +36,7 @@ struct instruction {
 	unsigned int alt_group;
 	bool dead_end, ignore, ignore_alts;
 	bool hint, save, restore;
+	bool intra_function_call;
 	bool retpoline_safe;
 	u8 visited;
 	u8 ret_offset;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (4 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07 13:27   ` Josh Poimboeuf
  2020-04-07  7:31 ` [PATCH V2 7/9] x86/speculation: Annotate intra-function calls Alexandre Chartre
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Change __FILL_RETURN_BUFFER so that the stack state is deterministically
defined for each iteration and that objtool can have an accurate view
of the stack.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 arch/x86/include/asm/nospec-branch.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 5c24a7b35166..9a946fd5e824 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -60,8 +60,8 @@
 	jmp	775b;				\
 774:						\
 	dec	reg;				\
-	jnz	771b;				\
-	add	$(BITS_PER_LONG/8) * nr, sp;
+	add	$(BITS_PER_LONG/8) * 2, sp;	\
+	jnz	771b;
 
 #ifdef __ASSEMBLY__
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 7/9] x86/speculation: Annotate intra-function calls
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (5 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return Alexandre Chartre
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Some speculative execution mitigations (like retpoline) use intra-
function calls. Provide a macro to annotate such intra-function calls
so they can be properly handled by objtool, and use this macro to
annotate intra-function calls.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 arch/x86/include/asm/nospec-branch.h | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 9a946fd5e824..a345c6fa0541 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -3,6 +3,7 @@
 #ifndef _ASM_X86_NOSPEC_BRANCH_H_
 #define _ASM_X86_NOSPEC_BRANCH_H_
 
+#include <linux/frame.h>
 #include <linux/static_key.h>
 
 #include <asm/alternative.h>
@@ -19,6 +20,15 @@
 #define ANNOTATE_NOSPEC_ALTERNATIVE \
 	ANNOTATE_IGNORE_ALTERNATIVE
 
+/*
+ * Intra-function call instruction. This should be used as a substitute
+ * for the call instruction when doing an intra-function call. It is
+ * similar to the call instruction but it tells objtool that this is
+ * an intra-function call.
+ */
+#define INTRA_FUNCTION_CALL \
+	ANNOTATE_INTRA_FUNCTION_CALL call
+
 /*
  * Fill the CPU return stack buffer.
  *
@@ -47,13 +57,13 @@
 #define __FILL_RETURN_BUFFER(reg, nr, sp)	\
 	mov	$(nr/2), reg;			\
 771:						\
-	call	772f;				\
+	INTRA_FUNCTION_CALL 772f;		\
 773:	/* speculation trap */			\
 	pause;					\
 	lfence;					\
 	jmp	773b;				\
 772:						\
-	call	774f;				\
+	INTRA_FUNCTION_CALL 774f;		\
 775:	/* speculation trap */			\
 	pause;					\
 	lfence;					\
@@ -83,7 +93,7 @@
  * invocation below less ugly.
  */
 .macro RETPOLINE_JMP reg:req
-	call	.Ldo_rop_\@
+	INTRA_FUNCTION_CALL .Ldo_rop_\@
 .Lspec_trap_\@:
 	pause
 	lfence
@@ -102,7 +112,7 @@
 .Ldo_retpoline_jmp_\@:
 	RETPOLINE_JMP \reg
 .Ldo_call_\@:
-	call	.Ldo_retpoline_jmp_\@
+	INTRA_FUNCTION_CALL .Ldo_retpoline_jmp_\@
 .endm
 
 /*
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (6 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 7/9] x86/speculation: Annotate intra-function calls Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07  7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre
  2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf
  9 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

With retpoline, the address to branch to is pushed on the stack and
the return instruction (ret) is used to jump to that address. Use the
UNWIND_HINT_RET_OFFSET directive to inform objtool that the stack
should be adjusted.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>

replace RETPOLINE_RET with UNWIND_HINT_RET_OFFSET
---
 arch/x86/include/asm/nospec-branch.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index a345c6fa0541..5ce2a40a26da 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -10,6 +10,7 @@
 #include <asm/alternative-asm.h>
 #include <asm/cpufeatures.h>
 #include <asm/msr-index.h>
+#include <asm/unwind_hints.h>
 
 /*
  * This should be used immediately before a retpoline alternative. It tells
@@ -100,6 +101,7 @@
 	jmp	.Lspec_trap_\@
 .Ldo_rop_\@:
 	mov	\reg, (%_ASM_SP)
+	UNWIND_HINT_RET_OFFSET
 	ret
 .endm
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (7 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return Alexandre Chartre
@ 2020-04-07  7:31 ` Alexandre Chartre
  2020-04-07 13:28   ` Peter Zijlstra
  2020-04-07 13:52   ` Peter Zijlstra
  2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf
  9 siblings, 2 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07  7:31 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, jpoimboe, peterz, jthierry, tglx, alexandre.chartre

Now that intra-function calls have been annotated and are supported
by objtool, that retpoline return instructions have been annotated,
and that __FILL_RETURN_BUFFER code is compatible with objtool, then
all ANNOTATE_NOSPEC_ALTERNATIVE directives can be removed.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
---
 arch/x86/include/asm/nospec-branch.h | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 5ce2a40a26da..6480db4304a0 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -124,7 +124,6 @@
  */
 .macro JMP_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
 		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
 		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
@@ -135,7 +134,6 @@
 
 .macro CALL_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg),	\
 		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
 		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
@@ -150,7 +148,6 @@
   */
 .macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
 	ALTERNATIVE "jmp .Lskip_rsb_\@",				\
 		__stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP))	\
 		\ftr
@@ -174,7 +171,6 @@
  * which is ensured when CONFIG_RETPOLINE is defined.
  */
 # define CALL_NOSPEC						\
-	ANNOTATE_NOSPEC_ALTERNATIVE				\
 	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
@@ -193,7 +189,6 @@
  * here, anyway.
  */
 # define CALL_NOSPEC						\
-	ANNOTATE_NOSPEC_ALTERNATIVE				\
 	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
@@ -261,8 +256,7 @@ static inline void vmexit_fill_RSB(void)
 #ifdef CONFIG_RETPOLINE
 	unsigned long loops;
 
-	asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
-		      ALTERNATIVE("jmp 910f",
+	asm volatile (ALTERNATIVE("jmp 910f",
 				  __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
 				  X86_FEATURE_RETPOLINE)
 		      "910:"
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET
  2020-04-07  7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre
@ 2020-04-07 12:53   ` Peter Zijlstra
  2020-04-07 13:17     ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 12:53 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx

On Tue, Apr 07, 2020 at 09:31:34AM +0200, Alexandre Chartre wrote:
> diff --git a/tools/objtool/check.h b/tools/objtool/check.h
> index 6d875ca6fce0..7a91497fee7e 100644
> --- a/tools/objtool/check.h
> +++ b/tools/objtool/check.h
> @@ -33,9 +33,12 @@ struct instruction {
>  	unsigned int len;
>  	enum insn_type type;
>  	unsigned long immediate;
> -	bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts;
> +	unsigned int alt_group;

^^ that wants to be in a later patch I'm thinking

> +	bool dead_end, ignore, ignore_alts;
> +	bool hint, save, restore;
>  	bool retpoline_safe;
>  	u8 visited;
> +	u8 ret_offset;
>  	struct symbol *call_dest;
>  	struct instruction *jump_dest;
>  	struct instruction *first_jump_src;
> -- 
> 2.18.2
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-07  7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre
@ 2020-04-07 13:07   ` Peter Zijlstra
  2020-04-07 13:28     ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 13:07 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx

On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:

> index a62e032863a8..7ee1561bf7ad 100644
> --- a/tools/objtool/arch/x86/decode.c
> +++ b/tools/objtool/arch/x86/decode.c
> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
>  	state->regs[16].base = CFI_CFA;
>  	state->regs[16].offset = -8;
>  }
> +
> +
> +void arch_configure_intra_function_call(struct stack_op *op)
> +{
> +	/*
> +	 * For the impact on the stack, make an intra-function
> +	 * call behaves like a push of an immediate value (the
> +	 * return address).
> +	 */
> +	op->src.type = OP_SRC_CONST;
> +	op->dest.type = OP_DEST_PUSH;
> +}

An alternative is to always set up stack ops for CALL/RET on decode, but
conditionally run update_insn_state() for them.

Not sure that makes more logical sense, but the patch would be simpler I
think.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET
  2020-04-07 12:53   ` Peter Zijlstra
@ 2020-04-07 13:17     ` Alexandre Chartre
  0 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 13:17 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx


On 4/7/20 2:53 PM, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 09:31:34AM +0200, Alexandre Chartre wrote:
>> diff --git a/tools/objtool/check.h b/tools/objtool/check.h
>> index 6d875ca6fce0..7a91497fee7e 100644
>> --- a/tools/objtool/check.h
>> +++ b/tools/objtool/check.h
>> @@ -33,9 +33,12 @@ struct instruction {
>>   	unsigned int len;
>>   	enum insn_type type;
>>   	unsigned long immediate;
>> -	bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts;
>> +	unsigned int alt_group;
> 
> ^^ that wants to be in a later patch I'm thinking

Correct, rebase error, should be in patch 4.

Thanks,

alex.

>> +	bool dead_end, ignore, ignore_alts;
>> +	bool hint, save, restore;
>>   	bool retpoline_safe;
>>   	u8 visited;
>> +	u8 ret_offset;
>>   	struct symbol *call_dest;
>>   	struct instruction *jump_dest;
>>   	struct instruction *first_jump_src;
>> -- 
>> 2.18.2
>>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool
  2020-04-07  7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre
@ 2020-04-07 13:27   ` Josh Poimboeuf
  0 siblings, 0 replies; 40+ messages in thread
From: Josh Poimboeuf @ 2020-04-07 13:27 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, peterz, jthierry, tglx

On Tue, Apr 07, 2020 at 09:31:39AM +0200, Alexandre Chartre wrote:
> Change __FILL_RETURN_BUFFER so that the stack state is deterministically
> defined for each iteration and that objtool can have an accurate view
> of the stack.
> 
> Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
> ---
>  arch/x86/include/asm/nospec-branch.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 5c24a7b35166..9a946fd5e824 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -60,8 +60,8 @@
>  	jmp	775b;				\
>  774:						\
>  	dec	reg;				\
> -	jnz	771b;				\
> -	add	$(BITS_PER_LONG/8) * nr, sp;
> +	add	$(BITS_PER_LONG/8) * 2, sp;	\
> +	jnz	771b;
>  
>  #ifdef __ASSEMBLY__

This still isn't a complete fix because the macro is used in an
alternative.  So in the !X86_FEATURE_RSB_CTXSW case, this code isn't
patched in and the ORC data which refers to it is wrong.

As I said before I think the easiest fix would be to convert RSB and
retpolines to use static branches instead of alternatives.

-- 
Josh


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-07 13:07   ` Peter Zijlstra
@ 2020-04-07 13:28     ` Alexandre Chartre
  2020-04-08 14:06       ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 13:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx


On 4/7/20 3:07 PM, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
> 
>> index a62e032863a8..7ee1561bf7ad 100644
>> --- a/tools/objtool/arch/x86/decode.c
>> +++ b/tools/objtool/arch/x86/decode.c
>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
>>   	state->regs[16].base = CFI_CFA;
>>   	state->regs[16].offset = -8;
>>   }
>> +
>> +
>> +void arch_configure_intra_function_call(struct stack_op *op)
>> +{
>> +	/*
>> +	 * For the impact on the stack, make an intra-function
>> +	 * call behaves like a push of an immediate value (the
>> +	 * return address).
>> +	 */
>> +	op->src.type = OP_SRC_CONST;
>> +	op->dest.type = OP_DEST_PUSH;
>> +}
> 
> An alternative is to always set up stack ops for CALL/RET on decode, but
> conditionally run update_insn_state() for them.
> 
> Not sure that makes more logical sense, but the patch would be simpler I
> think.

Right, this would avoid adding a new arch dependent function and the patch
will be simpler. This probably makes sense as the stack impact is the same
for all calls (but objtool will use it only for intra-function calls).

Thanks,

alex.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07  7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre
@ 2020-04-07 13:28   ` Peter Zijlstra
  2020-04-07 13:34     ` Josh Poimboeuf
  2020-04-07 13:52   ` Peter Zijlstra
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 13:28 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx

On Tue, Apr 07, 2020 at 09:31:42AM +0200, Alexandre Chartre wrote:
> Now that intra-function calls have been annotated and are supported
> by objtool, that retpoline return instructions have been annotated,
> and that __FILL_RETURN_BUFFER code is compatible with objtool, then
> all ANNOTATE_NOSPEC_ALTERNATIVE directives can be removed.

Like Josh said in the previous thread, this isn't going to work right.

> -	ANNOTATE_NOSPEC_ALTERNATIVE
>  	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
>  		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
>  		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD

The problem is that while objtool can now understand the code flow and
the corresponding stack layout, we only have a single ORC table, one
that must be valid for all alternatives.

Effectively this means there should not be any orc entries in an
alternative range.

In practise it _might_ work when the instruction of the various
alternatives have unique offsets in the range. But I'm not entirely sure
of that.

Josh, we should probably have objtool verify it doesn't emit ORC entries
in alternative ranges.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 13:28   ` Peter Zijlstra
@ 2020-04-07 13:34     ` Josh Poimboeuf
  2020-04-07 14:32       ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Josh Poimboeuf @ 2020-04-07 13:34 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Alexandre Chartre, x86, linux-kernel, jthierry, tglx

On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
> Josh, we should probably have objtool verify it doesn't emit ORC entries
> in alternative ranges.

Agreed, it might be as simple as checking for insn->alt_group in the
INSN_STACK check or in update_insn_state().

-- 
Josh


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE
  2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
                   ` (8 preceding siblings ...)
  2020-04-07  7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre
@ 2020-04-07 13:35 ` Josh Poimboeuf
  2020-04-07 14:02   ` Alexandre Chartre
  9 siblings, 1 reply; 40+ messages in thread
From: Josh Poimboeuf @ 2020-04-07 13:35 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, peterz, jthierry, tglx

On Tue, Apr 07, 2020 at 09:31:33AM +0200, Alexandre Chartre wrote:
> Hi,
> 
> This is version v2 of this patchset based on the different comments
> received so far. It now uses and includes PeterZ patch to add
> UNWIND_HINT_RET_OFFSET. Other changes are described below.
> 
> Code like retpoline or RSB stuffing, which is used to mitigate some of
> the speculative execution issues, is currently ignored by objtool with
> the ANNOTATE_NOSPEC_ALTERNATIVE directive. This series adds support
> for intra-function calls to objtool so that it can handle such a code.
> With these changes, we can remove all ANNOTATE_NOSPEC_ALTERNATIVE
> directives.
> 
> Changes:
>  - replace RETPOLINE_RET with PeterZ UNWIND_HINT_RET_OFFSET
>  - make objtool intra-function call action architecture dependent
>  - objtool now automatically detects and validates all intra-function
>    calls but it issues a warning if the call was not explicitly tagged
>  - change __FILL_RETURN_BUFFER to work with objtool
>  - add generic ANNOTATE_INTRA_FUNCTION_CALL macro
>  - remove all ANNOTATE_SPEC_ALTERNATIVE (even for __FILL_RETURN_BUFFER)

I had trouble applying the patches.  What branch are they based on?  In
general the latest tip/master is good.

-- 
Josh


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07  7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre
  2020-04-07 13:28   ` Peter Zijlstra
@ 2020-04-07 13:52   ` Peter Zijlstra
  2020-04-07 13:59     ` Peter Zijlstra
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 13:52 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx

On Tue, Apr 07, 2020 at 09:31:42AM +0200, Alexandre Chartre wrote:

> -	ANNOTATE_NOSPEC_ALTERNATIVE
>  	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
>  		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
>  		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD

Possibly we can write this like:

	ALTERNATIVE("", "lfence", X86_FEATURE_RETPOLINE_AMD);
	ALTERNATIVE("jmp *\reg", "jmp __x86_retpoline_jmp_\reg", X86_FEATURE_RETPOLINE);

With an out-of-line copy of the retpoline, just like the THUNKs the
compiler uses, except of course, it can't be those, because we actually
want to use the alternative to implement those.

By moving the retpoline magic out-of-line we ensure it has a unique
address and the ORC stuff should work.

I'm just not sure what to do about the RETPOLINE_CALL variant.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 13:52   ` Peter Zijlstra
@ 2020-04-07 13:59     ` Peter Zijlstra
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 13:59 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx

On Tue, Apr 07, 2020 at 03:52:11PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 09:31:42AM +0200, Alexandre Chartre wrote:
> 
> > -	ANNOTATE_NOSPEC_ALTERNATIVE
> >  	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
> >  		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
> >  		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
> 
> Possibly we can write this like:

.macro OOL_RETPOLINE_JMP reg:req
SYM_FUNC_START(__x86_retpoline_jmp_\reg)
	CFI_STARTPROC
	RETPOLINE_JMP \reg
	CFI_ENDPROC
SYM_FUNC_END(__x86_retpoline_jmp_\reg)
.endm

> 	ALTERNATIVE("", "lfence", X86_FEATURE_RETPOLINE_AMD);
> 	ALTERNATIVE("jmp *\reg", "jmp __x86_retpoline_jmp_\reg", X86_FEATURE_RETPOLINE);
> 
> With an out-of-line copy of the retpoline, just like the THUNKs the
> compiler uses, except of course, it can't be those, because we actually
> want to use the alternative to implement those.
> 
> By moving the retpoline magic out-of-line we ensure it has a unique
> address and the ORC stuff should work.
> 
> I'm just not sure what to do about the RETPOLINE_CALL variant.

Duh, something like so:

	ALTERNATIVE("", "lfence", X86_FEATURE_RETPOLINE_AMD);
	ALTERNATIVE("call *\reg", "call __x86_retpoline_jmp_\reg", X86_FEATURE_RETPOLINE);



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE
  2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf
@ 2020-04-07 14:02   ` Alexandre Chartre
  0 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 14:02 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, peterz, jthierry, tglx


On 4/7/20 3:35 PM, Josh Poimboeuf wrote:
> On Tue, Apr 07, 2020 at 09:31:33AM +0200, Alexandre Chartre wrote:
>> Hi,
>>
>> This is version v2 of this patchset based on the different comments
>> received so far. It now uses and includes PeterZ patch to add
>> UNWIND_HINT_RET_OFFSET. Other changes are described below.
>>
>> Code like retpoline or RSB stuffing, which is used to mitigate some of
>> the speculative execution issues, is currently ignored by objtool with
>> the ANNOTATE_NOSPEC_ALTERNATIVE directive. This series adds support
>> for intra-function calls to objtool so that it can handle such a code.
>> With these changes, we can remove all ANNOTATE_NOSPEC_ALTERNATIVE
>> directives.
>>
>> Changes:
>>   - replace RETPOLINE_RET with PeterZ UNWIND_HINT_RET_OFFSET
>>   - make objtool intra-function call action architecture dependent
>>   - objtool now automatically detects and validates all intra-function
>>     calls but it issues a warning if the call was not explicitly tagged
>>   - change __FILL_RETURN_BUFFER to work with objtool
>>   - add generic ANNOTATE_INTRA_FUNCTION_CALL macro
>>   - remove all ANNOTATE_SPEC_ALTERNATIVE (even for __FILL_RETURN_BUFFER)
> 
> I had trouble applying the patches.  What branch are they based on?  In
> general the latest tip/master is good.

Oups, this is on based on 5.5. I didn't realize I was that late, I though
I recently rebased but it looks like I didn't. Sorry about that, I will
make sure the next version is more recent.

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 13:34     ` Josh Poimboeuf
@ 2020-04-07 14:32       ` Alexandre Chartre
  2020-04-07 16:18         ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 14:32 UTC (permalink / raw)
  To: Josh Poimboeuf, Peter Zijlstra; +Cc: x86, linux-kernel, jthierry, tglx


On 4/7/20 3:34 PM, Josh Poimboeuf wrote:
> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
>> Josh, we should probably have objtool verify it doesn't emit ORC entries
>> in alternative ranges.
> 
> Agreed, it might be as simple as checking for insn->alt_group in the
> INSN_STACK check or in update_insn_state().
> 

We could do that only for the "objtool orc generate" command. That way
"objtool check" would still check the alternative, but "objtool orc generate"
will just use the first half of the alternative (like it does today with
ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE
but only use them for "objtool orc generate".

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 14:32       ` Alexandre Chartre
@ 2020-04-07 16:18         ` Alexandre Chartre
  2020-04-07 16:28           ` Josh Poimboeuf
  2020-04-07 16:41           ` Peter Zijlstra
  0 siblings, 2 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 16:18 UTC (permalink / raw)
  To: Josh Poimboeuf, Peter Zijlstra; +Cc: x86, linux-kernel, jthierry, tglx


On 4/7/20 4:32 PM, Alexandre Chartre wrote:
> 
> On 4/7/20 3:34 PM, Josh Poimboeuf wrote:
>> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
>>> Josh, we should probably have objtool verify it doesn't emit ORC entries
>>> in alternative ranges.
>>
>> Agreed, it might be as simple as checking for insn->alt_group in the
>> INSN_STACK check or in update_insn_state().
>>
> 
> We could do that only for the "objtool orc generate" command. That way
> "objtool check" would still check the alternative, but "objtool orc generate"
> will just use the first half of the alternative (like it does today with
> ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE
> but only use them for "objtool orc generate".
> 

I have checked and objtool doesn't emit ORC entries for alternative:
decode_instructions() doesn't mark such section with sec->text = true
so create_orc_sections() doesn't emit corresponding ORC entries.

So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives,
this will allow objtool to check the instructions but it still won't
emit ORC entries (same behavior as today). In the future, if ORC
eventually supports alternative we will be ready to have objtool emit
ORC entries.

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 16:18         ` Alexandre Chartre
@ 2020-04-07 16:28           ` Josh Poimboeuf
  2020-04-07 17:01             ` Alexandre Chartre
  2020-04-07 17:27             ` Peter Zijlstra
  2020-04-07 16:41           ` Peter Zijlstra
  1 sibling, 2 replies; 40+ messages in thread
From: Josh Poimboeuf @ 2020-04-07 16:28 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: Peter Zijlstra, x86, linux-kernel, jthierry, tglx

On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote:
> 
> On 4/7/20 4:32 PM, Alexandre Chartre wrote:
> > 
> > On 4/7/20 3:34 PM, Josh Poimboeuf wrote:
> > > On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
> > > > Josh, we should probably have objtool verify it doesn't emit ORC entries
> > > > in alternative ranges.
> > > 
> > > Agreed, it might be as simple as checking for insn->alt_group in the
> > > INSN_STACK check or in update_insn_state().
> > > 
> > 
> > We could do that only for the "objtool orc generate" command. That way
> > "objtool check" would still check the alternative, but "objtool orc generate"
> > will just use the first half of the alternative (like it does today with
> > ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE
> > but only use them for "objtool orc generate".
> > 
> 
> I have checked and objtool doesn't emit ORC entries for alternative:
> decode_instructions() doesn't mark such section with sec->text = true
> so create_orc_sections() doesn't emit corresponding ORC entries.
> 
> So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives,
> this will allow objtool to check the instructions but it still won't
> emit ORC entries (same behavior as today). In the future, if ORC
> eventually supports alternative we will be ready to have objtool emit
> ORC entries.

What's the benefit of removing ANNOTATE_NOSPEC_ALTERNATIVE if there's no
ORC support to go along with it?

Also I want to avoid adding "ORC alternatives".  ORC is nice and simple
and we should keep it that way as much as possible.

Again, we should warn on stack changes inside alternatives, and then
look at converting RSB and retpolines to use static branches so they
have deterministic stacks.

-- 
Josh


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 16:18         ` Alexandre Chartre
  2020-04-07 16:28           ` Josh Poimboeuf
@ 2020-04-07 16:41           ` Peter Zijlstra
  2020-04-07 17:04             ` Alexandre Chartre
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 16:41 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx

On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote:
> 
> On 4/7/20 4:32 PM, Alexandre Chartre wrote:
> > 
> > On 4/7/20 3:34 PM, Josh Poimboeuf wrote:
> > > On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
> > > > Josh, we should probably have objtool verify it doesn't emit ORC entries
> > > > in alternative ranges.
> > > 
> > > Agreed, it might be as simple as checking for insn->alt_group in the
> > > INSN_STACK check or in update_insn_state().
> > > 
> > 
> > We could do that only for the "objtool orc generate" command. That way
> > "objtool check" would still check the alternative, but "objtool orc generate"
> > will just use the first half of the alternative (like it does today with
> > ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE
> > but only use them for "objtool orc generate".
> > 
> 
> I have checked and objtool doesn't emit ORC entries for alternative:
> decode_instructions() doesn't mark such section with sec->text = true
> so create_orc_sections() doesn't emit corresponding ORC entries.
> 
> So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives,
> this will allow objtool to check the instructions but it still won't
> emit ORC entries (same behavior as today). In the future, if ORC
> eventually supports alternative we will be ready to have objtool emit
> ORC entries.

I mean, we should make it warn for the case where you remove
ANNOTATE_NOSPEC and it would like to generate ORC.

Also, what's the point of having objtool grok this code and then not
doing anything with it?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 16:28           ` Josh Poimboeuf
@ 2020-04-07 17:01             ` Alexandre Chartre
  2020-04-07 17:26               ` Peter Zijlstra
  2020-04-07 17:27             ` Peter Zijlstra
  1 sibling, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 17:01 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: Peter Zijlstra, x86, linux-kernel, jthierry, tglx



On 4/7/20 6:28 PM, Josh Poimboeuf wrote:
> On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote:
>>
>> On 4/7/20 4:32 PM, Alexandre Chartre wrote:
>>>
>>> On 4/7/20 3:34 PM, Josh Poimboeuf wrote:
>>>> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
>>>>> Josh, we should probably have objtool verify it doesn't emit ORC entries
>>>>> in alternative ranges.
>>>>
>>>> Agreed, it might be as simple as checking for insn->alt_group in the
>>>> INSN_STACK check or in update_insn_state().
>>>>
>>>
>>> We could do that only for the "objtool orc generate" command. That way
>>> "objtool check" would still check the alternative, but "objtool orc generate"
>>> will just use the first half of the alternative (like it does today with
>>> ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE
>>> but only use them for "objtool orc generate".
>>>
>>
>> I have checked and objtool doesn't emit ORC entries for alternative:
>> decode_instructions() doesn't mark such section with sec->text = true
>> so create_orc_sections() doesn't emit corresponding ORC entries.
>>
>> So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives,
>> this will allow objtool to check the instructions but it still won't
>> emit ORC entries (same behavior as today). In the future, if ORC
>> eventually supports alternative we will be ready to have objtool emit
>> ORC entries.
> 
> What's the benefit of removing ANNOTATE_NOSPEC_ALTERNATIVE if there's no
> ORC support to go along with it?

To have the code validated by objtool like any other alternative code
(which is not tagged with ANNOTATE_NOSPEC_ALTERNATIVE).

> Also I want to avoid adding "ORC alternatives".  ORC is nice and simple
> and we should keep it that way as much as possible.
> 
> Again, we should warn on stack changes inside alternatives, and then
> look at converting RSB and retpolines to use static branches so they
> have deterministic stacks.
> 
objtool doesn't currently warn on stack changes inside alternatives.
The RSB/retpoline alternatives have warning because objtool doesn't
support retpoline ret and intra-function calls. If you have an alternative
doing stack changes that objtool understand (like push/pop, add/remove
to sp) then you won't have a warning.

I think that's the case with smap_save:

static __always_inline unsigned long smap_save(void)
{
         unsigned long flags;

         asm volatile (ALTERNATIVE("", "pushf; pop %0; " __ASM_CLAC,
                                   X86_FEATURE_SMAP)
                       : "=rm" (flags) : : "memory", "cc");

         return flags;
}

The alternative does change the stack but objtool won't complain
because it handles the pushf and pop instruction.

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 16:41           ` Peter Zijlstra
@ 2020-04-07 17:04             ` Alexandre Chartre
  0 siblings, 0 replies; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-07 17:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx


On 4/7/20 6:41 PM, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 06:18:51PM +0200, Alexandre Chartre wrote:
>>
>> On 4/7/20 4:32 PM, Alexandre Chartre wrote:
>>>
>>> On 4/7/20 3:34 PM, Josh Poimboeuf wrote:
>>>> On Tue, Apr 07, 2020 at 03:28:37PM +0200, Peter Zijlstra wrote:
>>>>> Josh, we should probably have objtool verify it doesn't emit ORC entries
>>>>> in alternative ranges.
>>>>
>>>> Agreed, it might be as simple as checking for insn->alt_group in the
>>>> INSN_STACK check or in update_insn_state().
>>>>
>>>
>>> We could do that only for the "objtool orc generate" command. That way
>>> "objtool check" would still check the alternative, but "objtool orc generate"
>>> will just use the first half of the alternative (like it does today with
>>> ANNOTATE_NOSPEC_ALTERNATIVE). We can even keep all ANNOTATE_NOSPEC_ALTERNATIVE
>>> but only use them for "objtool orc generate".
>>>
>>
>> I have checked and objtool doesn't emit ORC entries for alternative:
>> decode_instructions() doesn't mark such section with sec->text = true
>> so create_orc_sections() doesn't emit corresponding ORC entries.
>>
>> So I think we can remove the ANNOTATE_NOSPEC_ALTERNATIVE directives,
>> this will allow objtool to check the instructions but it still won't
>> emit ORC entries (same behavior as today). In the future, if ORC
>> eventually supports alternative we will be ready to have objtool emit
>> ORC entries.
> 
> I mean, we should make it warn for the case where you remove
> ANNOTATE_NOSPEC and it would like to generate ORC.

Okay, so check if an alternative is changing the stack and warn if it is.

alex.

> Also, what's the point of having objtool grok this code and then not
> doing anything with it?
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 17:01             ` Alexandre Chartre
@ 2020-04-07 17:26               ` Peter Zijlstra
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 17:26 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx

On Tue, Apr 07, 2020 at 07:01:43PM +0200, Alexandre Chartre wrote:

> I think that's the case with smap_save:
> 
> static __always_inline unsigned long smap_save(void)
> {
>         unsigned long flags;
> 
>         asm volatile (ALTERNATIVE("", "pushf; pop %0; " __ASM_CLAC,
>                                   X86_FEATURE_SMAP)
>                       : "=rm" (flags) : : "memory", "cc");
> 
>         return flags;
> }
> 
> The alternative does change the stack but objtool won't complain
> because it handles the pushf and pop instruction.

Oh bugger; indeed! That'll wreck unwinding.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 16:28           ` Josh Poimboeuf
  2020-04-07 17:01             ` Alexandre Chartre
@ 2020-04-07 17:27             ` Peter Zijlstra
  2020-04-08 21:35               ` Peter Zijlstra
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-07 17:27 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: Alexandre Chartre, x86, linux-kernel, jthierry, tglx

On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote:
> Again, we should warn on stack changes inside alternatives, and then
> look at converting RSB and retpolines to use static branches so they
> have deterministic stacks.

I don't think we need static brancher, we should just out-of-line the
whole thing.

Let me sort this CFI error Thomas is getting and then I'll attempt a
patch along the lines I outlined in earlier emails.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-07 13:28     ` Alexandre Chartre
@ 2020-04-08 14:06       ` Alexandre Chartre
  2020-04-08 14:19         ` Julien Thierry
  0 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-08 14:06 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, jthierry, tglx



On 4/7/20 3:28 PM, Alexandre Chartre wrote:
> 
> On 4/7/20 3:07 PM, Peter Zijlstra wrote:
>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
>>
>>> index a62e032863a8..7ee1561bf7ad 100644
>>> --- a/tools/objtool/arch/x86/decode.c
>>> +++ b/tools/objtool/arch/x86/decode.c
>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
>>>       state->regs[16].base = CFI_CFA;
>>>       state->regs[16].offset = -8;
>>>   }
>>> +
>>> +
>>> +void arch_configure_intra_function_call(struct stack_op *op)
>>> +{
>>> +    /*
>>> +     * For the impact on the stack, make an intra-function
>>> +     * call behaves like a push of an immediate value (the
>>> +     * return address).
>>> +     */
>>> +    op->src.type = OP_SRC_CONST;
>>> +    op->dest.type = OP_DEST_PUSH;
>>> +}
>>
>> An alternative is to always set up stack ops for CALL/RET on decode, but
>> conditionally run update_insn_state() for them.
>>
>> Not sure that makes more logical sense, but the patch would be simpler I
>> think.
> 
> Right, this would avoid adding a new arch dependent function and the patch
> will be simpler. This probably makes sense as the stack impact is the same
> for all calls (but objtool will use it only for intra-function calls).
> 

Actually the processing of the ret instruction is more complicated than I
anticipated with intra-function calls, and so my implementation is not
complete at the moment.

The issue is to correctly handle how the ret is going to behave depending how
the stack (or register on arm) is modified before the ret. Adjusting the stack
offset makes the stack state correct, but objtool still needs to correctly
figure out where the ret is going to return and where the code flow continues.

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-08 14:06       ` Alexandre Chartre
@ 2020-04-08 14:19         ` Julien Thierry
  2020-04-08 16:03           ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Julien Thierry @ 2020-04-08 14:19 UTC (permalink / raw)
  To: Alexandre Chartre, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx



On 4/8/20 3:06 PM, Alexandre Chartre wrote:
> 
> 
> On 4/7/20 3:28 PM, Alexandre Chartre wrote:
>>
>> On 4/7/20 3:07 PM, Peter Zijlstra wrote:
>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
>>>
>>>> index a62e032863a8..7ee1561bf7ad 100644
>>>> --- a/tools/objtool/arch/x86/decode.c
>>>> +++ b/tools/objtool/arch/x86/decode.c
>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct 
>>>> cfi_state *state)
>>>>       state->regs[16].base = CFI_CFA;
>>>>       state->regs[16].offset = -8;
>>>>   }
>>>> +
>>>> +
>>>> +void arch_configure_intra_function_call(struct stack_op *op)
>>>> +{
>>>> +    /*
>>>> +     * For the impact on the stack, make an intra-function
>>>> +     * call behaves like a push of an immediate value (the
>>>> +     * return address).
>>>> +     */
>>>> +    op->src.type = OP_SRC_CONST;
>>>> +    op->dest.type = OP_DEST_PUSH;
>>>> +}
>>>
>>> An alternative is to always set up stack ops for CALL/RET on decode, but
>>> conditionally run update_insn_state() for them.
>>>
>>> Not sure that makes more logical sense, but the patch would be simpler I
>>> think.
>>
>> Right, this would avoid adding a new arch dependent function and the 
>> patch
>> will be simpler. This probably makes sense as the stack impact is the 
>> same
>> for all calls (but objtool will use it only for intra-function calls).
>>
> 
> Actually the processing of the ret instruction is more complicated than I
> anticipated with intra-function calls, and so my implementation is not
> complete at the moment.
> 
> The issue is to correctly handle how the ret is going to behave 
> depending how
> the stack (or register on arm) is modified before the ret. Adjusting the 
> stack
> offset makes the stack state correct, but objtool still needs to correctly
> figure out where the ret is going to return and where the code flow 
> continues.
> 

A hint indicating the target "jump" address could be useful. It could be 
used to add the information on some call/jump dynamic that aren't 
associated with jump tables. Currently when objtool finds a jump 
dynamic, if no branches were added to it, it will just return.

Having such a hint could help make additional links (at least on arm64). 
I don't know what Peter and Josh would think of that (if that helps in 
your case of course).

-- 
Julien Thierry


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-08 14:19         ` Julien Thierry
@ 2020-04-08 16:03           ` Alexandre Chartre
  2020-04-08 16:04             ` Julien Thierry
  0 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-08 16:03 UTC (permalink / raw)
  To: Julien Thierry, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx



On 4/8/20 4:19 PM, Julien Thierry wrote:
> 
> 
> On 4/8/20 3:06 PM, Alexandre Chartre wrote:
>>
>>
>> On 4/7/20 3:28 PM, Alexandre Chartre wrote:
>>>
>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote:
>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
>>>>
>>>>> index a62e032863a8..7ee1561bf7ad 100644
>>>>> --- a/tools/objtool/arch/x86/decode.c
>>>>> +++ b/tools/objtool/arch/x86/decode.c
>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
>>>>>       state->regs[16].base = CFI_CFA;
>>>>>       state->regs[16].offset = -8;
>>>>>   }
>>>>> +
>>>>> +
>>>>> +void arch_configure_intra_function_call(struct stack_op *op)
>>>>> +{
>>>>> +    /*
>>>>> +     * For the impact on the stack, make an intra-function
>>>>> +     * call behaves like a push of an immediate value (the
>>>>> +     * return address).
>>>>> +     */
>>>>> +    op->src.type = OP_SRC_CONST;
>>>>> +    op->dest.type = OP_DEST_PUSH;
>>>>> +}
>>>>
>>>> An alternative is to always set up stack ops for CALL/RET on decode, but
>>>> conditionally run update_insn_state() for them.
>>>>
>>>> Not sure that makes more logical sense, but the patch would be simpler I
>>>> think.
>>>
>>> Right, this would avoid adding a new arch dependent function and the patch
>>> will be simpler. This probably makes sense as the stack impact is the same
>>> for all calls (but objtool will use it only for intra-function calls).
>>>
>>
>> Actually the processing of the ret instruction is more complicated than I
>> anticipated with intra-function calls, and so my implementation is not
>> complete at the moment.
>>
>> The issue is to correctly handle how the ret is going to behave depending how
>> the stack (or register on arm) is modified before the ret. Adjusting the stack
>> offset makes the stack state correct, but objtool still needs to correctly
>> figure out where the ret is going to return and where the code flow continues.
>>
> 
> A hint indicating the target "jump" address could be useful. It could
> be used to add the information on some call/jump dynamic that aren't
> associated with jump tables. Currently when objtool finds a jump
> dynamic, if no branches were added to it, it will just return.
> 
> Having such a hint could help make additional links (at least on
> arm64). I don't know what Peter and Josh would think of that (if that
> helps in your case of course).
> 

Yes, I am thinking about tracking intra-function call return address,
and having hints to specify a return address changes. For example,
on x86, when we push the branch address on the stack we overwrite the
last return address (the return address of the last intra-function call).
Then the return instruction can figure out where to branch.

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-08 16:03           ` Alexandre Chartre
@ 2020-04-08 16:04             ` Julien Thierry
  2020-04-08 17:06               ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Julien Thierry @ 2020-04-08 16:04 UTC (permalink / raw)
  To: Alexandre Chartre, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx



On 4/8/20 5:03 PM, Alexandre Chartre wrote:
> 
> 
> On 4/8/20 4:19 PM, Julien Thierry wrote:
>>
>>
>> On 4/8/20 3:06 PM, Alexandre Chartre wrote:
>>>
>>>
>>> On 4/7/20 3:28 PM, Alexandre Chartre wrote:
>>>>
>>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote:
>>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
>>>>>
>>>>>> index a62e032863a8..7ee1561bf7ad 100644
>>>>>> --- a/tools/objtool/arch/x86/decode.c
>>>>>> +++ b/tools/objtool/arch/x86/decode.c
>>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct 
>>>>>> cfi_state *state)
>>>>>>       state->regs[16].base = CFI_CFA;
>>>>>>       state->regs[16].offset = -8;
>>>>>>   }
>>>>>> +
>>>>>> +
>>>>>> +void arch_configure_intra_function_call(struct stack_op *op)
>>>>>> +{
>>>>>> +    /*
>>>>>> +     * For the impact on the stack, make an intra-function
>>>>>> +     * call behaves like a push of an immediate value (the
>>>>>> +     * return address).
>>>>>> +     */
>>>>>> +    op->src.type = OP_SRC_CONST;
>>>>>> +    op->dest.type = OP_DEST_PUSH;
>>>>>> +}
>>>>>
>>>>> An alternative is to always set up stack ops for CALL/RET on 
>>>>> decode, but
>>>>> conditionally run update_insn_state() for them.
>>>>>
>>>>> Not sure that makes more logical sense, but the patch would be 
>>>>> simpler I
>>>>> think.
>>>>
>>>> Right, this would avoid adding a new arch dependent function and the 
>>>> patch
>>>> will be simpler. This probably makes sense as the stack impact is 
>>>> the same
>>>> for all calls (but objtool will use it only for intra-function calls).
>>>>
>>>
>>> Actually the processing of the ret instruction is more complicated 
>>> than I
>>> anticipated with intra-function calls, and so my implementation is not
>>> complete at the moment.
>>>
>>> The issue is to correctly handle how the ret is going to behave 
>>> depending how
>>> the stack (or register on arm) is modified before the ret. Adjusting 
>>> the stack
>>> offset makes the stack state correct, but objtool still needs to 
>>> correctly
>>> figure out where the ret is going to return and where the code flow 
>>> continues.
>>>
>>
>> A hint indicating the target "jump" address could be useful. It could
>> be used to add the information on some call/jump dynamic that aren't
>> associated with jump tables. Currently when objtool finds a jump
>> dynamic, if no branches were added to it, it will just return.
>>
>> Having such a hint could help make additional links (at least on
>> arm64). I don't know what Peter and Josh would think of that (if that
>> helps in your case of course).
>>
> 
> Yes, I am thinking about tracking intra-function call return address,
> and having hints to specify a return address changes. For example,
> on x86, when we push the branch address on the stack we overwrite the
> last return address (the return address of the last intra-function call).
> Then the return instruction can figure out where to branch.

I see, I was thinking about a more generic hint, that would just 
indicate "this instruction actually jumps here". So in your case it 
would just point that a certain return instruction causes to branch 
somewhere.

This way the hint could also be used for other instructions (e.g. 
INSN_JUMP_DYNAMIC).



-- 
Julien Thierry


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-08 16:04             ` Julien Thierry
@ 2020-04-08 17:06               ` Alexandre Chartre
  2020-04-08 17:07                 ` Julien Thierry
  0 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-08 17:06 UTC (permalink / raw)
  To: Julien Thierry, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx



On 4/8/20 6:04 PM, Julien Thierry wrote:
> 
> 
> On 4/8/20 5:03 PM, Alexandre Chartre wrote:
>>
>>
>> On 4/8/20 4:19 PM, Julien Thierry wrote:
>>>
>>>
>>> On 4/8/20 3:06 PM, Alexandre Chartre wrote:
>>>>
>>>>
>>>> On 4/7/20 3:28 PM, Alexandre Chartre wrote:
>>>>>
>>>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote:
>>>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
>>>>>>
>>>>>>> index a62e032863a8..7ee1561bf7ad 100644
>>>>>>> --- a/tools/objtool/arch/x86/decode.c
>>>>>>> +++ b/tools/objtool/arch/x86/decode.c
>>>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
>>>>>>>       state->regs[16].base = CFI_CFA;
>>>>>>>       state->regs[16].offset = -8;
>>>>>>>   }
>>>>>>> +
>>>>>>> +
>>>>>>> +void arch_configure_intra_function_call(struct stack_op *op)
>>>>>>> +{
>>>>>>> +    /*
>>>>>>> +     * For the impact on the stack, make an intra-function
>>>>>>> +     * call behaves like a push of an immediate value (the
>>>>>>> +     * return address).
>>>>>>> +     */
>>>>>>> +    op->src.type = OP_SRC_CONST;
>>>>>>> +    op->dest.type = OP_DEST_PUSH;
>>>>>>> +}
>>>>>>
>>>>>> An alternative is to always set up stack ops for CALL/RET on decode, but
>>>>>> conditionally run update_insn_state() for them.
>>>>>>
>>>>>> Not sure that makes more logical sense, but the patch would be simpler I
>>>>>> think.
>>>>>
>>>>> Right, this would avoid adding a new arch dependent function and the patch
>>>>> will be simpler. This probably makes sense as the stack impact is the same
>>>>> for all calls (but objtool will use it only for intra-function calls).
>>>>>
>>>>
>>>> Actually the processing of the ret instruction is more complicated than I
>>>> anticipated with intra-function calls, and so my implementation is not
>>>> complete at the moment.
>>>>
>>>> The issue is to correctly handle how the ret is going to behave depending how
>>>> the stack (or register on arm) is modified before the ret. Adjusting the stack
>>>> offset makes the stack state correct, but objtool still needs to correctly
>>>> figure out where the ret is going to return and where the code flow continues.
>>>>
>>>
>>> A hint indicating the target "jump" address could be useful. It could
>>> be used to add the information on some call/jump dynamic that aren't
>>> associated with jump tables. Currently when objtool finds a jump
>>> dynamic, if no branches were added to it, it will just return.
>>>
>>> Having such a hint could help make additional links (at least on
>>> arm64). I don't know what Peter and Josh would think of that (if that
>>> helps in your case of course).
>>>
>>
>> Yes, I am thinking about tracking intra-function call return address,
>> and having hints to specify a return address changes. For example,
>> on x86, when we push the branch address on the stack we overwrite the
>> last return address (the return address of the last intra-function call).
>> Then the return instruction can figure out where to branch.
> 
> I see, I was thinking about a more generic hint, that would just
> indicate "this instruction actually jumps here". So in your case it
> would just point that a certain return instruction causes to branch
> somewhere.

I thought about doing that but the problem is that on x86 the same
retpoline code can branch differently depending on how it is used.
Basically we have a return instruction that will branch differently
based on what's on the stack. So we can just tell that this ret
instruction will branch/return there.

alex.

> This way the hint could also be used for other instructions (e.g.
> INSN_JUMP_DYNAMIC).
> 
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 5/9] objtool: Add support for intra-function calls
  2020-04-08 17:06               ` Alexandre Chartre
@ 2020-04-08 17:07                 ` Julien Thierry
  0 siblings, 0 replies; 40+ messages in thread
From: Julien Thierry @ 2020-04-08 17:07 UTC (permalink / raw)
  To: Alexandre Chartre, Peter Zijlstra; +Cc: x86, linux-kernel, jpoimboe, tglx



On 4/8/20 6:06 PM, Alexandre Chartre wrote:
> 
> 
> On 4/8/20 6:04 PM, Julien Thierry wrote:
>>
>>
>> On 4/8/20 5:03 PM, Alexandre Chartre wrote:
>>>
>>>
>>> On 4/8/20 4:19 PM, Julien Thierry wrote:
>>>>
>>>>
>>>> On 4/8/20 3:06 PM, Alexandre Chartre wrote:
>>>>>
>>>>>
>>>>> On 4/7/20 3:28 PM, Alexandre Chartre wrote:
>>>>>>
>>>>>> On 4/7/20 3:07 PM, Peter Zijlstra wrote:
>>>>>>> On Tue, Apr 07, 2020 at 09:31:38AM +0200, Alexandre Chartre wrote:
>>>>>>>
>>>>>>>> index a62e032863a8..7ee1561bf7ad 100644
>>>>>>>> --- a/tools/objtool/arch/x86/decode.c
>>>>>>>> +++ b/tools/objtool/arch/x86/decode.c
>>>>>>>> @@ -497,3 +497,15 @@ void arch_initial_func_cfi_state(struct 
>>>>>>>> cfi_state *state)
>>>>>>>>       state->regs[16].base = CFI_CFA;
>>>>>>>>       state->regs[16].offset = -8;
>>>>>>>>   }
>>>>>>>> +
>>>>>>>> +
>>>>>>>> +void arch_configure_intra_function_call(struct stack_op *op)
>>>>>>>> +{
>>>>>>>> +    /*
>>>>>>>> +     * For the impact on the stack, make an intra-function
>>>>>>>> +     * call behaves like a push of an immediate value (the
>>>>>>>> +     * return address).
>>>>>>>> +     */
>>>>>>>> +    op->src.type = OP_SRC_CONST;
>>>>>>>> +    op->dest.type = OP_DEST_PUSH;
>>>>>>>> +}
>>>>>>>
>>>>>>> An alternative is to always set up stack ops for CALL/RET on 
>>>>>>> decode, but
>>>>>>> conditionally run update_insn_state() for them.
>>>>>>>
>>>>>>> Not sure that makes more logical sense, but the patch would be 
>>>>>>> simpler I
>>>>>>> think.
>>>>>>
>>>>>> Right, this would avoid adding a new arch dependent function and 
>>>>>> the patch
>>>>>> will be simpler. This probably makes sense as the stack impact is 
>>>>>> the same
>>>>>> for all calls (but objtool will use it only for intra-function 
>>>>>> calls).
>>>>>>
>>>>>
>>>>> Actually the processing of the ret instruction is more complicated 
>>>>> than I
>>>>> anticipated with intra-function calls, and so my implementation is not
>>>>> complete at the moment.
>>>>>
>>>>> The issue is to correctly handle how the ret is going to behave 
>>>>> depending how
>>>>> the stack (or register on arm) is modified before the ret. 
>>>>> Adjusting the stack
>>>>> offset makes the stack state correct, but objtool still needs to 
>>>>> correctly
>>>>> figure out where the ret is going to return and where the code flow 
>>>>> continues.
>>>>>
>>>>
>>>> A hint indicating the target "jump" address could be useful. It could
>>>> be used to add the information on some call/jump dynamic that aren't
>>>> associated with jump tables. Currently when objtool finds a jump
>>>> dynamic, if no branches were added to it, it will just return.
>>>>
>>>> Having such a hint could help make additional links (at least on
>>>> arm64). I don't know what Peter and Josh would think of that (if that
>>>> helps in your case of course).
>>>>
>>>
>>> Yes, I am thinking about tracking intra-function call return address,
>>> and having hints to specify a return address changes. For example,
>>> on x86, when we push the branch address on the stack we overwrite the
>>> last return address (the return address of the last intra-function 
>>> call).
>>> Then the return instruction can figure out where to branch.
>>
>> I see, I was thinking about a more generic hint, that would just
>> indicate "this instruction actually jumps here". So in your case it
>> would just point that a certain return instruction causes to branch
>> somewhere.
> 
> I thought about doing that but the problem is that on x86 the same
> retpoline code can branch differently depending on how it is used.
> Basically we have a return instruction that will branch differently
> based on what's on the stack. So we can just tell that this ret
> instruction will branch/return there.
> 

Oh, I see. Nevermind then!

-- 
Julien Thierry


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-07 17:27             ` Peter Zijlstra
@ 2020-04-08 21:35               ` Peter Zijlstra
  2020-04-09  8:18                 ` Alexandre Chartre
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-08 21:35 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: Alexandre Chartre, x86, linux-kernel, jthierry, tglx

On Tue, Apr 07, 2020 at 07:27:39PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote:
> > Again, we should warn on stack changes inside alternatives, and then
> > look at converting RSB and retpolines to use static branches so they
> > have deterministic stacks.
> 
> I don't think we need static brancher, we should just out-of-line the
> whole thing.
> 
> Let me sort this CFI error Thomas is getting and then I'll attempt a
> patch along the lines I outlined in earlier emails.

Something like so.. seems to build and boot.

---
From: Peter Zijlstra (Intel) <peterz@infradead.org>
Subject: x86: Out-of-line retpoline

Since GCC generated code already uses out-of-line retpolines and objtool
has trouble with retpolines in alternatives, out-of-line them entirely.

This will enable objtool (once it's been taught a few more tricks) to
generate valid ORC data for the out-of-line copies, which means we can
correctly and reliably unwind through a retpoline.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/crypto/aesni-intel_asm.S            |  4 +--
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  |  2 +-
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S |  2 +-
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S    | 26 ++++++++---------
 arch/x86/entry/entry_32.S                    |  6 ++--
 arch/x86/entry/entry_64.S                    |  2 +-
 arch/x86/include/asm/asm-prototypes.h        |  8 ++++--
 arch/x86/include/asm/nospec-branch.h         | 42 ++++------------------------
 arch/x86/kernel/ftrace_32.S                  |  2 +-
 arch/x86/kernel/ftrace_64.S                  |  4 +--
 arch/x86/lib/checksum_32.S                   |  4 +--
 arch/x86/lib/retpoline.S                     | 27 +++++++++++++++---
 arch/x86/platform/efi/efi_stub_64.S          |  2 +-
 13 files changed, 62 insertions(+), 69 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index cad6e1bfa7d5..54e7d15dbd0d 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2758,7 +2758,7 @@ SYM_FUNC_START(aesni_xts_crypt8)
 	pxor INC, STATE4
 	movdqu IV, 0x30(OUTP)

-	CALL_NOSPEC %r11
+	CALL_NOSPEC r11

 	movdqu 0x00(OUTP), INC
 	pxor INC, STATE1
@@ -2803,7 +2803,7 @@ SYM_FUNC_START(aesni_xts_crypt8)
 	_aesni_gf128mul_x_ble()
 	movups IV, (IVP)

-	CALL_NOSPEC %r11
+	CALL_NOSPEC r11

 	movdqu 0x40(OUTP), INC
 	pxor INC, STATE1
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index d01ddd73de65..ecc0a9a905c4 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -1228,7 +1228,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_16way)
 	vpxor 14 * 16(%rax), %xmm15, %xmm14;
 	vpxor 15 * 16(%rax), %xmm15, %xmm15;

-	CALL_NOSPEC %r9;
+	CALL_NOSPEC r9;

 	addq $(16 * 16), %rsp;

diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 563ef6e83cdd..0907243c501c 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -1339,7 +1339,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_32way)
 	vpxor 14 * 32(%rax), %ymm15, %ymm14;
 	vpxor 15 * 32(%rax), %ymm15, %ymm15;

-	CALL_NOSPEC %r9;
+	CALL_NOSPEC r9;

 	addq $(16 * 32), %rsp;

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 0e6690e3618c..8501ec4532f4 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -75,7 +75,7 @@

 .text
 SYM_FUNC_START(crc_pcl)
-#define    bufp		%rdi
+#define    bufp		rdi
 #define    bufp_dw	%edi
 #define    bufp_w	%di
 #define    bufp_b	%dil
@@ -105,9 +105,9 @@ SYM_FUNC_START(crc_pcl)
 	## 1) ALIGN:
 	################################################################

-	mov     bufp, bufptmp		# rdi = *buf
-	neg     bufp
-	and     $7, bufp		# calculate the unalignment amount of
+	mov     %bufp, bufptmp		# rdi = *buf
+	neg     %bufp
+	and     $7, %bufp		# calculate the unalignment amount of
 					# the address
 	je      proc_block		# Skip if aligned

@@ -123,13 +123,13 @@ SYM_FUNC_START(crc_pcl)
 do_align:
 	#### Calculate CRC of unaligned bytes of the buffer (if any)
 	movq    (bufptmp), tmp		# load a quadward from the buffer
-	add     bufp, bufptmp		# align buffer pointer for quadword
+	add     %bufp, bufptmp		# align buffer pointer for quadword
 					# processing
-	sub     bufp, len		# update buffer length
+	sub     %bufp, len		# update buffer length
 align_loop:
 	crc32b  %bl, crc_init_dw 	# compute crc32 of 1-byte
 	shr     $8, tmp			# get next byte
-	dec     bufp
+	dec     %bufp
 	jne     align_loop

 proc_block:
@@ -169,10 +169,10 @@ continue_block:
 	xor     crc2, crc2

 	## branch into array
-	lea	jump_table(%rip), bufp
-	movzxw  (bufp, %rax, 2), len
-	lea	crc_array(%rip), bufp
-	lea     (bufp, len, 1), bufp
+	lea	jump_table(%rip), %bufp
+	movzxw  (%bufp, %rax, 2), len
+	lea	crc_array(%rip), %bufp
+	lea     (%bufp, len, 1), %bufp
 	JMP_NOSPEC bufp

 	################################################################
@@ -218,9 +218,9 @@ LABEL crc_ %i
 	## 4) Combine three results:
 	################################################################

-	lea	(K_table-8)(%rip), bufp		# first entry is for idx 1
+	lea	(K_table-8)(%rip), %bufp		# first entry is for idx 1
 	shlq    $3, %rax			# rax *= 8
-	pmovzxdq (bufp,%rax), %xmm0		# 2 consts: K1:K2
+	pmovzxdq (%bufp,%rax), %xmm0		# 2 consts: K1:K2
 	leal	(%eax,%eax,2), %eax		# rax *= 3 (total *24)
 	subq    %rax, tmp			# tmp -= rax*24

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index b67bae7091d7..7e7ffb7a5147 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -816,7 +816,7 @@ SYM_CODE_START(ret_from_fork)

 	/* kernel thread */
 1:	movl	%edi, %eax
-	CALL_NOSPEC %ebx
+	CALL_NOSPEC ebx
 	/*
 	 * A kernel thread is allowed to return here after successfully
 	 * calling do_execve().  Exit to userspace to complete the execve()
@@ -1501,7 +1501,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception_read_cr2)

 	TRACE_IRQS_OFF
 	movl	%esp, %eax			# pt_regs pointer
-	CALL_NOSPEC %edi
+	CALL_NOSPEC edi
 	jmp	ret_from_exception
 SYM_CODE_END(common_exception_read_cr2)

@@ -1522,7 +1522,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception)

 	TRACE_IRQS_OFF
 	movl	%esp, %eax			# pt_regs pointer
-	CALL_NOSPEC %edi
+	CALL_NOSPEC edi
 	jmp	ret_from_exception
 SYM_CODE_END(common_exception)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0e9504fabe52..168b798913bc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -349,7 +349,7 @@ SYM_CODE_START(ret_from_fork)
 	/* kernel thread */
 	UNWIND_HINT_EMPTY
 	movq	%r12, %rdi
-	CALL_NOSPEC %rbx
+	CALL_NOSPEC rbx
 	/*
 	 * A kernel thread is allowed to return here after successfully
 	 * calling do_execve().  Exit to userspace to complete the execve()
diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index ce92c4acc913..a75195f159cc 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -18,9 +18,13 @@ extern void cmpxchg8b_emu(void);

 #ifdef CONFIG_RETPOLINE
 #ifdef CONFIG_X86_32
-#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void);
+#define INDIRECT_THUNK(reg) \
+	extern asmlinkage void __x86_retpoline_e ## reg(void); \
+	extern asmlinkage void __x86_indirect_thunk_e ## reg(void);
 #else
-#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void);
+#define INDIRECT_THUNK(reg) \
+	extern asmlinkage void __x86_retpoline_r ## reg(void); \
+	extern asmlinkage void __x86_indirect_thunk_r ## reg(void);
 INDIRECT_THUNK(8)
 INDIRECT_THUNK(9)
 INDIRECT_THUNK(10)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 07e95dcb40ad..0a3ceab5e0ec 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -76,34 +76,6 @@
 	.popsection
 .endm

-/*
- * These are the bare retpoline primitives for indirect jmp and call.
- * Do not use these directly; they only exist to make the ALTERNATIVE
- * invocation below less ugly.
- */
-.macro RETPOLINE_JMP reg:req
-	call	.Ldo_rop_\@
-.Lspec_trap_\@:
-	pause
-	lfence
-	jmp	.Lspec_trap_\@
-.Ldo_rop_\@:
-	mov	\reg, (%_ASM_SP)
-	ret
-.endm
-
-/*
- * This is a wrapper around RETPOLINE_JMP so the called function in reg
- * returns to the instruction after the macro.
- */
-.macro RETPOLINE_CALL reg:req
-	jmp	.Ldo_call_\@
-.Ldo_retpoline_jmp_\@:
-	RETPOLINE_JMP \reg
-.Ldo_call_\@:
-	call	.Ldo_retpoline_jmp_\@
-.endm
-
 /*
  * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
  * indirect jmp/call which may be susceptible to the Spectre variant 2
@@ -111,10 +83,9 @@
  */
 .macro JMP_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
-	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
-		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
-		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
+	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg),	\
+		__stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,	\
+		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	jmp	*\reg
 #endif
@@ -122,10 +93,9 @@

 .macro CALL_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
-	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg),	\
-		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
-		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
+	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg),	\
+		__stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\
+		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	call	*\reg
 #endif
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index e8a9f8370112..e405fe1a8bf4 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -189,5 +189,5 @@ return_to_handler:
 	movl	%eax, %ecx
 	popl	%edx
 	popl	%eax
-	JMP_NOSPEC %ecx
+	JMP_NOSPEC ecx
 #endif
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 369e61faacfe..2f2b5702e6cf 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -303,7 +303,7 @@ trace:
 	 * function tracing is enabled.
 	 */
 	movq ftrace_trace_function, %r8
-	CALL_NOSPEC %r8
+	CALL_NOSPEC r8
 	restore_mcount_regs

 	jmp fgraph_trace
@@ -340,6 +340,6 @@ SYM_CODE_START(return_to_handler)
 	movq 8(%rsp), %rdx
 	movq (%rsp), %rax
 	addq $24, %rsp
-	JMP_NOSPEC %rdi
+	JMP_NOSPEC rdi
 SYM_CODE_END(return_to_handler)
 #endif
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index 4742e8fa7ee7..d1d768912368 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -153,7 +153,7 @@ SYM_FUNC_START(csum_partial)
 	negl %ebx
 	lea 45f(%ebx,%ebx,2), %ebx
 	testl %esi, %esi
-	JMP_NOSPEC %ebx
+	JMP_NOSPEC ebx

 	# Handle 2-byte-aligned regions
 20:	addw (%esi), %ax
@@ -436,7 +436,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	andl $-32,%edx
 	lea 3f(%ebx,%ebx), %ebx
 	testl %esi, %esi
-	JMP_NOSPEC %ebx
+	JMP_NOSPEC ebx
 1:	addl $64,%esi
 	addl $64,%edi
 	SRC(movb -32(%edx),%bl)	; SRC(movb (%edx),%bl)
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 363ec132df7e..d4aef0c856a6 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -8,14 +8,31 @@
 #include <asm/export.h>
 #include <asm/nospec-branch.h>

+/*
+ * This is the bare retpoline primitive.
+ */
+.macro RETPOLINE_JMP reg:req
+	call	.Ldo_rop_\@
+.Lspec_trap_\@:
+	pause
+	lfence
+	jmp	.Lspec_trap_\@
+.Ldo_rop_\@:
+	mov	\reg, (%_ASM_SP)
+	ret
+.endm
+
 .macro THUNK reg
 	.section .text.__x86.indirect_thunk

+SYM_FUNC_START(__x86_retpoline_\reg)
+	RETPOLINE_JMP %\reg
+SYM_FUNC_END(__x86_retpoline_\reg)
+
 SYM_FUNC_START(__x86_indirect_thunk_\reg)
-	CFI_STARTPROC
-	JMP_NOSPEC %\reg
-	CFI_ENDPROC
+	JMP_NOSPEC \reg
 SYM_FUNC_END(__x86_indirect_thunk_\reg)
+
 .endm

 /*
@@ -26,7 +43,9 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
  * the simple and nasty way...
  */
 #define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
-#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
+#define EXPORT_THUNK(reg)				\
+	__EXPORT_THUNK(__x86_retpoline_ ## reg);	\
+	__EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
 #define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)

 GENERATE_THUNK(_ASM_AX)
diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index 15da118f04f0..90380a17ab23 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -21,7 +21,7 @@ SYM_FUNC_START(__efi_call)
 	mov %r8, %r9
 	mov %rcx, %r8
 	mov %rsi, %rcx
-	CALL_NOSPEC %rdi
+	CALL_NOSPEC rdi
 	leave
 	ret
 SYM_FUNC_END(__efi_call)


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-08 21:35               ` Peter Zijlstra
@ 2020-04-09  8:18                 ` Alexandre Chartre
  2020-04-09 10:34                   ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Alexandre Chartre @ 2020-04-09  8:18 UTC (permalink / raw)
  To: Peter Zijlstra, Josh Poimboeuf; +Cc: x86, linux-kernel, jthierry, tglx


On 4/8/20 11:35 PM, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 07:27:39PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote:
>>> Again, we should warn on stack changes inside alternatives, and then
>>> look at converting RSB and retpolines to use static branches so they
>>> have deterministic stacks.
>>
>> I don't think we need static brancher, we should just out-of-line the
>> whole thing.
>>
>> Let me sort this CFI error Thomas is getting and then I'll attempt a
>> patch along the lines I outlined in earlier emails.
> 
> Something like so.. seems to build and boot.
> 
> ---
> From: Peter Zijlstra (Intel) <peterz@infradead.org>
> Subject: x86: Out-of-line retpoline
> 
> Since GCC generated code already uses out-of-line retpolines and objtool
> has trouble with retpolines in alternatives, out-of-line them entirely.
> 
> This will enable objtool (once it's been taught a few more tricks) to
> generate valid ORC data for the out-of-line copies, which means we can
> correctly and reliably unwind through a retpoline.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   arch/x86/crypto/aesni-intel_asm.S            |  4 +--
>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |  2 +-
>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |  2 +-
>   arch/x86/crypto/crc32c-pcl-intel-asm_64.S    | 26 ++++++++---------
>   arch/x86/entry/entry_32.S                    |  6 ++--
>   arch/x86/entry/entry_64.S                    |  2 +-
>   arch/x86/include/asm/asm-prototypes.h        |  8 ++++--
>   arch/x86/include/asm/nospec-branch.h         | 42 ++++------------------------
>   arch/x86/kernel/ftrace_32.S                  |  2 +-
>   arch/x86/kernel/ftrace_64.S                  |  4 +--
>   arch/x86/lib/checksum_32.S                   |  4 +--
>   arch/x86/lib/retpoline.S                     | 27 +++++++++++++++---
>   arch/x86/platform/efi/efi_stub_64.S          |  2 +-
>   13 files changed, 62 insertions(+), 69 deletions(-)
> 
...
>   /*
>    * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
>    * indirect jmp/call which may be susceptible to the Spectre variant 2
> @@ -111,10 +83,9 @@
>    */
>   .macro JMP_NOSPEC reg:req
>   #ifdef CONFIG_RETPOLINE
> -	ANNOTATE_NOSPEC_ALTERNATIVE
> -	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
> -		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
> -		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
> +	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg),	\
> +		__stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,	\
> +		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
>   #else
>   	jmp	*\reg
>   #endif
> @@ -122,10 +93,9 @@
> 
>   .macro CALL_NOSPEC reg:req
>   #ifdef CONFIG_RETPOLINE
> -	ANNOTATE_NOSPEC_ALTERNATIVE
> -	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg),	\
> -		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
> -		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
> +	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg),	\
> +		__stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\
> +		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD

For X86_FEATURE_RETPOLINE_AMD, the call won't be aligned like the others,
it will be after the lfence instruction so ORC data won't be at the same
place. I am adding some code in objtool to check that alternatives don't
change the stack, but I should actually be checking if all alternatives
have the same unwind instructions at the same place.

Other than that, my only question would be any impact on performances.
Retpoline code was added with trying to limit performance impact.
Here, JMP_NOSPEC has now an additional (long) jump, and CALL_NOSPEC
is doing a long call instead of a near call. But I have no idea if this
has a visible impact.

alex.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-09  8:18                 ` Alexandre Chartre
@ 2020-04-09 10:34                   ` Peter Zijlstra
  2020-04-09 10:40                     ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-09 10:34 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx

On Thu, Apr 09, 2020 at 10:18:56AM +0200, Alexandre Chartre wrote:

> > -	ANNOTATE_NOSPEC_ALTERNATIVE
> > -	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg),	\
> > -		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
> > -		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
> > +	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg),	\
> > +		__stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\
> > +		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
> 
> For X86_FEATURE_RETPOLINE_AMD, the call won't be aligned like the others,
> it will be after the lfence instruction so ORC data won't be at the same
> place. I am adding some code in objtool to check that alternatives don't
> change the stack, but I should actually be checking if all alternatives
> have the same unwind instructions at the same place.

Argh; earlier (20200407135953.GC20730@hirez.programming.kicks-ass.net) I
used 2 alternatives but then, when I did the patch yesterday, I forgot
why I did that :/

> Other than that, my only question would be any impact on performances.

Yeah, that needs testing, I suspect it's a wash.

> Retpoline code was added with trying to limit performance impact.
> Here, JMP_NOSPEC has now an additional (long) jump, and CALL_NOSPEC
> is doing a long call instead of a near call. But I have no idea if this
> has a visible impact.

The thing is, all the compiler generated code already used the
out-of-line copies, and those will now have a near jump extra I think,
but that should be to the same $I line, given that __x86_retpoline_ and
__x86_indirect_thunk_ are next to one another.

We can also play alignment tricks, see below.

The only sites that can suffer are those few manual asm uses of
JMP_NOSPEC, and I'm not sure we care.


The below results in:

Disassembly of section .text.__x86.indirect_thunk:

0000000000000000 <__x86_indirect_thunk_rax>:
   0:	90                   	nop
   1:	90                   	nop
   2:	90                   	nop
   3:	ff e0                	jmpq   *%rax
   5:	90                   	nop
   6:	90                   	nop
   7:	90                   	nop

0000000000000008 <__x86_retpoline_rax>:
   8:	e8 07 00 00 00       	callq  14 <__x86_retpoline_rax+0xc>
   d:	f3 90                	pause  
   f:	0f ae e8             	lfence 
  12:	eb f9                	jmp    d <__x86_retpoline_rax+0x5>
  14:	48 89 04 24          	mov    %rax,(%rsp)
  18:	c3                   	retq   
  19:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

0000000000000020 <__x86_indirect_thunk_rbx>:
  ....


I'm not sure why it thinks the "jmp __x86_retpoline_rax" needs 5 bytes
to encode, but those nops don't hurt nothing since we'll not fit in 16
bytes anyway.

---
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index cad6e1bfa7d5..54e7d15dbd0d 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2758,7 +2758,7 @@ SYM_FUNC_START(aesni_xts_crypt8)
 	pxor INC, STATE4
 	movdqu IV, 0x30(OUTP)
 
-	CALL_NOSPEC %r11
+	CALL_NOSPEC r11
 
 	movdqu 0x00(OUTP), INC
 	pxor INC, STATE1
@@ -2803,7 +2803,7 @@ SYM_FUNC_START(aesni_xts_crypt8)
 	_aesni_gf128mul_x_ble()
 	movups IV, (IVP)
 
-	CALL_NOSPEC %r11
+	CALL_NOSPEC r11
 
 	movdqu 0x40(OUTP), INC
 	pxor INC, STATE1
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index d01ddd73de65..ecc0a9a905c4 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -1228,7 +1228,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_16way)
 	vpxor 14 * 16(%rax), %xmm15, %xmm14;
 	vpxor 15 * 16(%rax), %xmm15, %xmm15;
 
-	CALL_NOSPEC %r9;
+	CALL_NOSPEC r9;
 
 	addq $(16 * 16), %rsp;
 
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 563ef6e83cdd..0907243c501c 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -1339,7 +1339,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_32way)
 	vpxor 14 * 32(%rax), %ymm15, %ymm14;
 	vpxor 15 * 32(%rax), %ymm15, %ymm15;
 
-	CALL_NOSPEC %r9;
+	CALL_NOSPEC r9;
 
 	addq $(16 * 32), %rsp;
 
diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 0e6690e3618c..8501ec4532f4 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -75,7 +75,7 @@
 
 .text
 SYM_FUNC_START(crc_pcl)
-#define    bufp		%rdi
+#define    bufp		rdi
 #define    bufp_dw	%edi
 #define    bufp_w	%di
 #define    bufp_b	%dil
@@ -105,9 +105,9 @@ SYM_FUNC_START(crc_pcl)
 	## 1) ALIGN:
 	################################################################
 
-	mov     bufp, bufptmp		# rdi = *buf
-	neg     bufp
-	and     $7, bufp		# calculate the unalignment amount of
+	mov     %bufp, bufptmp		# rdi = *buf
+	neg     %bufp
+	and     $7, %bufp		# calculate the unalignment amount of
 					# the address
 	je      proc_block		# Skip if aligned
 
@@ -123,13 +123,13 @@ SYM_FUNC_START(crc_pcl)
 do_align:
 	#### Calculate CRC of unaligned bytes of the buffer (if any)
 	movq    (bufptmp), tmp		# load a quadward from the buffer
-	add     bufp, bufptmp		# align buffer pointer for quadword
+	add     %bufp, bufptmp		# align buffer pointer for quadword
 					# processing
-	sub     bufp, len		# update buffer length
+	sub     %bufp, len		# update buffer length
 align_loop:
 	crc32b  %bl, crc_init_dw 	# compute crc32 of 1-byte
 	shr     $8, tmp			# get next byte
-	dec     bufp
+	dec     %bufp
 	jne     align_loop
 
 proc_block:
@@ -169,10 +169,10 @@ continue_block:
 	xor     crc2, crc2
 
 	## branch into array
-	lea	jump_table(%rip), bufp
-	movzxw  (bufp, %rax, 2), len
-	lea	crc_array(%rip), bufp
-	lea     (bufp, len, 1), bufp
+	lea	jump_table(%rip), %bufp
+	movzxw  (%bufp, %rax, 2), len
+	lea	crc_array(%rip), %bufp
+	lea     (%bufp, len, 1), %bufp
 	JMP_NOSPEC bufp
 
 	################################################################
@@ -218,9 +218,9 @@ LABEL crc_ %i
 	## 4) Combine three results:
 	################################################################
 
-	lea	(K_table-8)(%rip), bufp		# first entry is for idx 1
+	lea	(K_table-8)(%rip), %bufp		# first entry is for idx 1
 	shlq    $3, %rax			# rax *= 8
-	pmovzxdq (bufp,%rax), %xmm0		# 2 consts: K1:K2
+	pmovzxdq (%bufp,%rax), %xmm0		# 2 consts: K1:K2
 	leal	(%eax,%eax,2), %eax		# rax *= 3 (total *24)
 	subq    %rax, tmp			# tmp -= rax*24
 
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index b67bae7091d7..7e7ffb7a5147 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -816,7 +816,7 @@ SYM_CODE_START(ret_from_fork)
 
 	/* kernel thread */
 1:	movl	%edi, %eax
-	CALL_NOSPEC %ebx
+	CALL_NOSPEC ebx
 	/*
 	 * A kernel thread is allowed to return here after successfully
 	 * calling do_execve().  Exit to userspace to complete the execve()
@@ -1501,7 +1501,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception_read_cr2)
 
 	TRACE_IRQS_OFF
 	movl	%esp, %eax			# pt_regs pointer
-	CALL_NOSPEC %edi
+	CALL_NOSPEC edi
 	jmp	ret_from_exception
 SYM_CODE_END(common_exception_read_cr2)
 
@@ -1522,7 +1522,7 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exception)
 
 	TRACE_IRQS_OFF
 	movl	%esp, %eax			# pt_regs pointer
-	CALL_NOSPEC %edi
+	CALL_NOSPEC edi
 	jmp	ret_from_exception
 SYM_CODE_END(common_exception)
 
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0e9504fabe52..168b798913bc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -349,7 +349,7 @@ SYM_CODE_START(ret_from_fork)
 	/* kernel thread */
 	UNWIND_HINT_EMPTY
 	movq	%r12, %rdi
-	CALL_NOSPEC %rbx
+	CALL_NOSPEC rbx
 	/*
 	 * A kernel thread is allowed to return here after successfully
 	 * calling do_execve().  Exit to userspace to complete the execve()
diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index ce92c4acc913..a75195f159cc 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -18,9 +18,13 @@ extern void cmpxchg8b_emu(void);
 
 #ifdef CONFIG_RETPOLINE
 #ifdef CONFIG_X86_32
-#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void);
+#define INDIRECT_THUNK(reg) \
+	extern asmlinkage void __x86_retpoline_e ## reg(void); \
+	extern asmlinkage void __x86_indirect_thunk_e ## reg(void);
 #else
-#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void);
+#define INDIRECT_THUNK(reg) \
+	extern asmlinkage void __x86_retpoline_r ## reg(void); \
+	extern asmlinkage void __x86_indirect_thunk_r ## reg(void);
 INDIRECT_THUNK(8)
 INDIRECT_THUNK(9)
 INDIRECT_THUNK(10)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 07e95dcb40ad..a180b0fe2fed 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -76,34 +76,6 @@
 	.popsection
 .endm
 
-/*
- * These are the bare retpoline primitives for indirect jmp and call.
- * Do not use these directly; they only exist to make the ALTERNATIVE
- * invocation below less ugly.
- */
-.macro RETPOLINE_JMP reg:req
-	call	.Ldo_rop_\@
-.Lspec_trap_\@:
-	pause
-	lfence
-	jmp	.Lspec_trap_\@
-.Ldo_rop_\@:
-	mov	\reg, (%_ASM_SP)
-	ret
-.endm
-
-/*
- * This is a wrapper around RETPOLINE_JMP so the called function in reg
- * returns to the instruction after the macro.
- */
-.macro RETPOLINE_CALL reg:req
-	jmp	.Ldo_call_\@
-.Ldo_retpoline_jmp_\@:
-	RETPOLINE_JMP \reg
-.Ldo_call_\@:
-	call	.Ldo_retpoline_jmp_\@
-.endm
-
 /*
  * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
  * indirect jmp/call which may be susceptible to the Spectre variant 2
@@ -111,23 +83,21 @@
  */
 .macro JMP_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
-	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
-		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
-		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
+	ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
+	ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+		    __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE
 #else
-	jmp	*\reg
+	jmp	*%\reg
 #endif
 .endm
 
 .macro CALL_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
-	ANNOTATE_NOSPEC_ALTERNATIVE
-	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg),	\
-		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
-		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
+	ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
+	ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
+		    __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE
 #else
-	call	*\reg
+	call	*%\reg
 #endif
 .endm
 
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index e8a9f8370112..e405fe1a8bf4 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -189,5 +189,5 @@ return_to_handler:
 	movl	%eax, %ecx
 	popl	%edx
 	popl	%eax
-	JMP_NOSPEC %ecx
+	JMP_NOSPEC ecx
 #endif
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 369e61faacfe..2f2b5702e6cf 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -303,7 +303,7 @@ trace:
 	 * function tracing is enabled.
 	 */
 	movq ftrace_trace_function, %r8
-	CALL_NOSPEC %r8
+	CALL_NOSPEC r8
 	restore_mcount_regs
 
 	jmp fgraph_trace
@@ -340,6 +340,6 @@ SYM_CODE_START(return_to_handler)
 	movq 8(%rsp), %rdx
 	movq (%rsp), %rax
 	addq $24, %rsp
-	JMP_NOSPEC %rdi
+	JMP_NOSPEC rdi
 SYM_CODE_END(return_to_handler)
 #endif
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index 4742e8fa7ee7..d1d768912368 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -153,7 +153,7 @@ SYM_FUNC_START(csum_partial)
 	negl %ebx
 	lea 45f(%ebx,%ebx,2), %ebx
 	testl %esi, %esi
-	JMP_NOSPEC %ebx
+	JMP_NOSPEC ebx
 
 	# Handle 2-byte-aligned regions
 20:	addw (%esi), %ax
@@ -436,7 +436,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	andl $-32,%edx
 	lea 3f(%ebx,%ebx), %ebx
 	testl %esi, %esi 
-	JMP_NOSPEC %ebx
+	JMP_NOSPEC ebx
 1:	addl $64,%esi
 	addl $64,%edi 
 	SRC(movb -32(%edx),%bl)	; SRC(movb (%edx),%bl)
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 363ec132df7e..cc44702d0492 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -7,15 +7,35 @@
 #include <asm/alternative-asm.h>
 #include <asm/export.h>
 #include <asm/nospec-branch.h>
+#include <asm/unwind_hints.h>
+
+/*
+ * This is the bare retpoline primitive.
+ */
+.macro RETPOLINE_JMP reg:req
+	call	.Ldo_rop_\@
+.Lspec_trap_\@:
+	pause
+	lfence
+	jmp	.Lspec_trap_\@
+.Ldo_rop_\@:
+	mov	\reg, (%_ASM_SP)
+	ret
+.endm
 
 .macro THUNK reg
 	.section .text.__x86.indirect_thunk
 
+	.align 32
 SYM_FUNC_START(__x86_indirect_thunk_\reg)
-	CFI_STARTPROC
-	JMP_NOSPEC %\reg
-	CFI_ENDPROC
+	JMP_NOSPEC \reg
 SYM_FUNC_END(__x86_indirect_thunk_\reg)
+
+SYM_CODE_START_NOALIGN(__x86_retpoline_\reg)
+	UNWIND_HINT_EMPTY
+	RETPOLINE_JMP %\reg
+SYM_CODE_END(__x86_retpoline_\reg)
+
 .endm
 
 /*
@@ -26,7 +46,9 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
  * the simple and nasty way...
  */
 #define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
-#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
+#define EXPORT_THUNK(reg)				\
+	__EXPORT_THUNK(__x86_retpoline_ ## reg);	\
+	__EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
 #define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
 
 GENERATE_THUNK(_ASM_AX)
diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index 15da118f04f0..90380a17ab23 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -21,7 +21,7 @@ SYM_FUNC_START(__efi_call)
 	mov %r8, %r9
 	mov %rcx, %r8
 	mov %rsi, %rcx
-	CALL_NOSPEC %rdi
+	CALL_NOSPEC rdi
 	leave
 	ret
 SYM_FUNC_END(__efi_call)

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives
  2020-04-09 10:34                   ` Peter Zijlstra
@ 2020-04-09 10:40                     ` Peter Zijlstra
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Zijlstra @ 2020-04-09 10:40 UTC (permalink / raw)
  To: Alexandre Chartre; +Cc: Josh Poimboeuf, x86, linux-kernel, jthierry, tglx

On Thu, Apr 09, 2020 at 12:34:24PM +0200, Peter Zijlstra wrote:
> @@ -111,23 +83,21 @@
>   */
>  .macro JMP_NOSPEC reg:req
>  #ifdef CONFIG_RETPOLINE
> -	ANNOTATE_NOSPEC_ALTERNATIVE
> -	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg),	\
> -		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
> -		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
> +	ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
> +	ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
> +		    __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE

Still, I am being an idiot, only the call (below) needs this, this can
stay what it was:

+       ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+                     __stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE, \
+                     __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD

>  #else
> -	jmp	*\reg
> +	jmp	*%\reg
>  #endif
>  .endm
>  
>  .macro CALL_NOSPEC reg:req
>  #ifdef CONFIG_RETPOLINE
> -	ANNOTATE_NOSPEC_ALTERNATIVE
> -	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg),	\
> -		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
> -		__stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD

And then this needs a comment on why it is not ALTERNATIVE_2 like above.

> +	ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
> +	ALTERNATIVE __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
> +		    __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE
>  #else
> -	call	*\reg
> +	call	*%\reg
>  #endif
>  .endm

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [tip: objtool/core] objtool: UNWIND_HINT_RET_OFFSET should not check registers
  2020-04-07  7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre
@ 2020-05-01 18:22   ` tip-bot2 for Alexandre Chartre
  0 siblings, 0 replies; 40+ messages in thread
From: tip-bot2 for Alexandre Chartre @ 2020-05-01 18:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Alexandre Chartre, Peter Zijlstra (Intel),
	Miroslav Benes, Josh Poimboeuf, x86, LKML

The following commit has been merged into the objtool/core branch of tip:

Commit-ID:     c721b3f80faebc7891211fa82de303eebadfed15
Gitweb:        https://git.kernel.org/tip/c721b3f80faebc7891211fa82de303eebadfed15
Author:        Alexandre Chartre <alexandre.chartre@oracle.com>
AuthorDate:    Tue, 07 Apr 2020 09:31:35 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 30 Apr 2020 20:14:32 +02:00

objtool: UNWIND_HINT_RET_OFFSET should not check registers

UNWIND_HINT_RET_OFFSET will adjust a modified stack. However if a
callee-saved register was pushed on the stack then the stack frame
will still appear modified. So stop checking registers when
UNWIND_HINT_RET_OFFSET is used.

Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200407073142.20659-3-alexandre.chartre@oracle.com
---
 tools/objtool/check.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 8af8de2..068897d 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1507,6 +1507,14 @@ static bool has_modified_stack_frame(struct instruction *insn, struct insn_state
 	if (cfi->stack_size != initial_func_cfi.cfa.offset + ret_offset)
 		return true;
 
+	/*
+	 * If there is a ret offset hint then don't check registers
+	 * because a callee-saved register might have been pushed on
+	 * the stack.
+	 */
+	if (ret_offset)
+		return false;
+
 	for (i = 0; i < CFI_NUM_REGS; i++) {
 		if (cfi->regs[i].base != initial_func_cfi.regs[i].base ||
 		    cfi->regs[i].offset != initial_func_cfi.regs[i].offset)

^ permalink raw reply related	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-05-01 18:23 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07  7:31 [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 1/9] objtool: Introduce HINT_RET_OFFSET Alexandre Chartre
2020-04-07 12:53   ` Peter Zijlstra
2020-04-07 13:17     ` Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 2/9] objtool: UNWIND_HINT_RET_OFFSET should not check registers Alexandre Chartre
2020-05-01 18:22   ` [tip: objtool/core] " tip-bot2 for Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 3/9] objtool: is_fentry_call() crashes if call has no destination Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 4/9] objtool: Allow branches within the same alternative Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 5/9] objtool: Add support for intra-function calls Alexandre Chartre
2020-04-07 13:07   ` Peter Zijlstra
2020-04-07 13:28     ` Alexandre Chartre
2020-04-08 14:06       ` Alexandre Chartre
2020-04-08 14:19         ` Julien Thierry
2020-04-08 16:03           ` Alexandre Chartre
2020-04-08 16:04             ` Julien Thierry
2020-04-08 17:06               ` Alexandre Chartre
2020-04-08 17:07                 ` Julien Thierry
2020-04-07  7:31 ` [PATCH V2 6/9] x86/speculation: Change __FILL_RETURN_BUFFER to work with objtool Alexandre Chartre
2020-04-07 13:27   ` Josh Poimboeuf
2020-04-07  7:31 ` [PATCH V2 7/9] x86/speculation: Annotate intra-function calls Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 8/9] x86/speculation: Add unwind hint to trampoline return Alexandre Chartre
2020-04-07  7:31 ` [PATCH V2 9/9] x86/speculation: Remove all ANNOTATE_NOSPEC_ALTERNATIVE directives Alexandre Chartre
2020-04-07 13:28   ` Peter Zijlstra
2020-04-07 13:34     ` Josh Poimboeuf
2020-04-07 14:32       ` Alexandre Chartre
2020-04-07 16:18         ` Alexandre Chartre
2020-04-07 16:28           ` Josh Poimboeuf
2020-04-07 17:01             ` Alexandre Chartre
2020-04-07 17:26               ` Peter Zijlstra
2020-04-07 17:27             ` Peter Zijlstra
2020-04-08 21:35               ` Peter Zijlstra
2020-04-09  8:18                 ` Alexandre Chartre
2020-04-09 10:34                   ` Peter Zijlstra
2020-04-09 10:40                     ` Peter Zijlstra
2020-04-07 16:41           ` Peter Zijlstra
2020-04-07 17:04             ` Alexandre Chartre
2020-04-07 13:52   ` Peter Zijlstra
2020-04-07 13:59     ` Peter Zijlstra
2020-04-07 13:35 ` [PATCH V2 0/9] objtool changes to remove all ANNOTATE_NOSPEC_ALTERNATIVE Josh Poimboeuf
2020-04-07 14:02   ` Alexandre Chartre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).