* [RFC][PATCH 0/3] objtool vs irqstack swizzles
@ 2020-05-07 16:10 Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 1/3] x86/entry: Collapse the 3 IRQ stack instances into a single macro Peter Zijlstra
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-07 16:10 UTC (permalink / raw)
To: tglx; +Cc: jpoimboe, x86, linux-kernel, luto, peterz
These few patches are on top of the very latest tarball tglx gave me and do not
apply to hit posted patches or his git-tree (although I suspect that might
change at some point in the near future).
Aside from relying on tip/objtool/core, this also relies on the patch I just
pushed into tip/objtool/urgent.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC][PATCH 1/3] x86/entry: Collapse the 3 IRQ stack instances into a single macro
2020-05-07 16:10 [RFC][PATCH 0/3] objtool vs irqstack swizzles Peter Zijlstra
@ 2020-05-07 16:10 ` Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 2/3] x86/entry: Provide ASM_INSTR_{BEGIN,END} Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles Peter Zijlstra
2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-07 16:10 UTC (permalink / raw)
To: tglx; +Cc: jpoimboe, x86, linux-kernel, luto, peterz
Because 1 copy of this magic is plenty.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -12,10 +12,8 @@ static __always_inline bool irqstack_act
return __this_cpu_read(irq_count) != -1;
}
-/*
- * Macro to emit code for running @func on the irq stack.
- */
-#define RUN_ON_IRQSTACK(func) { \
+#define __RUN_ON_IRQSTACK(_asm, ...) \
+do { \
unsigned long tos; \
\
lockdep_assert_irqs_disabled(); \
@@ -30,45 +28,27 @@ static __always_inline bool irqstack_act
" .pushsection .discard.instr_begin \n" \
" .long 1b - . \n" \
" .popsection \n" \
- "call " __ASM_FORM(func) " \n" \
+ _asm " \n" \
"2: \n" \
" .pushsection .discard.instr_end \n" \
" .long 2b - . \n" \
" .popsection \n" \
"popq %%rsp \n" \
: \
- : [ts] "r" (tos) \
+ : [ts] "r" (tos), ##__VA_ARGS__ \
: "memory" \
); \
__this_cpu_sub(irq_count, 1); \
-}
+} while (0)
-#define RUN_ON_IRQSTACK_ARG1(func, arg) { \
- unsigned long tos; \
- \
- tos = ((unsigned long)__this_cpu_read(hardirq_stack_ptr)) - 8; \
- \
- __this_cpu_add(irq_count, 1); \
- asm volatile( \
- "movq %%rsp, (%[ts]) \n" \
- "movq %[ts], %%rsp \n" \
- "1: \n" \
- " .pushsection .discard.instr_begin \n" \
- " .long 1b - . \n" \
- " .popsection \n" \
- "call " __ASM_FORM(func) " \n" \
- "2: \n" \
- " .pushsection .discard.instr_end \n" \
- " .long 2b - . \n" \
- " .popsection \n" \
- "popq %%rsp \n" \
- : \
- : [ts] "r" (tos), \
- "D" (arg) \
- : "memory" \
- ); \
- __this_cpu_sub(irq_count, 1); \
-}
+/*
+ * Macro to emit code for running @func on the irq stack.
+ */
+#define RUN_ON_IRQSTACK(func) \
+ __RUN_ON_IRQSTACK("call" __ASM_FORM(func))
+
+#define RUN_ON_IRQSTACK_ARG1(func, arg) \
+ __RUN_ON_IRQSTACK("call" __ASM_FORM(func), "D" (arg))
#else /* CONFIG_X86_64 */
static __always_inline bool irqstack_active(void) { return false; }
--- a/arch/x86/kernel/irq_64.c
+++ b/arch/x86/kernel/irq_64.c
@@ -74,35 +74,7 @@ int irq_init_percpu_irqstack(unsigned in
static noinstr void handle_irq_on_irqstack(struct irq_desc *desc)
{
- unsigned long tos;
-
- tos = (unsigned long) __this_cpu_read(hardirq_stack_ptr) - 8;
-
- __this_cpu_add(irq_count, 1);
- /*
- * The unwinder requires that the top of the IRQ stack links back
- * to the previous stack and RBP is set up.
- */
- asm volatile(
- "movq %%rsp, (%[ts]) \n"
- "movq %[ts], %%rsp \n"
- "1: \n"
- " .pushsection .discard.instr_begin \n"
- " .long 1b - . \n"
- " .popsection \n"
- CALL_NOSPEC
- "2: \n"
- " .pushsection .discard.instr_end \n"
- " .long 2b - . \n"
- " .popsection \n"
- "popq %%rsp \n"
- :
- : [ts] "r" (tos),
- [thunk_target] "r" (desc->handle_irq),
- "D" (desc)
- : "memory"
- );
- __this_cpu_sub(irq_count, 1);
+ __RUN_ON_IRQSTACK(CALL_NOSPEC, THUNK_TARGET(desc->handle_irq), "D" (desc));
}
void handle_irq(struct irq_desc *desc, struct pt_regs *regs)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC][PATCH 2/3] x86/entry: Provide ASM_INSTR_{BEGIN,END}
2020-05-07 16:10 [RFC][PATCH 0/3] objtool vs irqstack swizzles Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 1/3] x86/entry: Collapse the 3 IRQ stack instances into a single macro Peter Zijlstra
@ 2020-05-07 16:10 ` Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles Peter Zijlstra
2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-07 16:10 UTC (permalink / raw)
To: tglx; +Cc: jpoimboe, x86, linux-kernel, luto, peterz
To reduce clutter.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/include/asm/irq_stack.h | 10 ++--------
arch/x86/include/asm/kvm_host.h | 10 ++--------
include/linux/compiler.h | 14 ++++++++++++++
3 files changed, 18 insertions(+), 16 deletions(-)
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -24,15 +24,9 @@ do { \
asm volatile( \
"movq %%rsp, (%[ts]) \n" \
"movq %[ts], %%rsp \n" \
- "1: \n" \
- " .pushsection .discard.instr_begin \n" \
- " .long 1b - . \n" \
- " .popsection \n" \
+ ASM_INSTR_BEGIN \
_asm " \n" \
- "2: \n" \
- " .pushsection .discard.instr_end \n" \
- " .long 2b - . \n" \
- " .popsection \n" \
+ ASM_INSTR_END \
"popq %%rsp \n" \
: \
: [ts] "r" (tos), ##__VA_ARGS__ \
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1601,15 +1601,9 @@ asmlinkage void kvm_spurious_fault(void)
insn "\n\t" \
"jmp 668f \n\t" \
"667: \n\t" \
- "1: \n\t" \
- ".pushsection .discard.instr_begin \n\t" \
- ".long 1b - . \n\t" \
- ".popsection \n\t" \
+ ASM_INSTR_BEGIN \
"call kvm_spurious_fault \n\t" \
- "1: \n\t" \
- ".pushsection .discard.instr_end \n\t" \
- ".long 1b - . \n\t" \
- ".popsection \n\t" \
+ ASM_INSTR_END \
"668: \n\t" \
_ASM_EXTABLE(666b, 667b)
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -135,12 +135,26 @@ void ftrace_likely_update(struct ftrace_
".popsection\n\t" : : "i" (__COUNTER__)); \
})
+#define ASM_INSTR_BEGIN \
+ "999:\n\t" \
+ ".pushsection .discard.instr_begin\n\t" \
+ ".long 999b - .\n\t" \
+ ".popsection\n\t"
+
+#define ASM_INSTR_END \
+ "999:\n\t" \
+ ".pushsection .discard.instr_end\n\t" \
+ ".long 999b - .\n\t" \
+ ".popsection\n\t"
+
#else
#define annotate_reachable()
#define annotate_unreachable()
#define __annotate_jump_table
#define instr_begin() do { } while(0)
#define instr_end() do { } while(0)
+#define ASM_INSTR_BEGIN
+#define ASM_INSTR_END
#endif
#ifndef ASM_UNREACHABLE
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-07 16:10 [RFC][PATCH 0/3] objtool vs irqstack swizzles Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 1/3] x86/entry: Collapse the 3 IRQ stack instances into a single macro Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 2/3] x86/entry: Provide ASM_INSTR_{BEGIN,END} Peter Zijlstra
@ 2020-05-07 16:10 ` Peter Zijlstra
2020-05-07 17:38 ` Peter Zijlstra
2 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-07 16:10 UTC (permalink / raw)
To: tglx; +Cc: jpoimboe, x86, linux-kernel, luto, peterz
Thomas would very much like objtool to understand and generate correct
ORC unwind information for the minimal stack swizzle sequence:
mov %rsp, (%[ts])
mov %[ts], %rsp
...
pop %rsp
This sequence works for the fp and guess unwinders -- all they need is
that top-of-stack link set up by the first instruction.
The previous entry_64.S code worked with "UNWIND_HINT_REGS indirect=1"
hints to inform the unwinder about the stack-swizzle, but because
we've now already entered C, we can no longer point to a REGS. In
fact, due to being in C we don't even have a reliable sp_offset to
anything.
None of the existing UNWIND_HINT() functionality is quite sufficient
to generate the right thing, but SP_INDIRECT is still the closest, so
extend it.
When SP_INDIRECT is combined with .end=1 (which is otherwise unused,
except for sp_reg == UNDEFINED):
- change it from (sp+sp_offset) to (sp)+sp_offset
- have objtool preserve sp_offset from the previous state
- change "pop %rsp" handling to restore the CFI state from before the
hint.
NOTES:
- We now have an instruction with stackops and a hint; make hint take
precedence over stackops.
- Due to the reverse search in "pop %rsp" we must
fill_alternative_cfi() before validate_branch().
- This all isn't really pretty, but it works and gets Thomas the code
sequence he wants.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/include/asm/irq_stack.h | 1
arch/x86/include/asm/unwind_hints.h | 41 +++++++++++++++++++++++++
arch/x86/kernel/unwind_orc.c | 11 +++++-
tools/objtool/check.c | 58 +++++++++++++++++++++++++++---------
tools/objtool/orc_dump.c | 22 ++++++++-----
5 files changed, 110 insertions(+), 23 deletions(-)
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -23,6 +23,7 @@ do { \
__this_cpu_add(irq_count, 1); \
asm volatile( \
"movq %%rsp, (%[ts]) \n" \
+ UNWIND_HINT_STACK_EMPTY \
"movq %[ts], %%rsp \n" \
ASM_INSTR_BEGIN \
_asm " \n" \
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -95,6 +95,47 @@
UNWIND_HINT type=UNWIND_HINT_TYPE_RET_OFFSET sp_offset=\sp_offset
.endm
+#else
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(sp_offset) "\n\t" \
+ ".byte " __stringify(sp_reg) "\n\t" \
+ ".byte " __stringify(type) "\n\t" \
+ ".byte " __stringify(end) "\n\t" \
+ ".balign 4 \n\t" \
+ ".popsection\n\t"
+
+/*
+ * Stack swizzling vs objtool/ORC:
+ *
+ * The canonical way of swizzling stack is:
+ *
+ * 1: mov %%rsp, (%[ts])
+ * 2: mov %[ts], %%rsp
+ * ...
+ * 3: pop %%rsp
+ *
+ * Where:
+ *
+ * 1 - places a pointer to the previous stack at the top of the new stack;
+ * also see the unwinders.
+ *
+ * 2 - switches to the new stack, but to avoid hitting the CFA_UNDEFINED case,
+ * we need to tell objtool the stack pointer can be found at (%%rsp),
+ * UNWIND_HINT_STACK_EMPTY does so.
+ *
+ * 3 - restores the previous stack by popping the value stored by 1 into %%rsp,
+ * ....
+ *
+ * See arch/x86/include/asm/irq_stack.h
+ */
+#define UNWIND_HINT_STACK_EMPTY \
+ UNWIND_HINT(ORC_REG_SP_INDIRECT, 0, ORC_TYPE_CALL, 1)
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_UNWIND_HINTS_H */
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -435,12 +435,16 @@ bool unwind_next_frame(struct unwind_sta
break;
case ORC_REG_SP_INDIRECT:
- sp = state->sp + orc->sp_offset;
+ sp = state->sp;
+ if (!orc->end)
+ sp += orc->sp_offset;
indirect = true;
break;
case ORC_REG_BP_INDIRECT:
- sp = state->bp + orc->sp_offset;
+ sp = state->bp;
+ if (!orc->end)
+ sp += orc->sp_offset;
indirect = true;
break;
@@ -489,6 +493,9 @@ bool unwind_next_frame(struct unwind_sta
if (indirect) {
if (!deref_stack_reg(state, sp, &sp))
goto err;
+
+ if (orc->end)
+ sp += orc->sp_offset;
}
/* Find IP, SP and possibly regs: */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1720,8 +1720,8 @@ static void restore_reg(struct cfi_state
* 41 5d pop %r13
* c3 retq
*/
-static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi,
- struct stack_op *op)
+static int update_cfi_state(struct objtool_file *file, struct instruction *insn,
+ struct cfi_state *cfi, struct stack_op *op)
{
struct cfi_reg *cfa = &cfi->cfa;
struct cfi_reg *regs = cfi->regs;
@@ -1898,6 +1898,29 @@ static int update_cfi_state(struct instr
case OP_SRC_POP:
case OP_SRC_POPF:
+
+ if (op->dest.reg == CFI_SP && cfa->base == CFI_SP_INDIRECT && cfi->end) {
+ /* pop %rsp from a stack swizzle */
+
+ struct instruction *prev;
+
+ for (prev = prev_insn_same_sym(file, insn);
+ prev; prev = prev_insn_same_sym(file, prev)) {
+
+ if (prev->cfi.cfa.base != CFI_SP_INDIRECT) {
+ *cfi = prev->cfi;
+ break;
+ }
+ }
+
+ if (!prev) {
+ WARN_FUNC("failed RESTORE_STACK", insn->sec, insn->offset);
+ return -1;
+ }
+
+ break;
+ }
+
if (!cfi->drap && op->dest.reg == cfa->base) {
/* pop %rbp */
@@ -2073,7 +2096,7 @@ static int update_cfi_state(struct instr
return 0;
}
-static int handle_insn_ops(struct instruction *insn, struct insn_state *state)
+static int handle_insn_ops(struct objtool_file *file, struct instruction *insn, struct insn_state *state)
{
struct stack_op *op;
@@ -2085,9 +2108,11 @@ static int handle_insn_ops(struct instru
return -1;
}
- res = update_cfi_state(insn, &state->cfi, op);
- if (res)
- return res;
+ if (!insn->hint) {
+ res = update_cfi_state(file, insn, &state->cfi, op);
+ if (res)
+ return res;
+ }
if (op->dest.type == OP_DEST_PUSHF) {
if (!state->uaccess_stack) {
@@ -2319,16 +2344,26 @@ static int validate_branch(struct objtoo
if (state.noinstr)
state.instr += insn->instr;
- if (insn->hint)
- state.cfi = insn->cfi;
- else
+ if (insn->hint) {
+ if (insn->cfi.cfa.base == CFI_SP_INDIRECT && insn->cfi.end) {
+ state.cfi.cfa.base = CFI_SP_INDIRECT;
+ state.cfi.end = true;
+ insn->cfi = state.cfi;
+ } else {
+ state.cfi = insn->cfi;
+ }
+ } else {
insn->cfi = state.cfi;
+ }
insn->visited |= visited;
if (!insn->ignore_alts && !list_empty(&insn->alts)) {
bool skip_orig = false;
+ if (insn->alt_group)
+ fill_alternative_cfi(file, insn);
+
list_for_each_entry(alt, &insn->alts, list) {
if (alt->skip_orig)
skip_orig = true;
@@ -2341,14 +2376,11 @@ static int validate_branch(struct objtoo
}
}
- if (insn->alt_group)
- fill_alternative_cfi(file, insn);
-
if (skip_orig)
return 0;
}
- if (handle_insn_ops(insn, &state))
+ if (handle_insn_ops(file, insn, &state))
return 1;
switch (insn->type) {
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -47,13 +47,19 @@ static const char *orc_type_name(unsigne
}
}
-static void print_reg(unsigned int reg, int offset)
+static void print_reg(unsigned int reg, int offset, bool end)
{
- if (reg == ORC_REG_BP_INDIRECT)
- printf("(bp%+d)", offset);
- else if (reg == ORC_REG_SP_INDIRECT)
- printf("(sp%+d)", offset);
- else if (reg == ORC_REG_UNDEFINED)
+ if (reg == ORC_REG_BP_INDIRECT) {
+ if (end)
+ printf("(bp)%+d", offset);
+ else
+ printf("(bp%+d)", offset);
+ } else if (reg == ORC_REG_SP_INDIRECT) {
+ if (end)
+ printf("(sp)%+d", offset);
+ else
+ printf("(sp%+d)", offset);
+ } else if (reg == ORC_REG_UNDEFINED)
printf("(und)");
else
printf("%s%+d", reg_name(reg), offset);
@@ -195,11 +201,11 @@ int orc_dump(const char *_objname)
printf(" sp:");
- print_reg(orc[i].sp_reg, orc[i].sp_offset);
+ print_reg(orc[i].sp_reg, orc[i].sp_offset, orc[i].end);
printf(" bp:");
- print_reg(orc[i].bp_reg, orc[i].bp_offset);
+ print_reg(orc[i].bp_reg, orc[i].bp_offset, false);
printf(" type:%s end:%d\n",
orc_type_name(orc[i].type), orc[i].end);
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-07 16:10 ` [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles Peter Zijlstra
@ 2020-05-07 17:38 ` Peter Zijlstra
2020-05-07 18:30 ` Josh Poimboeuf
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-07 17:38 UTC (permalink / raw)
To: tglx; +Cc: jpoimboe, x86, linux-kernel, luto
On Thu, May 07, 2020 at 06:10:23PM +0200, Peter Zijlstra wrote:
> Thomas would very much like objtool to understand and generate correct
> ORC unwind information for the minimal stack swizzle sequence:
>
> mov %rsp, (%[ts])
> mov %[ts], %rsp
> ...
> pop %rsp
>
> This sequence works for the fp and guess unwinders -- all they need is
> that top-of-stack link set up by the first instruction.
>
> The previous entry_64.S code worked with "UNWIND_HINT_REGS indirect=1"
> hints to inform the unwinder about the stack-swizzle, but because
> we've now already entered C, we can no longer point to a REGS. In
> fact, due to being in C we don't even have a reliable sp_offset to
> anything.
>
> None of the existing UNWIND_HINT() functionality is quite sufficient
> to generate the right thing, but SP_INDIRECT is still the closest, so
> extend it.
>
> When SP_INDIRECT is combined with .end=1 (which is otherwise unused,
> except for sp_reg == UNDEFINED):
>
> - change it from (sp+sp_offset) to (sp)+sp_offset
> - have objtool preserve sp_offset from the previous state
> - change "pop %rsp" handling to restore the CFI state from before the
> hint.
>
> NOTES:
>
> - We now have an instruction with stackops and a hint; make hint take
> precedence over stackops.
>
> - Due to the reverse search in "pop %rsp" we must
> fill_alternative_cfi() before validate_branch().
>
> - This all isn't really pretty, but it works and gets Thomas the code
> sequence he wants.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
Much simpler, also works.
---
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -23,6 +23,7 @@ do { \
__this_cpu_add(irq_count, 1); \
asm volatile( \
"movq %%rsp, (%[ts]) \n" \
+ UNWIND_HINT_INDIRECT \
"movq %[ts], %%rsp \n" \
ASM_INSTR_BEGIN \
_asm " \n" \
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -95,6 +95,47 @@
UNWIND_HINT type=UNWIND_HINT_TYPE_RET_OFFSET sp_offset=\sp_offset
.endm
+#else
+
+#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
+ "987: \n\t" \
+ ".pushsection .discard.unwind_hints\n\t" \
+ /* struct unwind_hint */ \
+ ".long 987b - .\n\t" \
+ ".short " __stringify(sp_offset) "\n\t" \
+ ".byte " __stringify(sp_reg) "\n\t" \
+ ".byte " __stringify(type) "\n\t" \
+ ".byte " __stringify(end) "\n\t" \
+ ".balign 4 \n\t" \
+ ".popsection\n\t"
+
+/*
+ * Stack swizzling vs objtool/ORC:
+ *
+ * The canonical way of swizzling stack is:
+ *
+ * 1: mov %%rsp, (%[ts])
+ * 2: mov %[ts], %%rsp
+ * ...
+ * 3: pop %%rsp
+ *
+ * Where:
+ *
+ * 1 - places a pointer to the previous stack at the top of the new stack;
+ * also see the unwinders.
+ *
+ * 2 - switches to the new stack, but to avoid hitting the CFA_UNDEFINED case,
+ * we need to tell objtool the stack pointer can be found at (%%rsp),
+ * UNWIND_HINT_INDIRECT does so.
+ *
+ * 3 - restores the previous stack by popping the value stored by 1 into %%rsp,
+ * ....
+ *
+ * See arch/x86/include/asm/irq_stack.h
+ */
+#define UNWIND_HINT_INDIRECT \
+ UNWIND_HINT(ORC_REG_SP_INDIRECT, 0, ORC_TYPE_CALL, 0)
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_UNWIND_HINTS_H */
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -435,12 +435,12 @@ bool unwind_next_frame(struct unwind_sta
break;
case ORC_REG_SP_INDIRECT:
- sp = state->sp + orc->sp_offset;
+ sp = state->sp;
indirect = true;
break;
case ORC_REG_BP_INDIRECT:
- sp = state->bp + orc->sp_offset;
+ sp = state->bp;
indirect = true;
break;
@@ -489,6 +489,8 @@ bool unwind_next_frame(struct unwind_sta
if (indirect) {
if (!deref_stack_reg(state, sp, &sp))
goto err;
+
+ sp += orc->sp_offset;
}
/* Find IP, SP and possibly regs: */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1720,8 +1720,7 @@ static void restore_reg(struct cfi_state
* 41 5d pop %r13
* c3 retq
*/
-static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi,
- struct stack_op *op)
+static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi, struct stack_op *op)
{
struct cfi_reg *cfa = &cfi->cfa;
struct cfi_reg *regs = cfi->regs;
@@ -1898,6 +1897,13 @@ static int update_cfi_state(struct instr
case OP_SRC_POP:
case OP_SRC_POPF:
+ if (op->dest.reg == CFI_SP && cfa->base == CFI_SP_INDIRECT) {
+
+ /* pop %rsp from a stack swizzle */
+ cfa->base = CFI_SP;
+ break;
+ }
+
if (!cfi->drap && op->dest.reg == cfa->base) {
/* pop %rbp */
@@ -2085,9 +2091,11 @@ static int handle_insn_ops(struct instru
return -1;
}
- res = update_cfi_state(insn, &state->cfi, op);
- if (res)
- return res;
+ if (!insn->hint) {
+ res = update_cfi_state(insn, &state->cfi, op);
+ if (res)
+ return res;
+ }
if (op->dest.type == OP_DEST_PUSHF) {
if (!state->uaccess_stack) {
@@ -2319,16 +2327,25 @@ static int validate_branch(struct objtoo
if (state.noinstr)
state.instr += insn->instr;
- if (insn->hint)
- state.cfi = insn->cfi;
- else
+ if (insn->hint) {
+ if (insn->cfi.cfa.base == CFI_SP_INDIRECT) {
+ state.cfi.cfa.base = CFI_SP_INDIRECT;
+ insn->cfi = state.cfi;
+ } else {
+ state.cfi = insn->cfi;
+ }
+ } else {
insn->cfi = state.cfi;
+ }
insn->visited |= visited;
if (!insn->ignore_alts && !list_empty(&insn->alts)) {
bool skip_orig = false;
+ if (insn->alt_group)
+ fill_alternative_cfi(file, insn);
+
list_for_each_entry(alt, &insn->alts, list) {
if (alt->skip_orig)
skip_orig = true;
@@ -2341,9 +2358,6 @@ static int validate_branch(struct objtoo
}
}
- if (insn->alt_group)
- fill_alternative_cfi(file, insn);
-
if (skip_orig)
return 0;
}
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -50,9 +50,9 @@ static const char *orc_type_name(unsigne
static void print_reg(unsigned int reg, int offset)
{
if (reg == ORC_REG_BP_INDIRECT)
- printf("(bp%+d)", offset);
+ printf("(bp)%+d", offset);
else if (reg == ORC_REG_SP_INDIRECT)
- printf("(sp%+d)", offset);
+ printf("(sp)%+d", offset);
else if (reg == ORC_REG_UNDEFINED)
printf("(und)");
else
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-07 17:38 ` Peter Zijlstra
@ 2020-05-07 18:30 ` Josh Poimboeuf
2020-05-07 21:24 ` Thomas Gleixner
0 siblings, 1 reply; 11+ messages in thread
From: Josh Poimboeuf @ 2020-05-07 18:30 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: tglx, x86, linux-kernel, luto
On Thu, May 07, 2020 at 07:38:09PM +0200, Peter Zijlstra wrote:
> On Thu, May 07, 2020 at 06:10:23PM +0200, Peter Zijlstra wrote:
> > Thomas would very much like objtool to understand and generate correct
> > ORC unwind information for the minimal stack swizzle sequence:
> >
> > mov %rsp, (%[ts])
> > mov %[ts], %rsp
> > ...
> > pop %rsp
> >
> > This sequence works for the fp and guess unwinders -- all they need is
> > that top-of-stack link set up by the first instruction.
> >
> > The previous entry_64.S code worked with "UNWIND_HINT_REGS indirect=1"
> > hints to inform the unwinder about the stack-swizzle, but because
> > we've now already entered C, we can no longer point to a REGS. In
> > fact, due to being in C we don't even have a reliable sp_offset to
> > anything.
> >
> > None of the existing UNWIND_HINT() functionality is quite sufficient
> > to generate the right thing, but SP_INDIRECT is still the closest, so
> > extend it.
> >
> > When SP_INDIRECT is combined with .end=1 (which is otherwise unused,
> > except for sp_reg == UNDEFINED):
> >
> > - change it from (sp+sp_offset) to (sp)+sp_offset
> > - have objtool preserve sp_offset from the previous state
> > - change "pop %rsp" handling to restore the CFI state from before the
> > hint.
> >
> > NOTES:
> >
> > - We now have an instruction with stackops and a hint; make hint take
> > precedence over stackops.
> >
> > - Due to the reverse search in "pop %rsp" we must
> > fill_alternative_cfi() before validate_branch().
> >
> > - This all isn't really pretty, but it works and gets Thomas the code
> > sequence he wants.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
>
> Much simpler, also works.
Doing the stack switch in inline asm is just nasty.
Also, a frame pointer makes this SO much easier for ORC/objtool, no
funky hints needed. In fact maybe we can get rid of the indirect hint
things altogether, which means even more deleted code.
This is much cleaner, and it boots:
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 3f9b2555e6fb..4a25f72f969f 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -718,15 +718,6 @@ __visible void __xen_pv_evtchn_do_upcall(void)
irq_exit_rcu();
}
-/*
- * Separate function as objtool is unhappy about having
- * the macro at the call site.
- */
-static noinstr void run_on_irqstack(void)
-{
- RUN_ON_IRQSTACK(__xen_pv_evtchn_do_upcall);
-}
-
__visible noinstr void xen_pv_evtchn_do_upcall(struct pt_regs *regs)
{
struct pt_regs *old_regs;
@@ -739,7 +730,7 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct pt_regs *regs)
__xen_pv_evtchn_do_upcall();
instr_end();
} else {
- run_on_irqstack();
+ RUN_ON_IRQSTACK(__xen_pv_evtchn_do_upcall);
}
set_irq_regs(old_regs);
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3046dfc69b8c..d036dc831a23 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1295,3 +1295,31 @@ SYM_CODE_START(rewind_stack_do_exit)
call do_exit
SYM_CODE_END(rewind_stack_do_exit)
.popsection
+
+/*
+ * rdi: new stack pointer
+ * rsi: function pointer
+ * rdx: arg1 (can be NULL if none)
+ */
+SYM_FUNC_START(call_on_stack)
+ /*
+ * Save the frame pointer unconditionally. This allows the ORC
+ * unwinder to handle the stack switch.
+ */
+ pushq %rbp
+ mov %rsp, %rbp
+
+ /*
+ * The unwinder relies on the word at the end of the new stack page
+ * linking back to the previous RSP.
+ */
+ mov %rsp, -8(%rdi)
+ lea -8(%rdi), %rsp
+ mov %rdx, %rdi
+ CALL_NOSPEC rsi
+
+ /* Restore the previous stack pointer from RBP. */
+ leaveq
+
+ ret
+SYM_FUNC_END(call_on_stack)
diff --git a/arch/x86/include/asm/irq_stack.h b/arch/x86/include/asm/irq_stack.h
index f307d4c27f84..131a6c689b1c 100644
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -7,42 +7,26 @@
#include <asm/processor.h>
#ifdef CONFIG_X86_64
+
+void call_on_stack(void *stack, void *func, void *arg);
+
static __always_inline bool irqstack_active(void)
{
return __this_cpu_read(irq_count) != -1;
}
-#define __RUN_ON_IRQSTACK(_asm, ...) \
+#define __RUN_ON_IRQSTACK(func, arg) \
do { \
- unsigned long tos; \
- \
lockdep_assert_irqs_disabled(); \
- \
- tos = ((unsigned long)__this_cpu_read(hardirq_stack_ptr)) - 8; \
- \
- __this_cpu_add(irq_count, 1); \
- asm volatile( \
- "movq %%rsp, (%[ts]) \n" \
- "movq %[ts], %%rsp \n" \
- ASM_INSTR_BEGIN \
- _asm " \n" \
- ASM_INSTR_END \
- "popq %%rsp \n" \
- : \
- : [ts] "r" (tos), ##__VA_ARGS__ \
- : "memory" \
- ); \
+ call_on_stack(__this_cpu_read(hardirq_stack_ptr), func, arg); \
__this_cpu_sub(irq_count, 1); \
} while (0)
-/*
- * Macro to emit code for running @func on the irq stack.
- */
#define RUN_ON_IRQSTACK(func) \
- __RUN_ON_IRQSTACK("call" __ASM_FORM(func))
+ __RUN_ON_IRQSTACK(func, NULL)
#define RUN_ON_IRQSTACK_ARG1(func, arg) \
- __RUN_ON_IRQSTACK("call" __ASM_FORM(func), "D" (arg))
+ __RUN_ON_IRQSTACK(func, arg)
#else /* CONFIG_X86_64 */
static __always_inline bool irqstack_active(void) { return false; }
diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
index c41b0a2859d7..30b6ddf64dc7 100644
--- a/arch/x86/kernel/irq_64.c
+++ b/arch/x86/kernel/irq_64.c
@@ -74,7 +74,7 @@ int irq_init_percpu_irqstack(unsigned int cpu)
static noinstr void handle_irq_on_irqstack(struct irq_desc *desc)
{
- __RUN_ON_IRQSTACK(CALL_NOSPEC, THUNK_TARGET(desc->handle_irq), "D" (desc));
+ RUN_ON_IRQSTACK_ARG1(desc->handle_irq, desc);
}
void handle_irq(struct irq_desc *desc, struct pt_regs *regs)
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-07 18:30 ` Josh Poimboeuf
@ 2020-05-07 21:24 ` Thomas Gleixner
2020-05-08 10:12 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2020-05-07 21:24 UTC (permalink / raw)
To: Josh Poimboeuf, Peter Zijlstra; +Cc: x86, linux-kernel, luto
Josh Poimboeuf <jpoimboe@redhat.com> writes:
> On Thu, May 07, 2020 at 07:38:09PM +0200, Peter Zijlstra wrote:
>> On Thu, May 07, 2020 at 06:10:23PM +0200, Peter Zijlstra wrote:
>> Much simpler, also works.
>
> Doing the stack switch in inline asm is just nasty.
Matter of personal preference :)
> Also, a frame pointer makes this SO much easier for ORC/objtool, no
> funky hints needed. In fact maybe we can get rid of the indirect hint
> things altogether, which means even more deleted code.
>
> This is much cleaner, and it boots:
Well, yes. I thought about that and it clearly works fine. There is a
downside because it adds the indirect call to all system vectors.
The device interrupt part is fine, it has an indirect call already, but
for system vectors and in particular the rescheduling IPI its bad. Not
bad because of the indirect call per se, but bad because of the
ratpoutine mitigation mess.
My other alternative was to emit switcheroo into ASM for each system
vector. Not pretty either
But over our IRC conversation I came up with a 3rd variant:
For most of the vectors the indirect call overhead is just noise, so
we can run them through the ASM switcher, but for the resched IPI
we can just use a separate direct call stub in ASM.
I can live with that. I might have to pay up for Peter's headaches to
teach objtool, but that's a different story. Let me check how many beers
he owes me first ...
Thanks,
tglx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-07 21:24 ` Thomas Gleixner
@ 2020-05-08 10:12 ` Peter Zijlstra
2020-05-08 12:26 ` Thomas Gleixner
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-08 10:12 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Josh Poimboeuf, x86, linux-kernel, luto
On Thu, May 07, 2020 at 11:24:49PM +0200, Thomas Gleixner wrote:
> But over our IRC conversation I came up with a 3rd variant:
>
> For most of the vectors the indirect call overhead is just noise, so
> we can run them through the ASM switcher, but for the resched IPI
> we can just use a separate direct call stub in ASM.
Are we sure the rat-poison crap is noise for all the other system
vectors? I suppose it is for most since they'll do indirect calls
themselves anyway, right?
> I can live with that. I might have to pay up for Peter's headaches to
> teach objtool, but that's a different story. Let me check how many beers
> he owes me first ...
We're going to be so massively drunk if we ever settle that :-) For now
I'll just have to live with knowing more about the unwinders.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-08 10:12 ` Peter Zijlstra
@ 2020-05-08 12:26 ` Thomas Gleixner
2020-05-08 12:40 ` Peter Zijlstra
2020-05-08 12:44 ` Thomas Gleixner
0 siblings, 2 replies; 11+ messages in thread
From: Thomas Gleixner @ 2020-05-08 12:26 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Josh Poimboeuf, x86, linux-kernel, luto
Peter Zijlstra <peterz@infradead.org> writes:
> On Thu, May 07, 2020 at 11:24:49PM +0200, Thomas Gleixner wrote:
>> But over our IRC conversation I came up with a 3rd variant:
>>
>> For most of the vectors the indirect call overhead is just noise, so
>> we can run them through the ASM switcher, but for the resched IPI
>> we can just use a separate direct call stub in ASM.
>
> Are we sure the rat-poison crap is noise for all the other system
> vectors? I suppose it is for most since they'll do indirect calls
> themselves anyway, right?
We have different categories:
1) Uninteresting
SPURIOUS_APIC_VECTOR, ERROR_APIC_VECTOR, THERMAL_APIC_VECTOR,
THRESHOLD_APIC_VECTOR, REBOOT_VECTOR, DEFERRED_ERROR_VECTOR
2) Indirect call poisoned
LOCAL_TIMER_VECTOR
X86_PLATFORM_IPI_VECTOR
IRQ_WORK_VECTOR
HYPERV_STIMER0_VECTOR
HYPERVISOR_CALLBACK_VECTOR
POSTED_INTERRUPT_WAKEUP_VECTOR.
CALL_FUNCTION_VECTOR
CALL_FUNCTION_SINGLE_VECTOR
3) Quick
RESCHEDULE_VECTOR
POSTED_INTR_VECTOR
POSTED_INTR_NESTED_VECTOR
These two postit ones are weird because they are both empty and
just increment different irq counts.
HYPERV_REENLIGHTENMENT_VECTOR
schedules delayed work, i,e. arms a timer which should be
straight forward, but does it matter?
4) Others
UV_BAU_MESSAGE - The TLB flushes are probably more expensive than
ratpoutine
Hmm?
Thanks,
tglx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-08 12:26 ` Thomas Gleixner
@ 2020-05-08 12:40 ` Peter Zijlstra
2020-05-08 12:44 ` Thomas Gleixner
1 sibling, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-08 12:40 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Josh Poimboeuf, x86, linux-kernel, luto
On Fri, May 08, 2020 at 02:26:32PM +0200, Thomas Gleixner wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
>
> > On Thu, May 07, 2020 at 11:24:49PM +0200, Thomas Gleixner wrote:
> >> But over our IRC conversation I came up with a 3rd variant:
> >>
> >> For most of the vectors the indirect call overhead is just noise, so
> >> we can run them through the ASM switcher, but for the resched IPI
> >> we can just use a separate direct call stub in ASM.
> >
> > Are we sure the rat-poison crap is noise for all the other system
> > vectors? I suppose it is for most since they'll do indirect calls
> > themselves anyway, right?
>
> We have different categories:
>
> 1) Uninteresting
>
> SPURIOUS_APIC_VECTOR, ERROR_APIC_VECTOR, THERMAL_APIC_VECTOR,
> THRESHOLD_APIC_VECTOR, REBOOT_VECTOR, DEFERRED_ERROR_VECTOR
>
> 2) Indirect call poisoned
>
> LOCAL_TIMER_VECTOR
> X86_PLATFORM_IPI_VECTOR
> IRQ_WORK_VECTOR
> HYPERV_STIMER0_VECTOR
> HYPERVISOR_CALLBACK_VECTOR
> POSTED_INTERRUPT_WAKEUP_VECTOR.
> CALL_FUNCTION_VECTOR
> CALL_FUNCTION_SINGLE_VECTOR
>
> 3) Quick
>
> RESCHEDULE_VECTOR
>
> POSTED_INTR_VECTOR
> POSTED_INTR_NESTED_VECTOR
>
> These two postit ones are weird because they are both empty and
> just increment different irq counts.
>
> HYPERV_REENLIGHTENMENT_VECTOR
>
> schedules delayed work, i,e. arms a timer which should be
> straight forward, but does it matter?
>
> 4) Others
>
> UV_BAU_MESSAGE - The TLB flushes are probably more expensive than
> ratpoutine
>
> Hmm?
As we just agreed on IRC, 3) can run without changing stack, and then
the rest can use the indirect thing.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles
2020-05-08 12:26 ` Thomas Gleixner
2020-05-08 12:40 ` Peter Zijlstra
@ 2020-05-08 12:44 ` Thomas Gleixner
1 sibling, 0 replies; 11+ messages in thread
From: Thomas Gleixner @ 2020-05-08 12:44 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Josh Poimboeuf, x86, linux-kernel, luto, Wei Liu, Michael Kelley
Thomas Gleixner <tglx@linutronix.de> writes:
> Peter Zijlstra <peterz@infradead.org> writes:
>> Are we sure the rat-poison crap is noise for all the other system
>> vectors? I suppose it is for most since they'll do indirect calls
>> themselves anyway, right?
>
> 3) Quick
>
> RESCHEDULE_VECTOR
>
> POSTED_INTR_VECTOR
> POSTED_INTR_NESTED_VECTOR
>
> These two postit ones are weird because they are both empty and
> just increment different irq counts.
For those 3 it's also pointless to run them on IST stack at all.
> HYPERV_REENLIGHTENMENT_VECTOR
>
> schedules delayed work, i,e. arms a timer which should be
> straight forward, but does it matter?
This one shouldn't have an issue when running on task stack either, but
we can run it through the regular indirect path for now and switch it
over when it matters performance wise.
Thanks,
tglx
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-05-08 13:21 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-07 16:10 [RFC][PATCH 0/3] objtool vs irqstack swizzles Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 1/3] x86/entry: Collapse the 3 IRQ stack instances into a single macro Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 2/3] x86/entry: Provide ASM_INSTR_{BEGIN,END} Peter Zijlstra
2020-05-07 16:10 ` [RFC][PATCH 3/3] x86/entry, ORC: Teach objtool/unwind_orc about stack irq swizzles Peter Zijlstra
2020-05-07 17:38 ` Peter Zijlstra
2020-05-07 18:30 ` Josh Poimboeuf
2020-05-07 21:24 ` Thomas Gleixner
2020-05-08 10:12 ` Peter Zijlstra
2020-05-08 12:26 ` Thomas Gleixner
2020-05-08 12:40 ` Peter Zijlstra
2020-05-08 12:44 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).