* [PATCH 0/2] Stackleak for arm64 [not found] <dc76a745-3fa7-4023-dcc1-3df18c9461a6@redhat.com> @ 2018-02-21 1:13 ` Laura Abbott 2018-02-21 1:13 ` [PATCH 1/2] stackleak: Update " Laura Abbott ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Laura Abbott @ 2018-02-21 1:13 UTC (permalink / raw) To: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel Cc: Laura Abbott, kernel-hardening, linux-arm-kernel, linux-kernel This is the arm64 version of the STACKLEAK plugin originall from grsecurity. See https://marc.info/?l=kernel-hardening&m=151880470609808 for the full x86 version. This is based on top of Kees' branch for stackleak and has been cleaned up to use a few macros from that branch. Comments welcome, if there are no major objections Kees will queue this up to get some CI testing. This passed both of the LKDTM tests. Laura Abbott (2): stackleak: Update for arm64 arm64: Clear the stack arch/arm64/Kconfig | 1 + arch/arm64/include/asm/processor.h | 6 ++ arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/entry.S | 108 +++++++++++++++++++++++++++++++++ arch/arm64/kernel/process.c | 16 +++++ drivers/firmware/efi/libstub/Makefile | 3 +- scripts/Makefile.gcc-plugins | 5 +- scripts/gcc-plugins/stackleak_plugin.c | 5 ++ 8 files changed, 145 insertions(+), 2 deletions(-) -- 2.14.3 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] stackleak: Update for arm64 2018-02-21 1:13 ` [PATCH 0/2] Stackleak for arm64 Laura Abbott @ 2018-02-21 1:13 ` Laura Abbott 2018-02-22 16:58 ` Will Deacon 2018-02-21 1:13 ` [PATCH 2/2] arm64: Clear the stack Laura Abbott 2018-02-21 14:48 ` [PATCH 0/2] Stackleak for arm64 Alexander Popov 2 siblings, 1 reply; 15+ messages in thread From: Laura Abbott @ 2018-02-21 1:13 UTC (permalink / raw) To: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel Cc: Laura Abbott, kernel-hardening, linux-arm-kernel, linux-kernel arm64 has another layer of indirection in the RTL. Account for this in the plugin. Signed-off-by: Laura Abbott <labbott@redhat.com> --- scripts/gcc-plugins/stackleak_plugin.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c index 6fc991c98d8b..7dfaa027423f 100644 --- a/scripts/gcc-plugins/stackleak_plugin.c +++ b/scripts/gcc-plugins/stackleak_plugin.c @@ -244,6 +244,11 @@ static unsigned int stackleak_final_execute(void) * that insn. */ body = PATTERN(insn); + /* arm64 is different */ + if (GET_CODE(body) == PARALLEL) { + body = XEXP(body, 0); + body = XEXP(body, 0); + } if (GET_CODE(body) != CALL) continue; -- 2.14.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-02-21 1:13 ` [PATCH 1/2] stackleak: Update " Laura Abbott @ 2018-02-22 16:58 ` Will Deacon 2018-02-22 19:22 ` Alexander Popov 2018-02-22 19:38 ` Laura Abbott 0 siblings, 2 replies; 15+ messages in thread From: Will Deacon @ 2018-02-22 16:58 UTC (permalink / raw) To: Laura Abbott Cc: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening, richard.sandiford Hi Laura, On Tue, Feb 20, 2018 at 05:13:02PM -0800, Laura Abbott wrote: > > arm64 has another layer of indirection in the RTL. > Account for this in the plugin. > > Signed-off-by: Laura Abbott <labbott@redhat.com> > --- > scripts/gcc-plugins/stackleak_plugin.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c > index 6fc991c98d8b..7dfaa027423f 100644 > --- a/scripts/gcc-plugins/stackleak_plugin.c > +++ b/scripts/gcc-plugins/stackleak_plugin.c > @@ -244,6 +244,11 @@ static unsigned int stackleak_final_execute(void) > * that insn. > */ > body = PATTERN(insn); > + /* arm64 is different */ > + if (GET_CODE(body) == PARALLEL) { > + body = XEXP(body, 0); > + body = XEXP(body, 0); > + } Like most kernel developers, I don't know the first thing about GCC internals so I asked our GCC team and Richard (CC'd) reckons this should be: if (GET_CODE(body) == PARALLEL) body = XVECEXP(body, 0, 0); instead of the hunk above. Can you give that a go instead, please? Cheers, Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-02-22 16:58 ` Will Deacon @ 2018-02-22 19:22 ` Alexander Popov 2018-02-27 10:21 ` Richard Sandiford 2018-02-22 19:38 ` Laura Abbott 1 sibling, 1 reply; 15+ messages in thread From: Alexander Popov @ 2018-02-22 19:22 UTC (permalink / raw) To: Will Deacon, Laura Abbott Cc: Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening, richard.sandiford Hello Will, Richard and GCC folks! On 22.02.2018 19:58, Will Deacon wrote: > On Tue, Feb 20, 2018 at 05:13:02PM -0800, Laura Abbott wrote: >> >> arm64 has another layer of indirection in the RTL. >> Account for this in the plugin. >> >> Signed-off-by: Laura Abbott <labbott@redhat.com> >> --- >> scripts/gcc-plugins/stackleak_plugin.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c >> index 6fc991c98d8b..7dfaa027423f 100644 >> --- a/scripts/gcc-plugins/stackleak_plugin.c >> +++ b/scripts/gcc-plugins/stackleak_plugin.c >> @@ -244,6 +244,11 @@ static unsigned int stackleak_final_execute(void) >> * that insn. >> */ >> body = PATTERN(insn); >> + /* arm64 is different */ >> + if (GET_CODE(body) == PARALLEL) { >> + body = XEXP(body, 0); >> + body = XEXP(body, 0); >> + } > > Like most kernel developers, I don't know the first thing about GCC internals > so I asked our GCC team and Richard (CC'd) reckons this should be: > > if (GET_CODE(body) == PARALLEL) > body = XVECEXP(body, 0, 0); > > instead of the hunk above. Can you give that a go instead, please? Thanks a lot! Would you be so kind to take a look at the whole STACKLEAK plugin? http://www.openwall.com/lists/kernel-hardening/2018/02/16/4 https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/gcc-plugin/stackleak&id=57a0a6763b12e82dd462593d0f42be610e93cdc9 It's not very big. I documented it in detail. I would be really grateful for the review! Best regards, Alexander ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-02-22 19:22 ` Alexander Popov @ 2018-02-27 10:21 ` Richard Sandiford 2018-02-28 15:09 ` Alexander Popov 0 siblings, 1 reply; 15+ messages in thread From: Richard Sandiford @ 2018-02-27 10:21 UTC (permalink / raw) To: Alexander Popov Cc: Will Deacon, Laura Abbott, Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening Hi Alexander, Sorry for the slow reply, been caught up in an office move. Alexander Popov <alex.popov@linux.com> writes: > Hello Will, Richard and GCC folks! > > On 22.02.2018 19:58, Will Deacon wrote: >> On Tue, Feb 20, 2018 at 05:13:02PM -0800, Laura Abbott wrote: >>> >>> arm64 has another layer of indirection in the RTL. >>> Account for this in the plugin. >>> >>> Signed-off-by: Laura Abbott <labbott@redhat.com> >>> --- >>> scripts/gcc-plugins/stackleak_plugin.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c >>> index 6fc991c98d8b..7dfaa027423f 100644 >>> --- a/scripts/gcc-plugins/stackleak_plugin.c >>> +++ b/scripts/gcc-plugins/stackleak_plugin.c >>> @@ -244,6 +244,11 @@ static unsigned int stackleak_final_execute(void) >>> * that insn. >>> */ >>> body = PATTERN(insn); >>> + /* arm64 is different */ >>> + if (GET_CODE(body) == PARALLEL) { >>> + body = XEXP(body, 0); >>> + body = XEXP(body, 0); >>> + } >> >> Like most kernel developers, I don't know the first thing about GCC internals >> so I asked our GCC team and Richard (CC'd) reckons this should be: >> >> if (GET_CODE(body) == PARALLEL) >> body = XVECEXP(body, 0, 0); >> >> instead of the hunk above. Can you give that a go instead, please? > > Thanks a lot! > > Would you be so kind to take a look at the whole STACKLEAK plugin? > http://www.openwall.com/lists/kernel-hardening/2018/02/16/4 > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/gcc-plugin/stackleak&id=57a0a6763b12e82dd462593d0f42be610e93cdc9 > > It's not very big. I documented it in detail. I would be really grateful for the > review! Looks good to me FWIW. Just a couple of minor things: > + /* > + * 1. Loop through the GIMPLE statements in each of cfun basic blocks. > + * cfun is a global variable which represents the function that is > + * currently processed. > + */ > + FOR_EACH_BB_FN(bb, cfun) { > + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { > + gimple stmt; > + > + stmt = gsi_stmt(gsi); > + > + /* Leaf function is a function which makes no calls */ > + if (is_gimple_call(stmt)) > + is_leaf = false; It's probably not going to matter in practice, but it might be worth emphasising in the comments that this leafness is only approximate. It will sometimes be a false positive, because we could still end up creating calls to libgcc functions from non-call statements (or for target-specific reasons). It can also be a false negative, since call statements can be to built-in or internal functions that end up being open-coded. > + /* > + * The stackleak_final pass should be executed before the "final" pass, > + * which turns the RTL (Register Transfer Language) into assembly. > + */ > + PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE); This might be too late, since it happens e.g. after addresses have been calculated for branch ranges, and after machine-specific passes (e.g. bundling on ia64). The stack size is final after reload, so inserting the pass after that might be better. Thanks, Richard ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-02-27 10:21 ` Richard Sandiford @ 2018-02-28 15:09 ` Alexander Popov 2018-03-01 10:33 ` Richard Sandiford 0 siblings, 1 reply; 15+ messages in thread From: Alexander Popov @ 2018-02-28 15:09 UTC (permalink / raw) To: Will Deacon, Laura Abbott, Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening, richard.sandiford On 27.02.2018 13:21, Richard Sandiford wrote: > Hi Alexander, > > Sorry for the slow reply, been caught up in an office move. Thank you very much for the review, Richard! > Alexander Popov <alex.popov@linux.com> writes: >> Would you be so kind to take a look at the whole STACKLEAK plugin? >> http://www.openwall.com/lists/kernel-hardening/2018/02/16/4 >> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/gcc-plugin/stackleak&id=57a0a6763b12e82dd462593d0f42be610e93cdc9 >> >> It's not very big. I documented it in detail. I would be really grateful for the >> review! > > Looks good to me FWIW. Just a couple of minor things: > >> + /* >> + * 1. Loop through the GIMPLE statements in each of cfun basic blocks. >> + * cfun is a global variable which represents the function that is >> + * currently processed. >> + */ >> + FOR_EACH_BB_FN(bb, cfun) { >> + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { >> + gimple stmt; >> + >> + stmt = gsi_stmt(gsi); >> + >> + /* Leaf function is a function which makes no calls */ >> + if (is_gimple_call(stmt)) >> + is_leaf = false; > > It's probably not going to matter in practice, but it might be worth > emphasising in the comments that this leafness is only approximate. That's important, thank you! May I ask why you think it's not going to matter in practice? > It will sometimes be a false positive, because we could still > end up creating calls to libgcc functions from non-call statements > (or for target-specific reasons). It can also be a false negative, > since call statements can be to built-in or internal functions that > end up being open-coded. Oh, that raises the question: how does this leafness inaccuracy affect in my particular case? is_leaf is currently used for finding the special cases to skip the track_stack() call insertion: /* * Special cases to skip the instrumentation. * * Taking the address of static inline functions materializes them, * but we mustn't instrument some of them as the resulting stack * alignment required by the function call ABI will break other * assumptions regarding the expected (but not otherwise enforced) * register clobbering ABI. * * Case in point: native_save_fl on amd64 when optimized for size * clobbers rdx if it were instrumented here. * * TODO: any more special cases? */ if (is_leaf && !TREE_PUBLIC(current_function_decl) && DECL_DECLARED_INLINE_P(current_function_decl)) { return 0; } And now it seems to me that the stackleak plugin should not instrument all static inline functions, regardless of is_leaf. Do you agree? >> + /* >> + * The stackleak_final pass should be executed before the "final" pass, >> + * which turns the RTL (Register Transfer Language) into assembly. >> + */ >> + PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE); > > This might be too late, since it happens e.g. after addresses have > been calculated for branch ranges, and after machine-specific passes > (e.g. bundling on ia64). > > The stack size is final after reload, so inserting the pass after that > might be better. Thanks for that notice. May I ask for the additional clarification? I specified -fdump-passes and see a lot of passes between reload and final: ... rtl-sched1 : OFF rtl-ira : ON rtl-reload : ON rtl-vzeroupper : OFF *all-postreload : OFF rtl-postreload : OFF rtl-gcse2 : OFF rtl-split2 : ON rtl-ree : ON rtl-cmpelim : OFF rtl-btl1 : OFF rtl-pro_and_epilogue : ON rtl-dse2 : ON rtl-csa : ON rtl-jump2 : ON rtl-compgotos : ON rtl-sched_fusion : OFF rtl-peephole2 : ON rtl-ce3 : ON rtl-rnreg : OFF rtl-cprop_hardreg : ON rtl-rtl_dce : ON rtl-bbro : ON rtl-btl2 : OFF *leaf_regs : ON rtl-split4 : ON rtl-sched2 : ON *stack_regs : ON rtl-split3 : OFF rtl-stack : ON *all-late_compilation : OFF rtl-alignments : ON rtl-vartrack : ON *free_cfg : ON rtl-mach : ON rtl-barriers : ON rtl-dbr : OFF rtl-split5 : OFF rtl-eh_ranges : OFF rtl-shorten : ON rtl-nothrow : ON rtl-dwarf2 : ON rtl-stackleak_final : ON rtl-final : ON rtl-dfinish : ON clean_state : ON Where exactly would you recommend me to insert the stackleak_final pass, which removes the unneeded track_stack() calls? Best regards, Alexander ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-02-28 15:09 ` Alexander Popov @ 2018-03-01 10:33 ` Richard Sandiford 2018-03-02 11:14 ` Alexander Popov 0 siblings, 1 reply; 15+ messages in thread From: Richard Sandiford @ 2018-03-01 10:33 UTC (permalink / raw) To: Alexander Popov Cc: Will Deacon, Laura Abbott, Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening Alexander Popov <alex.popov@linux.com> writes: > On 27.02.2018 13:21, Richard Sandiford wrote: >> Hi Alexander, >> >> Sorry for the slow reply, been caught up in an office move. > > Thank you very much for the review, Richard! > >> Alexander Popov <alex.popov@linux.com> writes: >>> Would you be so kind to take a look at the whole STACKLEAK plugin? >>> http://www.openwall.com/lists/kernel-hardening/2018/02/16/4 >>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/gcc-plugin/stackleak&id=57a0a6763b12e82dd462593d0f42be610e93cdc9 >>> >>> It's not very big. I documented it in detail. I would be really >>> grateful for the >>> review! >> >> Looks good to me FWIW. Just a couple of minor things: >> >>> + /* >>> + * 1. Loop through the GIMPLE statements in each of cfun basic blocks. >>> + * cfun is a global variable which represents the function that is >>> + * currently processed. >>> + */ >>> + FOR_EACH_BB_FN(bb, cfun) { >>> + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { >>> + gimple stmt; >>> + >>> + stmt = gsi_stmt(gsi); >>> + >>> + /* Leaf function is a function which makes no calls */ >>> + if (is_gimple_call(stmt)) >>> + is_leaf = false; >> >> It's probably not going to matter in practice, but it might be worth >> emphasising in the comments that this leafness is only approximate. > > That's important, thank you! May I ask why you think it's not going to matter in > practice? I just thought the kind of calls it misses are going to have very shallow frames, but from what you said later I guess that isn't the point. It also might be a bit too hand-wavy for something like this :-) >> It will sometimes be a false positive, because we could still >> end up creating calls to libgcc functions from non-call statements >> (or for target-specific reasons). It can also be a false negative, >> since call statements can be to built-in or internal functions that >> end up being open-coded. > > Oh, that raises the question: how does this leafness inaccuracy affect in my > particular case? > > is_leaf is currently used for finding the special cases to skip the > track_stack() call insertion: > > /* > * Special cases to skip the instrumentation. > * > * Taking the address of static inline functions materializes them, > * but we mustn't instrument some of them as the resulting stack > * alignment required by the function call ABI will break other > * assumptions regarding the expected (but not otherwise enforced) > * register clobbering ABI. > * > * Case in point: native_save_fl on amd64 when optimized for size > * clobbers rdx if it were instrumented here. > * > * TODO: any more special cases? > */ > if (is_leaf && > !TREE_PUBLIC(current_function_decl) && > DECL_DECLARED_INLINE_P(current_function_decl)) { > return 0; > } > > > And now it seems to me that the stackleak plugin should not instrument all > static inline functions, regardless of is_leaf. Do you agree? OK. I'd missed that this was just a heuristic to detect certain kinds of linux function, so it's probably fine as it is. Not sure whether it's safe to punt for general static inline functions. E.g. couldn't you have a static inline function that just provides a more convenient interface to another function? But I guess it's a linux-specific heuristic, so I can't really say. TBH the paravirt save_fl stuff seems like dancing on the edge, but that's another story. :-) >>> + /* >>> + * The stackleak_final pass should be executed before the "final" pass, >>> + * which turns the RTL (Register Transfer Language) into assembly. >>> + */ >>> + PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE); >> >> This might be too late, since it happens e.g. after addresses have >> been calculated for branch ranges, and after machine-specific passes >> (e.g. bundling on ia64). >> >> The stack size is final after reload, so inserting the pass after that >> might be better. > > Thanks for that notice. May I ask for the additional clarification? > > I specified -fdump-passes and see a lot of passes between reload and final: > ... > rtl-sched1 : OFF > rtl-ira : ON > rtl-reload : ON > rtl-vzeroupper : OFF > *all-postreload : OFF > rtl-postreload : OFF > rtl-gcse2 : OFF > rtl-split2 : ON > rtl-ree : ON > rtl-cmpelim : OFF > rtl-btl1 : OFF > rtl-pro_and_epilogue : ON > rtl-dse2 : ON > rtl-csa : ON > rtl-jump2 : ON > rtl-compgotos : ON > rtl-sched_fusion : OFF > rtl-peephole2 : ON > rtl-ce3 : ON > rtl-rnreg : OFF > rtl-cprop_hardreg : ON > rtl-rtl_dce : ON > rtl-bbro : ON > rtl-btl2 : OFF > *leaf_regs : ON > rtl-split4 : ON > rtl-sched2 : ON > *stack_regs : ON > rtl-split3 : OFF > rtl-stack : ON > *all-late_compilation : OFF > rtl-alignments : ON > rtl-vartrack : ON > *free_cfg : ON > rtl-mach : ON > rtl-barriers : ON > rtl-dbr : OFF > rtl-split5 : OFF > rtl-eh_ranges : OFF > rtl-shorten : ON > rtl-nothrow : ON > rtl-dwarf2 : ON > rtl-stackleak_final : ON > rtl-final : ON > rtl-dfinish : ON > clean_state : ON > > Where exactly would you recommend me to insert the stackleak_final pass, which > removes the unneeded track_stack() calls? Directly after rtl-reload seems best. That's the first point at which the frame size is final, and reload is one of the few rtl passes that always runs. Doing it there could also help with things like shrink wrapping (part of rtl-pro_and_epilogue). Thanks, Richard ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-03-01 10:33 ` Richard Sandiford @ 2018-03-02 11:14 ` Alexander Popov 0 siblings, 0 replies; 15+ messages in thread From: Alexander Popov @ 2018-03-02 11:14 UTC (permalink / raw) To: Will Deacon, Laura Abbott, Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening, richard.sandiford Cc: PaX Team Thanks for your reply, Richard! On 01.03.2018 13:33, Richard Sandiford wrote: > Alexander Popov <alex.popov@linux.com> writes: >> On 27.02.2018 13:21, Richard Sandiford wrote: >>> Alexander Popov <alex.popov@linux.com> writes: >>>> Would you be so kind to take a look at the whole STACKLEAK plugin? >>>> http://www.openwall.com/lists/kernel-hardening/2018/02/16/4 >>>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/gcc-plugin/stackleak&id=57a0a6763b12e82dd462593d0f42be610e93cdc9 >>>> >>>> It's not very big. I documented it in detail. I would be really >>>> grateful for the >>>> review! >>> >>> Looks good to me FWIW. Just a couple of minor things: >>> >>>> + /* >>>> + * 1. Loop through the GIMPLE statements in each of cfun basic blocks. >>>> + * cfun is a global variable which represents the function that is >>>> + * currently processed. >>>> + */ >>>> + FOR_EACH_BB_FN(bb, cfun) { >>>> + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { >>>> + gimple stmt; >>>> + >>>> + stmt = gsi_stmt(gsi); >>>> + >>>> + /* Leaf function is a function which makes no calls */ >>>> + if (is_gimple_call(stmt)) >>>> + is_leaf = false; >>> >>> It's probably not going to matter in practice, but it might be worth >>> emphasising in the comments that this leafness is only approximate. >> >> That's important, thank you! May I ask why you think it's not going to matter in >> practice? > > I just thought the kind of calls it misses are going to have very > shallow frames, but from what you said later I guess that isn't the > point. It also might be a bit too hand-wavy for something like this :-) > >>> It will sometimes be a false positive, because we could still >>> end up creating calls to libgcc functions from non-call statements >>> (or for target-specific reasons). It can also be a false negative, >>> since call statements can be to built-in or internal functions that >>> end up being open-coded. >> >> Oh, that raises the question: how does this leafness inaccuracy affect in my >> particular case? >> >> is_leaf is currently used for finding the special cases to skip the >> track_stack() call insertion: >> >> /* >> * Special cases to skip the instrumentation. >> * >> * Taking the address of static inline functions materializes them, >> * but we mustn't instrument some of them as the resulting stack >> * alignment required by the function call ABI will break other >> * assumptions regarding the expected (but not otherwise enforced) >> * register clobbering ABI. >> * >> * Case in point: native_save_fl on amd64 when optimized for size >> * clobbers rdx if it were instrumented here. >> * >> * TODO: any more special cases? >> */ >> if (is_leaf && >> !TREE_PUBLIC(current_function_decl) && >> DECL_DECLARED_INLINE_P(current_function_decl)) { >> return 0; >> } >> >> >> And now it seems to me that the stackleak plugin should not instrument all >> static inline functions, regardless of is_leaf. Do you agree? > > OK. I'd missed that this was just a heuristic to detect certain kinds > of linux function, so it's probably fine as it is. > > Not sure whether it's safe to punt for general static inline functions. > E.g. couldn't you have a static inline function that just provides a > more convenient interface to another function? But I guess it's a > linux-specific heuristic, so I can't really say. Huh, I got the insight! I think that the current approach (originally by PaX Team) should work fine despite the false positives which you described: If some static inline function already does explicit calls (so is_leaf is false), adding the track_stack() call will not introduce anything special that can break the aforementioned register clobbering ABI in that function. Does it sound reasonable? However, I don't know what to with false negatives. > TBH the paravirt save_fl stuff seems like dancing on the edge, > but that's another story. :-) That's interesting. Could you elaborate on that? >>>> + /* >>>> + * The stackleak_final pass should be executed before the "final" pass, >>>> + * which turns the RTL (Register Transfer Language) into assembly. >>>> + */ >>>> + PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE); >>> >>> This might be too late, since it happens e.g. after addresses have >>> been calculated for branch ranges, and after machine-specific passes >>> (e.g. bundling on ia64). >>> >>> The stack size is final after reload, so inserting the pass after that >>> might be better. >> >> Thanks for that notice. May I ask for the additional clarification? >> >> I specified -fdump-passes and see a lot of passes between reload and final: ... >> >> Where exactly would you recommend me to insert the stackleak_final pass, which >> removes the unneeded track_stack() calls? > > Directly after rtl-reload seems best. That's the first point at which > the frame size is final, and reload is one of the few rtl passes that > always runs. Doing it there could also help with things like shrink > wrapping (part of rtl-pro_and_epilogue). Thanks a lot for your detailed answer. I'll follow your advice in the next version of the patch series. Best regards, Alexander ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] stackleak: Update for arm64 2018-02-22 16:58 ` Will Deacon 2018-02-22 19:22 ` Alexander Popov @ 2018-02-22 19:38 ` Laura Abbott 1 sibling, 0 replies; 15+ messages in thread From: Laura Abbott @ 2018-02-22 19:38 UTC (permalink / raw) To: Will Deacon Cc: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel, linux-kernel, linux-arm-kernel, kernel-hardening, richard.sandiford On 02/22/2018 08:58 AM, Will Deacon wrote: > Hi Laura, > > On Tue, Feb 20, 2018 at 05:13:02PM -0800, Laura Abbott wrote: >> >> arm64 has another layer of indirection in the RTL. >> Account for this in the plugin. >> >> Signed-off-by: Laura Abbott <labbott@redhat.com> >> --- >> scripts/gcc-plugins/stackleak_plugin.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c >> index 6fc991c98d8b..7dfaa027423f 100644 >> --- a/scripts/gcc-plugins/stackleak_plugin.c >> +++ b/scripts/gcc-plugins/stackleak_plugin.c >> @@ -244,6 +244,11 @@ static unsigned int stackleak_final_execute(void) >> * that insn. >> */ >> body = PATTERN(insn); >> + /* arm64 is different */ >> + if (GET_CODE(body) == PARALLEL) { >> + body = XEXP(body, 0); >> + body = XEXP(body, 0); >> + } > > Like most kernel developers, I don't know the first thing about GCC internals > so I asked our GCC team and Richard (CC'd) reckons this should be: > > if (GET_CODE(body) == PARALLEL) > body = XVECEXP(body, 0, 0); > > instead of the hunk above. Can you give that a go instead, please? > > Cheers, > > Will > Yep, seems to work fine and makes sense from my understanding of gcc internals. I'll fix it up for the next version. Thanks for the review! Laura ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] arm64: Clear the stack 2018-02-21 1:13 ` [PATCH 0/2] Stackleak for arm64 Laura Abbott 2018-02-21 1:13 ` [PATCH 1/2] stackleak: Update " Laura Abbott @ 2018-02-21 1:13 ` Laura Abbott 2018-02-21 15:38 ` Mark Rutland 2018-02-21 14:48 ` [PATCH 0/2] Stackleak for arm64 Alexander Popov 2 siblings, 1 reply; 15+ messages in thread From: Laura Abbott @ 2018-02-21 1:13 UTC (permalink / raw) To: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel Cc: Laura Abbott, kernel-hardening, linux-arm-kernel, linux-kernel Implementation of stackleak based heavily on the x86 version Signed-off-by: Laura Abbott <labbott@redhat.com> --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/processor.h | 6 ++ arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/entry.S | 108 ++++++++++++++++++++++++++++++++++ arch/arm64/kernel/process.c | 16 +++++ drivers/firmware/efi/libstub/Makefile | 3 +- scripts/Makefile.gcc-plugins | 5 +- 7 files changed, 140 insertions(+), 2 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7381eeb7ef8e..dcadcae674a7 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -92,6 +92,7 @@ config ARM64 select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_SECCOMP_FILTER + select HAVE_ARCH_STACKLEAK select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fce604e3e599..4b309101ac83 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -114,6 +114,12 @@ struct thread_struct { unsigned long fault_address; /* fault info */ unsigned long fault_code; /* ESR_EL1 value */ struct debug_info debug; /* debugging */ +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK + unsigned long lowest_stack; +#ifdef CONFIG_STACKLEAK_METRICS + unsigned long prev_lowest_stack; +#endif +#endif }; /* diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 1303e04110cd..b5c6100e8b14 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -45,6 +45,9 @@ int main(void) DEFINE(TSK_TI_TTBR0, offsetof(struct task_struct, thread_info.ttbr0)); #endif DEFINE(TSK_STACK, offsetof(struct task_struct, stack)); +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK + DEFINE(TSK_TI_LOWEST_STACK, offsetof(struct task_struct, thread.lowest_stack)); +#endif BLANK(); DEFINE(THREAD_CPU_CONTEXT, offsetof(struct task_struct, thread.cpu_context)); BLANK(); diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index ec2ee720e33e..b909b436293a 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -401,6 +401,11 @@ tsk .req x28 // current thread_info .text + .macro erase_kstack +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK + bl __erase_kstack +#endif + .endm /* * Exception vectors. */ @@ -901,6 +906,7 @@ work_pending: */ ret_to_user: disable_daif + erase_kstack ldr x1, [tsk, #TSK_TI_FLAGS] and x2, x1, #_TIF_WORK_MASK cbnz x2, work_pending @@ -1337,3 +1343,105 @@ alternative_else_nop_endif ENDPROC(__sdei_asm_handler) NOKPROBE(__sdei_asm_handler) #endif /* CONFIG_ARM_SDE_INTERFACE */ + +/* + * This is what the stack looks like + * + * +---+ <- task_stack_page(p) + THREAD_SIZE + * | | + * +---+ <- task_stack_page(p) + THREAD_START_SP + * | | + * | | + * +---+ <- task_pt_regs(p) + * | | + * | | + * | | <- current_sp + * ~~~~~ + * + * ~~~~~ + * | | <- lowest_stack + * | | + * | | + * +---+ <- task_stack_page(p) + * + * This function is desgned to poison the memory between the lowest_stack + * and the current stack pointer. After clearing the stack, the lowest + * stack is reset. + */ + +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK +ENTRY(__erase_kstack) + mov x10, x0 // save x0 for the fast path + + get_thread_info x0 + ldr x1, [x0, #TSK_TI_LOWEST_STACK] + + /* get the number of bytes to check for lowest stack */ + mov x3, x1 + and x3, x3, #THREAD_SIZE - 1 + lsr x3, x3, #3 + + /* generate addresses from the bottom of the stack */ + mov x4, sp + movn x2, #THREAD_SIZE - 1 + and x1, x4, x2 + + mov x2, #STACKLEAK_POISON + + mov x5, #0 +1: + /* + * As borrowed from the x86 logic, start from the lowest_stack + * and go to the bottom to find the poison value. + * The check of 16 is to hopefully avoid false positives. + */ + cbz x3, 4f + ldr x4, [x1, x3, lsl #3] + cmp x4, x2 + csinc x5, xzr, x5, ne + tbnz x5, #STACKLEAK_POISON_CHECK_DEPTH/4, 4f // found 16 poisons? + sub x3, x3, #1 + b 1b + +4: + /* total number of bytes to poison */ + add x5, x1, x3, lsl #3 + mov x4, sp + sub x8, x4, x5 + + cmp x8, #THREAD_SIZE // sanity check the range + b.lo 5f + ASM_BUG() + +5: + /* + * We may have hit a path where the stack did not get used, + * no need to do anything here + */ + cbz x8, 7f + + sub x8, x8, #1 // don't poison the current stack pointer + + lsr x8, x8, #3 + add x3, x3, x8 + + /* + * The logic of this loop ensures the last stack word isn't + * ovewritten. + */ +6: + cbz x8, 7f + str x2, [x1, x3, lsl #3] + sub x3, x3, #1 + sub x8, x8, #1 + b 6b + + /* Reset the lowest stack to the top of the stack */ +7: + mov x1, sp + str x1, [x0, #TSK_TI_LOWEST_STACK] + + mov x0, x10 + ret +ENDPROC(__erase_kstack) +#endif diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index ad8aeb098b31..fd0528db6772 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -357,6 +357,9 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start, p->thread.cpu_context.pc = (unsigned long)ret_from_fork; p->thread.cpu_context.sp = (unsigned long)childregs; +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK + p->thread.lowest_stack = (unsigned long)task_stack_page(p); +#endif ptrace_hw_copy_thread(p); return 0; @@ -486,3 +489,16 @@ void arch_setup_new_exec(void) { current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0; } + +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK +void __used check_alloca(unsigned long size) +{ + unsigned long sp, stack_left; + + sp = current_stack_pointer; + + stack_left = sp & (THREAD_SIZE - 1); + BUG_ON(stack_left < 256 || size >= stack_left - 256); +} +EXPORT_SYMBOL(check_alloca); +#endif diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 7b3ba40f0745..35ebbc1b17ff 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -20,7 +20,8 @@ cflags-$(CONFIG_EFI_ARMSTUB) += -I$(srctree)/scripts/dtc/libfdt KBUILD_CFLAGS := $(cflags-y) -DDISABLE_BRANCH_PROFILING \ -D__NO_FORTIFY \ $(call cc-option,-ffreestanding) \ - $(call cc-option,-fno-stack-protector) + $(call cc-option,-fno-stack-protector) \ + $(DISABLE_STACKLEAK_PLUGIN) GCOV_PROFILE := n KASAN_SANITIZE := n diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 8d6070fc538f..6cc0e35d3324 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -37,11 +37,14 @@ ifdef CONFIG_GCC_PLUGINS gcc-plugin-$(CONFIG_GCC_PLUGIN_STACKLEAK) += stackleak_plugin.so gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK) += -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-min-size=$(CONFIG_STACKLEAK_TRACK_MIN_SIZE) + ifdef CONFIG_GCC_PLUGIN_STACKLEAK + DISABLE_STACKLEAK_PLUGIN += -fplugin-arg-stackleak_plugin-disable + endif GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR - export SANCOV_PLUGIN DISABLE_LATENT_ENTROPY_PLUGIN + export SANCOV_PLUGIN DISABLE_LATENT_ENTROPY_PLUGIN DISABLE_STACKLEAK_PLUGIN ifneq ($(PLUGINCC),) # SANCOV_PLUGIN can be only in CFLAGS_KCOV because avoid duplication. -- 2.14.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] arm64: Clear the stack 2018-02-21 1:13 ` [PATCH 2/2] arm64: Clear the stack Laura Abbott @ 2018-02-21 15:38 ` Mark Rutland 2018-02-21 23:53 ` Laura Abbott 0 siblings, 1 reply; 15+ messages in thread From: Mark Rutland @ 2018-02-21 15:38 UTC (permalink / raw) To: Laura Abbott Cc: Alexander Popov, Kees Cook, Ard Biesheuvel, kernel-hardening, linux-arm-kernel, linux-kernel Hi Laura, On Tue, Feb 20, 2018 at 05:13:03PM -0800, Laura Abbott wrote: > Implementation of stackleak based heavily on the x86 version Neat! > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index ec2ee720e33e..b909b436293a 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -401,6 +401,11 @@ tsk .req x28 // current thread_info > > .text > > + .macro erase_kstack > +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK > + bl __erase_kstack > +#endif > + .endm > /* > * Exception vectors. > */ > @@ -901,6 +906,7 @@ work_pending: > */ > ret_to_user: > disable_daif > + erase_kstack I *think* this should happen in finish_ret_to_user a few lines down, since we can call C code if we branch to work_pending, dirtying the stack. > ldr x1, [tsk, #TSK_TI_FLAGS] > and x2, x1, #_TIF_WORK_MASK > cbnz x2, work_pending > @@ -1337,3 +1343,105 @@ alternative_else_nop_endif > ENDPROC(__sdei_asm_handler) > NOKPROBE(__sdei_asm_handler) > #endif /* CONFIG_ARM_SDE_INTERFACE */ > + > +/* > + * This is what the stack looks like > + * > + * +---+ <- task_stack_page(p) + THREAD_SIZE > + * | | > + * +---+ <- task_stack_page(p) + THREAD_START_SP > + * | | > + * | | > + * +---+ <- task_pt_regs(p) THREAD_START_SP got killed off in commit 34be98f4944f9907 as part of the VMAP_STACK rework, so this can be: +---+ <- task_stack_page(p) + THREAD_SIZE | | | | +---+ <- task_pt_regs(p) ... > + * | | > + * | | > + * | | <- current_sp > + * ~~~~~ > + * > + * ~~~~~ > + * | | <- lowest_stack > + * | | > + * | | > + * +---+ <- task_stack_page(p) > + * > + * This function is desgned to poison the memory between the lowest_stack > + * and the current stack pointer. After clearing the stack, the lowest > + * stack is reset. > + */ > + > +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK > +ENTRY(__erase_kstack) > + mov x10, x0 // save x0 for the fast path AFAICT, we only call this from ret_to_user, where x0 doesn't need to be preserved. Is that for ret_fast_syscall? In some cases, ret_fast_syscall can bypass ret_to_user and calls kernel_exit directly, so we might need a call there. > + > + get_thread_info x0 > + ldr x1, [x0, #TSK_TI_LOWEST_STACK] > + > + /* get the number of bytes to check for lowest stack */ > + mov x3, x1 > + and x3, x3, #THREAD_SIZE - 1 > + lsr x3, x3, #3 > + > + /* generate addresses from the bottom of the stack */ > + mov x4, sp > + movn x2, #THREAD_SIZE - 1 > + and x1, x4, x2 Can we replace the MOVN;AND with a single instruction to clear the low bits? e.g. mov x4, sp bic x1, x4, #THREAD_SIZE - 1 ... IIUC BIC is an alias for the bitfield instructions, though I can't recall exactly which one(s). > + > + mov x2, #STACKLEAK_POISON > + > + mov x5, #0 > +1: > + /* > + * As borrowed from the x86 logic, start from the lowest_stack > + * and go to the bottom to find the poison value. > + * The check of 16 is to hopefully avoid false positives. > + */ > + cbz x3, 4f > + ldr x4, [x1, x3, lsl #3] > + cmp x4, x2 > + csinc x5, xzr, x5, ne > + tbnz x5, #STACKLEAK_POISON_CHECK_DEPTH/4, 4f // found 16 poisons? > + sub x3, x3, #1 > + b 1b > + > +4: > + /* total number of bytes to poison */ > + add x5, x1, x3, lsl #3 > + mov x4, sp > + sub x8, x4, x5 > + > + cmp x8, #THREAD_SIZE // sanity check the range > + b.lo 5f > + ASM_BUG() > + > +5: > + /* > + * We may have hit a path where the stack did not get used, > + * no need to do anything here > + */ > + cbz x8, 7f > + > + sub x8, x8, #1 // don't poison the current stack pointer > + > + lsr x8, x8, #3 > + add x3, x3, x8 > + > + /* > + * The logic of this loop ensures the last stack word isn't > + * ovewritten. > + */ Is that to ensure that we don't clobber the word at the current sp value? > +6: > + cbz x8, 7f > + str x2, [x1, x3, lsl #3] > + sub x3, x3, #1 > + sub x8, x8, #1 > + b 6b > + > + /* Reset the lowest stack to the top of the stack */ > +7: > + mov x1, sp > + str x1, [x0, #TSK_TI_LOWEST_STACK] > + > + mov x0, x10 > + ret > +ENDPROC(__erase_kstack) > +#endif [...] > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile > index 7b3ba40f0745..35ebbc1b17ff 100644 > --- a/drivers/firmware/efi/libstub/Makefile > +++ b/drivers/firmware/efi/libstub/Makefile > @@ -20,7 +20,8 @@ cflags-$(CONFIG_EFI_ARMSTUB) += -I$(srctree)/scripts/dtc/libfdt > KBUILD_CFLAGS := $(cflags-y) -DDISABLE_BRANCH_PROFILING \ > -D__NO_FORTIFY \ > $(call cc-option,-ffreestanding) \ > - $(call cc-option,-fno-stack-protector) > + $(call cc-option,-fno-stack-protector) \ > + $(DISABLE_STACKLEAK_PLUGIN) I believe the KVM hyp code will also need to opt-out of this. Thanks, Mark. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] arm64: Clear the stack 2018-02-21 15:38 ` Mark Rutland @ 2018-02-21 23:53 ` Laura Abbott 2018-02-22 1:35 ` Laura Abbott 0 siblings, 1 reply; 15+ messages in thread From: Laura Abbott @ 2018-02-21 23:53 UTC (permalink / raw) To: Mark Rutland Cc: Alexander Popov, Kees Cook, Ard Biesheuvel, kernel-hardening, linux-arm-kernel, linux-kernel On 02/21/2018 07:38 AM, Mark Rutland wrote: > Hi Laura, > > On Tue, Feb 20, 2018 at 05:13:03PM -0800, Laura Abbott wrote: >> Implementation of stackleak based heavily on the x86 version > > Neat! > >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S >> index ec2ee720e33e..b909b436293a 100644 >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -401,6 +401,11 @@ tsk .req x28 // current thread_info >> >> .text >> >> + .macro erase_kstack >> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK >> + bl __erase_kstack >> +#endif >> + .endm >> /* >> * Exception vectors. >> */ >> @@ -901,6 +906,7 @@ work_pending: >> */ >> ret_to_user: >> disable_daif >> + erase_kstack > > I *think* this should happen in finish_ret_to_user a few lines down, since we > can call C code if we branch to work_pending, dirtying the stack. > I think you're right but this didn't immediately work when I tried it. I'll have to dig into this some more. >> ldr x1, [tsk, #TSK_TI_FLAGS] >> and x2, x1, #_TIF_WORK_MASK >> cbnz x2, work_pending >> @@ -1337,3 +1343,105 @@ alternative_else_nop_endif >> ENDPROC(__sdei_asm_handler) >> NOKPROBE(__sdei_asm_handler) >> #endif /* CONFIG_ARM_SDE_INTERFACE */ >> + >> +/* >> + * This is what the stack looks like >> + * >> + * +---+ <- task_stack_page(p) + THREAD_SIZE >> + * | | >> + * +---+ <- task_stack_page(p) + THREAD_START_SP >> + * | | >> + * | | >> + * +---+ <- task_pt_regs(p) > > THREAD_START_SP got killed off in commit 34be98f4944f9907 as part of the > VMAP_STACK rework, so this can be: > > +---+ <- task_stack_page(p) + THREAD_SIZE > | | > | | > +---+ <- task_pt_regs(p) > ... > Good point. >> + * | | >> + * | | >> + * | | <- current_sp >> + * ~~~~~ >> + * >> + * ~~~~~ >> + * | | <- lowest_stack >> + * | | >> + * | | >> + * +---+ <- task_stack_page(p) >> + * >> + * This function is desgned to poison the memory between the lowest_stack >> + * and the current stack pointer. After clearing the stack, the lowest >> + * stack is reset. >> + */ >> + >> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK >> +ENTRY(__erase_kstack) >> + mov x10, x0 // save x0 for the fast path > > AFAICT, we only call this from ret_to_user, where x0 doesn't need to be > preserved. > > Is that for ret_fast_syscall? In some cases, ret_fast_syscall can bypass > ret_to_user and calls kernel_exit directly, so we might need a call there. > This was a hold over when I was experimenting with calling erase_kstack more places, one of which came through ret_fast_syscall. I realized later that the erase was unnecessary but accidentally kept the saving in. I'll see about removing it assuming we don't decide later to put a call on the fast path. >> + >> + get_thread_info x0 >> + ldr x1, [x0, #TSK_TI_LOWEST_STACK] >> + >> + /* get the number of bytes to check for lowest stack */ >> + mov x3, x1 >> + and x3, x3, #THREAD_SIZE - 1 >> + lsr x3, x3, #3 >> + >> + /* generate addresses from the bottom of the stack */ >> + mov x4, sp >> + movn x2, #THREAD_SIZE - 1 >> + and x1, x4, x2 > > Can we replace the MOVN;AND with a single instruction to clear the low bits? > e.g. > > mov x4, sp > bic x1, x4, #THREAD_SIZE - 1 > > ... IIUC BIC is an alias for the bitfield instructions, though I can't recall > exactly which one(s). > Good suggestion. >> + >> + mov x2, #STACKLEAK_POISON >> + >> + mov x5, #0 >> +1: >> + /* >> + * As borrowed from the x86 logic, start from the lowest_stack >> + * and go to the bottom to find the poison value. >> + * The check of 16 is to hopefully avoid false positives. >> + */ >> + cbz x3, 4f >> + ldr x4, [x1, x3, lsl #3] >> + cmp x4, x2 >> + csinc x5, xzr, x5, ne >> + tbnz x5, #STACKLEAK_POISON_CHECK_DEPTH/4, 4f // found 16 poisons? >> + sub x3, x3, #1 >> + b 1b >> + >> +4: >> + /* total number of bytes to poison */ >> + add x5, x1, x3, lsl #3 >> + mov x4, sp >> + sub x8, x4, x5 >> + >> + cmp x8, #THREAD_SIZE // sanity check the range >> + b.lo 5f >> + ASM_BUG() >> + >> +5: >> + /* >> + * We may have hit a path where the stack did not get used, >> + * no need to do anything here >> + */ >> + cbz x8, 7f >> + >> + sub x8, x8, #1 // don't poison the current stack pointer >> + >> + lsr x8, x8, #3 >> + add x3, x3, x8 >> + >> + /* >> + * The logic of this loop ensures the last stack word isn't >> + * ovewritten. >> + */ > > Is that to ensure that we don't clobber the word at the current sp value? > Correct. >> +6: >> + cbz x8, 7f >> + str x2, [x1, x3, lsl #3] >> + sub x3, x3, #1 >> + sub x8, x8, #1 >> + b 6b >> + >> + /* Reset the lowest stack to the top of the stack */ >> +7: >> + mov x1, sp >> + str x1, [x0, #TSK_TI_LOWEST_STACK] >> + >> + mov x0, x10 >> + ret >> +ENDPROC(__erase_kstack) >> +#endif > > [...] > >> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile >> index 7b3ba40f0745..35ebbc1b17ff 100644 >> --- a/drivers/firmware/efi/libstub/Makefile >> +++ b/drivers/firmware/efi/libstub/Makefile >> @@ -20,7 +20,8 @@ cflags-$(CONFIG_EFI_ARMSTUB) += -I$(srctree)/scripts/dtc/libfdt >> KBUILD_CFLAGS := $(cflags-y) -DDISABLE_BRANCH_PROFILING \ >> -D__NO_FORTIFY \ >> $(call cc-option,-ffreestanding) \ >> - $(call cc-option,-fno-stack-protector) >> + $(call cc-option,-fno-stack-protector) \ >> + $(DISABLE_STACKLEAK_PLUGIN) > > I believe the KVM hyp code will also need to opt-out of this. > I'll double check that. > Thanks, > Mark. > Thanks, Laura ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] arm64: Clear the stack 2018-02-21 23:53 ` Laura Abbott @ 2018-02-22 1:35 ` Laura Abbott 0 siblings, 0 replies; 15+ messages in thread From: Laura Abbott @ 2018-02-22 1:35 UTC (permalink / raw) To: Mark Rutland Cc: Alexander Popov, Kees Cook, Ard Biesheuvel, kernel-hardening, linux-arm-kernel, linux-kernel On 02/21/2018 03:53 PM, Laura Abbott wrote: >> I *think* this should happen in finish_ret_to_user a few lines down, since we >> can call C code if we branch to work_pending, dirtying the stack. >> > > I think you're right but this didn't immediately work when I tried it. > I'll have to dig into this some more. Okay I figured this out. Not corrupting registers works wonders. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] Stackleak for arm64 2018-02-21 1:13 ` [PATCH 0/2] Stackleak for arm64 Laura Abbott 2018-02-21 1:13 ` [PATCH 1/2] stackleak: Update " Laura Abbott 2018-02-21 1:13 ` [PATCH 2/2] arm64: Clear the stack Laura Abbott @ 2018-02-21 14:48 ` Alexander Popov 2 siblings, 0 replies; 15+ messages in thread From: Alexander Popov @ 2018-02-21 14:48 UTC (permalink / raw) To: Laura Abbott, Kees Cook, Mark Rutland, Ard Biesheuvel Cc: kernel-hardening, linux-arm-kernel, linux-kernel On 21.02.2018 04:13, Laura Abbott wrote: > This is the arm64 version of the STACKLEAK plugin originall from > grsecurity. See > https://marc.info/?l=kernel-hardening&m=151880470609808 for the > full x86 version. This is based on top of Kees' branch for stackleak > and has been cleaned up to use a few macros from that branch. > > Comments welcome, if there are no major objections Kees will queue this > up to get some CI testing. This passed both of the LKDTM tests. Hello, Laura, Thank you. I'll take some time to learn your patches and test them on my LeMaker HiKey board. I'll return with the feedback. Best regards, Alexander ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Stackleak for arm64 @ 2018-05-02 20:33 Laura Abbott 2018-05-02 20:33 ` [PATCH 1/2] stackleak: Update " Laura Abbott 0 siblings, 1 reply; 15+ messages in thread From: Laura Abbott @ 2018-05-02 20:33 UTC (permalink / raw) To: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel Cc: Laura Abbott, kernel-hardening, linux-arm-kernel, linux-kernel This is an extension of the arm64 stackleak plugin, based on top of the v11 version for x86. The biggest change from the previous version is the conversion from assembly to C. It's mostly taken straight from the x86 version modulo a few cleanups and seems to work just fine. If this gets Acks, I'd like Kees to start carrying in his tree for -next. Laura Abbott (2): stackleak: Update for arm64 arm64: Clear the stack arch/arm64/Kconfig | 1 + arch/arm64/include/asm/processor.h | 6 ++++ arch/arm64/kernel/Makefile | 3 ++ arch/arm64/kernel/entry.S | 6 ++++ arch/arm64/kernel/erase.c | 55 ++++++++++++++++++++++++++++++++++ arch/arm64/kernel/process.c | 16 ++++++++++ drivers/firmware/efi/libstub/Makefile | 3 +- scripts/Makefile.gcc-plugins | 5 +++- scripts/gcc-plugins/stackleak_plugin.c | 4 +++ 9 files changed, 97 insertions(+), 2 deletions(-) create mode 100644 arch/arm64/kernel/erase.c -- 2.14.3 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] stackleak: Update for arm64 2018-05-02 20:33 Laura Abbott @ 2018-05-02 20:33 ` Laura Abbott 0 siblings, 0 replies; 15+ messages in thread From: Laura Abbott @ 2018-05-02 20:33 UTC (permalink / raw) To: Alexander Popov, Kees Cook, Mark Rutland, Ard Biesheuvel Cc: Laura Abbott, kernel-hardening, linux-arm-kernel, linux-kernel arm64 has another layer of indirection in the RTL. Account for this in the plugin. Signed-off-by: Laura Abbott <labbott@redhat.com> --- Fixed from previous version to be a vector expression. --- scripts/gcc-plugins/stackleak_plugin.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c index 6ac2a055ec61..0a55ecaf44df 100644 --- a/scripts/gcc-plugins/stackleak_plugin.c +++ b/scripts/gcc-plugins/stackleak_plugin.c @@ -253,6 +253,10 @@ static unsigned int stackleak_cleanup_execute(void) * that insn. */ body = PATTERN(insn); + /* arm64 is different */ + if (GET_CODE(body) == PARALLEL) + body = XVECEXP(body, 0, 0); + if (GET_CODE(body) != CALL) continue; -- 2.14.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-05-02 20:33 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <dc76a745-3fa7-4023-dcc1-3df18c9461a6@redhat.com> 2018-02-21 1:13 ` [PATCH 0/2] Stackleak for arm64 Laura Abbott 2018-02-21 1:13 ` [PATCH 1/2] stackleak: Update " Laura Abbott 2018-02-22 16:58 ` Will Deacon 2018-02-22 19:22 ` Alexander Popov 2018-02-27 10:21 ` Richard Sandiford 2018-02-28 15:09 ` Alexander Popov 2018-03-01 10:33 ` Richard Sandiford 2018-03-02 11:14 ` Alexander Popov 2018-02-22 19:38 ` Laura Abbott 2018-02-21 1:13 ` [PATCH 2/2] arm64: Clear the stack Laura Abbott 2018-02-21 15:38 ` Mark Rutland 2018-02-21 23:53 ` Laura Abbott 2018-02-22 1:35 ` Laura Abbott 2018-02-21 14:48 ` [PATCH 0/2] Stackleak for arm64 Alexander Popov 2018-05-02 20:33 Laura Abbott 2018-05-02 20:33 ` [PATCH 1/2] stackleak: Update " Laura Abbott
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).