On Mon, Oct 05, 2020 at 01:18:04PM -0500, Jeremy Linton wrote: > The AES code uses a 'br x7' as part of a function called by > a macro, that ends up needing a BTI_J as a target. Lets > define SYN_CODE_START_LOCAL() for this and replace the > SYM_FUNC_START_LOCAL with a SYM_FUNC_CODE_LOCAL in the AES block. Really what the subject here should say is that this code is not a standard function and therefore should not be annotated as such - it's wrong with or without BTI, BTI just makes it very apparent. It'd also be better to split the change in linkage.h out into a separate patch, that'd make things clearer for review. > CPU: 1 PID: 265 Comm: cryptomgr_test Not tainted 5.8.11-300.fc33.aarch64 #1 > pstate: 20400c05 (nzCv daif +PAN -UAO BTYPE=j-) > pc : aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs] > lr : aesbs_xts_encrypt+0x48/0xe0 [aes_neon_bs] > sp : ffff80001052b730 > x29: ffff80001052b730 x28: 0000000000000001 > x27: ffff0001ec8f4000 x26: ffff0001ec5d27b0 Please think hard before including complete backtraces in upstream reports, they are very large and contain almost no useful information relative to their size so often obscure the relevant content in your message. If part of the backtrace is usefully illustrative (it often is for search engines if nothing else) then it's usually better to pull out the relevant sections. > -SYM_FUNC_START_LOCAL(aesbs_encrypt8) > +SYM_CODE_START_LOCAL(aesbs_encrypt8) > ldr q9, [bskey], #16 // round 0 key > ldr q8, M0SR > ldr q24, SR > @@ -488,10 +488,10 @@ SYM_FUNC_START_LOCAL(aesbs_encrypt8) > eor v2.16b, v2.16b, v12.16b > eor v5.16b, v5.16b, v12.16b > ret > -SYM_FUNC_END(aesbs_encrypt8) > +SYM_END(aesbs_encrypt8) This should be SYM_CODE_END() to match the opening. However... > * When using in-kernel BTI we need to ensure that PCS-conformant assembly > @@ -42,6 +43,9 @@ > SYM_START(name, SYM_L_WEAK, SYM_A_NONE) \ > BTI_C > > +#define SYM_CODE_START_LOCAL(name) \ > + SYM_START(name, SYM_L_LOCAL, SYM_A_ALIGN) \ > + BTI_JC ...this is going to cause problems, SYM_CODE means that we should assemble *exactly* what was written since it's some non-standard thing - we use it for the vectors table for example. Looking at the code it's not 100% clear that the best approach here isn't just to change the call to a regular function call, this isn't a fast path or anything as far as I can see so it's unclear to me why we need to tail call. Failing that I think we need an annotation for tail called functions, that'd need to be a new thing as I am not seeing anything appropriate in the current generic annotations.