On 07/19/2017 01:08 AM, Ard Biesheuvel wrote: > On 18 July 2017 at 22:53, Laura Abbott wrote: >> On 07/15/2017 05:03 PM, Ard Biesheuvel wrote: >>> On 14 July 2017 at 22:27, Mark Rutland wrote: >>>> On Fri, Jul 14, 2017 at 03:06:06PM +0100, Mark Rutland wrote: >>>>> On Fri, Jul 14, 2017 at 01:27:14PM +0100, Ard Biesheuvel wrote: >>>>>> On 14 July 2017 at 11:48, Ard Biesheuvel wrote: >>>>>>> On 14 July 2017 at 11:32, Mark Rutland wrote: >>>>>>>> On Thu, Jul 13, 2017 at 07:28:48PM +0100, Ard Biesheuvel wrote: >>>>> >>>>>>>>> OK, so here's a crazy idea: what if we >>>>>>>>> a) carve out a dedicated range in the VMALLOC area for stacks >>>>>>>>> b) for each stack, allocate a naturally aligned window of 2x the stack >>>>>>>>> size, and map the stack inside it, leaving the remaining space >>>>>>>>> unmapped >>>>> >>>>>>>> The logical ops (TST) and conditional branches (TB(N)Z, CB(N)Z) operate >>>>>>>> on XZR rather than SP, so to do this we need to get the SP value into a >>>>>>>> GPR. >>>>>>>> >>>>>>>> Previously, I assumed this meant we needed to corrupt a GPR (and hence >>>>>>>> stash that GPR in a sysreg), so I started writing code to free sysregs. >>>>>>>> >>>>>>>> However, I now realise I was being thick, since we can stash the GPR >>>>>>>> in the SP: >>>>>>>> >>>>>>>> sub sp, sp, x0 // sp = orig_sp - x0 >>>>>>>> add x0, sp, x0 // x0 = x0 - (orig_sp - x0) == orig_sp >>>>> >>>>> That comment is off, and should say x0 = x0 + (orig_sp - x0) == orig_sp >>>>> >>>>>>>> sub x0, x0, #S_FRAME_SIZE >>>>>>>> tb(nz) x0, #THREAD_SHIFT, overflow >>>>>>>> add x0, x0, #S_FRAME_SIZE >>>>>>>> sub x0, sp, x0 >>>>>> >>>>>> You need a neg x0, x0 here I think >>>>> >>>>> Oh, whoops. I'd mis-simplified things. >>>>> >>>>> We can avoid that by storing orig_sp + orig_x0 in sp: >>>>> >>>>> add sp, sp, x0 // sp = orig_sp + orig_x0 >>>>> sub x0, sp, x0 // x0 = orig_sp >>>>> < check > >>>>> sub x0, sp, x0 // x0 = orig_x0 >>>>> sub sp, sp, x0 // sp = orig_sp >>>>> >>>>> ... which works in a locally-built kernel where I've aligned all the >>>>> stacks. >>>> >>>> FWIW, I've pushed out a somewhat cleaned-up (and slightly broken!) >>>> version of said kernel source to my arm64/vmap-stack-align branch [1]. >>>> That's still missing the backtrace handling, IRQ stack alignment is >>>> broken at least on 64K pages, and there's still more cleanup and rework >>>> to do. >>>> >>> >>> I have spent some time addressing the issues mentioned in the commit >>> log. Please take a look. >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git vmap-arm64-mark >>> >> >> I used vmap-arm64-mark to compile kernels for a few days. It seemed to >> work well enough. >> > > Thanks for giving this a spin. Any comments on the performance impact? > (if you happened to notice any) > I didn't notice any performance impact but I also wasn't trying that hard. I did try this with a different configuration and ran into stackspace errors almost immediately: [ 0.358026] smp: Brought up 1 node, 8 CPUs [ 0.359359] SMP: Total of 8 processors activated. [ 0.359542] CPU features: detected feature: 32-bit EL0 Support [ 0.361781] Insufficient stack space to handle exception! [ 0.362075] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.12.0-00018-ge9cf49d604ef-dirty #23 [ 0.362538] Hardware name: linux,dummy-virt (DT) [ 0.362844] task: ffffffc03a8a3200 task.stack: ffffff8008e80000 [ 0.363389] PC is at __do_softirq+0x88/0x210 [ 0.363585] LR is at __do_softirq+0x78/0x210 [ 0.363859] pc : [] lr : [] pstate: 80000145 [ 0.364109] sp : ffffffc03bf65ea0 [ 0.364253] x29: ffffffc03bf66830 x28: 0000000000000002 [ 0.364547] x27: ffffff8008e83e20 x26: 00000000fffedb5a [ 0.364777] x25: 0000000000000001 x24: 0000000000000000 [ 0.365017] x23: ffffff8008dc5900 x22: ffffff8008c37000 [ 0.365242] x21: 0000000000000003 x20: 0000000000000000 [ 0.365557] x19: ffffff8008d02000 x18: 0000000000040000 [ 0.365991] x17: 0000000000000000 x16: 0000000000000008 [ 0.366148] x15: ffffffc03a400228 x14: 0000000000000000 [ 0.366296] x13: ffffff8008a50b98 x12: ffffffc03a916480 [ 0.366442] x11: ffffff8008a50ba0 x10: 0000000000000008 [ 0.366624] x9 : 0000000000000004 x8 : ffffffc03bf6f630 [ 0.366779] x7 : 0000000000000020 x6 : 00000000fffedb5a [ 0.366924] x5 : 00000000ffffffff x4 : 000000403326a000 [ 0.367071] x3 : 0000000000000101 x2 : ffffff8008ce8000 [ 0.367218] x1 : ffffff8008dc5900 x0 : 0000000000000200 [ 0.367382] Task stack: [0xffffff8008e80000..0xffffff8008e84000] [ 0.367519] IRQ stack: [0xffffffc03bf62000..0xffffffc03bf66000] [ 0.367687] ESR: 0x00000000 -- Unknown/Uncategorized [ 0.367868] FAR: 0x0000000000000000 [ 0.368059] Kernel panic - not syncing: kernel stack overflow [ 0.368252] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.12.0-00018-ge9cf49d604ef-dirty #23 [ 0.368427] Hardware name: linux,dummy-virt (DT) [ 0.368612] Call trace: [ 0.368774] [] dump_backtrace+0x0/0x228 [ 0.368979] [] show_stack+0x10/0x20 [ 0.369270] [] dump_stack+0x88/0xac [ 0.369459] [] panic+0x120/0x278 [ 0.369582] [] handle_bad_stack+0xd0/0xd8 [ 0.369799] [] __do_softirq+0x74/0x210 [ 0.370560] SMP: stopping secondary CPUs [ 0.384269] Rebooting in 5 seconds.. The config is based on what I use for booting my Hikey android board. I haven't been able to narrow down exactly which set of configs set this off. Thanks, Laura