* [PATCH v4 0/7] ARM: add vmap'ed stack support @ 2021-11-22 9:28 Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 1/7] ARM: memcpy: use frame pointer as unwind anchor Ard Biesheuvel ` (6 more replies) 0 siblings, 7 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren This series enables support on ARM for vmap'ed task and IRQ stacks in the kernel. This is an important hardening feature that terminates tasks on inadvertent or deliberate accesses past the stack pointer, which might otherwise go completely unnoticed. Since having an accurate backtrace is especially important in such cases, this series includes some enhancements to the unwinder and to some hand rolled unwind info to increase the likelihood that a backtrace can be generated when relying on the ARM unwinder. The frame pointer unwinder turns out to be rather bullet proof in this context, and does not need any such enhancements. According to a quick survey I did, compiler generated code puts a single stack push as the first instruction in about 2/3 of the cases, which the unwinder can deal with after applying patch #4, even if this push faulted because of a stack overflow. In the remaining cases, the compiler tends to fall back to R11 or R7 as the frame pointer (on ARM or Thumb-2, respectively), or emit partial unwind frames for the part of the function that runs before the stack frame is set up, and the part that runs inside the stack frame. In either case, the unwinder can deal with such occurrences as they don't rely on the stack pointer directly. Changes since v3: - avoid using the wrong virtual to physical translation on the stack pointer in the suspend/cpuidle code path, - check whether SP points into the linear map rather than whether it points into the overflow stack specifically, so that other stacks are disregarded as well, - use a per-CPU pointer rather than a per-CPU allocation for the overflow stack, so the stack itself can be allocated via the page allocator, - avoid deliberately corrupting any task userland state, by repurposing the padding in the per-mode stacks as scratch space to hold a single GPR value, and rejigging the __bad_stack handler to only require a single GPR to load the overflow stack address into SP. Changes since v2: - rebase onto v5.16-rc1 - incorporate Nico's review feedback Changes since v1: - handle a missed corner case in svc_entry code, and while at it, streamline it a bit, especially for Thumb-2, which no longer needs to move SP into R0 twice to do the overflow check and the alignment check, - improve the memcpy patch so that we no longer need to push the frame pointer separately, - add Keith's tested-by Patches #1, #2 and #3 update the ARM asm string routines to align more closely with the compiler's approach in terms of unwind tables, increasing the likelihood that we can unwind them in case of a stack overflow. Patches #5 and #6 do some preparatory refactoring for the entry and switch_to code, to reduce clutter in patch #7. Patch #7 wires up the generic support, and adds the entry code to detect and deal with stack overflows. This series applies onto my IRQ stacks series sent out earlier: https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/ Cc: Russell King <linux@armlinux.org.uk> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Keith Packard <keithpac@amazon.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tony Lindgren <tony@atomide.com> Ard Biesheuvel (7): ARM: memcpy: use frame pointer as unwind anchor ARM: memmove: use frame pointer as unwind anchor ARM: memset: clean up unwind annotations ARM: unwind: disregard unwind info before stack frame is set up ARM: switch_to: clean up Thumb2 code path ARM: entry: rework stack realignment code in svc_entry ARM: implement support for vmap'ed stacks arch/arm/Kconfig | 1 + arch/arm/include/asm/page.h | 4 + arch/arm/include/asm/thread_info.h | 8 ++ arch/arm/kernel/entry-armv.S | 139 +++++++++++++++++--- arch/arm/kernel/entry-header.S | 37 ++++++ arch/arm/kernel/irq.c | 9 +- arch/arm/kernel/setup.c | 8 +- arch/arm/kernel/sleep.S | 8 ++ arch/arm/kernel/traps.c | 80 ++++++++++- arch/arm/kernel/unwind.c | 19 ++- arch/arm/kernel/vmlinux.lds.S | 4 +- arch/arm/lib/copy_from_user.S | 13 +- arch/arm/lib/copy_template.S | 67 ++++------ arch/arm/lib/copy_to_user.S | 13 +- arch/arm/lib/memcpy.S | 13 +- arch/arm/lib/memmove.S | 60 +++------ arch/arm/lib/memset.S | 7 +- 17 files changed, 349 insertions(+), 141 deletions(-) -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v4 1/7] ARM: memcpy: use frame pointer as unwind anchor 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 2/7] ARM: memmove: " Ard Biesheuvel ` (5 subsequent siblings) 6 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren The memcpy template is a bit unusual in the way it manages the stack pointer: depending on the execution path through the function, the SP assumes different values as different subsets of the register file are preserved and restored again. This is problematic when it comes to EHABI unwind info, as it is not instruction accurate, and does not allow tracking the SP value as it changes. Commit 279f487e0b471 ("ARM: 8225/1: Add unwinding support for memory copy functions") addressed this by carving up the function in different chunks as far as the unwinder is concerned, and keeping a set of unwind directives for each of them, each corresponding with the state of the stack pointer during execution of the chunk in question. This not only duplicates unwind info unnecessarily, but it also complicates unwinding the stack upon overflow. Instead, let's do what the compiler does when the SP is updated halfway through a function, which is to use a frame pointer and emit the appropriate unwind directives to communicate this to the unwinder. Note that Thumb-2 uses R7 for this, while ARM uses R11 aka FP. So let's avoid touching R7 in the body of the template, so that Thumb-2 can use it as the frame pointer. R11 was not modified in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> --- arch/arm/lib/copy_from_user.S | 13 ++-- arch/arm/lib/copy_template.S | 67 +++++++------------- arch/arm/lib/copy_to_user.S | 13 ++-- arch/arm/lib/memcpy.S | 13 ++-- 4 files changed, 38 insertions(+), 68 deletions(-) diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S index 480a20766137..270de7debd0f 100644 --- a/arch/arm/lib/copy_from_user.S +++ b/arch/arm/lib/copy_from_user.S @@ -91,18 +91,15 @@ strb\cond \reg, [\ptr], #1 .endm - .macro enter reg1 reg2 + .macro enter regs:vararg mov r3, #0 - stmdb sp!, {r0, r2, r3, \reg1, \reg2} +UNWIND( .save {r0, r2, r3, \regs} ) + stmdb sp!, {r0, r2, r3, \regs} .endm - .macro usave reg1 reg2 - UNWIND( .save {r0, r2, r3, \reg1, \reg2} ) - .endm - - .macro exit reg1 reg2 + .macro exit regs:vararg add sp, sp, #8 - ldmfd sp!, {r0, \reg1, \reg2} + ldmfd sp!, {r0, \regs} .endm .text diff --git a/arch/arm/lib/copy_template.S b/arch/arm/lib/copy_template.S index 810a805d36dc..8fbafb074fe9 100644 --- a/arch/arm/lib/copy_template.S +++ b/arch/arm/lib/copy_template.S @@ -69,13 +69,10 @@ * than one 32bit instruction in Thumb-2) */ - - UNWIND( .fnstart ) - enter r4, lr - UNWIND( .fnend ) - UNWIND( .fnstart ) - usave r4, lr @ in first stmdb block + enter r4, UNWIND(fpreg,) lr + UNWIND( .setfp fpreg, sp ) + UNWIND( mov fpreg, sp ) subs r2, r2, #4 blt 8f @@ -86,12 +83,7 @@ bne 10f 1: subs r2, r2, #(28) - stmfd sp!, {r5 - r8} - UNWIND( .fnend ) - - UNWIND( .fnstart ) - usave r4, lr - UNWIND( .save {r5 - r8} ) @ in second stmfd block + stmfd sp!, {r5, r6, r8, r9} blt 5f CALGN( ands ip, r0, #31 ) @@ -110,9 +102,9 @@ PLD( pld [r1, #92] ) 3: PLD( pld [r1, #124] ) -4: ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f +4: ldr8w r1, r3, r4, r5, r6, r8, r9, ip, lr, abort=20f subs r2, r2, #32 - str8w r0, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f + str8w r0, r3, r4, r5, r6, r8, r9, ip, lr, abort=20f bge 3b PLD( cmn r2, #96 ) PLD( bge 4b ) @@ -132,8 +124,8 @@ ldr1w r1, r4, abort=20f ldr1w r1, r5, abort=20f ldr1w r1, r6, abort=20f - ldr1w r1, r7, abort=20f ldr1w r1, r8, abort=20f + ldr1w r1, r9, abort=20f ldr1w r1, lr, abort=20f #if LDR1W_SHIFT < STR1W_SHIFT @@ -150,17 +142,14 @@ str1w r0, r4, abort=20f str1w r0, r5, abort=20f str1w r0, r6, abort=20f - str1w r0, r7, abort=20f str1w r0, r8, abort=20f + str1w r0, r9, abort=20f str1w r0, lr, abort=20f CALGN( bcs 2b ) -7: ldmfd sp!, {r5 - r8} - UNWIND( .fnend ) @ end of second stmfd block +7: ldmfd sp!, {r5, r6, r8, r9} - UNWIND( .fnstart ) - usave r4, lr @ still in first stmdb block 8: movs r2, r2, lsl #31 ldr1b r1, r3, ne, abort=21f ldr1b r1, r4, cs, abort=21f @@ -169,7 +158,7 @@ str1b r0, r4, cs, abort=21f str1b r0, ip, cs, abort=21f - exit r4, pc + exit r4, UNWIND(fpreg,) pc 9: rsb ip, ip, #4 cmp ip, #2 @@ -189,13 +178,10 @@ ldr1w r1, lr, abort=21f beq 17f bgt 18f - UNWIND( .fnend ) .macro forward_copy_shift pull push - UNWIND( .fnstart ) - usave r4, lr @ still in first stmdb block subs r2, r2, #28 blt 14f @@ -205,12 +191,8 @@ CALGN( subcc r2, r2, ip ) CALGN( bcc 15f ) -11: stmfd sp!, {r5 - r9} - UNWIND( .fnend ) +11: stmfd sp!, {r5, r6, r8 - r10} - UNWIND( .fnstart ) - usave r4, lr - UNWIND( .save {r5 - r9} ) @ in new second stmfd block PLD( pld [r1, #0] ) PLD( subs r2, r2, #96 ) PLD( pld [r1, #28] ) @@ -219,35 +201,32 @@ PLD( pld [r1, #92] ) 12: PLD( pld [r1, #124] ) -13: ldr4w r1, r4, r5, r6, r7, abort=19f +13: ldr4w r1, r4, r5, r6, r8, abort=19f mov r3, lr, lspull #\pull subs r2, r2, #32 - ldr4w r1, r8, r9, ip, lr, abort=19f + ldr4w r1, r9, r10, ip, lr, abort=19f orr r3, r3, r4, lspush #\push mov r4, r4, lspull #\pull orr r4, r4, r5, lspush #\push mov r5, r5, lspull #\pull orr r5, r5, r6, lspush #\push mov r6, r6, lspull #\pull - orr r6, r6, r7, lspush #\push - mov r7, r7, lspull #\pull - orr r7, r7, r8, lspush #\push + orr r6, r6, r8, lspush #\push mov r8, r8, lspull #\pull orr r8, r8, r9, lspush #\push mov r9, r9, lspull #\pull - orr r9, r9, ip, lspush #\push + orr r9, r9, r10, lspush #\push + mov r10, r10, lspull #\pull + orr r10, r10, ip, lspush #\push mov ip, ip, lspull #\pull orr ip, ip, lr, lspush #\push - str8w r0, r3, r4, r5, r6, r7, r8, r9, ip, abort=19f + str8w r0, r3, r4, r5, r6, r8, r9, r10, ip, abort=19f bge 12b PLD( cmn r2, #96 ) PLD( bge 13b ) - ldmfd sp!, {r5 - r9} - UNWIND( .fnend ) @ end of the second stmfd block + ldmfd sp!, {r5, r6, r8 - r10} - UNWIND( .fnstart ) - usave r4, lr @ still in first stmdb block 14: ands ip, r2, #28 beq 16f @@ -262,7 +241,6 @@ 16: sub r1, r1, #(\push / 8) b 8b - UNWIND( .fnend ) .endm @@ -273,6 +251,7 @@ 18: forward_copy_shift pull=24 push=8 + UNWIND( .fnend ) /* * Abort preamble and completion macros. @@ -282,13 +261,13 @@ */ .macro copy_abort_preamble -19: ldmfd sp!, {r5 - r9} +19: ldmfd sp!, {r5, r6, r8 - r10} b 21f -20: ldmfd sp!, {r5 - r8} +20: ldmfd sp!, {r5, r6, r8, r9} 21: .endm .macro copy_abort_end - ldmfd sp!, {r4, pc} + ldmfd sp!, {r4, UNWIND(fpreg,) pc} .endm diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S index 842ea5ede485..fac49e57cc0b 100644 --- a/arch/arm/lib/copy_to_user.S +++ b/arch/arm/lib/copy_to_user.S @@ -90,18 +90,15 @@ strusr \reg, \ptr, 1, \cond, abort=\abort .endm - .macro enter reg1 reg2 + .macro enter regs:vararg mov r3, #0 - stmdb sp!, {r0, r2, r3, \reg1, \reg2} +UNWIND( .save {r0, r2, r3, \regs} ) + stmdb sp!, {r0, r2, r3, \regs} .endm - .macro usave reg1 reg2 - UNWIND( .save {r0, r2, r3, \reg1, \reg2} ) - .endm - - .macro exit reg1 reg2 + .macro exit regs:vararg add sp, sp, #8 - ldmfd sp!, {r0, \reg1, \reg2} + ldmfd sp!, {r0, \regs} .endm .text diff --git a/arch/arm/lib/memcpy.S b/arch/arm/lib/memcpy.S index e4caf48c089f..90f2b645aa0d 100644 --- a/arch/arm/lib/memcpy.S +++ b/arch/arm/lib/memcpy.S @@ -42,16 +42,13 @@ strb\cond \reg, [\ptr], #1 .endm - .macro enter reg1 reg2 - stmdb sp!, {r0, \reg1, \reg2} + .macro enter regs:vararg +UNWIND( .save {r0, \regs} ) + stmdb sp!, {r0, \regs} .endm - .macro usave reg1 reg2 - UNWIND( .save {r0, \reg1, \reg2} ) - .endm - - .macro exit reg1 reg2 - ldmfd sp!, {r0, \reg1, \reg2} + .macro exit regs:vararg + ldmfd sp!, {r0, \regs} .endm .text -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 2/7] ARM: memmove: use frame pointer as unwind anchor 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 1/7] ARM: memcpy: use frame pointer as unwind anchor Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 3/7] ARM: memset: clean up unwind annotations Ard Biesheuvel ` (4 subsequent siblings) 6 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren The memmove routine is a bit unusual in the way it manages the stack pointer: depending on the execution path through the function, the SP assumes different values as different subsets of the register file are preserved and restored again. This is problematic when it comes to EHABI unwind info, as it is not instruction accurate, and does not allow tracking the SP value as it changes. Commit 207a6cb06990c ("ARM: 8224/1: Add unwinding support for memmove function") addressed this by carving up the function in different chunks as far as the unwinder is concerned, and keeping a set of unwind directives for each of them, each corresponding with the state of the stack pointer during execution of the chunk in question. This not only duplicates unwind info unnecessarily, but it also complicates unwinding the stack upon overflow. Instead, let's do what the compiler does when the SP is updated halfway through a function, which is to use a frame pointer and emit the appropriate unwind directives to communicate this to the unwinder. Note that Thumb-2 uses R7 for this, while ARM uses R11 aka FP. So let's avoid touching R7 in the body of the function, so that Thumb-2 can use it as the frame pointer. R11 was not modified in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> --- arch/arm/lib/memmove.S | 60 +++++++------------- 1 file changed, 20 insertions(+), 40 deletions(-) diff --git a/arch/arm/lib/memmove.S b/arch/arm/lib/memmove.S index 6fecc12a1f51..6410554039fd 100644 --- a/arch/arm/lib/memmove.S +++ b/arch/arm/lib/memmove.S @@ -31,12 +31,13 @@ WEAK(memmove) subs ip, r0, r1 cmphi r2, ip bls __memcpy - - stmfd sp!, {r0, r4, lr} UNWIND( .fnend ) UNWIND( .fnstart ) - UNWIND( .save {r0, r4, lr} ) @ in first stmfd block + UNWIND( .save {r0, r4, fpreg, lr} ) + stmfd sp!, {r0, r4, UNWIND(fpreg,) lr} + UNWIND( .setfp fpreg, sp ) + UNWIND( mov fpreg, sp ) add r1, r1, r2 add r0, r0, r2 subs r2, r2, #4 @@ -48,12 +49,7 @@ WEAK(memmove) bne 10f 1: subs r2, r2, #(28) - stmfd sp!, {r5 - r8} - UNWIND( .fnend ) - - UNWIND( .fnstart ) - UNWIND( .save {r0, r4, lr} ) - UNWIND( .save {r5 - r8} ) @ in second stmfd block + stmfd sp!, {r5, r6, r8, r9} blt 5f CALGN( ands ip, r0, #31 ) @@ -72,9 +68,9 @@ WEAK(memmove) PLD( pld [r1, #-96] ) 3: PLD( pld [r1, #-128] ) -4: ldmdb r1!, {r3, r4, r5, r6, r7, r8, ip, lr} +4: ldmdb r1!, {r3, r4, r5, r6, r8, r9, ip, lr} subs r2, r2, #32 - stmdb r0!, {r3, r4, r5, r6, r7, r8, ip, lr} + stmdb r0!, {r3, r4, r5, r6, r8, r9, ip, lr} bge 3b PLD( cmn r2, #96 ) PLD( bge 4b ) @@ -88,8 +84,8 @@ WEAK(memmove) W(ldr) r4, [r1, #-4]! W(ldr) r5, [r1, #-4]! W(ldr) r6, [r1, #-4]! - W(ldr) r7, [r1, #-4]! W(ldr) r8, [r1, #-4]! + W(ldr) r9, [r1, #-4]! W(ldr) lr, [r1, #-4]! add pc, pc, ip @@ -99,17 +95,13 @@ WEAK(memmove) W(str) r4, [r0, #-4]! W(str) r5, [r0, #-4]! W(str) r6, [r0, #-4]! - W(str) r7, [r0, #-4]! W(str) r8, [r0, #-4]! + W(str) r9, [r0, #-4]! W(str) lr, [r0, #-4]! CALGN( bcs 2b ) -7: ldmfd sp!, {r5 - r8} - UNWIND( .fnend ) @ end of second stmfd block - - UNWIND( .fnstart ) - UNWIND( .save {r0, r4, lr} ) @ still in first stmfd block +7: ldmfd sp!, {r5, r6, r8, r9} 8: movs r2, r2, lsl #31 ldrbne r3, [r1, #-1]! @@ -118,7 +110,7 @@ WEAK(memmove) strbne r3, [r0, #-1]! strbcs r4, [r0, #-1]! strbcs ip, [r0, #-1] - ldmfd sp!, {r0, r4, pc} + ldmfd sp!, {r0, r4, UNWIND(fpreg,) pc} 9: cmp ip, #2 ldrbgt r3, [r1, #-1]! @@ -137,13 +129,10 @@ WEAK(memmove) ldr r3, [r1, #0] beq 17f blt 18f - UNWIND( .fnend ) .macro backward_copy_shift push pull - UNWIND( .fnstart ) - UNWIND( .save {r0, r4, lr} ) @ still in first stmfd block subs r2, r2, #28 blt 14f @@ -152,12 +141,7 @@ WEAK(memmove) CALGN( subcc r2, r2, ip ) CALGN( bcc 15f ) -11: stmfd sp!, {r5 - r9} - UNWIND( .fnend ) - - UNWIND( .fnstart ) - UNWIND( .save {r0, r4, lr} ) - UNWIND( .save {r5 - r9} ) @ in new second stmfd block +11: stmfd sp!, {r5, r6, r8 - r10} PLD( pld [r1, #-4] ) PLD( subs r2, r2, #96 ) @@ -167,35 +151,31 @@ WEAK(memmove) PLD( pld [r1, #-96] ) 12: PLD( pld [r1, #-128] ) -13: ldmdb r1!, {r7, r8, r9, ip} +13: ldmdb r1!, {r8, r9, r10, ip} mov lr, r3, lspush #\push subs r2, r2, #32 ldmdb r1!, {r3, r4, r5, r6} orr lr, lr, ip, lspull #\pull mov ip, ip, lspush #\push - orr ip, ip, r9, lspull #\pull + orr ip, ip, r10, lspull #\pull + mov r10, r10, lspush #\push + orr r10, r10, r9, lspull #\pull mov r9, r9, lspush #\push orr r9, r9, r8, lspull #\pull mov r8, r8, lspush #\push - orr r8, r8, r7, lspull #\pull - mov r7, r7, lspush #\push - orr r7, r7, r6, lspull #\pull + orr r8, r8, r6, lspull #\pull mov r6, r6, lspush #\push orr r6, r6, r5, lspull #\pull mov r5, r5, lspush #\push orr r5, r5, r4, lspull #\pull mov r4, r4, lspush #\push orr r4, r4, r3, lspull #\pull - stmdb r0!, {r4 - r9, ip, lr} + stmdb r0!, {r4 - r6, r8 - r10, ip, lr} bge 12b PLD( cmn r2, #96 ) PLD( bge 13b ) - ldmfd sp!, {r5 - r9} - UNWIND( .fnend ) @ end of the second stmfd block - - UNWIND( .fnstart ) - UNWIND( .save {r0, r4, lr} ) @ still in first stmfd block + ldmfd sp!, {r5, r6, r8 - r10} 14: ands ip, r2, #28 beq 16f @@ -211,7 +191,6 @@ WEAK(memmove) 16: add r1, r1, #(\pull / 8) b 8b - UNWIND( .fnend ) .endm @@ -222,5 +201,6 @@ WEAK(memmove) 18: backward_copy_shift push=24 pull=8 + UNWIND( .fnend ) ENDPROC(memmove) ENDPROC(__memmove) -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 3/7] ARM: memset: clean up unwind annotations 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 1/7] ARM: memcpy: use frame pointer as unwind anchor Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 2/7] ARM: memmove: " Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 4/7] ARM: unwind: disregard unwind info before stack frame is set up Ard Biesheuvel ` (3 subsequent siblings) 6 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren The memset implementation carves up the code in different sections, each covered with their own unwind info. In this case, it is done in a way similar to how the compiler might do it, to disambiguate between parts where the return address is in LR and the SP is unmodified, and parts where a stack frame is live, and the unwinder needs to know the size of the stack frame and the location of the return address within it. Only the placement of the unwind directives is slightly odd: the stack pushes are placed in the wrong sections, which may confuse the unwinder when attempting to unwind with PC pointing at the stack push in question. So let's fix this up, by reordering the directives and instructions as appropriate. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> --- arch/arm/lib/memset.S | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S index 9817cb258c1a..d71ab61430b2 100644 --- a/arch/arm/lib/memset.S +++ b/arch/arm/lib/memset.S @@ -28,16 +28,16 @@ UNWIND( .fnstart ) mov r3, r1 7: cmp r2, #16 blt 4f +UNWIND( .fnend ) #if ! CALGN(1)+0 /* * We need 2 extra registers for this loop - use r8 and the LR */ - stmfd sp!, {r8, lr} -UNWIND( .fnend ) UNWIND( .fnstart ) UNWIND( .save {r8, lr} ) + stmfd sp!, {r8, lr} mov r8, r1 mov lr, r3 @@ -66,10 +66,9 @@ UNWIND( .fnend ) * whole cache lines at once. */ - stmfd sp!, {r4-r8, lr} -UNWIND( .fnend ) UNWIND( .fnstart ) UNWIND( .save {r4-r8, lr} ) + stmfd sp!, {r4-r8, lr} mov r4, r1 mov r5, r3 mov r6, r1 -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 4/7] ARM: unwind: disregard unwind info before stack frame is set up 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel ` (2 preceding siblings ...) 2021-11-22 9:28 ` [PATCH v4 3/7] ARM: memset: clean up unwind annotations Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 5/7] ARM: switch_to: clean up Thumb2 code path Ard Biesheuvel ` (2 subsequent siblings) 6 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren When unwinding the stack from a stack overflow, we are likely to start from a stack push instruction, given that this is the most common way to grow the stack for compiler emitted code. This push instruction rarely appears anywhere else than at offset 0x0 of the function, and if it doesn't, the compiler tends to split up the unwind annotations, given that the stack frame layout is apparently not the same throughout the function. This means that, in the general case, if the frame's PC points at the first instruction covered by a certain unwind entry, there is no way the stack frame that the unwind entry describes could have been created yet, and so we are still on the stack frame of the caller in that case. So treat this as a special case, and return with the new PC taken from the frame's LR, without applying the unwind transformations to the virtual register set. This permits us to unwind the call stack on stack overflow when the overflow was caused by a stack push on function entry. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> --- arch/arm/kernel/unwind.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c index b7a6141c342f..e8d729975f12 100644 --- a/arch/arm/kernel/unwind.c +++ b/arch/arm/kernel/unwind.c @@ -411,7 +411,21 @@ int unwind_frame(struct stackframe *frame) if (idx->insn == 1) /* can't unwind */ return -URC_FAILURE; - else if ((idx->insn & 0x80000000) == 0) + else if (frame->pc == prel31_to_addr(&idx->addr_offset)) { + /* + * Unwinding is tricky when we're halfway through the prologue, + * since the stack frame that the unwinder expects may not be + * fully set up yet. However, one thing we do know for sure is + * that if we are unwinding from the very first instruction of + * a function, we are still effectively in the stack frame of + * the caller, and the unwind info has no relevance yet. + */ + if (frame->pc == frame->lr) + return -URC_FAILURE; + frame->sp_low = frame->sp; + frame->pc = frame->lr; + return URC_OK; + } else if ((idx->insn & 0x80000000) == 0) /* prel31 to the unwind table */ ctrl.insn = (unsigned long *)prel31_to_addr(&idx->insn); else if ((idx->insn & 0xff000000) == 0x80000000) -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 5/7] ARM: switch_to: clean up Thumb2 code path 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel ` (3 preceding siblings ...) 2021-11-22 9:28 ` [PATCH v4 4/7] ARM: unwind: disregard unwind info before stack frame is set up Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 6/7] ARM: entry: rework stack realignment code in svc_entry Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 7/7] ARM: implement support for vmap'ed stacks Ard Biesheuvel 6 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren The load-multiple instruction that essentially performs the switch_to operation in ARM mode, by loading all callee save registers as well the stack pointer and the program counter, is split into 3 separate loads for Thumb-2, with the IP register used as a temporary to capture the value of R4 before it gets overwritten. We can clean this up a bit, by sticking with a single LDMIA instruction, but one that pops SP and PC into IP and LR, respectively, and by using ordinary move register and branch instructions to get those values into SP and PC. This also allows us to move the set_current call closer to the assignment of SP, reducing the window where those are mutually out of sync. This is especially relevant for CONFIG_VMAP_STACK, which is being introduced in a subsequent patch, where we need to issue a load that might fault from the new stack while running from the old one, to ensure that stale PMD entries in the VMALLOC space are synced up. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> --- arch/arm/kernel/entry-armv.S | 23 +++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 1c7590eef712..ce8ca29461de 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -823,13 +823,26 @@ ENTRY(__switch_to) #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) str r7, [r8] #endif - THUMB( mov ip, r4 ) mov r0, r5 +#if !defined(CONFIG_THUMB2_KERNEL) set_current r7 - ARM( ldmia r4, {r4 - sl, fp, sp, pc} ) @ Load all regs saved previously - THUMB( ldmia ip!, {r4 - sl, fp} ) @ Load all regs saved previously - THUMB( ldr sp, [ip], #4 ) - THUMB( ldr pc, [ip] ) + ldmia r4, {r4 - sl, fp, sp, pc} @ Load all regs saved previously +#else + mov r1, r7 + ldmia r4, {r4 - sl, fp, ip, lr} @ Load all regs saved previously + + @ When CONFIG_THREAD_INFO_IN_TASK=n, the update of SP itself is what + @ effectuates the task switch, as that is what causes the observable + @ values of current and current_thread_info to change. When + @ CONFIG_THREAD_INFO_IN_TASK=y, setting current (and therefore + @ current_thread_info) is done explicitly, and the update of SP just + @ switches us to another stack, with few other side effects. In order + @ to prevent this distinction from causing any inconsistencies, let's + @ keep the 'set_current' call as close as we can to the update of SP. + set_current r1 + mov sp, ip + ret lr +#endif UNWIND(.fnend ) ENDPROC(__switch_to) -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 6/7] ARM: entry: rework stack realignment code in svc_entry 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel ` (4 preceding siblings ...) 2021-11-22 9:28 ` [PATCH v4 5/7] ARM: switch_to: clean up Thumb2 code path Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 7/7] ARM: implement support for vmap'ed stacks Ard Biesheuvel 6 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren The original Thumb-2 enablement patches updated the stack realignment code in svc_entry to work around the lack of a STMIB instruction in Thumb-2, by subtracting 4 from the frame size, inverting the sense of the misaligment check, and changing to a STMIA instruction and a final stack push of a 4 byte quantity that results in the stack becoming aligned at the end of the sequence. It also pushes and pops R0 to the stack in order to have a temp register that Thumb-2 allows in general purpose ALU instructions, as TST using SP is not permitted. Both are a bit problematic for vmap'ed stacks, as using the stack is only permitted after we decide that we did not overflow the stack, or have already switched to the overflow stack. As for the alignment check: the current approach creates a corner case where, if the initial SUB of SP ends up right at the start of the stack, we will end up subtracting another 8 bytes and overflowing it. This means we would need to add the overflow check *after* the SUB that deliberately misaligns the stack. However, this would require us to keep local state (i.e., whether we performed the subtract or not) across the overflow check, but without any GPRs or stack available. So let's switch to an approach where we don't use the stack, and where the alignment check of the stack pointer occurs in the usual way, as this is guaranteed not to result in overflow. This means we will be able to do the overflow check first. While at it, switch to R1 so the mode stack pointer in R0 remains accesible. Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- arch/arm/kernel/entry-armv.S | 25 +++++++++++--------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index ce8ca29461de..b447f7d0708c 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -191,24 +191,27 @@ ENDPROC(__und_invalid) .macro svc_entry, stack_hole=0, trace=1, uaccess=1 UNWIND(.fnstart ) UNWIND(.save {r0 - pc} ) - sub sp, sp, #(SVC_REGS_SIZE + \stack_hole - 4) + sub sp, sp, #(SVC_REGS_SIZE + \stack_hole) #ifdef CONFIG_THUMB2_KERNEL - SPFIX( str r0, [sp] ) @ temporarily saved - SPFIX( mov r0, sp ) - SPFIX( tst r0, #4 ) @ test original stack alignment - SPFIX( ldr r0, [sp] ) @ restored + add sp, r1 @ get SP in a GPR without + sub r1, sp, r1 @ using a temp register + tst r1, #4 @ test stack pointer alignment + sub r1, sp, r1 @ restore original R0 + sub sp, r1 @ restore original SP #else SPFIX( tst sp, #4 ) #endif - SPFIX( subeq sp, sp, #4 ) - stmia sp, {r1 - r12} + SPFIX( subne sp, sp, #4 ) + + ARM( stmib sp, {r1 - r12} ) + THUMB( stmia sp, {r0 - r12} ) @ No STMIB in Thumb-2 ldmia r0, {r3 - r5} - add r7, sp, #S_SP - 4 @ here for interlock avoidance + add r7, sp, #S_SP @ here for interlock avoidance mov r6, #-1 @ "" "" "" "" - add r2, sp, #(SVC_REGS_SIZE + \stack_hole - 4) - SPFIX( addeq r2, r2, #4 ) - str r3, [sp, #-4]! @ save the "real" r0 copied + add r2, sp, #(SVC_REGS_SIZE + \stack_hole) + SPFIX( addne r2, r2, #4 ) + str r3, [sp] @ save the "real" r0 copied @ from the exception stack mov r3, lr -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel ` (5 preceding siblings ...) 2021-11-22 9:28 ` [PATCH v4 6/7] ARM: entry: rework stack realignment code in svc_entry Ard Biesheuvel @ 2021-11-22 9:28 ` Ard Biesheuvel [not found] ` <CGME20211221103854eucas1p2592e38fcc84c1c3506fce87f1dab6739@eucas1p2.samsung.com> 6 siblings, 1 reply; 27+ messages in thread From: Ard Biesheuvel @ 2021-11-22 9:28 UTC (permalink / raw) To: linux-arm-kernel Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren Wire up the generic support for managing task stack allocations via vmalloc, and implement the entry code that detects whether we faulted because of a stack overrun (or future stack overrun caused by pushing the pt_regs array) While this adds a fair amount of tricky entry asm code, it should be noted that it only adds a TST + branch to the svc_entry path. The code implementing the non-trivial handling of the overflow stack is emitted out-of-line into the .text section. Since on ARM, we rely on do_translation_fault() to keep PMD level page table entries that cover the vmalloc region up to date, we need to ensure that we don't hit such a stale PMD entry when accessing the stack. So we do a dummy read from the new stack while still running from the old one on the context switch path, and bump the vmalloc_seq counter when PMD level entries in the vmalloc range are modified, so that the MM switch fetches the latest version of the entries. Note that we need to increase the per-mode stack by 1 word, to gain some space to stash a GPR until we know it is safe to touch the stack. However, due to the cacheline alignment of the struct, this does not actually increase the memory footprint of the struct stack array at all. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> --- arch/arm/Kconfig | 1 + arch/arm/include/asm/page.h | 4 + arch/arm/include/asm/thread_info.h | 8 ++ arch/arm/kernel/entry-armv.S | 97 +++++++++++++++++++- arch/arm/kernel/entry-header.S | 37 ++++++++ arch/arm/kernel/irq.c | 9 +- arch/arm/kernel/setup.c | 8 +- arch/arm/kernel/sleep.S | 8 ++ arch/arm/kernel/traps.c | 80 +++++++++++++++- arch/arm/kernel/unwind.c | 3 +- arch/arm/kernel/vmlinux.lds.S | 4 +- 11 files changed, 244 insertions(+), 15 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index b1eba1b4168c..7a0853bd298f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -127,6 +127,7 @@ config ARM select RTC_LIB select SYS_SUPPORTS_APM_EMULATION select THREAD_INFO_IN_TASK if CURRENT_POINTER_IN_TPIDRURO + select HAVE_ARCH_VMAP_STACK if THREAD_INFO_IN_TASK && (!LD_IS_LLD || LLD_VERSION >= 140000) select TRACE_IRQFLAGS_SUPPORT if !CPU_V7M # Above selects are sorted alphabetically; please add new ones # according to that. Thanks. diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h index 11b058a72a5b..7b871ed99ccf 100644 --- a/arch/arm/include/asm/page.h +++ b/arch/arm/include/asm/page.h @@ -149,6 +149,10 @@ extern void copy_page(void *to, const void *from); #include <asm/pgtable-2level-types.h> #endif +#ifdef CONFIG_VMAP_STACK +#define ARCH_PAGE_TABLE_SYNC_MASK PGTBL_PMD_MODIFIED +#endif + #endif /* CONFIG_MMU */ typedef struct page *pgtable_t; diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index 164e15f26485..004b89d86224 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -25,6 +25,14 @@ #define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER) #define THREAD_START_SP (THREAD_SIZE - 8) +#ifdef CONFIG_VMAP_STACK +#define THREAD_ALIGN (2 * THREAD_SIZE) +#else +#define THREAD_ALIGN THREAD_SIZE +#endif + +#define OVERFLOW_STACK_SIZE SZ_4K + #ifndef __ASSEMBLY__ struct task_struct; diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index b447f7d0708c..54210dce80e1 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -57,6 +57,10 @@ UNWIND( .setfp fpreg, sp ) @ subs r2, sp, r0 @ SP above bottom of IRQ stack? rsbscs r2, r2, #THREAD_SIZE @ ... and below the top? +#ifdef CONFIG_VMAP_STACK + ldr_l r2, high_memory, cc @ End of the linear region + cmpcc r2, r0 @ Stack pointer was below it? +#endif movcs sp, r0 @ If so, revert to incoming SP #ifndef CONFIG_UNWINDER_ARM @@ -188,13 +192,18 @@ ENDPROC(__und_invalid) #define SPFIX(code...) #endif - .macro svc_entry, stack_hole=0, trace=1, uaccess=1 + .macro svc_entry, stack_hole=0, trace=1, uaccess=1, overflow_check=1 UNWIND(.fnstart ) - UNWIND(.save {r0 - pc} ) sub sp, sp, #(SVC_REGS_SIZE + \stack_hole) + THUMB( add sp, r1 ) @ get SP in a GPR without + THUMB( sub r1, sp, r1 ) @ using a temp register + + .if \overflow_check + UNWIND(.save {r0 - pc} ) + do_overflow_check (SVC_REGS_SIZE + \stack_hole) + .endif + #ifdef CONFIG_THUMB2_KERNEL - add sp, r1 @ get SP in a GPR without - sub r1, sp, r1 @ using a temp register tst r1, #4 @ test stack pointer alignment sub r1, sp, r1 @ restore original R0 sub sp, r1 @ restore original SP @@ -827,12 +836,20 @@ ENTRY(__switch_to) str r7, [r8] #endif mov r0, r5 -#if !defined(CONFIG_THUMB2_KERNEL) +#if !defined(CONFIG_THUMB2_KERNEL) && !defined(CONFIG_VMAP_STACK) set_current r7 ldmia r4, {r4 - sl, fp, sp, pc} @ Load all regs saved previously #else mov r1, r7 ldmia r4, {r4 - sl, fp, ip, lr} @ Load all regs saved previously +#ifdef CONFIG_VMAP_STACK + @ + @ Do a dummy read from the new stack while running from the old one so + @ that we can rely on do_translation_fault() to fix up any stale PMD + @ entries covering the vmalloc region. + @ + ldr r2, [ip] +#endif @ When CONFIG_THREAD_INFO_IN_TASK=n, the update of SP itself is what @ effectuates the task switch, as that is what causes the observable @@ -849,6 +866,76 @@ ENTRY(__switch_to) UNWIND(.fnend ) ENDPROC(__switch_to) +#ifdef CONFIG_VMAP_STACK + .text + .align 2 +__bad_stack: + @ + @ We've just detected an overflow. We need to load the address of this + @ CPU's overflow stack into the stack pointer register. We have only one + @ scratch register so let's use a sequence of ADDs including one + @ involving the PC, and decorate them with PC-relative group + @ relocations. As these are ARM only, switch to ARM mode first. + @ + @ We enter here with IP clobbered and its value stashed on the mode + @ stack. + @ +THUMB( bx pc ) +THUMB( nop ) +THUMB( .arm ) + mrc p15, 0, ip, c13, c0, 4 @ Get per-CPU offset + + .globl overflow_stack_ptr + .reloc 0f, R_ARM_ALU_PC_G0_NC, overflow_stack_ptr + .reloc 1f, R_ARM_ALU_PC_G1_NC, overflow_stack_ptr + .reloc 2f, R_ARM_LDR_PC_G2, overflow_stack_ptr + add ip, ip, pc +0: add ip, ip, #-4 +1: add ip, ip, #0 +2: ldr ip, [ip, #4] + + str sp, [ip, #-4]! @ Preserve original SP value + mov sp, ip @ Switch to overflow stack + pop {ip} @ Original SP in IP + +#if defined(CONFIG_UNWINDER_FRAME_POINTER) && defined(CONFIG_CC_IS_GCC) + mov ip, ip @ mov expected by unwinder + push {fp, ip, lr, pc} @ GCC flavor frame record +#else + str ip, [sp, #-8]! @ store original SP + push {fpreg, lr} @ Clang flavor frame record +#endif +UNWIND( ldr ip, [r0, #4] ) @ load exception LR +UNWIND( str ip, [sp, #12] ) @ store in the frame record + ldr ip, [r0, #12] @ reload IP + + @ Store the original GPRs to the new stack. + svc_entry uaccess=0, overflow_check=0 + +UNWIND( .save {sp, pc} ) +UNWIND( .save {fpreg, lr} ) +UNWIND( .setfp fpreg, sp ) + + ldr fpreg, [sp, #S_SP] @ Add our frame record + @ to the linked list +#if defined(CONFIG_UNWINDER_FRAME_POINTER) && defined(CONFIG_CC_IS_GCC) + ldr r1, [fp, #4] @ reload SP at entry + add fp, fp, #12 +#else + ldr r1, [fpreg, #8] +#endif + str r1, [sp, #S_SP] @ store in pt_regs + + @ Stash the regs for handle_bad_stack + mov r0, sp + + @ Time to die + bl handle_bad_stack + nop +UNWIND( .fnend ) +ENDPROC(__bad_stack) +#endif + __INIT /* diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S index ae24dd54e9ef..81df2a3561ca 100644 --- a/arch/arm/kernel/entry-header.S +++ b/arch/arm/kernel/entry-header.S @@ -423,3 +423,40 @@ scno .req r7 @ syscall number tbl .req r8 @ syscall table pointer why .req r8 @ Linux syscall (!= 0) tsk .req r9 @ current thread_info + + .macro do_overflow_check, frame_size:req +#ifdef CONFIG_VMAP_STACK + @ + @ Test whether the SP has overflowed. Task and IRQ stacks are aligned + @ so that SP & BIT(THREAD_SIZE_ORDER + PAGE_SHIFT) should always be + @ zero. + @ +ARM( tst sp, #1 << (THREAD_SIZE_ORDER + PAGE_SHIFT) ) +THUMB( tst r1, #1 << (THREAD_SIZE_ORDER + PAGE_SHIFT) ) +THUMB( it ne ) + bne .Lstack_overflow_check\@ + + .pushsection .text +.Lstack_overflow_check\@: + @ + @ The stack pointer is not pointing to a valid vmap'ed stack, but it + @ may be pointing into the linear map instead, which may happen if we + @ are already running from the overflow stack. We cannot detect overflow + @ in such cases so just carry on. + @ + str ip, [r0, #12] @ Stash IP on the mode stack + ldr_l ip, high_memory @ Start of VMALLOC space +ARM( cmp sp, ip ) @ SP in vmalloc space? +THUMB( cmp r1, ip ) +THUMB( itt lo ) + ldrlo ip, [r0, #12] @ Restore IP + blo .Lout\@ @ Carry on + +THUMB( sub r1, sp, r1 ) @ Restore original R1 +THUMB( sub sp, r1 ) @ Restore original SP + add sp, sp, #\frame_size @ Undo svc_entry's SP change + b __bad_stack @ Handle VMAP stack overflow + .popsection +.Lout\@: +#endif + .endm diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c index e05219bca218..5deb40f39999 100644 --- a/arch/arm/kernel/irq.c +++ b/arch/arm/kernel/irq.c @@ -56,7 +56,14 @@ static void __init init_irq_stacks(void) int cpu; for_each_possible_cpu(cpu) { - stack = (u8 *)__get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER); + if (!IS_ENABLED(CONFIG_VMAP_STACK)) + stack = (u8 *)__get_free_pages(GFP_KERNEL, + THREAD_SIZE_ORDER); + else + stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN, + THREADINFO_GFP, NUMA_NO_NODE, + __builtin_return_address(0)); + if (WARN_ON(!stack)) break; per_cpu(irq_stack_ptr, cpu) = &stack[THREAD_SIZE]; diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 284a80c0b6e1..039feb7cd590 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -141,10 +141,10 @@ EXPORT_SYMBOL(outer_cache); int __cpu_architecture __read_mostly = CPU_ARCH_UNKNOWN; struct stack { - u32 irq[3]; - u32 abt[3]; - u32 und[3]; - u32 fiq[3]; + u32 irq[4]; + u32 abt[4]; + u32 und[4]; + u32 fiq[4]; } ____cacheline_aligned; #ifndef CONFIG_CPU_V7M diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S index 43077e11dafd..803b51e5cba0 100644 --- a/arch/arm/kernel/sleep.S +++ b/arch/arm/kernel/sleep.S @@ -67,6 +67,14 @@ ENTRY(__cpu_suspend) ldr r4, =cpu_suspend_size #endif mov r5, sp @ current virtual SP +#ifdef CONFIG_VMAP_STACK + @ Run the suspend code from the overflow stack so we don't have to rely + @ on vmalloc-to-phys conversions anywhere in the arch suspend code. + @ The original SP value captured in R5 will be restored on the way out. + mov_l r6, overflow_stack_ptr @ Base pointer + mrc p15, 0, r7, c13, c0, 4 @ Get per-CPU offset + ldr sp, [r6, r7] @ Address of this CPU's overflow stack +#endif add r4, r4, #12 @ Space for pgd, virt sp, phys resume fn sub sp, sp, r4 @ allocate CPU state on stack ldr r3, =sleep_save_sp diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index b42c446cec9a..b28a705c49cb 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -121,7 +121,8 @@ void dump_backtrace_stm(u32 *stack, u32 instruction, const char *loglvl) static int verify_stack(unsigned long sp) { if (sp < PAGE_OFFSET || - (sp > (unsigned long)high_memory && high_memory != NULL)) + (!IS_ENABLED(CONFIG_VMAP_STACK) && + sp > (unsigned long)high_memory && high_memory != NULL)) return -EFAULT; return 0; @@ -291,7 +292,8 @@ static int __die(const char *str, int err, struct pt_regs *regs) if (!user_mode(regs) || in_interrupt()) { dump_mem(KERN_EMERG, "Stack: ", regs->ARM_sp, - ALIGN(regs->ARM_sp, THREAD_SIZE)); + ALIGN(regs->ARM_sp - THREAD_SIZE, THREAD_ALIGN) + + THREAD_SIZE); dump_backtrace(regs, tsk, KERN_EMERG); dump_instr(KERN_EMERG, regs); } @@ -838,3 +840,77 @@ void __init early_trap_init(void *vectors_base) */ #endif } + +#ifdef CONFIG_VMAP_STACK + +DECLARE_PER_CPU(u8 *, irq_stack_ptr); + +asmlinkage DEFINE_PER_CPU(u8 *, overflow_stack_ptr); + +static int __init allocate_overflow_stacks(void) +{ + u8 *stack; + int cpu; + + for_each_possible_cpu(cpu) { + stack = (u8 *)__get_free_page(GFP_KERNEL); + if (WARN_ON(!stack)) + return -ENOMEM; + per_cpu(overflow_stack_ptr, cpu) = &stack[OVERFLOW_STACK_SIZE]; + } + return 0; +} +early_initcall(allocate_overflow_stacks); + +asmlinkage void handle_bad_stack(struct pt_regs *regs) +{ + unsigned long tsk_stk = (unsigned long)current->stack; + unsigned long irq_stk = (unsigned long)this_cpu_read(irq_stack_ptr); + unsigned long ovf_stk = (unsigned long)this_cpu_read(overflow_stack_ptr); + + console_verbose(); + pr_emerg("Insufficient stack space to handle exception!"); + + pr_emerg("Task stack: [0x%08lx..0x%08lx]\n", + tsk_stk, tsk_stk + THREAD_SIZE); + pr_emerg("IRQ stack: [0x%08lx..0x%08lx]\n", + irq_stk - THREAD_SIZE, irq_stk); + pr_emerg("Overflow stack: [0x%08lx..0x%08lx]\n", + ovf_stk - OVERFLOW_STACK_SIZE, ovf_stk); + + die("kernel stack overflow", regs, 0); +} + +/* + * Normally, we rely on the logic in do_translation_fault() to update stale PMD + * entries covering the vmalloc space in a task's page tables when it first + * accesses the region in question. Unfortunately, this is not sufficient when + * the task stack resides in the vmalloc region, as do_translation_fault() is a + * C function that needs a stack to run. + * + * So we need to ensure that these PMD entries are up to date *before* the MM + * switch. As we already have some logic in the MM switch path that takes care + * of this, let's trigger it by bumping the counter every time the core vmalloc + * code modifies a PMD entry in the vmalloc region. + */ +void arch_sync_kernel_mappings(unsigned long start, unsigned long end) +{ + if (start > VMALLOC_END || end < VMALLOC_START) + return; + + /* + * This hooks into the core vmalloc code to receive notifications of + * any PMD level changes that have been made to the kernel page tables. + * This means it should only be triggered once for every MiB worth of + * vmalloc space, given that we don't support huge vmalloc/vmap on ARM, + * and that kernel PMD level table entries are rarely (if ever) + * updated. + * + * This means that the counter is going to max out at ~250 for the + * typical case. If it overflows, something entirely unexpected has + * occurred so let's throw a warning if that happens. + */ + WARN_ON(++init_mm.context.vmalloc_seq == UINT_MAX); +} + +#endif diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c index e8d729975f12..c5ea328c428d 100644 --- a/arch/arm/kernel/unwind.c +++ b/arch/arm/kernel/unwind.c @@ -389,7 +389,8 @@ int unwind_frame(struct stackframe *frame) /* store the highest address on the stack to avoid crossing it*/ ctrl.sp_low = frame->sp; - ctrl.sp_high = ALIGN(ctrl.sp_low, THREAD_SIZE); + ctrl.sp_high = ALIGN(ctrl.sp_low - THREAD_SIZE, THREAD_ALIGN) + + THREAD_SIZE; pr_debug("%s(pc = %08lx lr = %08lx sp = %08lx)\n", __func__, frame->pc, frame->lr, frame->sp); diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index f02d617e3359..aa12b65a7fd6 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -138,12 +138,12 @@ SECTIONS #ifdef CONFIG_STRICT_KERNEL_RWX . = ALIGN(1<<SECTION_SHIFT); #else - . = ALIGN(THREAD_SIZE); + . = ALIGN(THREAD_ALIGN); #endif __init_end = .; _sdata = .; - RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE) + RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN) _edata = .; BSS_SECTION(0, 0, 0) -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
[parent not found: <CGME20211221103854eucas1p2592e38fcc84c1c3506fce87f1dab6739@eucas1p2.samsung.com>]
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks [not found] ` <CGME20211221103854eucas1p2592e38fcc84c1c3506fce87f1dab6739@eucas1p2.samsung.com> @ 2021-12-21 10:38 ` Marek Szyprowski 2021-12-21 10:42 ` Krzysztof Kozlowski 2021-12-21 10:44 ` Ard Biesheuvel 0 siblings, 2 replies; 27+ messages in thread From: Marek Szyprowski @ 2021-12-21 10:38 UTC (permalink / raw) To: Ard Biesheuvel, linux-arm-kernel Cc: Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, 'Linux Samsung SOC' Hi, On 22.11.2021 10:28, Ard Biesheuvel wrote: > Wire up the generic support for managing task stack allocations via vmalloc, > and implement the entry code that detects whether we faulted because of a > stack overrun (or future stack overrun caused by pushing the pt_regs array) > > While this adds a fair amount of tricky entry asm code, it should be > noted that it only adds a TST + branch to the svc_entry path. The code > implementing the non-trivial handling of the overflow stack is emitted > out-of-line into the .text section. > > Since on ARM, we rely on do_translation_fault() to keep PMD level page > table entries that cover the vmalloc region up to date, we need to > ensure that we don't hit such a stale PMD entry when accessing the > stack. So we do a dummy read from the new stack while still running from > the old one on the context switch path, and bump the vmalloc_seq counter > when PMD level entries in the vmalloc range are modified, so that the MM > switch fetches the latest version of the entries. > > Note that we need to increase the per-mode stack by 1 word, to gain some > space to stash a GPR until we know it is safe to touch the stack. > However, due to the cacheline alignment of the struct, this does not > actually increase the memory footprint of the struct stack array at all. > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > Tested-by: Keith Packard <keithpac@amazon.com> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 ("ARM: implement support for vmap'ed stacks"). Sadly it breaks suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the suspend/resume related code must be updated somehow (it partially works on physical addresses and disabled MMU), but I didn't analyze it yet. If you have any hints, let me know. > --- > arch/arm/Kconfig | 1 + > arch/arm/include/asm/page.h | 4 + > arch/arm/include/asm/thread_info.h | 8 ++ > arch/arm/kernel/entry-armv.S | 97 +++++++++++++++++++- > arch/arm/kernel/entry-header.S | 37 ++++++++ > arch/arm/kernel/irq.c | 9 +- > arch/arm/kernel/setup.c | 8 +- > arch/arm/kernel/sleep.S | 8 ++ > arch/arm/kernel/traps.c | 80 +++++++++++++++- > arch/arm/kernel/unwind.c | 3 +- > arch/arm/kernel/vmlinux.lds.S | 4 +- > 11 files changed, 244 insertions(+), 15 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index b1eba1b4168c..7a0853bd298f 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -127,6 +127,7 @@ config ARM > select RTC_LIB > select SYS_SUPPORTS_APM_EMULATION > select THREAD_INFO_IN_TASK if CURRENT_POINTER_IN_TPIDRURO > + select HAVE_ARCH_VMAP_STACK if THREAD_INFO_IN_TASK && (!LD_IS_LLD || LLD_VERSION >= 140000) > select TRACE_IRQFLAGS_SUPPORT if !CPU_V7M > # Above selects are sorted alphabetically; please add new ones > # according to that. Thanks. > diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h > index 11b058a72a5b..7b871ed99ccf 100644 > --- a/arch/arm/include/asm/page.h > +++ b/arch/arm/include/asm/page.h > @@ -149,6 +149,10 @@ extern void copy_page(void *to, const void *from); > #include <asm/pgtable-2level-types.h> > #endif > > +#ifdef CONFIG_VMAP_STACK > +#define ARCH_PAGE_TABLE_SYNC_MASK PGTBL_PMD_MODIFIED > +#endif > + > #endif /* CONFIG_MMU */ > > typedef struct page *pgtable_t; > diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h > index 164e15f26485..004b89d86224 100644 > --- a/arch/arm/include/asm/thread_info.h > +++ b/arch/arm/include/asm/thread_info.h > @@ -25,6 +25,14 @@ > #define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER) > #define THREAD_START_SP (THREAD_SIZE - 8) > > +#ifdef CONFIG_VMAP_STACK > +#define THREAD_ALIGN (2 * THREAD_SIZE) > +#else > +#define THREAD_ALIGN THREAD_SIZE > +#endif > + > +#define OVERFLOW_STACK_SIZE SZ_4K > + > #ifndef __ASSEMBLY__ > > struct task_struct; > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > index b447f7d0708c..54210dce80e1 100644 > --- a/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -57,6 +57,10 @@ UNWIND( .setfp fpreg, sp ) > @ > subs r2, sp, r0 @ SP above bottom of IRQ stack? > rsbscs r2, r2, #THREAD_SIZE @ ... and below the top? > +#ifdef CONFIG_VMAP_STACK > + ldr_l r2, high_memory, cc @ End of the linear region > + cmpcc r2, r0 @ Stack pointer was below it? > +#endif > movcs sp, r0 @ If so, revert to incoming SP > > #ifndef CONFIG_UNWINDER_ARM > @@ -188,13 +192,18 @@ ENDPROC(__und_invalid) > #define SPFIX(code...) > #endif > > - .macro svc_entry, stack_hole=0, trace=1, uaccess=1 > + .macro svc_entry, stack_hole=0, trace=1, uaccess=1, overflow_check=1 > UNWIND(.fnstart ) > - UNWIND(.save {r0 - pc} ) > sub sp, sp, #(SVC_REGS_SIZE + \stack_hole) > + THUMB( add sp, r1 ) @ get SP in a GPR without > + THUMB( sub r1, sp, r1 ) @ using a temp register > + > + .if \overflow_check > + UNWIND(.save {r0 - pc} ) > + do_overflow_check (SVC_REGS_SIZE + \stack_hole) > + .endif > + > #ifdef CONFIG_THUMB2_KERNEL > - add sp, r1 @ get SP in a GPR without > - sub r1, sp, r1 @ using a temp register > tst r1, #4 @ test stack pointer alignment > sub r1, sp, r1 @ restore original R0 > sub sp, r1 @ restore original SP > @@ -827,12 +836,20 @@ ENTRY(__switch_to) > str r7, [r8] > #endif > mov r0, r5 > -#if !defined(CONFIG_THUMB2_KERNEL) > +#if !defined(CONFIG_THUMB2_KERNEL) && !defined(CONFIG_VMAP_STACK) > set_current r7 > ldmia r4, {r4 - sl, fp, sp, pc} @ Load all regs saved previously > #else > mov r1, r7 > ldmia r4, {r4 - sl, fp, ip, lr} @ Load all regs saved previously > +#ifdef CONFIG_VMAP_STACK > + @ > + @ Do a dummy read from the new stack while running from the old one so > + @ that we can rely on do_translation_fault() to fix up any stale PMD > + @ entries covering the vmalloc region. > + @ > + ldr r2, [ip] > +#endif > > @ When CONFIG_THREAD_INFO_IN_TASK=n, the update of SP itself is what > @ effectuates the task switch, as that is what causes the observable > @@ -849,6 +866,76 @@ ENTRY(__switch_to) > UNWIND(.fnend ) > ENDPROC(__switch_to) > > +#ifdef CONFIG_VMAP_STACK > + .text > + .align 2 > +__bad_stack: > + @ > + @ We've just detected an overflow. We need to load the address of this > + @ CPU's overflow stack into the stack pointer register. We have only one > + @ scratch register so let's use a sequence of ADDs including one > + @ involving the PC, and decorate them with PC-relative group > + @ relocations. As these are ARM only, switch to ARM mode first. > + @ > + @ We enter here with IP clobbered and its value stashed on the mode > + @ stack. > + @ > +THUMB( bx pc ) > +THUMB( nop ) > +THUMB( .arm ) > + mrc p15, 0, ip, c13, c0, 4 @ Get per-CPU offset > + > + .globl overflow_stack_ptr > + .reloc 0f, R_ARM_ALU_PC_G0_NC, overflow_stack_ptr > + .reloc 1f, R_ARM_ALU_PC_G1_NC, overflow_stack_ptr > + .reloc 2f, R_ARM_LDR_PC_G2, overflow_stack_ptr > + add ip, ip, pc > +0: add ip, ip, #-4 > +1: add ip, ip, #0 > +2: ldr ip, [ip, #4] > + > + str sp, [ip, #-4]! @ Preserve original SP value > + mov sp, ip @ Switch to overflow stack > + pop {ip} @ Original SP in IP > + > +#if defined(CONFIG_UNWINDER_FRAME_POINTER) && defined(CONFIG_CC_IS_GCC) > + mov ip, ip @ mov expected by unwinder > + push {fp, ip, lr, pc} @ GCC flavor frame record > +#else > + str ip, [sp, #-8]! @ store original SP > + push {fpreg, lr} @ Clang flavor frame record > +#endif > +UNWIND( ldr ip, [r0, #4] ) @ load exception LR > +UNWIND( str ip, [sp, #12] ) @ store in the frame record > + ldr ip, [r0, #12] @ reload IP > + > + @ Store the original GPRs to the new stack. > + svc_entry uaccess=0, overflow_check=0 > + > +UNWIND( .save {sp, pc} ) > +UNWIND( .save {fpreg, lr} ) > +UNWIND( .setfp fpreg, sp ) > + > + ldr fpreg, [sp, #S_SP] @ Add our frame record > + @ to the linked list > +#if defined(CONFIG_UNWINDER_FRAME_POINTER) && defined(CONFIG_CC_IS_GCC) > + ldr r1, [fp, #4] @ reload SP at entry > + add fp, fp, #12 > +#else > + ldr r1, [fpreg, #8] > +#endif > + str r1, [sp, #S_SP] @ store in pt_regs > + > + @ Stash the regs for handle_bad_stack > + mov r0, sp > + > + @ Time to die > + bl handle_bad_stack > + nop > +UNWIND( .fnend ) > +ENDPROC(__bad_stack) > +#endif > + > __INIT > > /* > diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S > index ae24dd54e9ef..81df2a3561ca 100644 > --- a/arch/arm/kernel/entry-header.S > +++ b/arch/arm/kernel/entry-header.S > @@ -423,3 +423,40 @@ scno .req r7 @ syscall number > tbl .req r8 @ syscall table pointer > why .req r8 @ Linux syscall (!= 0) > tsk .req r9 @ current thread_info > + > + .macro do_overflow_check, frame_size:req > +#ifdef CONFIG_VMAP_STACK > + @ > + @ Test whether the SP has overflowed. Task and IRQ stacks are aligned > + @ so that SP & BIT(THREAD_SIZE_ORDER + PAGE_SHIFT) should always be > + @ zero. > + @ > +ARM( tst sp, #1 << (THREAD_SIZE_ORDER + PAGE_SHIFT) ) > +THUMB( tst r1, #1 << (THREAD_SIZE_ORDER + PAGE_SHIFT) ) > +THUMB( it ne ) > + bne .Lstack_overflow_check\@ > + > + .pushsection .text > +.Lstack_overflow_check\@: > + @ > + @ The stack pointer is not pointing to a valid vmap'ed stack, but it > + @ may be pointing into the linear map instead, which may happen if we > + @ are already running from the overflow stack. We cannot detect overflow > + @ in such cases so just carry on. > + @ > + str ip, [r0, #12] @ Stash IP on the mode stack > + ldr_l ip, high_memory @ Start of VMALLOC space > +ARM( cmp sp, ip ) @ SP in vmalloc space? > +THUMB( cmp r1, ip ) > +THUMB( itt lo ) > + ldrlo ip, [r0, #12] @ Restore IP > + blo .Lout\@ @ Carry on > + > +THUMB( sub r1, sp, r1 ) @ Restore original R1 > +THUMB( sub sp, r1 ) @ Restore original SP > + add sp, sp, #\frame_size @ Undo svc_entry's SP change > + b __bad_stack @ Handle VMAP stack overflow > + .popsection > +.Lout\@: > +#endif > + .endm > diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c > index e05219bca218..5deb40f39999 100644 > --- a/arch/arm/kernel/irq.c > +++ b/arch/arm/kernel/irq.c > @@ -56,7 +56,14 @@ static void __init init_irq_stacks(void) > int cpu; > > for_each_possible_cpu(cpu) { > - stack = (u8 *)__get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER); > + if (!IS_ENABLED(CONFIG_VMAP_STACK)) > + stack = (u8 *)__get_free_pages(GFP_KERNEL, > + THREAD_SIZE_ORDER); > + else > + stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN, > + THREADINFO_GFP, NUMA_NO_NODE, > + __builtin_return_address(0)); > + > if (WARN_ON(!stack)) > break; > per_cpu(irq_stack_ptr, cpu) = &stack[THREAD_SIZE]; > diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c > index 284a80c0b6e1..039feb7cd590 100644 > --- a/arch/arm/kernel/setup.c > +++ b/arch/arm/kernel/setup.c > @@ -141,10 +141,10 @@ EXPORT_SYMBOL(outer_cache); > int __cpu_architecture __read_mostly = CPU_ARCH_UNKNOWN; > > struct stack { > - u32 irq[3]; > - u32 abt[3]; > - u32 und[3]; > - u32 fiq[3]; > + u32 irq[4]; > + u32 abt[4]; > + u32 und[4]; > + u32 fiq[4]; > } ____cacheline_aligned; > > #ifndef CONFIG_CPU_V7M > diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S > index 43077e11dafd..803b51e5cba0 100644 > --- a/arch/arm/kernel/sleep.S > +++ b/arch/arm/kernel/sleep.S > @@ -67,6 +67,14 @@ ENTRY(__cpu_suspend) > ldr r4, =cpu_suspend_size > #endif > mov r5, sp @ current virtual SP > +#ifdef CONFIG_VMAP_STACK > + @ Run the suspend code from the overflow stack so we don't have to rely > + @ on vmalloc-to-phys conversions anywhere in the arch suspend code. > + @ The original SP value captured in R5 will be restored on the way out. > + mov_l r6, overflow_stack_ptr @ Base pointer > + mrc p15, 0, r7, c13, c0, 4 @ Get per-CPU offset > + ldr sp, [r6, r7] @ Address of this CPU's overflow stack > +#endif > add r4, r4, #12 @ Space for pgd, virt sp, phys resume fn > sub sp, sp, r4 @ allocate CPU state on stack > ldr r3, =sleep_save_sp > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index b42c446cec9a..b28a705c49cb 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -121,7 +121,8 @@ void dump_backtrace_stm(u32 *stack, u32 instruction, const char *loglvl) > static int verify_stack(unsigned long sp) > { > if (sp < PAGE_OFFSET || > - (sp > (unsigned long)high_memory && high_memory != NULL)) > + (!IS_ENABLED(CONFIG_VMAP_STACK) && > + sp > (unsigned long)high_memory && high_memory != NULL)) > return -EFAULT; > > return 0; > @@ -291,7 +292,8 @@ static int __die(const char *str, int err, struct pt_regs *regs) > > if (!user_mode(regs) || in_interrupt()) { > dump_mem(KERN_EMERG, "Stack: ", regs->ARM_sp, > - ALIGN(regs->ARM_sp, THREAD_SIZE)); > + ALIGN(regs->ARM_sp - THREAD_SIZE, THREAD_ALIGN) > + + THREAD_SIZE); > dump_backtrace(regs, tsk, KERN_EMERG); > dump_instr(KERN_EMERG, regs); > } > @@ -838,3 +840,77 @@ void __init early_trap_init(void *vectors_base) > */ > #endif > } > + > +#ifdef CONFIG_VMAP_STACK > + > +DECLARE_PER_CPU(u8 *, irq_stack_ptr); > + > +asmlinkage DEFINE_PER_CPU(u8 *, overflow_stack_ptr); > + > +static int __init allocate_overflow_stacks(void) > +{ > + u8 *stack; > + int cpu; > + > + for_each_possible_cpu(cpu) { > + stack = (u8 *)__get_free_page(GFP_KERNEL); > + if (WARN_ON(!stack)) > + return -ENOMEM; > + per_cpu(overflow_stack_ptr, cpu) = &stack[OVERFLOW_STACK_SIZE]; > + } > + return 0; > +} > +early_initcall(allocate_overflow_stacks); > + > +asmlinkage void handle_bad_stack(struct pt_regs *regs) > +{ > + unsigned long tsk_stk = (unsigned long)current->stack; > + unsigned long irq_stk = (unsigned long)this_cpu_read(irq_stack_ptr); > + unsigned long ovf_stk = (unsigned long)this_cpu_read(overflow_stack_ptr); > + > + console_verbose(); > + pr_emerg("Insufficient stack space to handle exception!"); > + > + pr_emerg("Task stack: [0x%08lx..0x%08lx]\n", > + tsk_stk, tsk_stk + THREAD_SIZE); > + pr_emerg("IRQ stack: [0x%08lx..0x%08lx]\n", > + irq_stk - THREAD_SIZE, irq_stk); > + pr_emerg("Overflow stack: [0x%08lx..0x%08lx]\n", > + ovf_stk - OVERFLOW_STACK_SIZE, ovf_stk); > + > + die("kernel stack overflow", regs, 0); > +} > + > +/* > + * Normally, we rely on the logic in do_translation_fault() to update stale PMD > + * entries covering the vmalloc space in a task's page tables when it first > + * accesses the region in question. Unfortunately, this is not sufficient when > + * the task stack resides in the vmalloc region, as do_translation_fault() is a > + * C function that needs a stack to run. > + * > + * So we need to ensure that these PMD entries are up to date *before* the MM > + * switch. As we already have some logic in the MM switch path that takes care > + * of this, let's trigger it by bumping the counter every time the core vmalloc > + * code modifies a PMD entry in the vmalloc region. > + */ > +void arch_sync_kernel_mappings(unsigned long start, unsigned long end) > +{ > + if (start > VMALLOC_END || end < VMALLOC_START) > + return; > + > + /* > + * This hooks into the core vmalloc code to receive notifications of > + * any PMD level changes that have been made to the kernel page tables. > + * This means it should only be triggered once for every MiB worth of > + * vmalloc space, given that we don't support huge vmalloc/vmap on ARM, > + * and that kernel PMD level table entries are rarely (if ever) > + * updated. > + * > + * This means that the counter is going to max out at ~250 for the > + * typical case. If it overflows, something entirely unexpected has > + * occurred so let's throw a warning if that happens. > + */ > + WARN_ON(++init_mm.context.vmalloc_seq == UINT_MAX); > +} > + > +#endif > diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c > index e8d729975f12..c5ea328c428d 100644 > --- a/arch/arm/kernel/unwind.c > +++ b/arch/arm/kernel/unwind.c > @@ -389,7 +389,8 @@ int unwind_frame(struct stackframe *frame) > > /* store the highest address on the stack to avoid crossing it*/ > ctrl.sp_low = frame->sp; > - ctrl.sp_high = ALIGN(ctrl.sp_low, THREAD_SIZE); > + ctrl.sp_high = ALIGN(ctrl.sp_low - THREAD_SIZE, THREAD_ALIGN) > + + THREAD_SIZE; > > pr_debug("%s(pc = %08lx lr = %08lx sp = %08lx)\n", __func__, > frame->pc, frame->lr, frame->sp); > diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S > index f02d617e3359..aa12b65a7fd6 100644 > --- a/arch/arm/kernel/vmlinux.lds.S > +++ b/arch/arm/kernel/vmlinux.lds.S > @@ -138,12 +138,12 @@ SECTIONS > #ifdef CONFIG_STRICT_KERNEL_RWX > . = ALIGN(1<<SECTION_SHIFT); > #else > - . = ALIGN(THREAD_SIZE); > + . = ALIGN(THREAD_ALIGN); > #endif > __init_end = .; > > _sdata = .; > - RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE) > + RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN) > _edata = .; > > BSS_SECTION(0, 0, 0) Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 10:38 ` Marek Szyprowski @ 2021-12-21 10:42 ` Krzysztof Kozlowski 2021-12-21 10:46 ` Marek Szyprowski 2021-12-21 10:44 ` Ard Biesheuvel 1 sibling, 1 reply; 27+ messages in thread From: Krzysztof Kozlowski @ 2021-12-21 10:42 UTC (permalink / raw) To: Marek Szyprowski, Ard Biesheuvel, linux-arm-kernel Cc: Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, 'Linux Samsung SOC' On 21/12/2021 11:38, Marek Szyprowski wrote: > Hi, > > On 22.11.2021 10:28, Ard Biesheuvel wrote: >> Wire up the generic support for managing task stack allocations via vmalloc, >> and implement the entry code that detects whether we faulted because of a >> stack overrun (or future stack overrun caused by pushing the pt_regs array) >> >> While this adds a fair amount of tricky entry asm code, it should be >> noted that it only adds a TST + branch to the svc_entry path. The code >> implementing the non-trivial handling of the overflow stack is emitted >> out-of-line into the .text section. >> >> Since on ARM, we rely on do_translation_fault() to keep PMD level page >> table entries that cover the vmalloc region up to date, we need to >> ensure that we don't hit such a stale PMD entry when accessing the >> stack. So we do a dummy read from the new stack while still running from >> the old one on the context switch path, and bump the vmalloc_seq counter >> when PMD level entries in the vmalloc range are modified, so that the MM >> switch fetches the latest version of the entries. >> >> Note that we need to increase the per-mode stack by 1 word, to gain some >> space to stash a GPR until we know it is safe to touch the stack. >> However, due to the cacheline alignment of the struct, this does not >> actually increase the memory footprint of the struct stack array at all. >> >> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> >> Tested-by: Keith Packard <keithpac@amazon.com> > > > This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > suspend/resume related code must be updated somehow (it partially works > on physical addresses and disabled MMU), but I didn't analyze it yet. If > you have any hints, let me know. > > Maybe this one would help? https://lore.kernel.org/lkml/20211218085843.212497-2-cuigaosheng1@huawei.com/ Best regards, Krzysztof _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 10:42 ` Krzysztof Kozlowski @ 2021-12-21 10:46 ` Marek Szyprowski 0 siblings, 0 replies; 27+ messages in thread From: Marek Szyprowski @ 2021-12-21 10:46 UTC (permalink / raw) To: Krzysztof Kozlowski, Ard Biesheuvel, linux-arm-kernel Cc: Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, 'Linux Samsung SOC' Hi Krzysztof, On 21.12.2021 11:42, Krzysztof Kozlowski wrote: > On 21/12/2021 11:38, Marek Szyprowski wrote: >> On 22.11.2021 10:28, Ard Biesheuvel wrote: >>> Wire up the generic support for managing task stack allocations via vmalloc, >>> and implement the entry code that detects whether we faulted because of a >>> stack overrun (or future stack overrun caused by pushing the pt_regs array) >>> >>> While this adds a fair amount of tricky entry asm code, it should be >>> noted that it only adds a TST + branch to the svc_entry path. The code >>> implementing the non-trivial handling of the overflow stack is emitted >>> out-of-line into the .text section. >>> >>> Since on ARM, we rely on do_translation_fault() to keep PMD level page >>> table entries that cover the vmalloc region up to date, we need to >>> ensure that we don't hit such a stale PMD entry when accessing the >>> stack. So we do a dummy read from the new stack while still running from >>> the old one on the context switch path, and bump the vmalloc_seq counter >>> when PMD level entries in the vmalloc range are modified, so that the MM >>> switch fetches the latest version of the entries. >>> >>> Note that we need to increase the per-mode stack by 1 word, to gain some >>> space to stash a GPR until we know it is safe to touch the stack. >>> However, due to the cacheline alignment of the struct, this does not >>> actually increase the memory footprint of the struct stack array at all. >>> >>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> >>> Tested-by: Keith Packard <keithpac@amazon.com> >> >> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 >> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks >> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the >> suspend/resume related code must be updated somehow (it partially works >> on physical addresses and disabled MMU), but I didn't analyze it yet. If >> you have any hints, let me know. > Maybe this one would help? > https://lore.kernel.org/lkml/20211218085843.212497-2-cuigaosheng1@huawei.com/ I forgot to mention. I've already checked it and it doesn't change/fix anything. It also doesn't break the old (pre-a1c510d0adc) code though. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 10:38 ` Marek Szyprowski 2021-12-21 10:42 ` Krzysztof Kozlowski @ 2021-12-21 10:44 ` Ard Biesheuvel 2021-12-21 11:15 ` Marek Szyprowski 1 sibling, 1 reply; 27+ messages in thread From: Ard Biesheuvel @ 2021-12-21 10:44 UTC (permalink / raw) To: Marek Szyprowski Cc: linux-arm-kernel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > Hi, > > On 22.11.2021 10:28, Ard Biesheuvel wrote: > > Wire up the generic support for managing task stack allocations via vmalloc, > > and implement the entry code that detects whether we faulted because of a > > stack overrun (or future stack overrun caused by pushing the pt_regs array) > > > > While this adds a fair amount of tricky entry asm code, it should be > > noted that it only adds a TST + branch to the svc_entry path. The code > > implementing the non-trivial handling of the overflow stack is emitted > > out-of-line into the .text section. > > > > Since on ARM, we rely on do_translation_fault() to keep PMD level page > > table entries that cover the vmalloc region up to date, we need to > > ensure that we don't hit such a stale PMD entry when accessing the > > stack. So we do a dummy read from the new stack while still running from > > the old one on the context switch path, and bump the vmalloc_seq counter > > when PMD level entries in the vmalloc range are modified, so that the MM > > switch fetches the latest version of the entries. > > > > Note that we need to increase the per-mode stack by 1 word, to gain some > > space to stash a GPR until we know it is safe to touch the stack. > > However, due to the cacheline alignment of the struct, this does not > > actually increase the memory footprint of the struct stack array at all. > > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > Tested-by: Keith Packard <keithpac@amazon.com> > > > This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > suspend/resume related code must be updated somehow (it partially works > on physical addresses and disabled MMU), but I didn't analyze it yet. If > you have any hints, let me know. > Are there any such systems in KernelCI? We caught a suspend/resume related issue in development, which is why the hunk below was added. In general, any virt-to-phys translation involving and address on the stack will become problematic. Could you please confirm whether the issue persists with the patch applied but with CONFIG_VMAP_STACK turned off? Just so we know we are looking in the right place? > diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S > index 43077e11dafd..803b51e5cba0 100644 > --- a/arch/arm/kernel/sleep.S > +++ b/arch/arm/kernel/sleep.S > @@ -67,6 +67,14 @@ ENTRY(__cpu_suspend) > ldr r4, =cpu_suspend_size > #endif > mov r5, sp @ current virtual SP > +#ifdef CONFIG_VMAP_STACK > + @ Run the suspend code from the overflow stack so we don't have to rely > + @ on vmalloc-to-phys conversions anywhere in the arch suspend code. > + @ The original SP value captured in R5 will be restored on the way out. > + mov_l r6, overflow_stack_ptr @ Base pointer > + mrc p15, 0, r7, c13, c0, 4 @ Get per-CPU offset > + ldr sp, [r6, r7] @ Address of this CPU's overflow stack > +#endif > add r4, r4, #12 @ Space for pgd, virt sp, phys resume fn > sub sp, sp, r4 @ allocate CPU state on stack > ldr r3, =sleep_save_sp _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 10:44 ` Ard Biesheuvel @ 2021-12-21 11:15 ` Marek Szyprowski 2021-12-21 13:34 ` Ard Biesheuvel 0 siblings, 1 reply; 27+ messages in thread From: Marek Szyprowski @ 2021-12-21 11:15 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC Hi Ard, On 21.12.2021 11:44, Ard Biesheuvel wrote: > On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >> On 22.11.2021 10:28, Ard Biesheuvel wrote: >>> Wire up the generic support for managing task stack allocations via vmalloc, >>> and implement the entry code that detects whether we faulted because of a >>> stack overrun (or future stack overrun caused by pushing the pt_regs array) >>> >>> While this adds a fair amount of tricky entry asm code, it should be >>> noted that it only adds a TST + branch to the svc_entry path. The code >>> implementing the non-trivial handling of the overflow stack is emitted >>> out-of-line into the .text section. >>> >>> Since on ARM, we rely on do_translation_fault() to keep PMD level page >>> table entries that cover the vmalloc region up to date, we need to >>> ensure that we don't hit such a stale PMD entry when accessing the >>> stack. So we do a dummy read from the new stack while still running from >>> the old one on the context switch path, and bump the vmalloc_seq counter >>> when PMD level entries in the vmalloc range are modified, so that the MM >>> switch fetches the latest version of the entries. >>> >>> Note that we need to increase the per-mode stack by 1 word, to gain some >>> space to stash a GPR until we know it is safe to touch the stack. >>> However, due to the cacheline alignment of the struct, this does not >>> actually increase the memory footprint of the struct stack array at all. >>> >>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> >>> Tested-by: Keith Packard <keithpac@amazon.com> >> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 >> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks >> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the >> suspend/resume related code must be updated somehow (it partially works >> on physical addresses and disabled MMU), but I didn't analyze it yet. If >> you have any hints, let me know. >> > Are there any such systems in KernelCI? We caught a suspend/resume > related issue in development, which is why the hunk below was added. I think that some Exynos-based Odroids (U3 and XU3) were some time ago available in KernelCI, but I don't know if they are still there. > In general, any virt-to-phys translation involving and address on the > stack will become problematic. > > Could you please confirm whether the issue persists with the patch > applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > looking in the right place? I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume works fine both on commit a1c510d0adc6 and linux-next 20211220. >> diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S >> index 43077e11dafd..803b51e5cba0 100644 >> --- a/arch/arm/kernel/sleep.S >> +++ b/arch/arm/kernel/sleep.S >> @@ -67,6 +67,14 @@ ENTRY(__cpu_suspend) >> ldr r4, =cpu_suspend_size >> #endif >> mov r5, sp @ current virtual SP >> +#ifdef CONFIG_VMAP_STACK >> + @ Run the suspend code from the overflow stack so we don't have to rely >> + @ on vmalloc-to-phys conversions anywhere in the arch suspend code. >> + @ The original SP value captured in R5 will be restored on the way out. >> + mov_l r6, overflow_stack_ptr @ Base pointer >> + mrc p15, 0, r7, c13, c0, 4 @ Get per-CPU offset >> + ldr sp, [r6, r7] @ Address of this CPU's overflow stack >> +#endif >> add r4, r4, #12 @ Space for pgd, virt sp, phys resume fn >> sub sp, sp, r4 @ allocate CPU state on stack >> ldr r3, =sleep_save_sp Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 11:15 ` Marek Szyprowski @ 2021-12-21 13:34 ` Ard Biesheuvel 2021-12-21 13:51 ` Marek Szyprowski 0 siblings, 1 reply; 27+ messages in thread From: Ard Biesheuvel @ 2021-12-21 13:34 UTC (permalink / raw) To: Marek Szyprowski Cc: linux-arm-kernel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > Hi Ard, > > On 21.12.2021 11:44, Ard Biesheuvel wrote: > > On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > >> On 22.11.2021 10:28, Ard Biesheuvel wrote: > >>> Wire up the generic support for managing task stack allocations via vmalloc, > >>> and implement the entry code that detects whether we faulted because of a > >>> stack overrun (or future stack overrun caused by pushing the pt_regs array) > >>> > >>> While this adds a fair amount of tricky entry asm code, it should be > >>> noted that it only adds a TST + branch to the svc_entry path. The code > >>> implementing the non-trivial handling of the overflow stack is emitted > >>> out-of-line into the .text section. > >>> > >>> Since on ARM, we rely on do_translation_fault() to keep PMD level page > >>> table entries that cover the vmalloc region up to date, we need to > >>> ensure that we don't hit such a stale PMD entry when accessing the > >>> stack. So we do a dummy read from the new stack while still running from > >>> the old one on the context switch path, and bump the vmalloc_seq counter > >>> when PMD level entries in the vmalloc range are modified, so that the MM > >>> switch fetches the latest version of the entries. > >>> > >>> Note that we need to increase the per-mode stack by 1 word, to gain some > >>> space to stash a GPR until we know it is safe to touch the stack. > >>> However, due to the cacheline alignment of the struct, this does not > >>> actually increase the memory footprint of the struct stack array at all. > >>> > >>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > >>> Tested-by: Keith Packard <keithpac@amazon.com> > >> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > >> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > >> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > >> suspend/resume related code must be updated somehow (it partially works > >> on physical addresses and disabled MMU), but I didn't analyze it yet. If > >> you have any hints, let me know. > >> > > Are there any such systems in KernelCI? We caught a suspend/resume > > related issue in development, which is why the hunk below was added. > > > I think that some Exynos-based Odroids (U3 and XU3) were some time ago > available in KernelCI, but I don't know if they are still there. > > > > In general, any virt-to-phys translation involving and address on the > > stack will become problematic. > > > > Could you please confirm whether the issue persists with the patch > > applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > > looking in the right place? > > > I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume > works fine both on commit a1c510d0adc6 and linux-next 20211220. > Thanks. Any other context you can provide beyond 'does not work' ? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 13:34 ` Ard Biesheuvel @ 2021-12-21 13:51 ` Marek Szyprowski 2021-12-21 16:20 ` Ard Biesheuvel 0 siblings, 1 reply; 27+ messages in thread From: Marek Szyprowski @ 2021-12-21 13:51 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC Hi, On 21.12.2021 14:34, Ard Biesheuvel wrote: > On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >> Hi Ard, >> >> On 21.12.2021 11:44, Ard Biesheuvel wrote: >>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: >>>>> Wire up the generic support for managing task stack allocations via vmalloc, >>>>> and implement the entry code that detects whether we faulted because of a >>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) >>>>> >>>>> While this adds a fair amount of tricky entry asm code, it should be >>>>> noted that it only adds a TST + branch to the svc_entry path. The code >>>>> implementing the non-trivial handling of the overflow stack is emitted >>>>> out-of-line into the .text section. >>>>> >>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page >>>>> table entries that cover the vmalloc region up to date, we need to >>>>> ensure that we don't hit such a stale PMD entry when accessing the >>>>> stack. So we do a dummy read from the new stack while still running from >>>>> the old one on the context switch path, and bump the vmalloc_seq counter >>>>> when PMD level entries in the vmalloc range are modified, so that the MM >>>>> switch fetches the latest version of the entries. >>>>> >>>>> Note that we need to increase the per-mode stack by 1 word, to gain some >>>>> space to stash a GPR until we know it is safe to touch the stack. >>>>> However, due to the cacheline alignment of the struct, this does not >>>>> actually increase the memory footprint of the struct stack array at all. >>>>> >>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> >>>>> Tested-by: Keith Packard <keithpac@amazon.com> >>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 >>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks >>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the >>>> suspend/resume related code must be updated somehow (it partially works >>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If >>>> you have any hints, let me know. >>>> >>> Are there any such systems in KernelCI? We caught a suspend/resume >>> related issue in development, which is why the hunk below was added. >> >> I think that some Exynos-based Odroids (U3 and XU3) were some time ago >> available in KernelCI, but I don't know if they are still there. >> >> >>> In general, any virt-to-phys translation involving and address on the >>> stack will become problematic. >>> >>> Could you please confirm whether the issue persists with the patch >>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are >>> looking in the right place? >> >> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume >> works fine both on commit a1c510d0adc6 and linux-next 20211220. >> > Thanks. Any other context you can provide beyond 'does not work' ? Well, the board properly suspends, but it doesn't wake then (tested remotely with rtcwake command). So far I cannot provide anything more. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 13:51 ` Marek Szyprowski @ 2021-12-21 16:20 ` Ard Biesheuvel 2021-12-21 21:56 ` Marek Szyprowski 0 siblings, 1 reply; 27+ messages in thread From: Ard Biesheuvel @ 2021-12-21 16:20 UTC (permalink / raw) To: Marek Szyprowski Cc: Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > Hi, > > On 21.12.2021 14:34, Ard Biesheuvel wrote: > > On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > >> Hi Ard, > >> > >> On 21.12.2021 11:44, Ard Biesheuvel wrote: > >>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > >>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: > >>>>> Wire up the generic support for managing task stack allocations via vmalloc, > >>>>> and implement the entry code that detects whether we faulted because of a > >>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) > >>>>> > >>>>> While this adds a fair amount of tricky entry asm code, it should be > >>>>> noted that it only adds a TST + branch to the svc_entry path. The code > >>>>> implementing the non-trivial handling of the overflow stack is emitted > >>>>> out-of-line into the .text section. > >>>>> > >>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page > >>>>> table entries that cover the vmalloc region up to date, we need to > >>>>> ensure that we don't hit such a stale PMD entry when accessing the > >>>>> stack. So we do a dummy read from the new stack while still running from > >>>>> the old one on the context switch path, and bump the vmalloc_seq counter > >>>>> when PMD level entries in the vmalloc range are modified, so that the MM > >>>>> switch fetches the latest version of the entries. > >>>>> > >>>>> Note that we need to increase the per-mode stack by 1 word, to gain some > >>>>> space to stash a GPR until we know it is safe to touch the stack. > >>>>> However, due to the cacheline alignment of the struct, this does not > >>>>> actually increase the memory footprint of the struct stack array at all. > >>>>> > >>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > >>>>> Tested-by: Keith Packard <keithpac@amazon.com> > >>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > >>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > >>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > >>>> suspend/resume related code must be updated somehow (it partially works > >>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If > >>>> you have any hints, let me know. > >>>> > >>> Are there any such systems in KernelCI? We caught a suspend/resume > >>> related issue in development, which is why the hunk below was added. > >> > >> I think that some Exynos-based Odroids (U3 and XU3) were some time ago > >> available in KernelCI, but I don't know if they are still there. > >> > >> > >>> In general, any virt-to-phys translation involving and address on the > >>> stack will become problematic. > >>> > >>> Could you please confirm whether the issue persists with the patch > >>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > >>> looking in the right place? > >> > >> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume > >> works fine both on commit a1c510d0adc6 and linux-next 20211220. > >> > > Thanks. Any other context you can provide beyond 'does not work' ? > > Well, the board properly suspends, but it doesn't wake then (tested > remotely with rtcwake command). So far I cannot provide anything more. > Thanks. Does the below help? Or otherwise, could you try doubling the size of the overflow stack at arch/arm/include/asm/thread_info.h:34? diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S index b062b3738bc6..a59bd03a3f2e 100644 --- a/arch/arm/kernel/sleep.S +++ b/arch/arm/kernel/sleep.S @@ -67,7 +67,7 @@ ENTRY(__cpu_suspend) ldr r4, =cpu_suspend_size #endif mov r5, sp @ current virtual SP -#ifdef CONFIG_VMAP_STACK +#if 0 //def CONFIG_VMAP_STACK @ Run the suspend code from the overflow stack so we don't have to rely @ on vmalloc-to-phys conversions anywhere in the arch suspend code. @ The original SP value captured in R5 will be restored on the way out. diff --git a/arch/arm/kernel/suspend.c b/arch/arm/kernel/suspend.c index 43f0a3ebf390..ab1218ac5b4a 100644 --- a/arch/arm/kernel/suspend.c +++ b/arch/arm/kernel/suspend.c @@ -76,7 +76,9 @@ void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr) { u32 *ctx = ptr; - *save_ptr = virt_to_phys(ptr); + *save_ptr = IS_ENABLED(CONFIG_VMAP_STACK) + ? __pfn_to_phys(vmalloc_to_pfn(ptr)) + offset_in_page(ptr) + : virt_to_phys(ptr); /* This must correspond to the LDM in cpu_resume() assembly */ *ptr++ = virt_to_phys(idmap_pgd); _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 16:20 ` Ard Biesheuvel @ 2021-12-21 21:56 ` Marek Szyprowski 2021-12-23 14:23 ` Ard Biesheuvel 0 siblings, 1 reply; 27+ messages in thread From: Marek Szyprowski @ 2021-12-21 21:56 UTC (permalink / raw) To: Ard Biesheuvel Cc: Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC Hi, On 21.12.2021 17:20, Ard Biesheuvel wrote: > On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >> On 21.12.2021 14:34, Ard Biesheuvel wrote: >>> On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>> On 21.12.2021 11:44, Ard Biesheuvel wrote: >>>>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: >>>>>>> Wire up the generic support for managing task stack allocations via vmalloc, >>>>>>> and implement the entry code that detects whether we faulted because of a >>>>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) >>>>>>> >>>>>>> While this adds a fair amount of tricky entry asm code, it should be >>>>>>> noted that it only adds a TST + branch to the svc_entry path. The code >>>>>>> implementing the non-trivial handling of the overflow stack is emitted >>>>>>> out-of-line into the .text section. >>>>>>> >>>>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page >>>>>>> table entries that cover the vmalloc region up to date, we need to >>>>>>> ensure that we don't hit such a stale PMD entry when accessing the >>>>>>> stack. So we do a dummy read from the new stack while still running from >>>>>>> the old one on the context switch path, and bump the vmalloc_seq counter >>>>>>> when PMD level entries in the vmalloc range are modified, so that the MM >>>>>>> switch fetches the latest version of the entries. >>>>>>> >>>>>>> Note that we need to increase the per-mode stack by 1 word, to gain some >>>>>>> space to stash a GPR until we know it is safe to touch the stack. >>>>>>> However, due to the cacheline alignment of the struct, this does not >>>>>>> actually increase the memory footprint of the struct stack array at all. >>>>>>> >>>>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> >>>>>>> Tested-by: Keith Packard <keithpac@amazon.com> >>>>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 >>>>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks >>>>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the >>>>>> suspend/resume related code must be updated somehow (it partially works >>>>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If >>>>>> you have any hints, let me know. >>>>>> >>>>> Are there any such systems in KernelCI? We caught a suspend/resume >>>>> related issue in development, which is why the hunk below was added. >>>> I think that some Exynos-based Odroids (U3 and XU3) were some time ago >>>> available in KernelCI, but I don't know if they are still there. >>>> >>>> >>>>> In general, any virt-to-phys translation involving and address on the >>>>> stack will become problematic. >>>>> >>>>> Could you please confirm whether the issue persists with the patch >>>>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are >>>>> looking in the right place? >>>> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume >>>> works fine both on commit a1c510d0adc6 and linux-next 20211220. >>>> >>> Thanks. Any other context you can provide beyond 'does not work' ? >> Well, the board properly suspends, but it doesn't wake then (tested >> remotely with rtcwake command). So far I cannot provide anything more. >> > Thanks. Does the below help? Or otherwise, could you try doubling the > size of the overflow stack at arch/arm/include/asm/thread_info.h:34? I've tried both (but not at the same time) on the current linux-next and none helped. This must be something else... :/ Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-21 21:56 ` Marek Szyprowski @ 2021-12-23 14:23 ` Ard Biesheuvel 2021-12-28 14:39 ` Geert Uytterhoeven 0 siblings, 1 reply; 27+ messages in thread From: Ard Biesheuvel @ 2021-12-23 14:23 UTC (permalink / raw) To: Marek Szyprowski Cc: Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC On Tue, 21 Dec 2021 at 22:56, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > Hi, > > On 21.12.2021 17:20, Ard Biesheuvel wrote: > > On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > >> On 21.12.2021 14:34, Ard Biesheuvel wrote: > >>> On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > >>>> On 21.12.2021 11:44, Ard Biesheuvel wrote: > >>>>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > >>>>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: > >>>>>>> Wire up the generic support for managing task stack allocations via vmalloc, > >>>>>>> and implement the entry code that detects whether we faulted because of a > >>>>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) > >>>>>>> > >>>>>>> While this adds a fair amount of tricky entry asm code, it should be > >>>>>>> noted that it only adds a TST + branch to the svc_entry path. The code > >>>>>>> implementing the non-trivial handling of the overflow stack is emitted > >>>>>>> out-of-line into the .text section. > >>>>>>> > >>>>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page > >>>>>>> table entries that cover the vmalloc region up to date, we need to > >>>>>>> ensure that we don't hit such a stale PMD entry when accessing the > >>>>>>> stack. So we do a dummy read from the new stack while still running from > >>>>>>> the old one on the context switch path, and bump the vmalloc_seq counter > >>>>>>> when PMD level entries in the vmalloc range are modified, so that the MM > >>>>>>> switch fetches the latest version of the entries. > >>>>>>> > >>>>>>> Note that we need to increase the per-mode stack by 1 word, to gain some > >>>>>>> space to stash a GPR until we know it is safe to touch the stack. > >>>>>>> However, due to the cacheline alignment of the struct, this does not > >>>>>>> actually increase the memory footprint of the struct stack array at all. > >>>>>>> > >>>>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > >>>>>>> Tested-by: Keith Packard <keithpac@amazon.com> > >>>>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > >>>>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > >>>>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > >>>>>> suspend/resume related code must be updated somehow (it partially works > >>>>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If > >>>>>> you have any hints, let me know. > >>>>>> > >>>>> Are there any such systems in KernelCI? We caught a suspend/resume > >>>>> related issue in development, which is why the hunk below was added. > >>>> I think that some Exynos-based Odroids (U3 and XU3) were some time ago > >>>> available in KernelCI, but I don't know if they are still there. > >>>> > >>>> > >>>>> In general, any virt-to-phys translation involving and address on the > >>>>> stack will become problematic. > >>>>> > >>>>> Could you please confirm whether the issue persists with the patch > >>>>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > >>>>> looking in the right place? > >>>> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume > >>>> works fine both on commit a1c510d0adc6 and linux-next 20211220. > >>>> > >>> Thanks. Any other context you can provide beyond 'does not work' ? > >> Well, the board properly suspends, but it doesn't wake then (tested > >> remotely with rtcwake command). So far I cannot provide anything more. > >> > > Thanks. Does the below help? Or otherwise, could you try doubling the > > size of the overflow stack at arch/arm/include/asm/thread_info.h:34? > > I've tried both (but not at the same time) on the current linux-next and > none helped. This must be something else... :/ > Thanks. As i don't have access to this hardware, I am going to have to rely on someone who does to debug this further. The only alternative is marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be unfortunate. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-23 14:23 ` Ard Biesheuvel @ 2021-12-28 14:39 ` Geert Uytterhoeven 2021-12-28 16:12 ` Geert Uytterhoeven 2022-01-05 11:08 ` Jon Hunter 0 siblings, 2 replies; 27+ messages in thread From: Geert Uytterhoeven @ 2021-12-28 14:39 UTC (permalink / raw) To: Ard Biesheuvel Cc: Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas Hi Ard, On Thu, Dec 23, 2021 at 3:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > On Tue, 21 Dec 2021 at 22:56, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > On 21.12.2021 17:20, Ard Biesheuvel wrote: > > > On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > >> On 21.12.2021 14:34, Ard Biesheuvel wrote: > > >>> On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > >>>> On 21.12.2021 11:44, Ard Biesheuvel wrote: > > >>>>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > >>>>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: > > >>>>>>> Wire up the generic support for managing task stack allocations via vmalloc, > > >>>>>>> and implement the entry code that detects whether we faulted because of a > > >>>>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) > > >>>>>>> > > >>>>>>> While this adds a fair amount of tricky entry asm code, it should be > > >>>>>>> noted that it only adds a TST + branch to the svc_entry path. The code > > >>>>>>> implementing the non-trivial handling of the overflow stack is emitted > > >>>>>>> out-of-line into the .text section. > > >>>>>>> > > >>>>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page > > >>>>>>> table entries that cover the vmalloc region up to date, we need to > > >>>>>>> ensure that we don't hit such a stale PMD entry when accessing the > > >>>>>>> stack. So we do a dummy read from the new stack while still running from > > >>>>>>> the old one on the context switch path, and bump the vmalloc_seq counter > > >>>>>>> when PMD level entries in the vmalloc range are modified, so that the MM > > >>>>>>> switch fetches the latest version of the entries. > > >>>>>>> > > >>>>>>> Note that we need to increase the per-mode stack by 1 word, to gain some > > >>>>>>> space to stash a GPR until we know it is safe to touch the stack. > > >>>>>>> However, due to the cacheline alignment of the struct, this does not > > >>>>>>> actually increase the memory footprint of the struct stack array at all. > > >>>>>>> > > >>>>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > >>>>>>> Tested-by: Keith Packard <keithpac@amazon.com> > > >>>>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > > >>>>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > > >>>>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > > >>>>>> suspend/resume related code must be updated somehow (it partially works > > >>>>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If > > >>>>>> you have any hints, let me know. > > >>>>>> > > >>>>> Are there any such systems in KernelCI? We caught a suspend/resume > > >>>>> related issue in development, which is why the hunk below was added. > > >>>> I think that some Exynos-based Odroids (U3 and XU3) were some time ago > > >>>> available in KernelCI, but I don't know if they are still there. > > >>>> > > >>>> > > >>>>> In general, any virt-to-phys translation involving and address on the > > >>>>> stack will become problematic. > > >>>>> > > >>>>> Could you please confirm whether the issue persists with the patch > > >>>>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > > >>>>> looking in the right place? > > >>>> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume > > >>>> works fine both on commit a1c510d0adc6 and linux-next 20211220. > > >>>> > > >>> Thanks. Any other context you can provide beyond 'does not work' ? > > >> Well, the board properly suspends, but it doesn't wake then (tested > > >> remotely with rtcwake command). So far I cannot provide anything more. > > >> > > > Thanks. Does the below help? Or otherwise, could you try doubling the > > > size of the overflow stack at arch/arm/include/asm/thread_info.h:34? > > > > I've tried both (but not at the same time) on the current linux-next and > > none helped. This must be something else... :/ > > > > Thanks. > > As i don't have access to this hardware, I am going to have to rely on > someone who does to debug this further. The only alternative is > marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be > unfortunate. Wish I had seen this thread before... I've just bisected a resume after s2ram failure on R-Car Gen2 to the same commit a1c510d0adc604bb ("ARM: implement support for vmap'ed stacks") in arm/for-next. Expected output: PM: suspend entry (deep) Filesystems sync: 0.000 seconds Freezing user space processes ... (elapsed 0.010 seconds) done. OOM killer disabled. Freezing remaining freezable tasks ... (elapsed 0.009 seconds) done. Disabling non-boot CPUs ... [system suspended, this is also where it hangs on failure] Enabling non-boot CPUs ... CPU1 is up sh-eth ee700000.ethernet eth0: Link is Down Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY driver (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=193) OOM killer enabled. Restarting tasks ... done. PM: suspend exit Both wake-on-LAN and wake-up by gpio-keys fail. Nothing interesting in the kernel log, cfr. above. Disabling CONFIG_VMAP_STACK fixes the issue for me. Just like arch/arm/mach-exynos/ (and others), arch/arm/mach-shmobile/ has several *.S files related to secondary CPU bringup. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-28 14:39 ` Geert Uytterhoeven @ 2021-12-28 16:12 ` Geert Uytterhoeven 2021-12-28 16:27 ` Ard Biesheuvel 2022-01-05 11:08 ` Jon Hunter 1 sibling, 1 reply; 27+ messages in thread From: Geert Uytterhoeven @ 2021-12-28 16:12 UTC (permalink / raw) To: Ard Biesheuvel Cc: Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas On Tue, Dec 28, 2021 at 3:39 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote: > On Thu, Dec 23, 2021 at 3:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Tue, 21 Dec 2021 at 22:56, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > On 21.12.2021 17:20, Ard Biesheuvel wrote: > > > > On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > >> On 21.12.2021 14:34, Ard Biesheuvel wrote: > > > >>> On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > >>>> On 21.12.2021 11:44, Ard Biesheuvel wrote: > > > >>>>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > >>>>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: > > > >>>>>>> Wire up the generic support for managing task stack allocations via vmalloc, > > > >>>>>>> and implement the entry code that detects whether we faulted because of a > > > >>>>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) > > > >>>>>>> > > > >>>>>>> While this adds a fair amount of tricky entry asm code, it should be > > > >>>>>>> noted that it only adds a TST + branch to the svc_entry path. The code > > > >>>>>>> implementing the non-trivial handling of the overflow stack is emitted > > > >>>>>>> out-of-line into the .text section. > > > >>>>>>> > > > >>>>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page > > > >>>>>>> table entries that cover the vmalloc region up to date, we need to > > > >>>>>>> ensure that we don't hit such a stale PMD entry when accessing the > > > >>>>>>> stack. So we do a dummy read from the new stack while still running from > > > >>>>>>> the old one on the context switch path, and bump the vmalloc_seq counter > > > >>>>>>> when PMD level entries in the vmalloc range are modified, so that the MM > > > >>>>>>> switch fetches the latest version of the entries. > > > >>>>>>> > > > >>>>>>> Note that we need to increase the per-mode stack by 1 word, to gain some > > > >>>>>>> space to stash a GPR until we know it is safe to touch the stack. > > > >>>>>>> However, due to the cacheline alignment of the struct, this does not > > > >>>>>>> actually increase the memory footprint of the struct stack array at all. > > > >>>>>>> > > > >>>>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > > >>>>>>> Tested-by: Keith Packard <keithpac@amazon.com> > > > >>>>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > > > >>>>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > > > >>>>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > > > >>>>>> suspend/resume related code must be updated somehow (it partially works > > > >>>>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If > > > >>>>>> you have any hints, let me know. > > > >>>>>> > > > >>>>> Are there any such systems in KernelCI? We caught a suspend/resume > > > >>>>> related issue in development, which is why the hunk below was added. > > > >>>> I think that some Exynos-based Odroids (U3 and XU3) were some time ago > > > >>>> available in KernelCI, but I don't know if they are still there. > > > >>>> > > > >>>> > > > >>>>> In general, any virt-to-phys translation involving and address on the > > > >>>>> stack will become problematic. > > > >>>>> > > > >>>>> Could you please confirm whether the issue persists with the patch > > > >>>>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > > > >>>>> looking in the right place? > > > >>>> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume > > > >>>> works fine both on commit a1c510d0adc6 and linux-next 20211220. > > > >>>> > > > >>> Thanks. Any other context you can provide beyond 'does not work' ? > > > >> Well, the board properly suspends, but it doesn't wake then (tested > > > >> remotely with rtcwake command). So far I cannot provide anything more. > > > >> > > > > Thanks. Does the below help? Or otherwise, could you try doubling the > > > > size of the overflow stack at arch/arm/include/asm/thread_info.h:34? > > > > > > I've tried both (but not at the same time) on the current linux-next and > > > none helped. This must be something else... :/ > > > > > > > Thanks. > > > > As i don't have access to this hardware, I am going to have to rely on > > someone who does to debug this further. The only alternative is > > marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be > > unfortunate. > > Wish I had seen this thread before... > > I've just bisected a resume after s2ram failure on R-Car Gen2 to the same > commit a1c510d0adc604bb ("ARM: implement support for vmap'ed stacks") > in arm/for-next. > > Expected output: > > PM: suspend entry (deep) > Filesystems sync: 0.000 seconds > Freezing user space processes ... (elapsed 0.010 seconds) done. > OOM killer disabled. > Freezing remaining freezable tasks ... (elapsed 0.009 seconds) done. > Disabling non-boot CPUs ... > > [system suspended, this is also where it hangs on failure] > > Enabling non-boot CPUs ... > CPU1 is up > sh-eth ee700000.ethernet eth0: Link is Down > Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY > driver (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=193) > OOM killer enabled. > Restarting tasks ... done. > PM: suspend exit > > Both wake-on-LAN and wake-up by gpio-keys fail. > Nothing interesting in the kernel log, cfr. above. > > Disabling CONFIG_VMAP_STACK fixes the issue for me. Enabling CONFIG_ARM_LPAE also fixes the issue, but is not an option for shmobile_defconfig, as that would break systems with a Cortex-A9. > Just like arch/arm/mach-exynos/ (and others), arch/arm/mach-shmobile/ > has several *.S files related to secondary CPU bringup. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-28 16:12 ` Geert Uytterhoeven @ 2021-12-28 16:27 ` Ard Biesheuvel 0 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2021-12-28 16:27 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas On Tue, 28 Dec 2021 at 17:13, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > On Tue, Dec 28, 2021 at 3:39 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > On Thu, Dec 23, 2021 at 3:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > On Tue, 21 Dec 2021 at 22:56, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > > On 21.12.2021 17:20, Ard Biesheuvel wrote: > > > > > On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > > >> On 21.12.2021 14:34, Ard Biesheuvel wrote: > > > > >>> On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > > >>>> On 21.12.2021 11:44, Ard Biesheuvel wrote: > > > > >>>>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > > >>>>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: > > > > >>>>>>> Wire up the generic support for managing task stack allocations via vmalloc, > > > > >>>>>>> and implement the entry code that detects whether we faulted because of a > > > > >>>>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) > > > > >>>>>>> > > > > >>>>>>> While this adds a fair amount of tricky entry asm code, it should be > > > > >>>>>>> noted that it only adds a TST + branch to the svc_entry path. The code > > > > >>>>>>> implementing the non-trivial handling of the overflow stack is emitted > > > > >>>>>>> out-of-line into the .text section. > > > > >>>>>>> > > > > >>>>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page > > > > >>>>>>> table entries that cover the vmalloc region up to date, we need to > > > > >>>>>>> ensure that we don't hit such a stale PMD entry when accessing the > > > > >>>>>>> stack. So we do a dummy read from the new stack while still running from > > > > >>>>>>> the old one on the context switch path, and bump the vmalloc_seq counter > > > > >>>>>>> when PMD level entries in the vmalloc range are modified, so that the MM > > > > >>>>>>> switch fetches the latest version of the entries. > > > > >>>>>>> > > > > >>>>>>> Note that we need to increase the per-mode stack by 1 word, to gain some > > > > >>>>>>> space to stash a GPR until we know it is safe to touch the stack. > > > > >>>>>>> However, due to the cacheline alignment of the struct, this does not > > > > >>>>>>> actually increase the memory footprint of the struct stack array at all. > > > > >>>>>>> > > > > >>>>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > > > >>>>>>> Tested-by: Keith Packard <keithpac@amazon.com> > > > > >>>>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > > > > >>>>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > > > > >>>>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > > > > >>>>>> suspend/resume related code must be updated somehow (it partially works > > > > >>>>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If > > > > >>>>>> you have any hints, let me know. > > > > >>>>>> > > > > >>>>> Are there any such systems in KernelCI? We caught a suspend/resume > > > > >>>>> related issue in development, which is why the hunk below was added. > > > > >>>> I think that some Exynos-based Odroids (U3 and XU3) were some time ago > > > > >>>> available in KernelCI, but I don't know if they are still there. > > > > >>>> > > > > >>>> > > > > >>>>> In general, any virt-to-phys translation involving and address on the > > > > >>>>> stack will become problematic. > > > > >>>>> > > > > >>>>> Could you please confirm whether the issue persists with the patch > > > > >>>>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are > > > > >>>>> looking in the right place? > > > > >>>> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume > > > > >>>> works fine both on commit a1c510d0adc6 and linux-next 20211220. > > > > >>>> > > > > >>> Thanks. Any other context you can provide beyond 'does not work' ? > > > > >> Well, the board properly suspends, but it doesn't wake then (tested > > > > >> remotely with rtcwake command). So far I cannot provide anything more. > > > > >> > > > > > Thanks. Does the below help? Or otherwise, could you try doubling the > > > > > size of the overflow stack at arch/arm/include/asm/thread_info.h:34? > > > > > > > > I've tried both (but not at the same time) on the current linux-next and > > > > none helped. This must be something else... :/ > > > > > > > > > > Thanks. > > > > > > As i don't have access to this hardware, I am going to have to rely on > > > someone who does to debug this further. The only alternative is > > > marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be > > > unfortunate. > > > > Wish I had seen this thread before... > > > > I've just bisected a resume after s2ram failure on R-Car Gen2 to the same > > commit a1c510d0adc604bb ("ARM: implement support for vmap'ed stacks") > > in arm/for-next. > > > > Expected output: > > > > PM: suspend entry (deep) > > Filesystems sync: 0.000 seconds > > Freezing user space processes ... (elapsed 0.010 seconds) done. > > OOM killer disabled. > > Freezing remaining freezable tasks ... (elapsed 0.009 seconds) done. > > Disabling non-boot CPUs ... > > > > [system suspended, this is also where it hangs on failure] > > > > Enabling non-boot CPUs ... > > CPU1 is up > > sh-eth ee700000.ethernet eth0: Link is Down > > Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY > > driver (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=193) > > OOM killer enabled. > > Restarting tasks ... done. > > PM: suspend exit > > > > Both wake-on-LAN and wake-up by gpio-keys fail. > > Nothing interesting in the kernel log, cfr. above. > > > > Disabling CONFIG_VMAP_STACK fixes the issue for me. > > Enabling CONFIG_ARM_LPAE also fixes the issue, but is not an option > for shmobile_defconfig, as that would break systems with a Cortex-A9. > Thanks Geert. As you have confirmed on #armlinux, the issue also goes away when booting with 'nosmp'. So this looks like an issue with the virtual mapping of the stack in the secondary boot path on !LPAE. That really narrows it down, so hopefully I will be able to fix this shortly. Marek: could you please confirm whether or not enabling LPAE (on cores that support it, of course) and/or booting with 'nosmp' make the issue go away? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2021-12-28 14:39 ` Geert Uytterhoeven 2021-12-28 16:12 ` Geert Uytterhoeven @ 2022-01-05 11:08 ` Jon Hunter 2022-01-05 11:12 ` Ard Biesheuvel 1 sibling, 1 reply; 27+ messages in thread From: Jon Hunter @ 2022-01-05 11:08 UTC (permalink / raw) To: Geert Uytterhoeven, Ard Biesheuvel Cc: Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas, linux-tegra Hi Ard, On 28/12/2021 14:39, Geert Uytterhoeven wrote: ... >> As i don't have access to this hardware, I am going to have to rely on >> someone who does to debug this further. The only alternative is >> marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be >> unfortunate. > > Wish I had seen this thread before... > > I've just bisected a resume after s2ram failure on R-Car Gen2 to the same > commit a1c510d0adc604bb ("ARM: implement support for vmap'ed stacks") > in arm/for-next. > > Expected output: > > PM: suspend entry (deep) > Filesystems sync: 0.000 seconds > Freezing user space processes ... (elapsed 0.010 seconds) done. > OOM killer disabled. > Freezing remaining freezable tasks ... (elapsed 0.009 seconds) done. > Disabling non-boot CPUs ... > > [system suspended, this is also where it hangs on failure] > > Enabling non-boot CPUs ... > CPU1 is up > sh-eth ee700000.ethernet eth0: Link is Down > Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY > driver (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=193) > OOM killer enabled. > Restarting tasks ... done. > PM: suspend exit > > Both wake-on-LAN and wake-up by gpio-keys fail. > Nothing interesting in the kernel log, cfr. above. > > Disabling CONFIG_VMAP_STACK fixes the issue for me. > > Just like arch/arm/mach-exynos/ (and others), arch/arm/mach-shmobile/ > has several *.S files related to secondary CPU bringup. This is also breaking suspend on our 32-bit Tegra platforms. Reverting this change on top of -next fixes the problem. Cheers Jon -- nvpublic _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2022-01-05 11:08 ` Jon Hunter @ 2022-01-05 11:12 ` Ard Biesheuvel 2022-01-05 11:33 ` Jon Hunter 2022-01-05 16:49 ` Jon Hunter 0 siblings, 2 replies; 27+ messages in thread From: Ard Biesheuvel @ 2022-01-05 11:12 UTC (permalink / raw) To: Jon Hunter Cc: Geert Uytterhoeven, Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas, linux-tegra On Wed, 5 Jan 2022 at 12:08, Jon Hunter <jonathanh@nvidia.com> wrote: > > Hi Ard, > > On 28/12/2021 14:39, Geert Uytterhoeven wrote: > > ... > > >> As i don't have access to this hardware, I am going to have to rely on > >> someone who does to debug this further. The only alternative is > >> marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be > >> unfortunate. > > > > Wish I had seen this thread before... > > > > I've just bisected a resume after s2ram failure on R-Car Gen2 to the same > > commit a1c510d0adc604bb ("ARM: implement support for vmap'ed stacks") > > in arm/for-next. > > > > Expected output: > > > > PM: suspend entry (deep) > > Filesystems sync: 0.000 seconds > > Freezing user space processes ... (elapsed 0.010 seconds) done. > > OOM killer disabled. > > Freezing remaining freezable tasks ... (elapsed 0.009 seconds) done. > > Disabling non-boot CPUs ... > > > > [system suspended, this is also where it hangs on failure] > > > > Enabling non-boot CPUs ... > > CPU1 is up > > sh-eth ee700000.ethernet eth0: Link is Down > > Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY > > driver (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=193) > > OOM killer enabled. > > Restarting tasks ... done. > > PM: suspend exit > > > > Both wake-on-LAN and wake-up by gpio-keys fail. > > Nothing interesting in the kernel log, cfr. above. > > > > Disabling CONFIG_VMAP_STACK fixes the issue for me. > > > > Just like arch/arm/mach-exynos/ (and others), arch/arm/mach-shmobile/ > > has several *.S files related to secondary CPU bringup. > > > This is also breaking suspend on our 32-bit Tegra platforms. Reverting > this change on top of -next fixes the problem. > Thanks for the report. It would be helpful if you could provide some more context: - does it happen on a LPAE build too? - does it only happen on SMP capable systems? - does it reproduce on such systems when using only a single CPU? (i.e., pass 'nosmp' on the kernel command line) - when passing 'no_console_suspend' on the kernel command line, are any useful diagnostics produced? - is there any way you could tell whether the crash/hang (assuming that is what you are observing) occurs on the suspend path or on resume? - any other observations that could narrow this down? Thanks, Ard. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2022-01-05 11:12 ` Ard Biesheuvel @ 2022-01-05 11:33 ` Jon Hunter 2022-01-05 13:53 ` Russell King (Oracle) 2022-01-05 16:49 ` Jon Hunter 1 sibling, 1 reply; 27+ messages in thread From: Jon Hunter @ 2022-01-05 11:33 UTC (permalink / raw) To: Ard Biesheuvel Cc: Geert Uytterhoeven, Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas, linux-tegra On 05/01/2022 11:12, Ard Biesheuvel wrote: ... > Thanks for the report. > > It would be helpful if you could provide some more context: > - does it happen on a LPAE build too? > - does it only happen on SMP capable systems? These are all SMP systems. > - does it reproduce on such systems when using only a single CPU? > (i.e., pass 'nosmp' on the kernel command line) I would need to try this. > - when passing 'no_console_suspend' on the kernel command line, are > any useful diagnostics produced? > - is there any way you could tell whether the crash/hang (assuming > that is what you are observing) occurs on the suspend path or on > resume? > - any other observations that could narrow this down? I can run the above and let you know what I find. Cheers Jon -- nvpublic _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2022-01-05 11:33 ` Jon Hunter @ 2022-01-05 13:53 ` Russell King (Oracle) 0 siblings, 0 replies; 27+ messages in thread From: Russell King (Oracle) @ 2022-01-05 13:53 UTC (permalink / raw) To: Jon Hunter Cc: Ard Biesheuvel, Geert Uytterhoeven, Marek Szyprowski, Linux ARM, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas, linux-tegra On Wed, Jan 05, 2022 at 11:33:48AM +0000, Jon Hunter wrote: > On 05/01/2022 11:12, Ard Biesheuvel wrote: > > Thanks for the report. > > > > It would be helpful if you could provide some more context: > > - does it happen on a LPAE build too? > > - does it only happen on SMP capable systems? > > These are all SMP systems. > > > - does it reproduce on such systems when using only a single CPU? > > (i.e., pass 'nosmp' on the kernel command line) > > I would need to try this. Please note that I want an answer on the vmap stack patches by the end of today (UK time - so about five hours after this email has been sent) as we have only tonight and tomorrow's linux-next before the probable opening of the merge window. The options are: 1. The problem gets fixed today and I merge the fix today so it can get tested in linux-next over the next few days by the various build farms and test setups. 2. We postpone the merging of this until the very end of the merge window to give more time to sort out this mess - but what it means is keeping it in linux-next and keeping various platforms broken during that period. However, this is really not fair for other people, and some would say this isn't even an option. 3. We drop the entire series for this merge window, meaning it gets dropped from linux-next, and have another go for the neext merge window. Sorry for being so demanding, but we're far too close to the merge window to be trying to debug a feature that is clearly causing a regression for several platforms. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2022-01-05 11:12 ` Ard Biesheuvel 2022-01-05 11:33 ` Jon Hunter @ 2022-01-05 16:49 ` Jon Hunter 2022-01-05 17:02 ` Ard Biesheuvel 1 sibling, 1 reply; 27+ messages in thread From: Jon Hunter @ 2022-01-05 16:49 UTC (permalink / raw) To: Ard Biesheuvel Cc: Geert Uytterhoeven, Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas, linux-tegra On 05/01/2022 11:12, Ard Biesheuvel wrote: ... > Thanks for the report. > > It would be helpful if you could provide some more context: > - does it happen on a LPAE build too? Enabling CONFIG_ARM_LPAE does work. > - does it only happen on SMP capable systems? > - does it reproduce on such systems when using only a single CPU? > (i.e., pass 'nosmp' on the kernel command line) Adding 'nosmp' does not help. > - when passing 'no_console_suspend' on the kernel command line, are > any useful diagnostics produced? Adding 'no_console_suspend' does not produce any interesting logs. > - is there any way you could tell whether the crash/hang (assuming > that is what you are observing) occurs on the suspend path or on > resume? That is not clear. I see it entering suspend, but not clear if it is failing on entering suspend or resuming. Cheers Jon -- nvpublic _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks 2022-01-05 16:49 ` Jon Hunter @ 2022-01-05 17:02 ` Ard Biesheuvel 0 siblings, 0 replies; 27+ messages in thread From: Ard Biesheuvel @ 2022-01-05 17:02 UTC (permalink / raw) To: Jon Hunter Cc: Geert Uytterhoeven, Marek Szyprowski, Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren, Krzysztof Kozlowski, Linux Samsung SOC, Linux-Renesas, linux-tegra On Wed, 5 Jan 2022 at 17:50, Jon Hunter <jonathanh@nvidia.com> wrote: > > > On 05/01/2022 11:12, Ard Biesheuvel wrote: > > ... > > > Thanks for the report. > > > > It would be helpful if you could provide some more context: > > - does it happen on a LPAE build too? > > Enabling CONFIG_ARM_LPAE does work. > > > - does it only happen on SMP capable systems? > > - does it reproduce on such systems when using only a single CPU? > > (i.e., pass 'nosmp' on the kernel command line) > > Adding 'nosmp' does not help. > > > - when passing 'no_console_suspend' on the kernel command line, are > > any useful diagnostics produced? > > Adding 'no_console_suspend' does not produce any interesting logs. > > > - is there any way you could tell whether the crash/hang (assuming > > that is what you are observing) occurs on the suspend path or on > > resume? > > That is not clear. I see it entering suspend, but not clear if it is > failing on entering suspend or resuming. > Thanks a lot for providing this info. The fact that enabling LPAE makes the issue go away is a fairly strong hint that one of the CPUs comes up running in an address space that lacks the stack's vmapping in its copy of the swapper_pg_dir region - LPAE builds map swapper_pg_dir directly so there it can never go out of sync. Given that vmappings are global, and therefore cached in the TLB across context switches, it is not unlikely that the missing vmapping of the stack is in a task that runs before suspend, but does not cause any issues until after the CPU is reset completely (which takes cached TLB entries down with it) So in summary, this gives me something to chew on, and hopefully, I will be able to provide a proper fix shortly. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2022-01-05 17:04 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-22 9:28 [PATCH v4 0/7] ARM: add vmap'ed stack support Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 1/7] ARM: memcpy: use frame pointer as unwind anchor Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 2/7] ARM: memmove: " Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 3/7] ARM: memset: clean up unwind annotations Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 4/7] ARM: unwind: disregard unwind info before stack frame is set up Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 5/7] ARM: switch_to: clean up Thumb2 code path Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 6/7] ARM: entry: rework stack realignment code in svc_entry Ard Biesheuvel 2021-11-22 9:28 ` [PATCH v4 7/7] ARM: implement support for vmap'ed stacks Ard Biesheuvel [not found] ` <CGME20211221103854eucas1p2592e38fcc84c1c3506fce87f1dab6739@eucas1p2.samsung.com> 2021-12-21 10:38 ` Marek Szyprowski 2021-12-21 10:42 ` Krzysztof Kozlowski 2021-12-21 10:46 ` Marek Szyprowski 2021-12-21 10:44 ` Ard Biesheuvel 2021-12-21 11:15 ` Marek Szyprowski 2021-12-21 13:34 ` Ard Biesheuvel 2021-12-21 13:51 ` Marek Szyprowski 2021-12-21 16:20 ` Ard Biesheuvel 2021-12-21 21:56 ` Marek Szyprowski 2021-12-23 14:23 ` Ard Biesheuvel 2021-12-28 14:39 ` Geert Uytterhoeven 2021-12-28 16:12 ` Geert Uytterhoeven 2021-12-28 16:27 ` Ard Biesheuvel 2022-01-05 11:08 ` Jon Hunter 2022-01-05 11:12 ` Ard Biesheuvel 2022-01-05 11:33 ` Jon Hunter 2022-01-05 13:53 ` Russell King (Oracle) 2022-01-05 16:49 ` Jon Hunter 2022-01-05 17:02 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).