linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] arm: Remove early stack deallocation from restore_user_regs
@ 2014-12-12 11:11 Daniel Thompson
  2015-01-05 15:12 ` [PATCH] " Daniel Thompson
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Thompson @ 2014-12-12 11:11 UTC (permalink / raw)
  To: linux-arm-kernel

Currently restore_user_regs deallocates the SVC stack early in
its execution and relies on no exception being taken between
the deallocation and the registers being restored. The introduction
of a default FIQ handler that also uses the SVC stack breaks this
assumption and can result in corrupted register state.

This patch works around the problem by removing the early
stack deallocation and using r2 as a temporary instead. I have
not found a way to do this without introducing an extra mov
instruction to the macro.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---

Notes:
    I have recently started to hook up the PMU via FIQ (although
    its slightly hacky at present) and was seeing random userspace
    SEGVs when perf was running (after ~100,000 or so FIQs).
    
    Instrumenting the code eventually revealed that in almost all
    cases the last FIQ handler to run prior the SEGV had interrupted
    ret_to_user_from_irq or ret_fast_syscall. Very occasionally it was
    in the fault handling code (because that code runs as part of SEGV
    handling and the PMU is instrumenting that too).
    
    No SEGV problems have been observed since fixing the issue. This
    version of the patch has seen >7M FIQs and an older version (based
    on cpsid f) ran overnight.
    

 arch/arm/kernel/entry-header.S | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 4176df721bf0..1a0045abead7 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -253,21 +253,22 @@
 	.endm

 	.macro	restore_user_regs, fast = 0, offset = 0
-	ldr	r1, [sp, #\offset + S_PSR]	@ get calling cpsr
-	ldr	lr, [sp, #\offset + S_PC]!	@ get pc
+	mov	r2, sp
+	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
+	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
 	msr	spsr_cxsf, r1			@ save in spsr_svc
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
 	@ We must avoid clrex due to Cortex-A15 erratum #830321
-	strex	r1, r2, [sp]			@ clear the exclusive monitor
+	strex	r1, r2, [r2]			@ clear the exclusive monitor
 #endif
 	.if	\fast
-	ldmdb	sp, {r1 - lr}^			@ get calling r1 - lr
+	ldmdb	r2, {r1 - lr}^			@ get calling r1 - lr
 	.else
-	ldmdb	sp, {r0 - lr}^			@ get calling r0 - lr
+	ldmdb	r2, {r0 - lr}^			@ get calling r0 - lr
 	.endif
 	mov	r0, r0				@ ARMv5T and earlier require a nop
 						@ after ldm {}^
-	add	sp, sp, #S_FRAME_SIZE - S_PC
+	add	sp, sp, #\offset + S_FRAME_SIZE
 	movs	pc, lr				@ return & move spsr_svc into cpsr
 	.endm

--
1.9.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] arm: Remove early stack deallocation from restore_user_regs
  2014-12-12 11:11 [RFC PATCH] arm: Remove early stack deallocation from restore_user_regs Daniel Thompson
@ 2015-01-05 15:12 ` Daniel Thompson
  2015-01-09 16:46   ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Thompson @ 2015-01-05 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

Currently restore_user_regs deallocates the SVC stack early in
its execution and relies on no exception being taken between
the deallocation and the registers being restored. The introduction
of a default FIQ handler that also uses the SVC stack breaks this
assumption and can result in corrupted register state.

This patch works around the problem by removing the early
stack deallocation and using r2 as a temporary instead. I have
not found a way to do this without introducing an extra mov
instruction to the macro.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---

Notes:
    [This patch has not been modified since its original posting as an RFC].
    
    I have recently started to hook up the PMU via FIQ (although
    it's slightly hacky at present) and was seeing random userspace
    SEGVs when perf was running (after ~100,000 or so FIQs).
    
    Instrumenting the code eventually revealed that in almost all
    cases the last FIQ handler to run prior the SEGV had interrupted
    ret_to_user_from_irq or ret_fast_syscall. Very occasionally it was
    in the fault handling code (because that code runs as part of SEGV
    handling and the PMU is instrumenting that too).
    
    No SEGV problems have been observed since fixing the issue. This
    version of the patch has seen >7M FIQs and an older version (based
    on cpsid f) ran overnight.
    

 arch/arm/kernel/entry-header.S | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 4176df721bf0..1a0045abead7 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -253,21 +253,22 @@
 	.endm

 	.macro	restore_user_regs, fast = 0, offset = 0
-	ldr	r1, [sp, #\offset + S_PSR]	@ get calling cpsr
-	ldr	lr, [sp, #\offset + S_PC]!	@ get pc
+	mov	r2, sp
+	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
+	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
 	msr	spsr_cxsf, r1			@ save in spsr_svc
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
 	@ We must avoid clrex due to Cortex-A15 erratum #830321
-	strex	r1, r2, [sp]			@ clear the exclusive monitor
+	strex	r1, r2, [r2]			@ clear the exclusive monitor
 #endif
 	.if	\fast
-	ldmdb	sp, {r1 - lr}^			@ get calling r1 - lr
+	ldmdb	r2, {r1 - lr}^			@ get calling r1 - lr
 	.else
-	ldmdb	sp, {r0 - lr}^			@ get calling r0 - lr
+	ldmdb	r2, {r0 - lr}^			@ get calling r0 - lr
 	.endif
 	mov	r0, r0				@ ARMv5T and earlier require a nop
 						@ after ldm {}^
-	add	sp, sp, #S_FRAME_SIZE - S_PC
+	add	sp, sp, #\offset + S_FRAME_SIZE
 	movs	pc, lr				@ return & move spsr_svc into cpsr
 	.endm

--
1.9.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] arm: Remove early stack deallocation from restore_user_regs
  2015-01-05 15:12 ` [PATCH] " Daniel Thompson
@ 2015-01-09 16:46   ` Russell King - ARM Linux
  2015-01-09 17:06     ` Daniel Thompson
  0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2015-01-09 16:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 05, 2015 at 03:12:38PM +0000, Daniel Thompson wrote:
> Currently restore_user_regs deallocates the SVC stack early in
> its execution and relies on no exception being taken between
> the deallocation and the registers being restored. The introduction
> of a default FIQ handler that also uses the SVC stack breaks this
> assumption and can result in corrupted register state.
> 
> This patch works around the problem by removing the early
> stack deallocation and using r2 as a temporary instead. I have
> not found a way to do this without introducing an extra mov
> instruction to the macro.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> ---

Please put it in the patch system, thanks.  I think we should queue
this one for stable too, as I think we need this for v3.18
(as a result of c0e7f7ee717e2b4c5791e7422424c96b5008c39e,
ARM: 8150/3: fiq: Replace default FIQ handler)?

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: Remove early stack deallocation from restore_user_regs
  2015-01-09 16:46   ` Russell King - ARM Linux
@ 2015-01-09 17:06     ` Daniel Thompson
  2015-01-09 17:20       ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Thompson @ 2015-01-09 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/01/15 16:46, Russell King - ARM Linux wrote:
> On Mon, Jan 05, 2015 at 03:12:38PM +0000, Daniel Thompson wrote:
>> Currently restore_user_regs deallocates the SVC stack early in
>> its execution and relies on no exception being taken between
>> the deallocation and the registers being restored. The introduction
>> of a default FIQ handler that also uses the SVC stack breaks this
>> assumption and can result in corrupted register state.
>>
>> This patch works around the problem by removing the early
>> stack deallocation and using r2 as a temporary instead. I have
>> not found a way to do this without introducing an extra mov
>> instruction to the macro.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>> ---
> 
> Please put it in the patch system, thanks.

Will do.


> I think we should queue
> this one for stable too, as I think we need this for v3.18
> (as a result of c0e7f7ee717e2b4c5791e7422424c96b5008c39e,
> ARM: 8150/3: fiq: Replace default FIQ handler)?

It's a close call.

Before 8150/3 the system would probably crash if the default FIQ handler
ran. After 8150/3 the system is also likely to crash since there's no
code hooked into the handler in v3.18 that can clear the source of FIQ
leaving us stuck re-entering the FIQ handler.

Nevertheless, this is a nasty gremlin to leave for backporters (it
wasn't easy to find) so I'd be very happy to Cc: stable and see what
they think.


Daniel.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: Remove early stack deallocation from restore_user_regs
  2015-01-09 17:06     ` Daniel Thompson
@ 2015-01-09 17:20       ` Russell King - ARM Linux
  2015-01-09 19:16         ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2015-01-09 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 09, 2015 at 05:06:54PM +0000, Daniel Thompson wrote:
> On 09/01/15 16:46, Russell King - ARM Linux wrote:
> > On Mon, Jan 05, 2015 at 03:12:38PM +0000, Daniel Thompson wrote:
> >> Currently restore_user_regs deallocates the SVC stack early in
> >> its execution and relies on no exception being taken between
> >> the deallocation and the registers being restored. The introduction
> >> of a default FIQ handler that also uses the SVC stack breaks this
> >> assumption and can result in corrupted register state.
> >>
> >> This patch works around the problem by removing the early
> >> stack deallocation and using r2 as a temporary instead. I have
> >> not found a way to do this without introducing an extra mov
> >> instruction to the macro.
> >>
> >> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> >> ---
> > 
> > Please put it in the patch system, thanks.
> 
> Will do.
> 
> 
> > I think we should queue
> > this one for stable too, as I think we need this for v3.18
> > (as a result of c0e7f7ee717e2b4c5791e7422424c96b5008c39e,
> > ARM: 8150/3: fiq: Replace default FIQ handler)?
> 
> It's a close call.

I agree.

> Before 8150/3 the system would probably crash if the default FIQ handler
> ran. After 8150/3 the system is also likely to crash since there's no
> code hooked into the handler in v3.18 that can clear the source of FIQ
> leaving us stuck re-entering the FIQ handler.
> 
> Nevertheless, this is a nasty gremlin to leave for backporters (it
> wasn't easy to find) so I'd be very happy to Cc: stable and see what
> they think.

Well, we could ask Greg now.  It's not a regression (as nothing makes
use of the previously merged changes yet), but it is a nasty latent bug
which we could easily solve.

Having thought about it some more, I'm tempted to say... leave the
stable tag off it, and we can make the decision later - after it's had
a chance to really prove itself to a much wider audience.  That'd be
the lowest risk for the 3.18 stable tree.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: Remove early stack deallocation from restore_user_regs
  2015-01-09 17:20       ` Russell King - ARM Linux
@ 2015-01-09 19:16         ` Greg KH
  0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2015-01-09 19:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 09, 2015 at 05:20:35PM +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 09, 2015 at 05:06:54PM +0000, Daniel Thompson wrote:
> > On 09/01/15 16:46, Russell King - ARM Linux wrote:
> > > On Mon, Jan 05, 2015 at 03:12:38PM +0000, Daniel Thompson wrote:
> > >> Currently restore_user_regs deallocates the SVC stack early in
> > >> its execution and relies on no exception being taken between
> > >> the deallocation and the registers being restored. The introduction
> > >> of a default FIQ handler that also uses the SVC stack breaks this
> > >> assumption and can result in corrupted register state.
> > >>
> > >> This patch works around the problem by removing the early
> > >> stack deallocation and using r2 as a temporary instead. I have
> > >> not found a way to do this without introducing an extra mov
> > >> instruction to the macro.
> > >>
> > >> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> > >> ---
> > > 
> > > Please put it in the patch system, thanks.
> > 
> > Will do.
> > 
> > 
> > > I think we should queue
> > > this one for stable too, as I think we need this for v3.18
> > > (as a result of c0e7f7ee717e2b4c5791e7422424c96b5008c39e,
> > > ARM: 8150/3: fiq: Replace default FIQ handler)?
> > 
> > It's a close call.
> 
> I agree.
> 
> > Before 8150/3 the system would probably crash if the default FIQ handler
> > ran. After 8150/3 the system is also likely to crash since there's no
> > code hooked into the handler in v3.18 that can clear the source of FIQ
> > leaving us stuck re-entering the FIQ handler.
> > 
> > Nevertheless, this is a nasty gremlin to leave for backporters (it
> > wasn't easy to find) so I'd be very happy to Cc: stable and see what
> > they think.
> 
> Well, we could ask Greg now.  It's not a regression (as nothing makes
> use of the previously merged changes yet), but it is a nasty latent bug
> which we could easily solve.
> 
> Having thought about it some more, I'm tempted to say... leave the
> stable tag off it, and we can make the decision later - after it's had
> a chance to really prove itself to a much wider audience.  That'd be
> the lowest risk for the 3.18 stable tree.

That sounds like a good idea, you can always email stable@ to let us
know of a patch we should add at a later time.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-01-09 19:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-12 11:11 [RFC PATCH] arm: Remove early stack deallocation from restore_user_regs Daniel Thompson
2015-01-05 15:12 ` [PATCH] " Daniel Thompson
2015-01-09 16:46   ` Russell King - ARM Linux
2015-01-09 17:06     ` Daniel Thompson
2015-01-09 17:20       ` Russell King - ARM Linux
2015-01-09 19:16         ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).