From: u.kleine-koenig@pengutronix.de (Uwe Kleine-König)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 5/5] Cortex-M3: Add support for exception handling
Date: Tue, 13 Mar 2012 21:39:30 +0100 [thread overview]
Message-ID: <20120313203930.GD10400@pengutronix.de> (raw)
In-Reply-To: <20120309171026.GF14148@arm.com>
Hello Catalin,
On Fri, Mar 09, 2012 at 05:10:26PM +0000, Catalin Marinas wrote:
> On Mon, Mar 05, 2012 at 05:04:02PM +0000, Uwe Kleine-K?nig wrote:
> > To save r0, I'd readd OLD_R0 at the end of pt_regs (plus one buffer word
> > to get even alignment). Or would that already be unacceptable because
> > it's an ABI change, too?
>
> If we preserve the first part of the pt_regs structure, we could add the
> exception return at the end (with two additional 32-bit words to
> preserve the 64-bit alignment).
I will do that.
> > --- a/arch/arm/kernel/entry-common.S
> > +++ b/arch/arm/kernel/entry-common.S
> > @@ -473,7 +480,7 @@ __sys_trace:
> >
> > adr lr, BSYM(__sys_trace_return) @ return address
> > mov scno, r0 @ syscall number (possibly new)
> > - add r1, sp, #S_R0 + S_OFF @ pointer to regs
> > + add r1, sp, #S_OFF @ pointer to regs
>
> This change is no longer needed since S_R0 is no 0.
hmm, I thought I had removed them. Seems I didn't.
> > --- a/arch/arm/kernel/entry-header.S
> > +++ b/arch/arm/kernel/entry-header.S
> > @@ -26,7 +26,7 @@
> > * The SWI code relies on the fact that R0 is at the bottom of the stack
> > * (due to slow/fast restore user regs).
> > */
> > -#if S_R0 != 0
> > +#if S_R0 != 0 && !defined(CONFIG_CPU_V7M)
>
> Same here.
Same here ;-)
>
> > +#ifdef CONFIG_CPU_V7M
> > +/*
> > + * ARMv7-M exception entry/exit macros.
> > + *
> > + * xPSR, ReturnAddress(), LR (R14), R12, R3, R2, R1, and R0 are
> > + * automatically saved on the current stack (32 words) before
> > + * switching to the exception stack (SP_main).
> > + *
> > + * If exception is taken while in user mode, SP_main is
> > + * empty. Otherwise, SP_main is aligned to 64 bit automatically
> > + * (CCR.STKALIGN set).
> > + *
> > + * Linux assumes that the interrupts are disabled when entering an
> > + * exception handler and it may BUG if this is not the case. Interrupts
> > + * are disabled during entry and reenabled in the exit macro.
> > + *
> > + * v7m_exception_fast_exit is used when returning from interrupts.
> > + *
> > + * v7m_exception_slow_exit is used when returning from SVC or PendSV.
> > + * When returning to kernel mode, we don't return from exception.
> > + */
> > + .macro v7m_exception_entry
> > + cpsid i
> > + sub sp, #S_FRAME_SIZE
> > + stmia sp, {r0-r12}
> > +
> > + @ set r0 to the location of the registers saved by the core during
> > + @ exception entry. Depending on the mode the cpu was in when the
> > + @ exception happend that is either on the main or the process stack.
> > + @ Bit 2 of EXC_RETURN stored in the lr register specifies which stack
> > + @ was used.
> > + tst lr, #0x4
> > + mrsne r0, psp
> > + addeq r0, sp, #S_FRAME_SIZE
>
> Could we use some other registers here like r8-r12 so that we keep r0-r7
> for syscall handling later and avoid another ldmia?
One upside of using r0 is that
addeq r0, sp, #S_FRAME_SIZE
can be encoded in 16 bits while this is not possible and 32 bits are
needed. And I wonder if it's allowed to corrupt r8-r12?
> > + add r0, r0, #20 @ skip over r0-r3, r12
> > + ldmia r0!, {r1-r3} @ load saved lr, return address and xPSR
> > +
> > + @ calculate the orignal stack pointer value.
> > + @ r0 currently points to the memory location just above the auto saved
> > + @ xPSR. If the FP extension is implemented and bit 4 of EXC_RETURN is 0
> > + @ then space was allocated for FP state. That is space for 18 32-bit
> > + @ values. (If FP extension is unimplemented, bit 4 is 1.)
> > + @ Additionally the cpu might automatically 8-byte align the stack. Bit 9
> > + @ of the saved xPSR specifies if stack aligning took place. In this case
> > + @ another 32-bit value is included in the stack.
> > +
> > + tst lr, #0x10
> > + addeq r0, r0, #576
>
> I think you can ignore VFP for now. We could change it to do lazy
> save/restore and avoid the automatic state saving. If it does this while
> in kernel, it takes a bit of extra stack space.
If I understood correctly this will never trigger for kernel tasks,
won't it?
> > + @ save original sp, lr, return address, xPSR and EXC_RETURN
> > + add r12, sp, #52
>
> Just use S_ARM_SP or whatever asm offsets, it's easier to read.
OK, right.
> > + stmia r12, {r0-r3, lr}
> > +
> > + @ restore registers for system calls
> > + ldmia sp, {r0-r12}
>
> We could avoid reloading r0-r7 (that's what we actually need for
> syscalls) if we don't corrupt them.
See question above. And I noticed that because of tail-chaining we need
to load the register values from the exception frame at least once
anyhow. I will try to optimise here.
> > + .endm
> > +
> > + .macro v7m_exception_fast_exit
> > + @ read r12, sp, lr, return address, xPSR and EXC_RETURN
> > + add r12, sp, #48
>
> S_ARM_R12?
>
> > + ldmia r12, {r1-r5, lr}
> > +
> > + tst r5, #0x100
> > + subne r2, r2, #4
> > +
> > + tst lr, #0x10
> > + subeq r2, r2, #576
> > +
> > + stmdb r2!, {r1, r3-r5} @ write saved r12, lr, return address and xPSR
> > +
> > + ldmia sp, {r1, r3-r5} @ read saved r0-r3
> > + stmdb r2!, {r1, r3-r5} @ write r0-r3 to basic exception frame
> > +
> > + tst lr, #0x4
> > + msrne psp, r2
> > +
> > + ldmia sp, {r0-r12}
> > + add sp, #S_FRAME_SIZE
> > + cpsie i
> > + bx lr
> > + .endm
>
> In the context v7m_exception_fast_exit is used (return from interrupts),
> the previously saved r0-r4,r12 etc. are still on the return stack and
> restored from there. There is no need for the ldmia/stmdb to re-create
> the interrupted process stack. Here we only need to make sure that we
> restore the registers that were not automatically saved and also move
> the kernel SP back to the original location (add S_FRAME_SIZE).
>
> We handle rescheduling and work pending by raising PendSV, so we get
> another interrupt which would be returned via the slow_exit macro.
I was not sure if a task could be preempted during an exception. So I
choosed to play save.
> > + .macro v7m_exception_slow_exit ret_r0
> > + cpsid i
> > + ldr lr, [sp, #S_EXCRET] @ read exception LR
> > + tst lr, #0x8
> > + bne 1f @ go to thread mode using exception return
> > +
> > + /*
> > + * return to kernel thread
> > + * sp is already set up (and might be unset in pt_regs), so only
> > + * restore r0-r12 and pc
> > + */
> > + ldmia sp, {r0-r12}
> > + ldr lr, [sp, #S_PC]
> > + add sp, sp, #S_FRAME_SIZE
> > + cpsie i
> > + bx lr
> > + /*
> > + * return to userspace
> > + */
> > +1:
> > + @ read original r12, sp, lr, pc, xPSR
> > + add r12, sp, #48
> > + ldmia r12, {r1-r5}
> > +
> > + /* stack aligning */
> > + tst r5, #0x100
> > + subne r2, r2, #4
> > +
> > + /* skip over stack space for fp extension */
> > + tst lr, #0x10
> > + subeq r2, r2, #576
> > +
> > + /* write basic exception frame */
> > + stmdb r2!, {r1, r3-r5} @ saved r12, lr, return address and xPSR
> > + ldmia sp, {r1, r3-r5} @ read saved r0-r3
> > + .if \ret_r0
> > + stmdb r2!, {r0, r3-r5} @ restore r0-r3
> > + .else
> > + stmdb r2!, {r1, r3-r5} @ restore r0-r3
> > + .endif
>
> This looks fine.
>
> > + msr psp, r2
> > +
> > + ldmia sp, {r0-r12} @ restore original r4-r11
>
> Isn't this reading too much (optimisation)?
yeah, I think r12 isn't needed. For r0-r3 I'm not sure which is better:
ldmia sp, {r0-r11}
add sp, #S_FRAME_SIZE
vs.
add sp, #S_R4
ldmia sp, {r4-r11}
add sp, #S_FRAME_SIZE - S_R4
It's not described in the V7-M reverence. Do you know a document where I
can look that up?
> > + add sp, #S_FRAME_SIZE @ restore the original MSP
> > + cpsie i
> > + bx lr
> > + .endm
> > +#endif /* CONFIG_CPU_V7M */
> ...
> > --- a/arch/arm/kernel/process.c
> > +++ b/arch/arm/kernel/process.c
> > @@ -452,7 +452,11 @@ asm( ".pushsection .text\n"
> > #ifdef CONFIG_TRACE_IRQFLAGS
> > " bl trace_hardirqs_on\n"
> > #endif
> > +#ifdef CONFIG_CPU_V7M
> > +" msr primask, r7\n"
> > +#else
> > " msr cpsr_c, r7\n"
> > +#endif
> > " mov r0, r4\n"
> > " mov lr, r6\n"
> > " mov pc, r5\n"
> > @@ -491,6 +495,10 @@ pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
> > regs.ARM_r7 = SVC_MODE | PSR_ENDSTATE | PSR_ISETSTATE;
> > regs.ARM_pc = (unsigned long)kernel_thread_helper;
> > regs.ARM_cpsr = regs.ARM_r7 | PSR_I_BIT;
> > +#ifdef CONFIG_CPU_V7M
> > + /* Return to Handler mode */
> > + regs.ARM_EXCRET = 0xfffffff1L;
> > +#endif
>
> BTW, do we need to set this here? We don't return to a kernel thread via
> exception return.
But we need it to decide in slow_return if we should use exception
return or not. So yes, I think it's needed.
> I currently cannot see any obvious bugs in the code. Since you said it
> doesn't work, what are the symptoms?
I found two problems. One is I wasn't aware that r0-r3,r12 doesn't
necessarily need to match the values on exception entry (because of
tail-chaining) and in start_thread decreasing sp is wrong.
I will reimplement the three macros now with the new lessons learned.
Best regards and thanks for your input (here and on irc)
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
next prev parent reply other threads:[~2012-03-13 20:39 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-22 11:12 [RFC PATCH 00/11] Cortex-M3 support Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 01/11] ARM: only show modules in the memory layout for MODULES=y Uwe Kleine-König
2012-01-26 6:16 ` Linus Walleij
2012-01-22 11:13 ` [RFC PATCH 02/11] ARM: add device tree blobs to .gitignore Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 03/11] ARM: protect usage of cr_alignment by #ifdef CONFIG_CPU_CP15 Uwe Kleine-König
2012-01-23 5:43 ` Jean-Christophe PLAGNIOL-VILLARD
2012-01-23 8:14 ` Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 04/11] ARM: Add a printk loglevel modifier Uwe Kleine-König
2012-01-23 5:50 ` Jean-Christophe PLAGNIOL-VILLARD
2012-01-22 11:13 ` [RFC PATCH 05/11] ARM: provide XIP_VIRT_ADDR for no-MMU builds Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 06/11] Cortex-M3: Add base support for Cortex-M3 Uwe Kleine-König
2012-01-22 19:45 ` Michał Mirosław
2012-01-22 20:42 ` Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 07/11] Cortex-M3: Add support for exception handling Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 08/11] Cortex-M3: Add NVIC support Uwe Kleine-König
2012-01-31 19:39 ` Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 09/11] Cortex-M3: Allow the building of Cortex-M3 kernel port Uwe Kleine-König
2012-01-22 20:05 ` Michał Mirosław
2012-02-07 19:43 ` Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 10/11] Cortex-M3: Add VFP support Uwe Kleine-König
2012-01-22 11:13 ` [RFC PATCH 11/11] HACK! ARM: no, we don't enter in ARM Uwe Kleine-König
2012-02-07 20:18 ` [RFC PATCH 00/11] Cortex-M3 support Uwe Kleine-König
2012-02-16 20:01 ` Uwe Kleine-König
2012-02-16 20:18 ` [PATCH 1/5] ARM: protect usage of cr_alignment by #ifdef CONFIG_CPU_CP15 Uwe Kleine-König
2012-02-16 20:18 ` [PATCH 2/5] ARM: Add a printk loglevel modifier Uwe Kleine-König
2012-02-16 20:18 ` [PATCH 3/5] ARM: force branch instructions to use long distance encoding Uwe Kleine-König
2012-02-16 20:18 ` [PATCH 4/5] ARM: Cortex-M3: Add base support for Cortex-M3 Uwe Kleine-König
2012-02-16 20:18 ` [PATCH 5/5] ARM: Cortex-M3: Add support for exception handling Uwe Kleine-König
2012-02-16 22:20 ` Russell King - ARM Linux
2012-02-24 22:01 ` Uwe Kleine-König
2012-02-24 22:12 ` Catalin Marinas
2012-02-24 22:43 ` Russell King - ARM Linux
2012-02-25 8:49 ` Catalin Marinas
2012-02-25 14:07 ` Uwe Kleine-König
2012-03-05 17:04 ` [PATCH v2 4/5] Cortex-M3: Add base support for Cortex-M3 Uwe Kleine-König
2012-03-05 17:04 ` [PATCH v2 5/5] Cortex-M3: Add support for exception handling Uwe Kleine-König
2012-03-09 17:10 ` Catalin Marinas
2012-03-13 20:39 ` Uwe Kleine-König [this message]
2012-03-08 10:52 ` [PATCH v2 4/5] Cortex-M3: Add base support for Cortex-M3 Catalin Marinas
2012-02-17 0:28 ` [PATCH 1/5] ARM: protect usage of cr_alignment by #ifdef CONFIG_CPU_CP15 Ryan Mallon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120313203930.GD10400@pengutronix.de \
--to=u.kleine-koenig@pengutronix.de \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.