* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
@ 2017-03-16 13:53 Jon Medhurst
2017-03-17 12:10 ` Russell King - ARM Linux
0 siblings, 1 reply; 7+ messages in thread
From: Jon Medhurst @ 2017-03-16 13:53 UTC (permalink / raw)
To: linux-arm-kernel
kprobes test cases need to have a stack that is aligned to an 8-byte
boundary because they call other functions (and the ARM ABI mandates
that alignment) and because test cases include 64-bit accesses to the
stack. Unfortunately, GCC doesn't ensure this alignment for inline
assembler and for the code in question seems to always misalign it by
pushing just the LR register onto the stack. We therefore need to
explicitly perform stack alignment at the start of each test case.
Without this fix, some test cases will generate alignment faults on
systems where alignment is enforced. Even if the kernel is configured to
handle these faults in software, triggering them is ugly. It also
exposes limitations in the fault handling code which doesn't cope with
writes to the stack. E.g. when handling this instruction
strd r6, [sp, #-64]!
the fault handling code will write to a stack location below the SP
value at the point the fault occurred, which coincides with where the
exception handler has pushed the saved register context. This results in
corruption of those registers.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
---
I'm assuming the fact the alignment exception handler doesn't cope with
instructions that push things to the stack isn't a problem that we need
to be concerned about, given that compiler generated code and handwitten
assembler shouldn't trigger this unless it's buggy?
Russell, this is the last of several issues [1] [2] I found when testing
Masami Hiramatsu's kprobe changes [3]. That is a total of 4 kprobes
patches and 3 fixes around code patching. Assuming these are acceptable
I can create a branch and a pull request, or feed them into the patch
tracker, let me know.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/494365.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/494370.html
[3] https://lkml.org/lkml/2017/2/14/709
arch/arm/probes/kprobes/test-core.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/arm/probes/kprobes/test-core.c b/arch/arm/probes/kprobes/test-core.c
index c893726aa52d..1c98a87786ca 100644
--- a/arch/arm/probes/kprobes/test-core.c
+++ b/arch/arm/probes/kprobes/test-core.c
@@ -977,7 +977,10 @@ static void coverage_end(void)
void __naked __kprobes_test_case_start(void)
{
__asm__ __volatile__ (
- "stmdb sp!, {r4-r11} \n\t"
+ "mov r2, sp \n\t"
+ "bic r3, r2, #7 \n\t"
+ "mov sp, r3 \n\t"
+ "stmdb sp!, {r2-r11} \n\t"
"sub sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
"bic r0, lr, #1 @ r0 = inline data \n\t"
"mov r1, sp \n\t"
@@ -997,7 +1000,8 @@ void __naked __kprobes_test_case_end_32(void)
"movne pc, r0 \n\t"
"mov r0, r4 \n\t"
"add sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
- "ldmia sp!, {r4-r11} \n\t"
+ "ldmia sp!, {r2-r11} \n\t"
+ "mov sp, r2 \n\t"
"mov pc, r0 \n\t"
);
}
@@ -1013,7 +1017,8 @@ void __naked __kprobes_test_case_end_16(void)
"bxne r0 \n\t"
"mov r0, r4 \n\t"
"add sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
- "ldmia sp!, {r4-r11} \n\t"
+ "ldmia sp!, {r2-r11} \n\t"
+ "mov sp, r2 \n\t"
"bx r0 \n\t"
);
}
--
2.11.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
2017-03-16 13:53 [PATCH] arm: kprobes: Align stack to 8-bytes in test code Jon Medhurst
@ 2017-03-17 12:10 ` Russell King - ARM Linux
2017-03-17 12:59 ` Jon Medhurst (Tixy)
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2017-03-17 12:10 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 16, 2017 at 01:53:59PM +0000, Jon Medhurst wrote:
> Without this fix, some test cases will generate alignment faults on
> systems where alignment is enforced. Even if the kernel is configured to
> handle these faults in software, triggering them is ugly. It also
> exposes limitations in the fault handling code which doesn't cope with
> writes to the stack. E.g. when handling this instruction
>
> strd r6, [sp, #-64]!
>
> the fault handling code will write to a stack location below the SP
> value at the point the fault occurred, which coincides with where the
> exception handler has pushed the saved register context. This results in
> corruption of those registers.
The general rule today is that the stack must always be 64-bit aligned,
so an even number of registers must always be pushed to the stack.
> diff --git a/arch/arm/probes/kprobes/test-core.c b/arch/arm/probes/kprobes/test-core.c
> index c893726aa52d..1c98a87786ca 100644
> --- a/arch/arm/probes/kprobes/test-core.c
> +++ b/arch/arm/probes/kprobes/test-core.c
> @@ -977,7 +977,10 @@ static void coverage_end(void)
> void __naked __kprobes_test_case_start(void)
> {
> __asm__ __volatile__ (
> - "stmdb sp!, {r4-r11} \n\t"
> + "mov r2, sp \n\t"
> + "bic r3, r2, #7 \n\t"
> + "mov sp, r3 \n\t"
> + "stmdb sp!, {r2-r11} \n\t"
I'm not sure these is where the problem is - on entry, the stack
should be 64-bit aligned. You're pushing an even number of registers
onto the stack, so it should remain 64-bit aligned.
> "sub sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
This looks to be 256 bytes, so that should be fine.
I think the real question is... how is the stack becoming misaligned?
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
2017-03-17 12:10 ` Russell King - ARM Linux
@ 2017-03-17 12:59 ` Jon Medhurst (Tixy)
2017-03-17 14:06 ` Russell King - ARM Linux
0 siblings, 1 reply; 7+ messages in thread
From: Jon Medhurst (Tixy) @ 2017-03-17 12:59 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 2017-03-17 at 12:10 +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 16, 2017 at 01:53:59PM +0000, Jon Medhurst wrote:
> > Without this fix, some test cases will generate alignment faults on
> > systems where alignment is enforced. Even if the kernel is configured to
> > handle these faults in software, triggering them is ugly. It also
> > exposes limitations in the fault handling code which doesn't cope with
> > writes to the stack. E.g. when handling this instruction
> >
> > strd r6, [sp, #-64]!
> >
> > the fault handling code will write to a stack location below the SP
> > value at the point the fault occurred, which coincides with where the
> > exception handler has pushed the saved register context. This results in
> > corruption of those registers.
>
> The general rule today is that the stack must always be 64-bit aligned,
> so an even number of registers must always be pushed to the stack.
>
> > diff --git a/arch/arm/probes/kprobes/test-core.c b/arch/arm/probes/kprobes/test-core.c
> > index c893726aa52d..1c98a87786ca 100644
> > --- a/arch/arm/probes/kprobes/test-core.c
> > +++ b/arch/arm/probes/kprobes/test-core.c
> > @@ -977,7 +977,10 @@ static void coverage_end(void)
> > void __naked __kprobes_test_case_start(void)
> > {
> > __asm__ __volatile__ (
> > - "stmdb sp!, {r4-r11} \n\t"
> > + "mov r2, sp \n\t"
> > + "bic r3, r2, #7 \n\t"
> > + "mov sp, r3 \n\t"
> > + "stmdb sp!, {r2-r11} \n\t"
>
> I'm not sure these is where the problem is - on entry, the stack
> should be 64-bit aligned
It isn't, because GCC turns code like this
void foo(void)
{
asm volatile("bl __kprobes_test_case_start"
: : : "r0", "r1", "r2", "r3", "ip", "lr", "memory", "cc");
}
into this...
8010e4ac <foo>:
8010e4ac: e52de004 push {lr} ; (str lr, [sp, #-4]!)
8010e4b0: eb002c99 bl 8011971c <__kprobes_test_case_start>
8010e4b4: e49df004 pop {pc} ; (ldr pc, [sp], #4)
Perhaps we need a way of telling GCC we are using the stack but I've not
managed to spot a way of doing that.
--
Tixy
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
2017-03-17 12:59 ` Jon Medhurst (Tixy)
@ 2017-03-17 14:06 ` Russell King - ARM Linux
2017-03-17 14:42 ` Jon Medhurst (Tixy)
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2017-03-17 14:06 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Mar 17, 2017 at 12:59:02PM +0000, Jon Medhurst (Tixy) wrote:
> On Fri, 2017-03-17 at 12:10 +0000, Russell King - ARM Linux wrote:
> > On Thu, Mar 16, 2017 at 01:53:59PM +0000, Jon Medhurst wrote:
> > > Without this fix, some test cases will generate alignment faults on
> > > systems where alignment is enforced. Even if the kernel is configured to
> > > handle these faults in software, triggering them is ugly. It also
> > > exposes limitations in the fault handling code which doesn't cope with
> > > writes to the stack. E.g. when handling this instruction
> > >
> > > strd r6, [sp, #-64]!
> > >
> > > the fault handling code will write to a stack location below the SP
> > > value at the point the fault occurred, which coincides with where the
> > > exception handler has pushed the saved register context. This results in
> > > corruption of those registers.
> >
> > The general rule today is that the stack must always be 64-bit aligned,
> > so an even number of registers must always be pushed to the stack.
> >
> > > diff --git a/arch/arm/probes/kprobes/test-core.c b/arch/arm/probes/kprobes/test-core.c
> > > index c893726aa52d..1c98a87786ca 100644
> > > --- a/arch/arm/probes/kprobes/test-core.c
> > > +++ b/arch/arm/probes/kprobes/test-core.c
> > > @@ -977,7 +977,10 @@ static void coverage_end(void)
> > > void __naked __kprobes_test_case_start(void)
> > > {
> > > __asm__ __volatile__ (
> > > - "stmdb sp!, {r4-r11} \n\t"
> > > + "mov r2, sp \n\t"
> > > + "bic r3, r2, #7 \n\t"
> > > + "mov sp, r3 \n\t"
> > > + "stmdb sp!, {r2-r11} \n\t"
> >
> > I'm not sure these is where the problem is - on entry, the stack
> > should be 64-bit aligned
>
> It isn't, because GCC turns code like this
>
> void foo(void)
> {
> asm volatile("bl __kprobes_test_case_start"
> : : : "r0", "r1", "r2", "r3", "ip", "lr", "memory", "cc");
> }
>
> into this...
>
> 8010e4ac <foo>:
> 8010e4ac: e52de004 push {lr} ; (str lr, [sp, #-4]!)
> 8010e4b0: eb002c99 bl 8011971c <__kprobes_test_case_start>
> 8010e4b4: e49df004 pop {pc} ; (ldr pc, [sp], #4)
>
> Perhaps we need a way of telling GCC we are using the stack but I've not
> managed to spot a way of doing that.
Using which compiler options?
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
2017-03-17 14:06 ` Russell King - ARM Linux
@ 2017-03-17 14:42 ` Jon Medhurst (Tixy)
2017-03-17 15:05 ` Russell King - ARM Linux
0 siblings, 1 reply; 7+ messages in thread
From: Jon Medhurst (Tixy) @ 2017-03-17 14:42 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 2017-03-17 at 14:06 +0000, Russell King - ARM Linux wrote:
> On Fri, Mar 17, 2017 at 12:59:02PM +0000, Jon Medhurst (Tixy) wrote:
> >
[...]
> > It isn't, because GCC turns code like this
> >
> > void foo(void)
> > {
> > asm volatile("bl __kprobes_test_case_start"
> > : : : "r0", "r1", "r2", "r3", "ip", "lr", "memory", "cc");
> > }
> >
> > into this...
> >
> > 8010e4ac <foo>:
> > 8010e4ac: e52de004 push {lr} ; (str lr, [sp, #-4]!)
> > 8010e4b0: eb002c99 bl 8011971c <__kprobes_test_case_start>
> > 8010e4b4: e49df004 pop {pc} ; (ldr pc, [sp], #4)
> >
> > Perhaps we need a way of telling GCC we are using the stack but I've not
> > managed to spot a way of doing that.
>
> Using which compiler options?
The ones the Linux makefile picks when building with vexpress_defconfig.
I hacked a kernel file to build the above example but the behaviour is
what I observed with the real kprobes code. I've pasted the commanline
produced by building with V=1 at the end of this email.
One thing I've noticed playing around just now is that if I add "sp" to
the clobber list, and use a newer GCC (gcc-linaro-6.2.1-2016.11) then it
does allign the stack correctly. "sp" makes no difference with GCC 5.3 or
4.8.
That commandline ...
/data/arm32/aarch32-toolchain/gcc-linaro-5.3.1-2016.05-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc
-Wp,-MD,arch/arm/kernel/.patch.o.d -nostdinc -isystem
/data/arm32/aarch32-toolchain/gcc-linaro-5.3.1-2016.05-x86_64_arm-linux-gnueabihf/bin/../lib/gcc/arm-linux-gnueabihf/5.3.1/include
-I./arch/arm/include -I./arch/arm/include/generated/uapi
-I./arch/arm/include/generated -I./include -I./arch/arm/include/uapi
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h
-D__KERNEL__ -mlittle-endian -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration
-Wno-format-security -std=gnu89 -fno-PIE -fno-dwarf2-cfi-asm -fno-ipa-sra
-mabi=aapcs-linux -mno-thumb-interwork -mfpu=vfp -funwind-tables -marm
-D__LINUX_ARM_ARCH__=7 -march=armv7-a -msoft-float -Uarm
-fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
-Wframe-larger-than=1024 -fno-stack-protector -Wno-unused-but-set-variable
-fomit-frame-pointer -fno-var-tracking-assignments -g
-fno-inline-functions-called-once -Wdeclaration-after-statement
-Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int
-Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types
-DCC_HAVE_ASM_GOTO -DKBUILD_BASENAME='"patch"' -DKBUILD_MODNAME='"patch"' -c
-o arch/arm/kernel/patch.o arch/arm/kernel/patch.c
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
2017-03-17 14:42 ` Jon Medhurst (Tixy)
@ 2017-03-17 15:05 ` Russell King - ARM Linux
2017-03-17 17:50 ` Jon Medhurst (Tixy)
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2017-03-17 15:05 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Mar 17, 2017 at 02:42:09PM +0000, Jon Medhurst (Tixy) wrote:
> On Fri, 2017-03-17 at 14:06 +0000, Russell King - ARM Linux wrote:
> > On Fri, Mar 17, 2017 at 12:59:02PM +0000, Jon Medhurst (Tixy) wrote:
> > >
> [...]
> > > It isn't, because GCC turns code like this
> > >
> > > void foo(void)
> > > {
> > > asm volatile("bl __kprobes_test_case_start"
> > > : : : "r0", "r1", "r2", "r3", "ip", "lr", "memory", "cc");
> > > }
> > >
> > > into this...
> > >
> > > 8010e4ac <foo>:
> > > 8010e4ac: e52de004 push {lr} ; (str lr, [sp, #-4]!)
> > > 8010e4b0: eb002c99 bl 8011971c <__kprobes_test_case_start>
> > > 8010e4b4: e49df004 pop {pc} ; (ldr pc, [sp], #4)
> > >
> > > Perhaps we need a way of telling GCC we are using the stack but I've not
> > > managed to spot a way of doing that.
> >
> > Using which compiler options?
>
> The ones the Linux makefile picks when building with vexpress_defconfig.
> I hacked a kernel file to build the above example but the behaviour is
> what I observed with the real kprobes code. I've pasted the commanline
> produced by building with V=1 at the end of this email.
>
> One thing I've noticed playing around just now is that if I add "sp" to
> the clobber list, and use a newer GCC (gcc-linaro-6.2.1-2016.11) then it
> does allign the stack correctly. "sp" makes no difference with GCC 5.3 or
> 4.8.
Meanwhile the kernel images I have here all have code to align the stack
after pushing registers in functions, except for __csum_ipv6_magic and
__memzero, which are both assembly. That's gcc 4.7.4 building iMX6 only
(which is also armv7-a.)
I suspect we're into compiler behavioural differences, which can't be
relied upon, so I guess we do need to do something in
__kprobes_test_case_start() to work around it.
I'd do:
mov ip, sp
tst sp, #4
subeq sp, sp, #4 @ need to mis-align as we don't save
stmfd sp!, {r4 - r11, ip} @ an even number of registers to end
sub sp, sp, #size @ up with an aligned stack here
and when restoring:
add ip, sp, #size
ldmfd ip, {r4 - r11, sp}
bx r0
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] arm: kprobes: Align stack to 8-bytes in test code
2017-03-17 15:05 ` Russell King - ARM Linux
@ 2017-03-17 17:50 ` Jon Medhurst (Tixy)
0 siblings, 0 replies; 7+ messages in thread
From: Jon Medhurst (Tixy) @ 2017-03-17 17:50 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 2017-03-17 at 15:05 +0000, Russell King - ARM Linux wrote
[...]
> I suspect we're into compiler behavioural differences, which can't be
> relied upon, so I guess we do need to do something in
> __kprobes_test_case_start() to work around it.
>
> I'd do:
>
> mov ip, sp
> tst sp, #4
> subeq sp, sp, #4 @ need to mis-align as we don't save
> stmfd sp!, {r4 - r11, ip} @ an even number of registers to end
> sub sp, sp, #size @ up with an aligned stack here
>
> and when restoring:
>
> add ip, sp, #size
> ldmfd ip, {r4 - r11, sp}
> bx r0
Thumb doesn't allow a lot of operations with SP (like that 'tst sp, #4'
or having SP in the reg list for LDM) which is why I ended up producing
slightly convoluted code. Some of these are deprecated for ARM too,
which means they might not be supported on newer architecture versions,
oh joy.
--
Tixy
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-03-17 17:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-16 13:53 [PATCH] arm: kprobes: Align stack to 8-bytes in test code Jon Medhurst
2017-03-17 12:10 ` Russell King - ARM Linux
2017-03-17 12:59 ` Jon Medhurst (Tixy)
2017-03-17 14:06 ` Russell King - ARM Linux
2017-03-17 14:42 ` Jon Medhurst (Tixy)
2017-03-17 15:05 ` Russell King - ARM Linux
2017-03-17 17:50 ` Jon Medhurst (Tixy)
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.