All of lore.kernel.org
 help / color / mirror / Atom feed
* do page fault in atomic bug on arm
@ 2017-11-21 13:06 Alex Shi
  2017-11-21 13:20 ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Shi @ 2017-11-21 13:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

LKFT occasionally found a kernel bug in x15 platform, which is a armv7 board. 
The bug caught on kernel commit f82786d v4.9.55, but panic could happens in 
upstream, since there is no much change on the function call chain.

The function call chain is vector___pabt_svc -> do_PrefetchAbort -> 
	do_page_fault -> might_sleep()

The trick thing is LKFT team can not reproduce the bug. But from the kernel
panic info, we know the irq_disabled() is 128, that would be the only reason,
we got the panic -- the code can not return since irqs_disabled() = 128.
The preempt_offset and preempt_count are both 0 here.

line 7726 in kernel/sched/core.c: in function ___might_sleep():
       if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
             !is_idle_task(current)) ||
            system_state != SYSTEM_RUNNING || oops_in_progress)
                return;

I have no more idea on this issue. Any hints are appreciated!

Regards
Alex

 BUG: sleeping function called from invalid context at /srv/oe/build/tmp-rpb-glibc/work-shared/am57xx-evm/kernel-source/arch/arm/mm/fault.c:303
[   53.264908] in_atomic(): 0, irqs_disabled(): 128, pid: 1691, name: ftracetest
[   53.272074] 1 lock held by ftracetest/1691:
[   53.276273]  #0:  (&mm->mmap_sem){++++++}, at: [<c0d60cfc>] do_page_fault+0x90/0x428
[   53.284095] irq event stamp: 12924
[   53.287514] hardirqs last  enabled at (12923): [<c0307f10>] no_work_pending+0x4/0x30
[   53.295289] hardirqs last disabled at (12924): [<c0d605a0>] __pabt_svc+0x60/0xa0
[   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
[   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
[   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
[   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
[   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
[   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
[   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
[   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
[   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
[   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
[   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
[   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
[   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
[   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
[   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-21 13:06 do page fault in atomic bug on arm Alex Shi
@ 2017-11-21 13:20 ` Russell King - ARM Linux
  2017-11-24 15:09   ` Alex Shi
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-21 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 21, 2017 at 09:06:27PM +0800, Alex Shi wrote:
> Hi All,
> 
> LKFT occasionally found a kernel bug in x15 platform, which is a armv7 board. 
> The bug caught on kernel commit f82786d v4.9.55, but panic could happens in 
> upstream, since there is no much change on the function call chain.
> 
> The function call chain is vector___pabt_svc -> do_PrefetchAbort -> 
> 	do_page_fault -> might_sleep()
> 
> The trick thing is LKFT team can not reproduce the bug. But from the kernel
> panic info, we know the irq_disabled() is 128, that would be the only reason,
> we got the panic -- the code can not return since irqs_disabled() = 128.
> The preempt_offset and preempt_count are both 0 here.
> 
> line 7726 in kernel/sched/core.c: in function ___might_sleep():
>        if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
>              !is_idle_task(current)) ||
>             system_state != SYSTEM_RUNNING || oops_in_progress)
>                 return;
> 
> I have no more idea on this issue. Any hints are appreciated!
> 
> Regards
> Alex
> 
>  BUG: sleeping function called from invalid context at /srv/oe/build/tmp-rpb-glibc/work-shared/am57xx-evm/kernel-source/arch/arm/mm/fault.c:303
> [   53.264908] in_atomic(): 0, irqs_disabled(): 128, pid: 1691, name: ftracetest
> [   53.272074] 1 lock held by ftracetest/1691:
> [   53.276273]  #0:  (&mm->mmap_sem){++++++}, at: [<c0d60cfc>] do_page_fault+0x90/0x428
> [   53.284095] irq event stamp: 12924
> [   53.287514] hardirqs last  enabled at (12923): [<c0307f10>] no_work_pending+0x4/0x30
> [   53.295289] hardirqs last disabled at (12924): [<c0d605a0>] __pabt_svc+0x60/0xa0

Unfortunately, this doesn't help, because on entry to __pabt_svc, we
tell the IRQ context tracker that IRQs are now disabled, wiping out
the previous recording of where IRQs were disabled...

> [   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
> [   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
> [   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
> [   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
> [   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
> [   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
> [   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
> [   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
> [   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
> [   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
> [   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
> [   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
> [   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
> [   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
> [   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)

It also doesn't help that the backtrace stops at this point, and it looks
very strange:

1. the value of PC looks like it's outside of the module space.
2. the CPSR indicates that the CPU was in SVC mode in the parent context
   with IRQs disabled.
3. We're right at the top of the kernel stack, which suggests no further
   stack frames above this.

We should never be in SVC mode without further stack frames on the kernel
stack.

We don't seem to have overflowed the kernel stack, as the thread info
seems correct - and it would also be unlikely that the saved SP value
would end in ff8 in the exception stack frame.

I suspect something nasty is going on in the ftrace code, causing some
stacked state corruption, which then leads to us returning from a
kernel exception with state that leaves the CPU in SVC mode with
IRQs disabled, and with a LR & PC value of 0xb6e6d49c - a page that
doesn't exist.  That the leads to a prefetch abort, and this error.

In other words, the real problem is that something has gone wrong in
the ftrace code... what that is, I've no idea.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-21 13:20 ` Russell King - ARM Linux
@ 2017-11-24 15:09   ` Alex Shi
  2017-11-24 15:56     ` Russell King - ARM Linux
  2017-11-24 19:27     ` Russell King - ARM Linux
  0 siblings, 2 replies; 19+ messages in thread
From: Alex Shi @ 2017-11-24 15:09 UTC (permalink / raw)
  To: linux-arm-kernel


>> [   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
>> [   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
>> [   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
>> [   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
>> [   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
>> [   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
>> [   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
>> [   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
>> [   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
>> [   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
>> [   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
>> [   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
>> [   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
>> [   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
>> [   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)
> 
> It also doesn't help that the backtrace stops at this point, and it looks
> very strange:
> 
> 1. the value of PC looks like it's outside of the module space.
> 2. the CPSR indicates that the CPU was in SVC mode in the parent context
>    with IRQs disabled.
> 3. We're right at the top of the kernel stack, which suggests no further
>    stack frames above this.
> 
> We should never be in SVC mode without further stack frames on the kernel
> stack.
> 
> We don't seem to have overflowed the kernel stack, as the thread info
> seems correct - and it would also be unlikely that the saved SP value
> would end in ff8 in the exception stack frame.

Hi Russell,

Sorry for response late!
Is this SP was stained by sth? As my understand, SP should be times of
32bits. But why stack print out correct with a incorrect SP?

> 
> I suspect something nasty is going on in the ftrace code, causing some
> stacked state corruption, which then leads to us returning from a
> kernel exception with state that leaves the CPU in SVC mode with
> IRQs disabled, and with a LR & PC value of 0xb6e6d49c - a page that
> doesn't exist.  That the leads to a prefetch abort, and this error.
> 
> In other words, the real problem is that something has gone wrong in
> the ftrace code... what that is, I've no idea.
> 

Full agree with your analysis. Is it possible to stain PC value with
heavy stress on thermal or sth else? the ARM64 board run well with
ftracetest of LTP.

Thanks a lot!
Alex

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 15:09   ` Alex Shi
@ 2017-11-24 15:56     ` Russell King - ARM Linux
  2017-11-26 14:58       ` Alex Shi
  2017-11-24 19:27     ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-24 15:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
> 
> >> [   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
> >> [   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
> >> [   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
> >> [   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
> >> [   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
> >> [   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
> >> [   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
> >> [   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
> >> [   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
> >> [   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
> >> [   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
> >> [   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
> >> [   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
> >> [   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
> >> [   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)
> > 
> > It also doesn't help that the backtrace stops at this point, and it looks
> > very strange:
> > 
> > 1. the value of PC looks like it's outside of the module space.
> > 2. the CPSR indicates that the CPU was in SVC mode in the parent context
> >    with IRQs disabled.
> > 3. We're right at the top of the kernel stack, which suggests no further
> >    stack frames above this.
> > 
> > We should never be in SVC mode without further stack frames on the kernel
> > stack.
> > 
> > We don't seem to have overflowed the kernel stack, as the thread info
> > seems correct - and it would also be unlikely that the saved SP value
> > would end in ff8 in the exception stack frame.
> 
> Hi Russell,
> 
> Sorry for response late!
> Is this SP was stained by sth? As my understand, SP should be times of
> 32bits. But why stack print out correct with a incorrect SP?

There's nothing wrong with SP.

Looking deeper at this, I think that the kernel stack somehow got
corrupted earlier:

irq event stamp: 12924
hardirqs last  enabled at (12923): [<c0307f10>] no_work_pending+0x4/0x30
hardirqs last disabled at (12924): [<c0d605a0>] __pabt_svc+0x60/0xa0

The hard IRQ disable is as a result of taking a prefetch abort in
SVC mode.  The saved context agrees with that:

                        R0       R1       R2       R3       R4       R5
bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
      R6       R7       R8       R9       R10      FP       IP       SP
bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
      LR       PC       PSR
bfe0: b6e6d49c b6e6d49c 40070093 ffffffff

The PSR lower 5 bits are 0x13, which is SVC mode.  Bit 7 set means IRQs
disabled.  The PC address was 0xb6e6d49c.

The last record we have of interrupts being enabled was in
no_work_pending, which is the exit path to usermode - but if we were
returning to usermode:

(a) how did we get into SVC mode instead
(b) why are interrupts disabled
(c) why was mm->mmap_sem still held

Can you try the following patch to try and catch the problem earlier?
I haven't tested it myself, and adding code may move things around in
the kernel and make this bug disappear.

diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index d523cd8439a3..ff577177b286 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -299,6 +299,8 @@
 	@ ARM mode restore
 	mov	r2, sp
 	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
+	tst	r1, #0xcf
+	bne	oops
 	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
 	msr	spsr_cxsf, r1			@ save in spsr_svc
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
@@ -314,6 +316,15 @@
 						@ after ldm {}^
 	add	sp, sp, #\offset + PT_REGS_SIZE
 	movs	pc, lr				@ return & move spsr_svc into cpsr
+oops:	.word	0xe7f001f2
+	.pushsection .rodata.str, "aMS", %progbits, 1
+2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
+	.popsection
+	.pushsection __bug_table, "aw"
+	.align	2
+	.word	oops, 2b
+	.hword	\@
+	.popsection
 #elif defined(CONFIG_CPU_V7M)
 	@ V7M restore.
 	@ Note that we don't need to do clrex here as clearing the local


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 15:09   ` Alex Shi
  2017-11-24 15:56     ` Russell King - ARM Linux
@ 2017-11-24 19:27     ` Russell King - ARM Linux
  2017-11-24 20:25       ` Russell King - ARM Linux
  2017-11-26 12:07       ` Alex Shi
  1 sibling, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-24 19:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
> Full agree with your analysis. Is it possible to stain PC value with
> heavy stress on thermal or sth else? the ARM64 board run well with
> ftracetest of LTP.

In your first email, you said "x15 platform, which is a armv7 board."
Here you say "ARM64 board" which isn't armv7.  There's x15 DTS under
arch/arm/boot/dts, so I guess you mean 32-bit ARM, but who knows...

Anyway, I've tried running ftracetest on an OMAP4430 SDP board, and
after a while with the patch I sent you, I get:

Internal error: Oops - BUG: 0 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 2948 Comm: ftracetest Not tainted 4.14.0+ #557
Hardware name: Generic OMAP4 (Flattened Device Tree)
task: ce41c100 task.stack: cc7b8000
PC is at oops+0x0/0x4
LR is at trace_hardirqs_on_caller+0x154/0x1e0
pc : [<c0015adc>]    lr : [<c0086840>]    psr: 20000193
sp : cc7b9fb0  ip : cc7b9f80  fp : 00000000
r10: 00000000  r9 : cc7b8000  r8 : c0015c28
r7 : 00000006  r6 : 00000004  r5 : 0009fed4  r4 : 00000001
r3 : 00000000  r2 : cc7b9fb0  r1 : 60000193  r0 : 00000001
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8e7b804a  DAC: 00000055
Process ftracetest (pid: 2948, stack limit = 0xcc7b8210)
Stack: (0xcc7b9fb0 to 0xcc7ba000)
9fa0:                                     00000000 00000000 0009b008 0000000d
9fc0: 00000001 0009fed4 00000004 00000006 0009b3e0 000a17d4 00000000 beaf6e3c
9fe0: 0009fed0 beaf6e20 000319e0 b6e7199c 60000193 00000001 6b6b6b6b a56b6b6b
Backtrace: no frame pointer
Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
---[ end trace 390efe5843605357 ]---

The other CPU also oopses:

Internal error: Oops - BUG: 0 [#3] SMP ARM
Modules linked in:
CPU: 1 PID: 1 Comm: init Tainted: G      D         4.14.0+ #557
Hardware name: Generic OMAP4 (Flattened Device Tree)
task: ced04c00 task.stack: ced06000
PC is at oops+0x0/0x4
LR is at trace_hardirqs_on+0x14/0x18
pc : [<c0015adc>]    lr : [<c00868e0>]    psr: 20000193
sp : ced07fb0  ip : ced07fa0  fp : 00000000
r10: 00000000  r9 : ced06000  r8 : c0015c28
r7 : 0000004e  r6 : bec0acd4  r5 : 000176b4  r4 : bec0ac3c
r3 : 00000000  r2 : ced07fb0  r1 : 60000193  r0 : c0015aa8
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8e2d804a  DAC: 00000055
Process init (pid: 1, stack limit = 0xced06210)
Stack: (0xced07fb0 to 0xced08000)
7fa0:                                     00000000 00000000 00000000 00000000
7fc0: bec0ac3c 000176b4 bec0acd4 0000004e 10000000 00000000 0000a1d0 bec0ac44
7fe0: bec0acd8 bec0ac28 b6e54544 b6e5456c 60000193 bec0ac28 005d5555 00020201
Backtrace: no frame pointer
Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
---[ end trace 390efe5843605358 ]---

which is exactly your bug, but caught a bit earlier.

This happens while executing this ftrace test:

[28] Register/unregister many kprobe events

and needs a kernel with ftrace and kprobes enabled.

Unfortunately, the debug is immediately after a call to
trace_hardirqs_on() in no_work_pending, so the LR value is
meaningless.

So, now that we know it's tracing kprobes triggering it - it's
trying to set tracepoints on the first 256 symbols in the kernel's
kallsyms, which includes all sorts of things.

With some extra debug, this doesn't look clever:

trace_kprobe: Inserting kprobe at ret_fast_syscall+0
trace_kprobe: Inserting kprobe at slow_work_pending+0
trace_kprobe: Inserting kprobe at ret_slow_syscall+0
trace_kprobe: Could not insert probe at ret_slow_syscall+0: -22
trace_kprobe: Inserting kprobe at ret_to_user+0
trace_kprobe: Could not insert probe at ret_to_user+0: -22
trace_kprobe: Inserting kprobe at ret_to_user_from_irq+0
trace_kprobe: Inserting kprobe at no_work_pending+0
trace_kprobe: Inserting kprobe at oops+0
trace_kprobe: Could not insert probe at oops+0: -22
trace_kprobe: Inserting kprobe at ret_from_fork+0
trace_kprobe: Inserting kprobe at vector_swi+0
trace_kprobe: Inserting kprobe at local_restart+0
trace_kprobe: Inserting kprobe at __sys_trace+0
trace_kprobe: Inserting kprobe at __sys_trace_return+0
trace_kprobe: Inserting kprobe at __sys_trace_return_nosave+0
trace_kprobe: Could not insert probe at __sys_trace_return_nosave+0: -22
trace_kprobe: Inserting kprobe at __cr_alignment+0
trace_kprobe: Could not insert probe at __cr_alignment+0: -22
trace_kprobe: Inserting kprobe at sys_call_table+0
trace_kprobe: Inserting kprobe at sys_syscall+0
trace_kprobe: Inserting kprobe at sys_sigreturn_wrapper+0
trace_kprobe: Inserting kprobe at sys_rt_sigreturn_wrapper+0
trace_kprobe: Inserting kprobe at sys_statfs64_wrapper+0
trace_kprobe: Inserting kprobe at sys_fstatfs64_wrapper+0

I wouldn't be surprised if some of those were the cause of it -
for example, what guarantee do we have that a trace kprobe inserted
at ret_fast_syscall which starts with this:

c0015a40:       e5ad0008        str     r0, [sp, #8]!

will be handled correctly?  I can't say, I've virtually no knowledge
about kprobes, but I guess it isn't - especially as there's this
comment in the ARM kprobes code:

         * Never instrument insn like 'str r0, [sp, +/-r1]'. Also, insn likes
         * 'str r0, [sp, #-68]' should also be prohibited.

Clearly, that's not the case as the kprobes insert on
ret_fast_syscall succeeded.

Adding Tixy, as he's more knowledgable in this area - I suggest
someone knowledgable in this area runs

	ftracetest test.d/kprobe/multiple_kprobes.tc

and fixes these bugs... also running the entire ftracetest suite
would probably also be a very good idea.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 19:27     ` Russell King - ARM Linux
@ 2017-11-24 20:25       ` Russell King - ARM Linux
  2017-11-24 22:20         ` Russell King - ARM Linux
  2017-11-26 14:59         ` Alex Shi
  2017-11-26 12:07       ` Alex Shi
  1 sibling, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-24 20:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
> Adding Tixy, as he's more knowledgable in this area - I suggest
> someone knowledgable in this area runs
> 
> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> 
> and fixes these bugs... also running the entire ftracetest suite
> would probably also be a very good idea.

There's some other stupidities as well:

trace_kprobe: Inserting kprobe at __error_lpae+0
trace_kprobe: Inserting kprobe at str_lpae+0
trace_kprobe: Inserting kprobe at __error_p+0
trace_kprobe: Inserting kprobe at str_p1+0
trace_kprobe: Inserting kprobe at str_p2+0
trace_kprobe: Could not insert probe at str_p2+0: -22

str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
str_p2: .asciz  ").\n"

So we successfully placed a kprobe on some ASCII strings, which are
used prior to the kernel being properly up and running, which means
we have to use relative references to these strings, and relative
references to strings in other sections is not simple.

More worrying:

trace_kprobe: Inserting kprobe at __turn_mmu_on+0
trace_kprobe: Inserting kprobe at __idmap_text_start+0
trace_kprobe: Inserting kprobe at __turn_mmu_on_end+0
...
trace_kprobe: Inserting kprobe at __idmap_text_end+0
...
trace_kprobe: Inserting kprobe at secondary_startup+0
trace_kprobe: Inserting kprobe at secondary_startup_arm+0
trace_kprobe: Inserting kprobe at __secondary_switched+0
trace_kprobe: Inserting kprobe at __secondary_data+0
trace_kprobe: Inserting kprobe at __enable_mmu+0
trace_kprobe: Inserting kprobe at __do_fixup_smp_on_up+0
trace_kprobe: Inserting kprobe at __fixup_a_pv_table+0
trace_kprobe: Inserting kprobe at fixup_pv_table+0
trace_kprobe: Inserting kprobe at __lookup_processor_type+0
trace_kprobe: Inserting kprobe at __lookup_processor_type_data+0

These are definitely a no-no for tracing, because they're part of the
early startup code for the kernel when no stacks are available.

Some of these can't live in the special "do not use kprobes here"
section as they need to be in other sections (like .idmap) because
they need to be part of the identity-mapped code.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 20:25       ` Russell King - ARM Linux
@ 2017-11-24 22:20         ` Russell King - ARM Linux
  2017-11-26 15:02           ` Alex Shi
  2017-11-26 14:59         ` Alex Shi
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-24 22:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 24, 2017 at 08:25:53PM +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
> > Adding Tixy, as he's more knowledgable in this area - I suggest
> > someone knowledgable in this area runs
> > 
> > 	ftracetest test.d/kprobe/multiple_kprobes.tc
> > 
> > and fixes these bugs... also running the entire ftracetest suite
> > would probably also be a very good idea.
> 
> There's some other stupidities as well:
> 
> trace_kprobe: Inserting kprobe at __error_lpae+0
> trace_kprobe: Inserting kprobe at str_lpae+0
> trace_kprobe: Inserting kprobe at __error_p+0
> trace_kprobe: Inserting kprobe at str_p1+0
> trace_kprobe: Inserting kprobe at str_p2+0
> trace_kprobe: Could not insert probe at str_p2+0: -22
> 
> str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
> str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
> str_p2: .asciz  ").\n"
> 
> So we successfully placed a kprobe on some ASCII strings, which are
> used prior to the kernel being properly up and running, which means
> we have to use relative references to these strings, and relative
> references to strings in other sections is not simple.
> 
> More worrying:
> 
> trace_kprobe: Inserting kprobe at __turn_mmu_on+0
> trace_kprobe: Inserting kprobe at __idmap_text_start+0
> trace_kprobe: Inserting kprobe at __turn_mmu_on_end+0
> ...
> trace_kprobe: Inserting kprobe at __idmap_text_end+0
> ...
> trace_kprobe: Inserting kprobe at secondary_startup+0
> trace_kprobe: Inserting kprobe at secondary_startup_arm+0
> trace_kprobe: Inserting kprobe at __secondary_switched+0
> trace_kprobe: Inserting kprobe at __secondary_data+0
> trace_kprobe: Inserting kprobe at __enable_mmu+0
> trace_kprobe: Inserting kprobe at __do_fixup_smp_on_up+0
> trace_kprobe: Inserting kprobe at __fixup_a_pv_table+0
> trace_kprobe: Inserting kprobe at fixup_pv_table+0
> trace_kprobe: Inserting kprobe at __lookup_processor_type+0
> trace_kprobe: Inserting kprobe at __lookup_processor_type_data+0
> 
> These are definitely a no-no for tracing, because they're part of the
> early startup code for the kernel when no stacks are available.
> 
> Some of these can't live in the special "do not use kprobes here"
> section as they need to be in other sections (like .idmap) because
> they need to be part of the identity-mapped code.

Okay, this is what I came up with, and seems to solve the problem
here.  It's quite a large patch though, as it shuffles around how
we deal with exception entry, and knowing whether we should dump
the stacked registers.  This particular patch also contains a few
debugging bits too.

 arch/arm/include/asm/exception.h |  3 +--
 arch/arm/include/asm/sections.h  | 21 +++++++++++++++++++++
 arch/arm/include/asm/traps.h     | 12 ------------
 arch/arm/kernel/entry-armv.S     |  6 +-----
 arch/arm/kernel/entry-common.S   |  1 +
 arch/arm/kernel/entry-header.S   | 13 +++++++++++++
 arch/arm/kernel/head-common.S    | 12 ++++++------
 arch/arm/kernel/head.S           |  2 +-
 arch/arm/kernel/stacktrace.c     | 14 ++------------
 arch/arm/kernel/traps.c          |  4 ++--
 arch/arm/kernel/vmlinux.lds.S    |  6 +++---
 arch/arm/mm/fault.c              |  5 ++---
 arch/arm/probes/kprobes/core.c   | 14 +++++++++++---
 kernel/trace/trace_kprobe.c      |  3 +++
 14 files changed, 67 insertions(+), 49 deletions(-)

diff --git a/arch/arm/include/asm/exception.h b/arch/arm/include/asm/exception.h
index a7273ad9587a..58e039a851af 100644
--- a/arch/arm/include/asm/exception.h
+++ b/arch/arm/include/asm/exception.h
@@ -10,11 +10,10 @@
 
 #include <linux/interrupt.h>
 
-#define __exception	__attribute__((section(".exception.text")))
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 #define __exception_irq_entry	__irq_entry
 #else
-#define __exception_irq_entry	__exception
+#define __exception_irq_entry
 #endif
 
 #endif /* __ASM_ARM_EXCEPTION_H */
diff --git a/arch/arm/include/asm/sections.h b/arch/arm/include/asm/sections.h
index 63dfe1f10335..4ceb4f757d4d 100644
--- a/arch/arm/include/asm/sections.h
+++ b/arch/arm/include/asm/sections.h
@@ -6,4 +6,25 @@
 
 extern char _exiprom[];
 
+extern char __idmap_text_start[];
+extern char __idmap_text_end[];
+extern char __entry_text_start[];
+extern char __entry_text_end[];
+extern char __hyp_idmap_text_start[];
+extern char __hyp_idmap_text_end[];
+
+static inline bool in_entry_text(unsigned long addr)
+{
+	return memory_contains(__entry_text_start, __entry_text_end,
+			       (void *)addr, 1);
+}
+
+static inline bool in_idmap_text(unsigned long addr)
+{
+	void *a = (void *)addr;
+	return memory_contains(__idmap_text_start, __idmap_text_end, a, 1) ||
+	       memory_contains(__hyp_idmap_text_start, __hyp_idmap_text_end,
+			       a, 1);
+}
+
 #endif	/* _ASM_ARM_SECTIONS_H */
diff --git a/arch/arm/include/asm/traps.h b/arch/arm/include/asm/traps.h
index f9a6c5fc3fd1..a00288d75ee6 100644
--- a/arch/arm/include/asm/traps.h
+++ b/arch/arm/include/asm/traps.h
@@ -28,18 +28,6 @@ static inline int __in_irqentry_text(unsigned long ptr)
 	       ptr < (unsigned long)&__irqentry_text_end;
 }
 
-static inline int in_exception_text(unsigned long ptr)
-{
-	extern char __exception_text_start[];
-	extern char __exception_text_end[];
-	int in;
-
-	in = ptr >= (unsigned long)&__exception_text_start &&
-	     ptr < (unsigned long)&__exception_text_end;
-
-	return in ? : __in_irqentry_text(ptr);
-}
-
 extern void __init early_trap_init(void *);
 extern void dump_backtrace_entry(unsigned long where, unsigned long from, unsigned long frame);
 extern void ptrace_break(struct task_struct *tsk, struct pt_regs *regs);
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 10ad44f3886a..b8ab97dfdd17 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -82,11 +82,7 @@
 #endif
 	.endm
 
-#ifdef CONFIG_KPROBES
-	.section	.kprobes.text,"ax",%progbits
-#else
-	.text
-#endif
+	.section	.entry.text,"ax",%progbits
 
 /*
  * Invalid mode handlers
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index e655dcd0a933..3c4f88701f22 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -37,6 +37,7 @@ saved_pc	.req	lr
 #define TRACE(x...)
 #endif
 
+	.section .entry.text,"ax",%progbits
 	.align	5
 #if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING))
 /*
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index d523cd8439a3..e2d216ad70f0 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -300,6 +300,8 @@
 	mov	r2, sp
 	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
 	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
+	tst	r1, #0xcf
+	bne	oops
 	msr	spsr_cxsf, r1			@ save in spsr_svc
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
 	@ We must avoid clrex due to Cortex-A15 erratum #830321
@@ -314,6 +316,17 @@
 						@ after ldm {}^
 	add	sp, sp, #\offset + PT_REGS_SIZE
 	movs	pc, lr				@ return & move spsr_svc into cpsr
+oops:	.word	0xe7f001f2
+#ifdef CONFIG_DEBUG_BUGVERBOSE
+	.pushsection .rodata.str, "aMS", %progbits, 1
+2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
+	.popsection
+	.pushsection __bug_table, "aw"
+	.align	2
+	.word	oops, 2b
+	.hword	\@
+	.popsection
+#endif
 #elif defined(CONFIG_CPU_V7M)
 	@ V7M restore.
 	@ Note that we don't need to do clrex here as clearing the local
diff --git a/arch/arm/kernel/head-common.S b/arch/arm/kernel/head-common.S
index 21dde771a7dd..a989d84c4c38 100644
--- a/arch/arm/kernel/head-common.S
+++ b/arch/arm/kernel/head-common.S
@@ -201,10 +201,10 @@ ENDPROC(__lookup_processor_type)
 
 __error_lpae:
 #ifdef CONFIG_DEBUG_LL
-	adr	r0, str_lpae
+	adr	r0, 1f
 	bl 	printascii
 	b	__error
-str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
+1:	.asciz	"\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
 #else
 	b	__error
 #endif
@@ -213,15 +213,15 @@ ENDPROC(__error_lpae)
 
 __error_p:
 #ifdef CONFIG_DEBUG_LL
-	adr	r0, str_p1
+	adr	r0, 1f
 	bl	printascii
 	mov	r0, r9
 	bl	printhex8
-	adr	r0, str_p2
+	adr	r0, 2f
 	bl	printascii
 	b	__error
-str_p1:	.asciz	"\nError: unrecognized/unsupported processor variant (0x"
-str_p2:	.asciz	").\n"
+1:	.asciz	"\nError: unrecognized/unsupported processor variant (0x"
+2:	.asciz	").\n"
 	.align
 #endif
 ENDPROC(__error_p)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 6b1148cafffd..7ea72ce41329 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -415,7 +415,7 @@ ENDPROC(secondary_startup_arm)
 	/*
 	 * r6  = &secondary_data
 	 */
-ENTRY(__secondary_switched)
+__secondary_switched:
 	ldr	sp, [r7, #12]			@ get secondary_data.stack
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/stacktrace.c b/arch/arm/kernel/stacktrace.c
index 65228bf4c6df..a56e7c856ab5 100644
--- a/arch/arm/kernel/stacktrace.c
+++ b/arch/arm/kernel/stacktrace.c
@@ -3,6 +3,7 @@
 #include <linux/sched/debug.h>
 #include <linux/stacktrace.h>
 
+#include <asm/sections.h>
 #include <asm/stacktrace.h>
 #include <asm/traps.h>
 
@@ -63,7 +64,6 @@ EXPORT_SYMBOL(walk_stackframe);
 #ifdef CONFIG_STACKTRACE
 struct stack_trace_data {
 	struct stack_trace *trace;
-	unsigned long last_pc;
 	unsigned int no_sched_functions;
 	unsigned int skip;
 };
@@ -87,16 +87,7 @@ static int save_trace(struct stackframe *frame, void *d)
 	if (trace->nr_entries >= trace->max_entries)
 		return 1;
 
-	/*
-	 * in_exception_text() is designed to test if the PC is one of
-	 * the functions which has an exception stack above it, but
-	 * unfortunately what is in frame->pc is the return LR value,
-	 * not the saved PC value.  So, we need to track the previous
-	 * frame PC value when doing this.
-	 */
-	addr = data->last_pc;
-	data->last_pc = frame->pc;
-	if (!in_exception_text(addr))
+	if (!in_entry_text(frame->pc))
 		return 0;
 
 	regs = (struct pt_regs *)frame->sp;
@@ -114,7 +105,6 @@ static noinline void __save_stack_trace(struct task_struct *tsk,
 	struct stackframe frame;
 
 	data.trace = trace;
-	data.last_pc = ULONG_MAX;
 	data.skip = trace->skip;
 	data.no_sched_functions = nosched;
 
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 5de2bc46153f..95978073db53 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -73,7 +73,7 @@ void dump_backtrace_entry(unsigned long where, unsigned long from, unsigned long
 	printk("Function entered at [<%08lx>] from [<%08lx>]\n", where, from);
 #endif
 
-	if (in_exception_text(where))
+	if (in_entry_text(from))
 		dump_mem("", "Exception stack", frame + 4, frame + 4 + sizeof(struct pt_regs));
 }
 
@@ -434,7 +434,7 @@ static int call_undef_hook(struct pt_regs *regs, unsigned int instr)
 	return fn ? fn(regs, instr) : 1;
 }
 
-asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
+asmlinkage void do_undefinstr(struct pt_regs *regs)
 {
 	unsigned int instr;
 	siginfo_t info;
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index ee53f6518872..84a1ae3ce46e 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -105,9 +105,9 @@ SECTIONS
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			IDMAP_TEXT
-			__exception_text_start = .;
-			*(.exception.text)
-			__exception_text_end = .;
+			__entry_text_start = .;
+			*(.entry.text)
+			__entry_text_end = .;
 			IRQENTRY_TEXT
 			SOFTIRQENTRY_TEXT
 			TEXT_TEXT
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 42f585379e19..b75eada23d0a 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -21,7 +21,6 @@
 #include <linux/highmem.h>
 #include <linux/perf_event.h>
 
-#include <asm/exception.h>
 #include <asm/pgtable.h>
 #include <asm/system_misc.h>
 #include <asm/system_info.h>
@@ -545,7 +544,7 @@ hook_fault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *)
 /*
  * Dispatch a data abort to the relevant handler.
  */
-asmlinkage void __exception
+asmlinkage void
 do_DataAbort(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
 	const struct fsr_info *inf = fsr_info + fsr_fs(fsr);
@@ -578,7 +577,7 @@ hook_ifault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *
 	ifsr_info[nr].name = name;
 }
 
-asmlinkage void __exception
+asmlinkage void
 do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)
 {
 	const struct fsr_info *inf = ifsr_info + fsr_fs(ifsr);
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 52d1cd14fda4..e90cc8a08186 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -32,6 +32,7 @@
 #include <linux/percpu.h>
 #include <linux/bug.h>
 #include <asm/patch.h>
+#include <asm/sections.h>
 
 #include "../decode-arm.h"
 #include "../decode-thumb.h"
@@ -64,9 +65,6 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
 	int is;
 	const struct decode_checker **checkers;
 
-	if (in_exception_text(addr))
-		return -EINVAL;
-
 #ifdef CONFIG_THUMB2_KERNEL
 	thumb = true;
 	addr &= ~1; /* Bit 0 would normally be set to indicate Thumb code */
@@ -680,3 +678,13 @@ int __init arch_init_kprobes()
 #endif
 	return 0;
 }
+
+bool arch_within_kprobe_blacklist(unsigned long addr)
+{
+	void *a = (void *)addr;
+
+	return __in_irqentry_text(addr) ||
+	       in_entry_text(addr) ||
+	       in_idmap_text(addr) ||
+	       memory_contains(__kprobes_text_start, __kprobes_text_end, a, 1);
+}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 8a907e12b6b9..b0a7068afa2d 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -472,6 +472,9 @@ static int __register_trace_kprobe(struct trace_kprobe *tk)
 	else
 		tk->rp.kp.flags |= KPROBE_FLAG_DISABLED;
 
+	pr_info("Inserting kprobe at %s+%lu\n",
+			trace_kprobe_symbol(tk), trace_kprobe_offset(tk));
+
 	if (trace_kprobe_is_return(tk))
 		ret = register_kretprobe(&tk->rp);
 	else


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 19:27     ` Russell King - ARM Linux
  2017-11-24 20:25       ` Russell King - ARM Linux
@ 2017-11-26 12:07       ` Alex Shi
  2017-11-27  1:34         ` Masami Hiramatsu
  1 sibling, 1 reply; 19+ messages in thread
From: Alex Shi @ 2017-11-26 12:07 UTC (permalink / raw)
  To: linux-arm-kernel

CC Masami Hiramatsu for ftracetest part.

Hi Russell King,

Thanks a lot for quick response!

Regards
Alex

On 11/25/2017 03:27 AM, Russell King - ARM Linux wrote:
> On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
>> Full agree with your analysis. Is it possible to stain PC value with
>> heavy stress on thermal or sth else? the ARM64 board run well with
>> ftracetest of LTP.
> 
> In your first email, you said "x15 platform, which is a armv7 board."
> Here you say "ARM64 board" which isn't armv7.  There's x15 DTS under
> arch/arm/boot/dts, so I guess you mean 32-bit ARM, but who knows...
> 
> Anyway, I've tried running ftracetest on an OMAP4430 SDP board, and
> after a while with the patch I sent you, I get:
> 
> Internal error: Oops - BUG: 0 [#1] SMP ARM
> Modules linked in:
> CPU: 1 PID: 2948 Comm: ftracetest Not tainted 4.14.0+ #557
> Hardware name: Generic OMAP4 (Flattened Device Tree)
> task: ce41c100 task.stack: cc7b8000
> PC is at oops+0x0/0x4
> LR is at trace_hardirqs_on_caller+0x154/0x1e0
> pc : [<c0015adc>]    lr : [<c0086840>]    psr: 20000193
> sp : cc7b9fb0  ip : cc7b9f80  fp : 00000000
> r10: 00000000  r9 : cc7b8000  r8 : c0015c28
> r7 : 00000006  r6 : 00000004  r5 : 0009fed4  r4 : 00000001
> r3 : 00000000  r2 : cc7b9fb0  r1 : 60000193  r0 : 00000001
> Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c5387d  Table: 8e7b804a  DAC: 00000055
> Process ftracetest (pid: 2948, stack limit = 0xcc7b8210)
> Stack: (0xcc7b9fb0 to 0xcc7ba000)
> 9fa0:                                     00000000 00000000 0009b008 0000000d
> 9fc0: 00000001 0009fed4 00000004 00000006 0009b3e0 000a17d4 00000000 beaf6e3c
> 9fe0: 0009fed0 beaf6e20 000319e0 b6e7199c 60000193 00000001 6b6b6b6b a56b6b6b
> Backtrace: no frame pointer
> Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
> ---[ end trace 390efe5843605357 ]---
> 
> The other CPU also oopses:
> 
> Internal error: Oops - BUG: 0 [#3] SMP ARM
> Modules linked in:
> CPU: 1 PID: 1 Comm: init Tainted: G      D         4.14.0+ #557
> Hardware name: Generic OMAP4 (Flattened Device Tree)
> task: ced04c00 task.stack: ced06000
> PC is at oops+0x0/0x4
> LR is at trace_hardirqs_on+0x14/0x18
> pc : [<c0015adc>]    lr : [<c00868e0>]    psr: 20000193
> sp : ced07fb0  ip : ced07fa0  fp : 00000000
> r10: 00000000  r9 : ced06000  r8 : c0015c28
> r7 : 0000004e  r6 : bec0acd4  r5 : 000176b4  r4 : bec0ac3c
> r3 : 00000000  r2 : ced07fb0  r1 : 60000193  r0 : c0015aa8
> Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c5387d  Table: 8e2d804a  DAC: 00000055
> Process init (pid: 1, stack limit = 0xced06210)
> Stack: (0xced07fb0 to 0xced08000)
> 7fa0:                                     00000000 00000000 00000000 00000000
> 7fc0: bec0ac3c 000176b4 bec0acd4 0000004e 10000000 00000000 0000a1d0 bec0ac44
> 7fe0: bec0acd8 bec0ac28 b6e54544 b6e5456c 60000193 bec0ac28 005d5555 00020201
> Backtrace: no frame pointer
> Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
> ---[ end trace 390efe5843605358 ]---
> 
> which is exactly your bug, but caught a bit earlier.
> 
> This happens while executing this ftrace test:
> 
> [28] Register/unregister many kprobe events
> 
> and needs a kernel with ftrace and kprobes enabled.
> 
> Unfortunately, the debug is immediately after a call to
> trace_hardirqs_on() in no_work_pending, so the LR value is
> meaningless.
> 
> So, now that we know it's tracing kprobes triggering it - it's
> trying to set tracepoints on the first 256 symbols in the kernel's
> kallsyms, which includes all sorts of things.
> 
> With some extra debug, this doesn't look clever:
> 
> trace_kprobe: Inserting kprobe at ret_fast_syscall+0
> trace_kprobe: Inserting kprobe at slow_work_pending+0
> trace_kprobe: Inserting kprobe at ret_slow_syscall+0
> trace_kprobe: Could not insert probe at ret_slow_syscall+0: -22
> trace_kprobe: Inserting kprobe at ret_to_user+0
> trace_kprobe: Could not insert probe at ret_to_user+0: -22
> trace_kprobe: Inserting kprobe at ret_to_user_from_irq+0
> trace_kprobe: Inserting kprobe at no_work_pending+0
> trace_kprobe: Inserting kprobe at oops+0
> trace_kprobe: Could not insert probe at oops+0: -22
> trace_kprobe: Inserting kprobe at ret_from_fork+0
> trace_kprobe: Inserting kprobe at vector_swi+0
> trace_kprobe: Inserting kprobe at local_restart+0
> trace_kprobe: Inserting kprobe at __sys_trace+0
> trace_kprobe: Inserting kprobe at __sys_trace_return+0
> trace_kprobe: Inserting kprobe at __sys_trace_return_nosave+0
> trace_kprobe: Could not insert probe at __sys_trace_return_nosave+0: -22
> trace_kprobe: Inserting kprobe at __cr_alignment+0
> trace_kprobe: Could not insert probe at __cr_alignment+0: -22
> trace_kprobe: Inserting kprobe at sys_call_table+0
> trace_kprobe: Inserting kprobe at sys_syscall+0
> trace_kprobe: Inserting kprobe at sys_sigreturn_wrapper+0
> trace_kprobe: Inserting kprobe at sys_rt_sigreturn_wrapper+0
> trace_kprobe: Inserting kprobe at sys_statfs64_wrapper+0
> trace_kprobe: Inserting kprobe at sys_fstatfs64_wrapper+0
> 
> I wouldn't be surprised if some of those were the cause of it -
> for example, what guarantee do we have that a trace kprobe inserted
> at ret_fast_syscall which starts with this:
> 
> c0015a40:       e5ad0008        str     r0, [sp, #8]!
> 
> will be handled correctly?  I can't say, I've virtually no knowledge
> about kprobes, but I guess it isn't - especially as there's this
> comment in the ARM kprobes code:
> 
>          * Never instrument insn like 'str r0, [sp, +/-r1]'. Also, insn likes
>          * 'str r0, [sp, #-68]' should also be prohibited.
> 
> Clearly, that's not the case as the kprobes insert on
> ret_fast_syscall succeeded.
> 
> Adding Tixy, as he's more knowledgable in this area - I suggest
> someone knowledgable in this area runs
> 
> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> 
> and fixes these bugs... also running the entire ftracetest suite
> would probably also be a very good idea.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 15:56     ` Russell King - ARM Linux
@ 2017-11-26 14:58       ` Alex Shi
  2017-11-26 15:23         ` Alex Shi
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Shi @ 2017-11-26 14:58 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/24/2017 11:56 PM, Russell King - ARM Linux wrote:
> On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
>>
>>>> [   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
>>>> [   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
>>>> [   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
>>>> [   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
>>>> [   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
>>>> [   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
>>>> [   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
>>>> [   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
>>>> [   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
>>>> [   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
>>>> [   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
>>>> [   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
>>>> [   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
>>>> [   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
>>>> [   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)
>>>
>>> It also doesn't help that the backtrace stops at this point, and it looks
>>> very strange:
>>>
>>> 1. the value of PC looks like it's outside of the module space.
>>> 2. the CPSR indicates that the CPU was in SVC mode in the parent context
>>>    with IRQs disabled.
>>> 3. We're right at the top of the kernel stack, which suggests no further
>>>    stack frames above this.
>>>
>>> We should never be in SVC mode without further stack frames on the kernel
>>> stack.
>>>
>>> We don't seem to have overflowed the kernel stack, as the thread info
>>> seems correct - and it would also be unlikely that the saved SP value
>>> would end in ff8 in the exception stack frame.
>>
>> Hi Russell,
>>
>> Sorry for response late!
>> Is this SP was stained by sth? As my understand, SP should be times of
>> 32bits. But why stack print out correct with a incorrect SP?
> 
> There's nothing wrong with SP.

Got it. Thanks a lot!
> 
> Looking deeper at this, I think that the kernel stack somehow got
> corrupted earlier:
> 
> irq event stamp: 12924
> hardirqs last  enabled at (12923): [<c0307f10>] no_work_pending+0x4/0x30
> hardirqs last disabled at (12924): [<c0d605a0>] __pabt_svc+0x60/0xa0
> 
> The hard IRQ disable is as a result of taking a prefetch abort in
> SVC mode.  The saved context agrees with that:
> 
>                         R0       R1       R2       R3       R4       R5
> bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
>       R6       R7       R8       R9       R10      FP       IP       SP
> bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
>       LR       PC       PSR
> bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
> 
> The PSR lower 5 bits are 0x13, which is SVC mode.  Bit 7 set means IRQs
> disabled.  The PC address was 0xb6e6d49c.
> 

Thank you very much for detailed explanations! :)

> The last record we have of interrupts being enabled was in
> no_work_pending, which is the exit path to usermode - but if we were
> returning to usermode:
> 
> (a) how did we get into SVC mode instead

We return to user mode correctly in no_work_pending, the irq enabled
then we get a pabt_svc?

> (b) why are interrupts disabled

it was disabled in __pabt_svc, objdump show a extra irq disable with arm
defconfig.
...
 2a0:   ebfffffe        bl      0 <v7_pabort>
 2a4:   f10c0080        cpsid   i
...
> (c) why was mm->mmap_sem still held

the bug was caught here, just with mmap_sem, the pabt_svc happens.
> 
> Can you try the following patch to try and catch the problem earlier?
> I haven't tested it myself, and adding code may move things around in
> the kernel and make this bug disappear.

Do we still need to try this patch? I saw you tried it and do further more.
> 
> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> index d523cd8439a3..ff577177b286 100644
> --- a/arch/arm/kernel/entry-header.S
> +++ b/arch/arm/kernel/entry-header.S
> @@ -299,6 +299,8 @@
>  	@ ARM mode restore
>  	mov	r2, sp
>  	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
> +	tst	r1, #0xcf
> +	bne	oops
>  	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
>  	msr	spsr_cxsf, r1			@ save in spsr_svc
>  #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
> @@ -314,6 +316,15 @@
>  						@ after ldm {}^
>  	add	sp, sp, #\offset + PT_REGS_SIZE
>  	movs	pc, lr				@ return & move spsr_svc into cpsr
> +oops:	.word	0xe7f001f2

This oops cause allnoconfig with arm failed in build. but it's fine for
a multi_v7_defconfig
arch/arm/kernel/entry-common.S:106: Error: symbol `oops' is already defined

> +	.pushsection .rodata.str, "aMS", %progbits, 1
> +2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
> +	.popsection
> +	.pushsection __bug_table, "aw"
> +	.align	2
> +	.word	oops, 2b
> +	.hword	\@
> +	.popsection
>  #elif defined(CONFIG_CPU_V7M)
>  	@ V7M restore.
>  	@ Note that we don't need to do clrex here as clearing the local
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 20:25       ` Russell King - ARM Linux
  2017-11-24 22:20         ` Russell King - ARM Linux
@ 2017-11-26 14:59         ` Alex Shi
  2017-11-27  1:40           ` Masami Hiramatsu
  1 sibling, 1 reply; 19+ messages in thread
From: Alex Shi @ 2017-11-26 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

cc mhiramat at kernel.org

Thanks a lot for look into this! :)

Regards
Alex

On 11/25/2017 04:25 AM, Russell King - ARM Linux wrote:
> On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
>> Adding Tixy, as he's more knowledgable in this area - I suggest
>> someone knowledgable in this area runs
>>
>> 	ftracetest test.d/kprobe/multiple_kprobes.tc
>>
>> and fixes these bugs... also running the entire ftracetest suite
>> would probably also be a very good idea.
> 
> There's some other stupidities as well:
> 
> trace_kprobe: Inserting kprobe at __error_lpae+0
> trace_kprobe: Inserting kprobe at str_lpae+0
> trace_kprobe: Inserting kprobe at __error_p+0
> trace_kprobe: Inserting kprobe at str_p1+0
> trace_kprobe: Inserting kprobe at str_p2+0
> trace_kprobe: Could not insert probe at str_p2+0: -22
> 
> str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
> str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
> str_p2: .asciz  ").\n"
> 
> So we successfully placed a kprobe on some ASCII strings, which are
> used prior to the kernel being properly up and running, which means
> we have to use relative references to these strings, and relative
> references to strings in other sections is not simple.
> 
> More worrying:
> 
> trace_kprobe: Inserting kprobe at __turn_mmu_on+0
> trace_kprobe: Inserting kprobe at __idmap_text_start+0
> trace_kprobe: Inserting kprobe at __turn_mmu_on_end+0
> ...
> trace_kprobe: Inserting kprobe at __idmap_text_end+0
> ...
> trace_kprobe: Inserting kprobe at secondary_startup+0
> trace_kprobe: Inserting kprobe at secondary_startup_arm+0
> trace_kprobe: Inserting kprobe at __secondary_switched+0
> trace_kprobe: Inserting kprobe at __secondary_data+0
> trace_kprobe: Inserting kprobe at __enable_mmu+0
> trace_kprobe: Inserting kprobe at __do_fixup_smp_on_up+0
> trace_kprobe: Inserting kprobe at __fixup_a_pv_table+0
> trace_kprobe: Inserting kprobe at fixup_pv_table+0
> trace_kprobe: Inserting kprobe at __lookup_processor_type+0
> trace_kprobe: Inserting kprobe at __lookup_processor_type_data+0
> 
> These are definitely a no-no for tracing, because they're part of the
> early startup code for the kernel when no stacks are available.
> 
> Some of these can't live in the special "do not use kprobes here"
> section as they need to be in other sections (like .idmap) because
> they need to be part of the identity-mapped code.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-24 22:20         ` Russell King - ARM Linux
@ 2017-11-26 15:02           ` Alex Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Shi @ 2017-11-26 15:02 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/25/2017 06:20 AM, Russell King - ARM Linux wrote:
> Okay, this is what I came up with, and seems to solve the problem
> here.  It's quite a large patch though, as it shuffles around how
> we deal with exception entry, and knowing whether we should dump
> the stacked registers.  This particular patch also contains a few
> debugging bits too.

We will try your newer 2 patches: Fix ftractest issues
So guess we could skip this patch. :)

Thanks
Alex

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-26 14:58       ` Alex Shi
@ 2017-11-26 15:23         ` Alex Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Shi @ 2017-11-26 15:23 UTC (permalink / raw)
  To: linux-arm-kernel

Sorry, this patch should be passed off, since we got your new fixing
patches. I shouldn't mentioned here.

Regards
Alex

On 11/26/2017 10:58 PM, Alex Shi wrote:
>> Can you try the following patch to try and catch the problem earlier?
>> I haven't tested it myself, and adding code may move things around in
>> the kernel and make this bug disappear.
> Do we still need to try this patch? I saw you tried it and do further more.
>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>> index d523cd8439a3..ff577177b286 100644
>> --- a/arch/arm/kernel/entry-header.S
>> +++ b/arch/arm/kernel/entry-header.S
>> @@ -299,6 +299,8 @@
>>  	@ ARM mode restore
>>  	mov	r2, sp
>>  	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
>> +	tst	r1, #0xcf
>> +	bne	oops
>>  	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
>>  	msr	spsr_cxsf, r1			@ save in spsr_svc
>>  #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
>> @@ -314,6 +316,15 @@
>>  						@ after ldm {}^
>>  	add	sp, sp, #\offset + PT_REGS_SIZE
>>  	movs	pc, lr				@ return & move spsr_svc into cpsr
>> +oops:	.word	0xe7f001f2
> This oops cause allnoconfig with arm failed in build. but it's fine for
> a multi_v7_defconfig
> arch/arm/kernel/entry-common.S:106: Error: symbol `oops' is already defined
> 
>> +	.pushsection .rodata.str, "aMS", %progbits, 1
>> +2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
>> +	.popsection
>> +	.pushsection __bug_table, "aw"
>> +	.align	2
>> +	.word	oops, 2b
>> +	.hword	\@
>> +	.popsection
>>  #elif defined(CONFIG_CPU_V7M)
>>  	@ V7M restore.
>>  	@ Note that we don't need to do clrex here as clearing the local

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-26 12:07       ` Alex Shi
@ 2017-11-27  1:34         ` Masami Hiramatsu
  0 siblings, 0 replies; 19+ messages in thread
From: Masami Hiramatsu @ 2017-11-27  1:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 26 Nov 2017 20:07:43 +0800
Alex Shi <alex.shi@linaro.org> wrote:

> CC Masami Hiramatsu for ftracetest part.
> 
> Hi Russell King,
> 
> Thanks a lot for quick response!
> 
> Regards
> Alex
> 
> On 11/25/2017 03:27 AM, Russell King - ARM Linux wrote:
> > On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
> >> Full agree with your analysis. Is it possible to stain PC value with
> >> heavy stress on thermal or sth else? the ARM64 board run well with
> >> ftracetest of LTP.
> > 
> > In your first email, you said "x15 platform, which is a armv7 board."
> > Here you say "ARM64 board" which isn't armv7.  There's x15 DTS under
> > arch/arm/boot/dts, so I guess you mean 32-bit ARM, but who knows...
> > 
> > Anyway, I've tried running ftracetest on an OMAP4430 SDP board, and
> > after a while with the patch I sent you, I get:
> > 
> > Internal error: Oops - BUG: 0 [#1] SMP ARM
> > Modules linked in:
> > CPU: 1 PID: 2948 Comm: ftracetest Not tainted 4.14.0+ #557
> > Hardware name: Generic OMAP4 (Flattened Device Tree)
> > task: ce41c100 task.stack: cc7b8000
> > PC is at oops+0x0/0x4
> > LR is at trace_hardirqs_on_caller+0x154/0x1e0
> > pc : [<c0015adc>]    lr : [<c0086840>]    psr: 20000193
> > sp : cc7b9fb0  ip : cc7b9f80  fp : 00000000
> > r10: 00000000  r9 : cc7b8000  r8 : c0015c28
> > r7 : 00000006  r6 : 00000004  r5 : 0009fed4  r4 : 00000001
> > r3 : 00000000  r2 : cc7b9fb0  r1 : 60000193  r0 : 00000001
> > Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> > Control: 10c5387d  Table: 8e7b804a  DAC: 00000055
> > Process ftracetest (pid: 2948, stack limit = 0xcc7b8210)
> > Stack: (0xcc7b9fb0 to 0xcc7ba000)
> > 9fa0:                                     00000000 00000000 0009b008 0000000d
> > 9fc0: 00000001 0009fed4 00000004 00000006 0009b3e0 000a17d4 00000000 beaf6e3c
> > 9fe0: 0009fed0 beaf6e20 000319e0 b6e7199c 60000193 00000001 6b6b6b6b a56b6b6b
> > Backtrace: no frame pointer
> > Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
> > ---[ end trace 390efe5843605357 ]---
> > 
> > The other CPU also oopses:
> > 
> > Internal error: Oops - BUG: 0 [#3] SMP ARM
> > Modules linked in:
> > CPU: 1 PID: 1 Comm: init Tainted: G      D         4.14.0+ #557
> > Hardware name: Generic OMAP4 (Flattened Device Tree)
> > task: ced04c00 task.stack: ced06000
> > PC is at oops+0x0/0x4
> > LR is at trace_hardirqs_on+0x14/0x18
> > pc : [<c0015adc>]    lr : [<c00868e0>]    psr: 20000193
> > sp : ced07fb0  ip : ced07fa0  fp : 00000000
> > r10: 00000000  r9 : ced06000  r8 : c0015c28
> > r7 : 0000004e  r6 : bec0acd4  r5 : 000176b4  r4 : bec0ac3c
> > r3 : 00000000  r2 : ced07fb0  r1 : 60000193  r0 : c0015aa8
> > Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> > Control: 10c5387d  Table: 8e2d804a  DAC: 00000055
> > Process init (pid: 1, stack limit = 0xced06210)
> > Stack: (0xced07fb0 to 0xced08000)
> > 7fa0:                                     00000000 00000000 00000000 00000000
> > 7fc0: bec0ac3c 000176b4 bec0acd4 0000004e 10000000 00000000 0000a1d0 bec0ac44
> > 7fe0: bec0acd8 bec0ac28 b6e54544 b6e5456c 60000193 bec0ac28 005d5555 00020201
> > Backtrace: no frame pointer
> > Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
> > ---[ end trace 390efe5843605358 ]---
> > 
> > which is exactly your bug, but caught a bit earlier.
> > 
> > This happens while executing this ftrace test:
> > 
> > [28] Register/unregister many kprobe events
> > 
> > and needs a kernel with ftrace and kprobes enabled.
> > 
> > Unfortunately, the debug is immediately after a call to
> > trace_hardirqs_on() in no_work_pending, so the LR value is
> > meaningless.
> > 
> > So, now that we know it's tracing kprobes triggering it - it's
> > trying to set tracepoints on the first 256 symbols in the kernel's
> > kallsyms, which includes all sorts of things.
> > 
> > With some extra debug, this doesn't look clever:
> > 
> > trace_kprobe: Inserting kprobe at ret_fast_syscall+0
> > trace_kprobe: Inserting kprobe at slow_work_pending+0
> > trace_kprobe: Inserting kprobe at ret_slow_syscall+0
> > trace_kprobe: Could not insert probe at ret_slow_syscall+0: -22
> > trace_kprobe: Inserting kprobe at ret_to_user+0
> > trace_kprobe: Could not insert probe at ret_to_user+0: -22
> > trace_kprobe: Inserting kprobe at ret_to_user_from_irq+0
> > trace_kprobe: Inserting kprobe at no_work_pending+0
> > trace_kprobe: Inserting kprobe at oops+0
> > trace_kprobe: Could not insert probe at oops+0: -22
> > trace_kprobe: Inserting kprobe at ret_from_fork+0
> > trace_kprobe: Inserting kprobe at vector_swi+0
> > trace_kprobe: Inserting kprobe at local_restart+0
> > trace_kprobe: Inserting kprobe at __sys_trace+0
> > trace_kprobe: Inserting kprobe at __sys_trace_return+0
> > trace_kprobe: Inserting kprobe at __sys_trace_return_nosave+0
> > trace_kprobe: Could not insert probe at __sys_trace_return_nosave+0: -22
> > trace_kprobe: Inserting kprobe at __cr_alignment+0
> > trace_kprobe: Could not insert probe at __cr_alignment+0: -22
> > trace_kprobe: Inserting kprobe at sys_call_table+0
> > trace_kprobe: Inserting kprobe at sys_syscall+0
> > trace_kprobe: Inserting kprobe at sys_sigreturn_wrapper+0
> > trace_kprobe: Inserting kprobe at sys_rt_sigreturn_wrapper+0
> > trace_kprobe: Inserting kprobe at sys_statfs64_wrapper+0
> > trace_kprobe: Inserting kprobe at sys_fstatfs64_wrapper+0
> > 
> > I wouldn't be surprised if some of those were the cause of it -
> > for example, what guarantee do we have that a trace kprobe inserted
> > at ret_fast_syscall which starts with this:
> > 
> > c0015a40:       e5ad0008        str     r0, [sp, #8]!
> > 
> > will be handled correctly?  I can't say, I've virtually no knowledge
> > about kprobes, but I guess it isn't - especially as there's this
> > comment in the ARM kprobes code:
> > 
> >          * Never instrument insn like 'str r0, [sp, +/-r1]'. Also, insn likes
> >          * 'str r0, [sp, #-68]' should also be prohibited.
> > 
> > Clearly, that's not the case as the kprobes insert on
> > ret_fast_syscall succeeded.
> > 
> > Adding Tixy, as he's more knowledgable in this area - I suggest
> > someone knowledgable in this area runs
> > 
> > 	ftracetest test.d/kprobe/multiple_kprobes.tc
> > 
> > and fixes these bugs... also running the entire ftracetest suite
> > would probably also be a very good idea.

This test case is to test whether we can safely put multiple kprobes
on different functions or not. If it causes kernel crashes, it means
kprobes has a bug on that function (the last "Inserting kprobe at ...").

Since the ftrace kprobe event interface is opened to userspace (but
also requires root priv), it must be fixed.

BTW, the test is not care about failure or blacklisted functions.
I'll update it to check blacklist functions if available.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-26 14:59         ` Alex Shi
@ 2017-11-27  1:40           ` Masami Hiramatsu
  2017-11-27 13:36             ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Masami Hiramatsu @ 2017-11-27  1:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 26 Nov 2017 22:59:58 +0800
Alex Shi <alex.shi@linaro.org> wrote:

> cc mhiramat at kernel.org
> 
> Thanks a lot for look into this! :)
> 
> Regards
> Alex
> 
> On 11/25/2017 04:25 AM, Russell King - ARM Linux wrote:
> > On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
> >> Adding Tixy, as he's more knowledgable in this area - I suggest
> >> someone knowledgable in this area runs
> >>
> >> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> >>
> >> and fixes these bugs... also running the entire ftracetest suite
> >> would probably also be a very good idea.
> > 
> > There's some other stupidities as well:
> > 
> > trace_kprobe: Inserting kprobe at __error_lpae+0
> > trace_kprobe: Inserting kprobe at str_lpae+0
> > trace_kprobe: Inserting kprobe at __error_p+0
> > trace_kprobe: Inserting kprobe at str_p1+0
> > trace_kprobe: Inserting kprobe at str_p2+0
> > trace_kprobe: Could not insert probe at str_p2+0: -22
> > 
> > str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
> > str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
> > str_p2: .asciz  ").\n"
> > 
> > So we successfully placed a kprobe on some ASCII strings, which are
> > used prior to the kernel being properly up and running, which means
> > we have to use relative references to these strings, and relative
> > references to strings in other sections is not simple.

Oops, that's my mistake. It should pick only text symbols.

> > 
> > More worrying:
> > 
> > trace_kprobe: Inserting kprobe at __turn_mmu_on+0
> > trace_kprobe: Inserting kprobe at __idmap_text_start+0
> > trace_kprobe: Inserting kprobe at __turn_mmu_on_end+0
> > ...
> > trace_kprobe: Inserting kprobe at __idmap_text_end+0
> > ...
> > trace_kprobe: Inserting kprobe at secondary_startup+0
> > trace_kprobe: Inserting kprobe at secondary_startup_arm+0
> > trace_kprobe: Inserting kprobe at __secondary_switched+0
> > trace_kprobe: Inserting kprobe at __secondary_data+0
> > trace_kprobe: Inserting kprobe at __enable_mmu+0
> > trace_kprobe: Inserting kprobe at __do_fixup_smp_on_up+0
> > trace_kprobe: Inserting kprobe at __fixup_a_pv_table+0
> > trace_kprobe: Inserting kprobe at fixup_pv_table+0
> > trace_kprobe: Inserting kprobe at __lookup_processor_type+0
> > trace_kprobe: Inserting kprobe at __lookup_processor_type_data+0
> > 
> > These are definitely a no-no for tracing, because they're part of the
> > early startup code for the kernel when no stacks are available.

It should be rejected by kprobes arch dependent code.

> > Some of these can't live in the special "do not use kprobes here"
> > section as they need to be in other sections (like .idmap) because
> > they need to be part of the identity-mapped code.

kprobes already has a blacklist on x86, so it is a good time to start
making it on arm/arm64 too.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-27  1:40           ` Masami Hiramatsu
@ 2017-11-27 13:36             ` Andrew Lunn
  2017-11-27 13:55               ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2017-11-27 13:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 27, 2017 at 10:40:34AM +0900, Masami Hiramatsu wrote:
> On Sun, 26 Nov 2017 22:59:58 +0800
> Alex Shi <alex.shi@linaro.org> wrote:
> 
> > cc mhiramat at kernel.org
> > 
> > Thanks a lot for look into this! :)
> > 
> > Regards
> > Alex
> > 
> > On 11/25/2017 04:25 AM, Russell King - ARM Linux wrote:
> > > On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
> > >> Adding Tixy, as he's more knowledgable in this area - I suggest
> > >> someone knowledgable in this area runs
> > >>
> > >> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> > >>
> > >> and fixes these bugs... also running the entire ftracetest suite
> > >> would probably also be a very good idea.
> > > 
> > > There's some other stupidities as well:
> > > 
> > > trace_kprobe: Inserting kprobe at __error_lpae+0
> > > trace_kprobe: Inserting kprobe at str_lpae+0
> > > trace_kprobe: Inserting kprobe at __error_p+0
> > > trace_kprobe: Inserting kprobe at str_p1+0
> > > trace_kprobe: Inserting kprobe at str_p2+0
> > > trace_kprobe: Could not insert probe at str_p2+0: -22
> > > 
> > > str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
> > > str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
> > > str_p2: .asciz  ").\n"
> > > 
> > > So we successfully placed a kprobe on some ASCII strings, which are
> > > used prior to the kernel being properly up and running, which means
> > > we have to use relative references to these strings, and relative
> > > references to strings in other sections is not simple.
> 
> Oops, that's my mistake. It should pick only text symbols.

Hi Masami, Russell

Does the fact you are allowed to put a kprobe on an ASCII string from
userspace indicate a real problem? I would of thought the kernel core
kprobe could would be looking at the type of a symbol and rejecting
such requests. So fix the core, keep this test and make sure you get
an EINVAL back?

   Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-27 13:36             ` Andrew Lunn
@ 2017-11-27 13:55               ` Russell King - ARM Linux
  2017-11-28  5:52                 ` Masami Hiramatsu
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-27 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 27, 2017 at 02:36:31PM +0100, Andrew Lunn wrote:
> On Mon, Nov 27, 2017 at 10:40:34AM +0900, Masami Hiramatsu wrote:
> > On Sun, 26 Nov 2017 22:59:58 +0800
> > Alex Shi <alex.shi@linaro.org> wrote:
> > 
> > > cc mhiramat at kernel.org
> > > 
> > > Thanks a lot for look into this! :)
> > > 
> > > Regards
> > > Alex
> > > 
> > > On 11/25/2017 04:25 AM, Russell King - ARM Linux wrote:
> > > > On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
> > > >> Adding Tixy, as he's more knowledgable in this area - I suggest
> > > >> someone knowledgable in this area runs
> > > >>
> > > >> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> > > >>
> > > >> and fixes these bugs... also running the entire ftracetest suite
> > > >> would probably also be a very good idea.
> > > > 
> > > > There's some other stupidities as well:
> > > > 
> > > > trace_kprobe: Inserting kprobe at __error_lpae+0
> > > > trace_kprobe: Inserting kprobe at str_lpae+0
> > > > trace_kprobe: Inserting kprobe at __error_p+0
> > > > trace_kprobe: Inserting kprobe at str_p1+0
> > > > trace_kprobe: Inserting kprobe at str_p2+0
> > > > trace_kprobe: Could not insert probe at str_p2+0: -22
> > > > 
> > > > str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
> > > > str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
> > > > str_p2: .asciz  ").\n"
> > > > 
> > > > So we successfully placed a kprobe on some ASCII strings, which are
> > > > used prior to the kernel being properly up and running, which means
> > > > we have to use relative references to these strings, and relative
> > > > references to strings in other sections is not simple.
> > 
> > Oops, that's my mistake. It should pick only text symbols.
> 
> Hi Masami, Russell
> 
> Does the fact you are allowed to put a kprobe on an ASCII string from
> userspace indicate a real problem? I would of thought the kernel core
> kprobe could would be looking at the type of a symbol and rejecting
> such requests. So fix the core, keep this test and make sure you get
> an EINVAL back?

As far as I'm aware, the kernel doesn't know whether the address that
userspace wants to set a kprobe is code or any kind of data.  All that
it can do is:

1. Translate the address to a ksym and offset, looking it up in
   blacklists/blacklisted sections.
2. Look at the "instruction" and reject if it thinks the instruction
   is unsuitable.

Making the kernel reject placing a kprobe on a local function ("t" in
nm's or /proc/kallsyms output) would severely restrict the usefulness
of kprobes - that would mean you couldn't ever set a kprobe on a static
function.

So no, I don't agree with you.

Normally, there won't be strings in the .text section, but we have some
exceptions in ARM code where we have to have them to make early kernel
entry code sane without having to jump through excessive hoops.

As I've already said, we should not be placing kprobes even on this
code.  Think about a kprobe placed on the secondary CPU entry point,
where the CPU is running with the MMU off and there's no exception
table in place.  The kprobe instruction is a CPU undefined instruction,
so the CPU will try and take the undefined instruction vector, which
won't be present - the result will be an instant crash.

The same is true of the identity-mapped code - which again can run
with the MMU off, and suffer from exactly the same issue.

Then we have the exception code itself.  Consider the implications of
placing the kprobe undefined instruction somewhere in the undefined
instruction exception handling code - that would result in recursive
faulting if placed on the SVC paths.

So no, this problem has nothing to do with symbol types - it's about
what code is safe for kprobes to be placed.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-27 13:55               ` Russell King - ARM Linux
@ 2017-11-28  5:52                 ` Masami Hiramatsu
  2017-11-28  9:52                   ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Masami Hiramatsu @ 2017-11-28  5:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 27 Nov 2017 13:55:23 +0000
Russell King - ARM Linux <linux@armlinux.org.uk> wrote:

> On Mon, Nov 27, 2017 at 02:36:31PM +0100, Andrew Lunn wrote:
> > On Mon, Nov 27, 2017 at 10:40:34AM +0900, Masami Hiramatsu wrote:
> > > On Sun, 26 Nov 2017 22:59:58 +0800
> > > Alex Shi <alex.shi@linaro.org> wrote:
> > > 
> > > > cc mhiramat at kernel.org
> > > > 
> > > > Thanks a lot for look into this! :)
> > > > 
> > > > Regards
> > > > Alex
> > > > 
> > > > On 11/25/2017 04:25 AM, Russell King - ARM Linux wrote:
> > > > > On Fri, Nov 24, 2017 at 07:27:00PM +0000, Russell King - ARM Linux wrote:
> > > > >> Adding Tixy, as he's more knowledgable in this area - I suggest
> > > > >> someone knowledgable in this area runs
> > > > >>
> > > > >> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> > > > >>
> > > > >> and fixes these bugs... also running the entire ftracetest suite
> > > > >> would probably also be a very good idea.
> > > > > 
> > > > > There's some other stupidities as well:
> > > > > 
> > > > > trace_kprobe: Inserting kprobe at __error_lpae+0
> > > > > trace_kprobe: Inserting kprobe at str_lpae+0
> > > > > trace_kprobe: Inserting kprobe at __error_p+0
> > > > > trace_kprobe: Inserting kprobe at str_p1+0
> > > > > trace_kprobe: Inserting kprobe at str_p2+0
> > > > > trace_kprobe: Could not insert probe at str_p2+0: -22
> > > > > 
> > > > > str_lpae: .asciz "\nError: Kernel with LPAE support, but CPU does not support LPAE.\n"
> > > > > str_p1: .asciz  "\nError: unrecognized/unsupported processor variant (0x"
> > > > > str_p2: .asciz  ").\n"
> > > > > 
> > > > > So we successfully placed a kprobe on some ASCII strings, which are
> > > > > used prior to the kernel being properly up and running, which means
> > > > > we have to use relative references to these strings, and relative
> > > > > references to strings in other sections is not simple.
> > > 
> > > Oops, that's my mistake. It should pick only text symbols.
> > 
> > Hi Masami, Russell
> > 
> > Does the fact you are allowed to put a kprobe on an ASCII string from
> > userspace indicate a real problem? I would of thought the kernel core
> > kprobe could would be looking at the type of a symbol and rejecting
> > such requests. So fix the core, keep this test and make sure you get
> > an EINVAL back?
> 
> As far as I'm aware, the kernel doesn't know whether the address that
> userspace wants to set a kprobe is code or any kind of data.  All that
> it can do is:
> 
> 1. Translate the address to a ksym and offset, looking it up in
>    blacklists/blacklisted sections.
> 2. Look at the "instruction" and reject if it thinks the instruction
>    is unsuitable.
> 
> Making the kernel reject placing a kprobe on a local function ("t" in
> nm's or /proc/kallsyms output) would severely restrict the usefulness
> of kprobes - that would mean you couldn't ever set a kprobe on a static
> function.

Right, and anyway kprobes can put on a specified address via userspace.

In general, kprobe itself checks probe point is in the .text section,
so most of cases are safe as Russell explained.

> 
> So no, I don't agree with you.
> 
> Normally, there won't be strings in the .text section, but we have some
> exceptions in ARM code where we have to have them to make early kernel
> entry code sane without having to jump through excessive hoops.
> 

It's arch-specific reason, so it should be filtered out in arch
specific kprobe code.

> As I've already said, we should not be placing kprobes even on this
> code.  Think about a kprobe placed on the secondary CPU entry point,
> where the CPU is running with the MMU off and there's no exception
> tuable in place.  The kprobe instruction is a CPU undefined instruction,
> so the CPU will try and take the undefined instruction vector, which
> won't be present - the result will be an instant crash.

OK.

> 
> The same is true of the identity-mapped code - which again can run
> with the MMU off, and suffer from exactly the same issue.
> 
> Then we have the exception code itself.  Consider the implications of
> placing the kprobe undefined instruction somewhere in the undefined
> instruction exception handling code - that would result in recursive
> faulting if placed on the SVC paths.
> 
> So no, this problem has nothing to do with symbol types - it's about
> what code is safe for kprobes to be placed.

So there is already NOKPROBE_SYMBOL() macro to make a blacklist so that
we can avoid such recursive faults. We need to identify such place and
put the symbols in it as I did on x86.

Thank you,



-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-28  5:52                 ` Masami Hiramatsu
@ 2017-11-28  9:52                   ` Russell King - ARM Linux
  2017-11-30  2:41                     ` Masami Hiramatsu
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2017-11-28  9:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 28, 2017 at 02:52:21PM +0900, Masami Hiramatsu wrote:
> So there is already NOKPROBE_SYMBOL() macro to make a blacklist so that
> we can avoid such recursive faults. We need to identify such place and
> put the symbols in it as I did on x86.

Only if they are C functions.  If they're assembly, NOKPROBE_SYMBOL()
doesn't work.  The answer there is to move them into a section that
is excluded from kprobing.  That's what the second patch I've already
sent does.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 19+ messages in thread

* do page fault in atomic bug on arm
  2017-11-28  9:52                   ` Russell King - ARM Linux
@ 2017-11-30  2:41                     ` Masami Hiramatsu
  0 siblings, 0 replies; 19+ messages in thread
From: Masami Hiramatsu @ 2017-11-30  2:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 28 Nov 2017 09:52:05 +0000
Russell King - ARM Linux <linux@armlinux.org.uk> wrote:

> On Tue, Nov 28, 2017 at 02:52:21PM +0900, Masami Hiramatsu wrote:
> > So there is already NOKPROBE_SYMBOL() macro to make a blacklist so that
> > we can avoid such recursive faults. We need to identify such place and
> > put the symbols in it as I did on x86.
> 
> Only if they are C functions.  If they're assembly, NOKPROBE_SYMBOL()
> doesn't work.  The answer there is to move them into a section that
> is excluded from kprobing.  That's what the second patch I've already
> sent does.

Thanks!

And if you need to put some symbols in assembly which already in 
another section, you can use _ASM_NOKPROBE(symbol) macro :)


> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-11-30  2:41 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-21 13:06 do page fault in atomic bug on arm Alex Shi
2017-11-21 13:20 ` Russell King - ARM Linux
2017-11-24 15:09   ` Alex Shi
2017-11-24 15:56     ` Russell King - ARM Linux
2017-11-26 14:58       ` Alex Shi
2017-11-26 15:23         ` Alex Shi
2017-11-24 19:27     ` Russell King - ARM Linux
2017-11-24 20:25       ` Russell King - ARM Linux
2017-11-24 22:20         ` Russell King - ARM Linux
2017-11-26 15:02           ` Alex Shi
2017-11-26 14:59         ` Alex Shi
2017-11-27  1:40           ` Masami Hiramatsu
2017-11-27 13:36             ` Andrew Lunn
2017-11-27 13:55               ` Russell King - ARM Linux
2017-11-28  5:52                 ` Masami Hiramatsu
2017-11-28  9:52                   ` Russell King - ARM Linux
2017-11-30  2:41                     ` Masami Hiramatsu
2017-11-26 12:07       ` Alex Shi
2017-11-27  1:34         ` Masami Hiramatsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.