* [PATCH v2 0/3] powerpc: build out kprobes blacklist
@ 2017-04-27 8:36 Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 1/3] powerpc/kprobes: cleanup system_call_common and blacklist it from kprobes Naveen N. Rao
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Naveen N. Rao @ 2017-04-27 8:36 UTC (permalink / raw)
To: Michael Ellerman
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
v2 changes:
- Patches 3 and 4 from the previous series have been merged.
- Updated to no longer blacklist functions involved with stolen time
accounting.
v1:
https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg117514.html
--
This is the second in the series of patches to build out an appropriate
kprobes blacklist. This series blacklists system_call() and functions
involved when handling the trap itself. Not everything is covered, but
this is the first set of functions that I have tested with. More
patches to follow once I expand my tests.
I have converted many labels into private -- these are labels that I
felt are not necessary to read stack traces. If any of those are
important to have, please let me know.
- Naveen
Naveen N. Rao (3):
powerpc/kprobes: cleanup system_call_common and blacklist it from
kprobes
powerpc/kprobes: un-blacklist system_call() from kprobes
powerpc/kprobes: blacklist functions invoked on a trap
arch/powerpc/kernel/entry_64.S | 94 +++++++++++++++++++-----------------
arch/powerpc/kernel/exceptions-64s.S | 1 +
arch/powerpc/kernel/traps.c | 3 ++
3 files changed, 55 insertions(+), 43 deletions(-)
--
2.12.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/3] powerpc/kprobes: cleanup system_call_common and blacklist it from kprobes
2017-04-27 8:36 [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
@ 2017-04-27 8:36 ` Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() " Naveen N. Rao
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Naveen N. Rao @ 2017-04-27 8:36 UTC (permalink / raw)
To: Michael Ellerman
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
Convert some of the labels into private labels and blacklist
system_call_common() and system_call() from kprobes. We can't take a
trap at parts of these functions as either MSR_RI is unset or the
kernel stack pointer is not yet setup.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/entry_64.S | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 9b541d22595a..380361c0bb6a 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -52,12 +52,11 @@ exception_marker:
.section ".text"
.align 7
- .globl system_call_common
-system_call_common:
+_GLOBAL(system_call_common)
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
BEGIN_FTR_SECTION
extrdi. r10, r12, 1, (63-MSR_TS_T_LG) /* transaction active? */
- bne tabort_syscall
+ bne .Ltabort_syscall
END_FTR_SECTION_IFSET(CPU_FTR_TM)
#endif
andi. r10,r12,MSR_PR
@@ -152,9 +151,9 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
CURRENT_THREAD_INFO(r11, r1)
ld r10,TI_FLAGS(r11)
andi. r11,r10,_TIF_SYSCALL_DOTRACE
- bne syscall_dotrace /* does not return */
+ bne .Lsyscall_dotrace /* does not return */
cmpldi 0,r0,NR_syscalls
- bge- syscall_enosys
+ bge- .Lsyscall_enosys
system_call: /* label this so stack traces look sane */
/*
@@ -208,7 +207,7 @@ system_call: /* label this so stack traces look sane */
ld r9,TI_FLAGS(r12)
li r11,-MAX_ERRNO
andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
- bne- syscall_exit_work
+ bne- .Lsyscall_exit_work
andi. r0,r8,MSR_FP
beq 2f
@@ -232,7 +231,7 @@ system_call: /* label this so stack traces look sane */
3: cmpld r3,r11
ld r5,_CCR(r1)
- bge- syscall_error
+ bge- .Lsyscall_error
.Lsyscall_error_cont:
ld r7,_NIP(r1)
BEGIN_FTR_SECTION
@@ -258,14 +257,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
RFI
b . /* prevent speculative execution */
-syscall_error:
+.Lsyscall_error:
oris r5,r5,0x1000 /* Set SO bit in CR */
neg r3,r3
std r5,_CCR(r1)
b .Lsyscall_error_cont
/* Traced system call support */
-syscall_dotrace:
+.Lsyscall_dotrace:
bl save_nvgprs
addi r3,r1,STACK_FRAME_OVERHEAD
bl do_syscall_trace_enter
@@ -298,11 +297,11 @@ syscall_dotrace:
b .Lsyscall_exit
-syscall_enosys:
+.Lsyscall_enosys:
li r3,-ENOSYS
b .Lsyscall_exit
-syscall_exit_work:
+.Lsyscall_exit_work:
#ifdef CONFIG_PPC_BOOK3S
li r10,MSR_RI
mtmsrd r10,1 /* Restore RI */
@@ -362,7 +361,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
b ret_from_except
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-tabort_syscall:
+.Ltabort_syscall:
/* Firstly we need to enable TM in the kernel */
mfmsr r10
li r9, 1
@@ -388,6 +387,8 @@ tabort_syscall:
rfid
b . /* prevent speculative execution */
#endif
+_ASM_NOKPROBE_SYMBOL(system_call_common);
+_ASM_NOKPROBE_SYMBOL(system_call);
/* Save non-volatile GPRs, if not already saved. */
_GLOBAL(save_nvgprs)
--
2.12.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-04-27 8:36 [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 1/3] powerpc/kprobes: cleanup system_call_common and blacklist it from kprobes Naveen N. Rao
@ 2017-04-27 8:36 ` Naveen N. Rao
2017-04-27 10:19 ` Michael Ellerman
2017-04-27 8:36 ` [PATCH v2 3/3] powerpc/kprobes: blacklist functions invoked on a trap Naveen N. Rao
2017-05-04 4:59 ` [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
3 siblings, 1 reply; 11+ messages in thread
From: Naveen N. Rao @ 2017-04-27 8:36 UTC (permalink / raw)
To: Michael Ellerman
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
It is actually safe to probe system_call() in entry_64.S, but only till
.Lsyscall_exit. To allow this, convert .Lsyscall_exit to a non-local
symbol __system_call() and blacklist that symbol, rather than
system_call().
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/entry_64.S | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 380361c0bb6a..e030ce34dd66 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -176,7 +176,7 @@ system_call: /* label this so stack traces look sane */
mtctr r12
bctrl /* Call handler */
-.Lsyscall_exit:
+__system_call:
std r3,RESULT(r1)
CURRENT_THREAD_INFO(r12, r1)
@@ -294,12 +294,12 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
blt+ system_call
/* Return code is already in r3 thanks to do_syscall_trace_enter() */
- b .Lsyscall_exit
+ b __system_call
.Lsyscall_enosys:
li r3,-ENOSYS
- b .Lsyscall_exit
+ b __system_call
.Lsyscall_exit_work:
#ifdef CONFIG_PPC_BOOK3S
@@ -388,7 +388,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
b . /* prevent speculative execution */
#endif
_ASM_NOKPROBE_SYMBOL(system_call_common);
-_ASM_NOKPROBE_SYMBOL(system_call);
+_ASM_NOKPROBE_SYMBOL(__system_call);
/* Save non-volatile GPRs, if not already saved. */
_GLOBAL(save_nvgprs)
@@ -413,38 +413,38 @@ _GLOBAL(save_nvgprs)
_GLOBAL(ppc_fork)
bl save_nvgprs
bl sys_fork
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ppc_vfork)
bl save_nvgprs
bl sys_vfork
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ppc_clone)
bl save_nvgprs
bl sys_clone
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ppc32_swapcontext)
bl save_nvgprs
bl compat_sys_swapcontext
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ppc64_swapcontext)
bl save_nvgprs
bl sys_swapcontext
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ppc_switch_endian)
bl save_nvgprs
bl sys_switch_endian
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ret_from_fork)
bl schedule_tail
REST_NVGPRS(r1)
li r3,0
- b .Lsyscall_exit
+ b __system_call
_GLOBAL(ret_from_kernel_thread)
bl schedule_tail
@@ -456,7 +456,7 @@ _GLOBAL(ret_from_kernel_thread)
#endif
blrl
li r3,0
- b .Lsyscall_exit
+ b __system_call
/*
* This routine switches between two different tasks. The process
--
2.12.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/3] powerpc/kprobes: blacklist functions invoked on a trap
2017-04-27 8:36 [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 1/3] powerpc/kprobes: cleanup system_call_common and blacklist it from kprobes Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() " Naveen N. Rao
@ 2017-04-27 8:36 ` Naveen N. Rao
2017-05-04 4:59 ` [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
3 siblings, 0 replies; 11+ messages in thread
From: Naveen N. Rao @ 2017-04-27 8:36 UTC (permalink / raw)
To: Michael Ellerman
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
Blacklist all functions involved while handling a trap. We:
- convert some of the labels into private labels,
- remove the duplicate 'restore' label, and
- blacklist most functions involved while handling a trap.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/entry_64.S | 47 +++++++++++++++++++++---------------
arch/powerpc/kernel/exceptions-64s.S | 1 +
arch/powerpc/kernel/traps.c | 3 +++
3 files changed, 31 insertions(+), 20 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index e030ce34dd66..e7e05eb590a5 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -184,7 +184,7 @@ __system_call:
#ifdef CONFIG_PPC_BOOK3S
/* No MSR:RI on BookE */
andi. r10,r8,MSR_RI
- beq- unrecov_restore
+ beq- .Lunrecov_restore
#endif
/*
* Disable interrupts so current_thread_info()->flags can't change,
@@ -399,6 +399,7 @@ _GLOBAL(save_nvgprs)
clrrdi r0,r11,1
std r0,_TRAP(r1)
blr
+_ASM_NOKPROBE_SYMBOL(save_nvgprs);
/*
@@ -642,18 +643,18 @@ _GLOBAL(ret_from_except_lite)
* Use the internal debug mode bit to do this.
*/
andis. r0,r3,DBCR0_IDM@h
- beq restore
+ beq fast_exc_return_irq
mfmsr r0
rlwinm r0,r0,0,~MSR_DE /* Clear MSR.DE */
mtmsr r0
mtspr SPRN_DBCR0,r3
li r10, -1
mtspr SPRN_DBSR,r10
- b restore
+ b fast_exc_return_irq
#else
addi r3,r1,STACK_FRAME_OVERHEAD
bl restore_math
- b restore
+ b fast_exc_return_irq
#endif
1: andi. r0,r4,_TIF_NEED_RESCHED
beq 2f
@@ -666,7 +667,7 @@ _GLOBAL(ret_from_except_lite)
bne 3f /* only restore TM if nothing else to do */
addi r3,r1,STACK_FRAME_OVERHEAD
bl restore_tm_state
- b restore
+ b fast_exc_return_irq
3:
#endif
bl save_nvgprs
@@ -718,14 +719,14 @@ resume_kernel:
#ifdef CONFIG_PREEMPT
/* Check if we need to preempt */
andi. r0,r4,_TIF_NEED_RESCHED
- beq+ restore
+ beq+ fast_exc_return_irq
/* Check that preempt_count() == 0 and interrupts are enabled */
lwz r8,TI_PREEMPT(r9)
cmpwi cr1,r8,0
ld r0,SOFTE(r1)
cmpdi r0,0
crandc eq,cr1*4+eq,eq
- bne restore
+ bne fast_exc_return_irq
/*
* Here we are preempting the current task. We want to make
@@ -756,7 +757,6 @@ resume_kernel:
.globl fast_exc_return_irq
fast_exc_return_irq:
-restore:
/*
* This is the main kernel exit path. First we check if we
* are about to re-enable interrupts
@@ -764,11 +764,11 @@ restore:
ld r5,SOFTE(r1)
lbz r6,PACASOFTIRQEN(r13)
cmpwi cr0,r5,0
- beq restore_irq_off
+ beq .Lrestore_irq_off
/* We are enabling, were we already enabled ? Yes, just return */
cmpwi cr0,r6,1
- beq cr0,do_restore
+ beq cr0,.Ldo_restore
/*
* We are about to soft-enable interrupts (we are hard disabled
@@ -777,14 +777,14 @@ restore:
*/
lbz r0,PACAIRQHAPPENED(r13)
cmpwi cr0,r0,0
- bne- restore_check_irq_replay
+ bne- .Lrestore_check_irq_replay
/*
* Get here when nothing happened while soft-disabled, just
* soft-enable and move-on. We will hard-enable as a side
* effect of rfi
*/
-restore_no_replay:
+.Lrestore_no_replay:
TRACE_ENABLE_INTS
li r0,1
stb r0,PACASOFTIRQEN(r13);
@@ -792,7 +792,7 @@ restore_no_replay:
/*
* Final return path. BookE is handled in a different file
*/
-do_restore:
+.Ldo_restore:
#ifdef CONFIG_PPC_BOOK3E
b exception_return_book3e
#else
@@ -826,7 +826,7 @@ fast_exception_return:
REST_8GPRS(5, r1)
andi. r0,r3,MSR_RI
- beq- unrecov_restore
+ beq- .Lunrecov_restore
/* Load PPR from thread struct before we clear MSR:RI */
BEGIN_FTR_SECTION
@@ -884,7 +884,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
* make sure that in this case, we also clear PACA_IRQ_HARD_DIS
* or that bit can get out of sync and bad things will happen
*/
-restore_irq_off:
+.Lrestore_irq_off:
ld r3,_MSR(r1)
lbz r7,PACAIRQHAPPENED(r13)
andi. r0,r3,MSR_EE
@@ -894,13 +894,13 @@ restore_irq_off:
1: li r0,0
stb r0,PACASOFTIRQEN(r13);
TRACE_DISABLE_INTS
- b do_restore
+ b .Ldo_restore
/*
* Something did happen, check if a re-emit is needed
* (this also clears paca->irq_happened)
*/
-restore_check_irq_replay:
+.Lrestore_check_irq_replay:
/* XXX: We could implement a fast path here where we check
* for irq_happened being just 0x01, in which case we can
* clear it and return. That means that we would potentially
@@ -910,7 +910,7 @@ restore_check_irq_replay:
*/
bl __check_irq_replay
cmpwi cr0,r3,0
- beq restore_no_replay
+ beq .Lrestore_no_replay
/*
* We need to re-emit an interrupt. We do so by re-using our
@@ -959,10 +959,17 @@ restore_check_irq_replay:
#endif /* CONFIG_PPC_DOORBELL */
1: b ret_from_except /* What else to do here ? */
-unrecov_restore:
+.Lunrecov_restore:
addi r3,r1,STACK_FRAME_OVERHEAD
bl unrecoverable_exception
- b unrecov_restore
+ b .Lunrecov_restore
+
+_ASM_NOKPROBE_SYMBOL(ret_from_except);
+_ASM_NOKPROBE_SYMBOL(ret_from_except_lite);
+_ASM_NOKPROBE_SYMBOL(resume_kernel);
+_ASM_NOKPROBE_SYMBOL(fast_exc_return_irq);
+_ASM_NOKPROBE_SYMBOL(fast_exception_return);
+
#ifdef CONFIG_PPC_RTAS
/*
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 28f8d7bed6b1..aff23b2288f2 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1534,6 +1534,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
1: addi r3,r1,STACK_FRAME_OVERHEAD
bl kernel_bad_stack
b 1b
+_ASM_NOKPROBE_SYMBOL(bad_stack);
/*
* Called from arch_local_irq_enable when an interrupt needs
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 76f6045b021b..8ce51787b2ba 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -237,6 +237,7 @@ void die(const char *str, struct pt_regs *regs, long err)
err = 0;
oops_end(flags, regs, err);
}
+NOKPROBE_SYMBOL(die);
void user_single_step_siginfo(struct task_struct *tsk,
struct pt_regs *regs, siginfo_t *info)
@@ -1951,6 +1952,7 @@ void unrecoverable_exception(struct pt_regs *regs)
regs->trap, regs->nip);
die("Unrecoverable exception", regs, SIGABRT);
}
+NOKPROBE_SYMBOL(unrecoverable_exception);
#if defined(CONFIG_BOOKE_WDT) || defined(CONFIG_40x)
/*
@@ -1981,6 +1983,7 @@ void kernel_bad_stack(struct pt_regs *regs)
regs->gpr[1], regs->nip);
die("Bad kernel stack pointer", regs, SIGABRT);
}
+NOKPROBE_SYMBOL(kernel_bad_stack);
void __init trap_init(void)
{
--
2.12.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-04-27 8:36 ` [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() " Naveen N. Rao
@ 2017-04-27 10:19 ` Michael Ellerman
2017-04-27 12:35 ` Naveen N. Rao
0 siblings, 1 reply; 11+ messages in thread
From: Michael Ellerman @ 2017-04-27 10:19 UTC (permalink / raw)
To: Naveen N. Rao
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
> It is actually safe to probe system_call() in entry_64.S, but only till
> .Lsyscall_exit. To allow this, convert .Lsyscall_exit to a non-local
> symbol __system_call() and blacklist that symbol, rather than
> system_call().
I'm not sure I like this. The reason we made it a local symbol in the
first place is because it made backtraces look odd:
commit 4c3b21686111e0ac6018469dacbc5549f9915cf8
Author: Michael Ellerman <mpe@ellerman.id.au>
AuthorDate: Fri Dec 5 21:16:59 2014 +1100
powerpc/kernel: Make syscall_exit a local label
Currently when we back trace something that is in a syscall we see
something like this:
[c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
[c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
Although it's entirely correct, seeing syscall_exit at the bottom can be
confusing - we were exiting from a syscall and then called SyS_read() ?
If we instead change syscall_exit to be a local label we get something
more intuitive:
[c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
[c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
ie. we were handling a system call, and it was SyS_read().
I think you know that, although you didn't mention it in the change log,
because you've called the new symbol __system_call. But that is not a
great name either because that's not what it does.
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 380361c0bb6a..e030ce34dd66 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -176,7 +176,7 @@ system_call: /* label this so stack traces look sane */
> mtctr r12
> bctrl /* Call handler */
>
> -.Lsyscall_exit:
> +__system_call:
> std r3,RESULT(r1)
> CURRENT_THREAD_INFO(r12, r1)
Why can't we kprobe the std and the rotate to current thread info?
Is the real no-probe point just here, prior to the clearing of MSR_RI ?
ld r8,_MSR(r1)
#ifdef CONFIG_PPC_BOOK3S
/* No MSR:RI on BookE */
cheers
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-04-27 10:19 ` Michael Ellerman
@ 2017-04-27 12:35 ` Naveen N. Rao
2017-05-04 6:03 ` Michael Ellerman
0 siblings, 1 reply; 11+ messages in thread
From: Naveen N. Rao @ 2017-04-27 12:35 UTC (permalink / raw)
To: Michael Ellerman
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
On 2017/04/27 08:19PM, Michael Ellerman wrote:
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
>
> > It is actually safe to probe system_call() in entry_64.S, but only till
> > .Lsyscall_exit. To allow this, convert .Lsyscall_exit to a non-local
> > symbol __system_call() and blacklist that symbol, rather than
> > system_call().
>
> I'm not sure I like this. The reason we made it a local symbol in the
> first place is because it made backtraces look odd:
>
> commit 4c3b21686111e0ac6018469dacbc5549f9915cf8
> Author: Michael Ellerman <mpe@ellerman.id.au>
> AuthorDate: Fri Dec 5 21:16:59 2014 +1100
>
> powerpc/kernel: Make syscall_exit a local label
>
> Currently when we back trace something that is in a syscall we see
> something like this:
>
> [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
> [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
>
> Although it's entirely correct, seeing syscall_exit at the bottom can be
> confusing - we were exiting from a syscall and then called SyS_read() ?
>
> If we instead change syscall_exit to be a local label we get something
> more intuitive:
>
> [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
> [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
>
> ie. we were handling a system call, and it was SyS_read().
>
>
> I think you know that, although you didn't mention it in the change log,
> because you've called the new symbol __system_call. But that is not a
> great name either because that's not what it does.
Yes, you're right. I used __system_call since I felt that it won't cause
confusion like syscall_exit did. I agree it's not a great name, but we
need _some_ label other than system_call if we want to allow probing at
this point.
Also, if I'm reading this right, there is no other place to probe if we
want to capture all system call entries.
So, I felt this would be good to have.
>
> > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> > index 380361c0bb6a..e030ce34dd66 100644
> > --- a/arch/powerpc/kernel/entry_64.S
> > +++ b/arch/powerpc/kernel/entry_64.S
> > @@ -176,7 +176,7 @@ system_call: /* label this so stack traces look sane */
> > mtctr r12
> > bctrl /* Call handler */
> >
> > -.Lsyscall_exit:
> > +__system_call:
> > std r3,RESULT(r1)
> > CURRENT_THREAD_INFO(r12, r1)
>
> Why can't we kprobe the std and the rotate to current thread info?
>
> Is the real no-probe point just here, prior to the clearing of MSR_RI ?
>
> ld r8,_MSR(r1)
> #ifdef CONFIG_PPC_BOOK3S
> /* No MSR:RI on BookE */
We can probe at all those places, just not once MSR_RI is unset. So, the
no-probe point is just *after* the mtmsrd.
However, for kprobe blacklisting, the granularity is at a function level
(or ASM labels). As such, we will have to blacklist all of
syscall_exit/__system_call.
Regards,
Naveen
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/3] powerpc: build out kprobes blacklist
2017-04-27 8:36 [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
` (2 preceding siblings ...)
2017-04-27 8:36 ` [PATCH v2 3/3] powerpc/kprobes: blacklist functions invoked on a trap Naveen N. Rao
@ 2017-05-04 4:59 ` Naveen N. Rao
3 siblings, 0 replies; 11+ messages in thread
From: Naveen N. Rao @ 2017-05-04 4:59 UTC (permalink / raw)
To: Anton Blanchard
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev, Michael Ellerman
On 2017/04/27 02:06PM, Naveen N. Rao wrote:
> v2 changes:
> - Patches 3 and 4 from the previous series have been merged.
> - Updated to no longer blacklist functions involved with stolen time
> accounting.
>
> v1:
> https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg117514.html
> --
> This is the second in the series of patches to build out an appropriate
> kprobes blacklist. This series blacklists system_call() and functions
> involved when handling the trap itself. Not everything is covered, but
> this is the first set of functions that I have tested with. More
> patches to follow once I expand my tests.
>
> I have converted many labels into private -- these are labels that I
> felt are not necessary to read stack traces. If any of those are
> important to have, please let me know.
Hi Anton,
Can you please take a look at this series and let me know if you any
concerns, especially around some of the labels that I have made private?
Thanks,
Naveen
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-04-27 12:35 ` Naveen N. Rao
@ 2017-05-04 6:03 ` Michael Ellerman
2017-05-04 7:28 ` Naveen N. Rao
2017-05-04 8:41 ` [PATCH v3 " Naveen N. Rao
0 siblings, 2 replies; 11+ messages in thread
From: Michael Ellerman @ 2017-05-04 6:03 UTC (permalink / raw)
To: Naveen N. Rao
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
> On 2017/04/27 08:19PM, Michael Ellerman wrote:
>> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
>>
>> > It is actually safe to probe system_call() in entry_64.S, but only till
>> > .Lsyscall_exit. To allow this, convert .Lsyscall_exit to a non-local
>> > symbol __system_call() and blacklist that symbol, rather than
>> > system_call().
>>
>> I'm not sure I like this. The reason we made it a local symbol in the
>> first place is because it made backtraces look odd:
>>
>> commit 4c3b21686111e0ac6018469dacbc5549f9915cf8
>> Author: Michael Ellerman <mpe@ellerman.id.au>
>> AuthorDate: Fri Dec 5 21:16:59 2014 +1100
>>
>> powerpc/kernel: Make syscall_exit a local label
>>
>> Currently when we back trace something that is in a syscall we see
>> something like this:
>>
>> [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
>> [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
>>
>> Although it's entirely correct, seeing syscall_exit at the bottom can be
>> confusing - we were exiting from a syscall and then called SyS_read() ?
>>
>> If we instead change syscall_exit to be a local label we get something
>> more intuitive:
>>
>> [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
>> [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
>>
>> ie. we were handling a system call, and it was SyS_read().
>>
>>
>> I think you know that, although you didn't mention it in the change log,
>> because you've called the new symbol __system_call. But that is not a
>> great name either because that's not what it does.
>
> Yes, you're right. I used __system_call since I felt that it won't cause
> confusion like syscall_exit did. I agree it's not a great name, but we
> need _some_ label other than system_call if we want to allow probing at
> this point.
>
> Also, if I'm reading this right, there is no other place to probe if we
> want to capture all system call entries.
>
> So, I felt this would be good to have.
>
>>
>> > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> > index 380361c0bb6a..e030ce34dd66 100644
>> > --- a/arch/powerpc/kernel/entry_64.S
>> > +++ b/arch/powerpc/kernel/entry_64.S
>> > @@ -176,7 +176,7 @@ system_call: /* label this so stack traces look sane */
>> > mtctr r12
>> > bctrl /* Call handler */
>> >
>> > -.Lsyscall_exit:
>> > +__system_call:
>> > std r3,RESULT(r1)
>> > CURRENT_THREAD_INFO(r12, r1)
>>
>> Why can't we kprobe the std and the rotate to current thread info?
>>
>> Is the real no-probe point just here, prior to the clearing of MSR_RI ?
>>
>> ld r8,_MSR(r1)
>> #ifdef CONFIG_PPC_BOOK3S
>> /* No MSR:RI on BookE */
>
> We can probe at all those places, just not once MSR_RI is unset. So, the
> no-probe point is just *after* the mtmsrd.
>
> However, for kprobe blacklisting, the granularity is at a function level
> (or ASM labels). As such, we will have to blacklist all of
> syscall_exit/__system_call.
Would this work?
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 767ef6d68c9e..8d0fa4a2262a 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -207,6 +207,7 @@ system_call: /* label this so stack traces look sane */
mtmsrd r11,1
#endif /* CONFIG_PPC_BOOK3E */
+syscall_exit:
ld r9,TI_FLAGS(r12)
li r11,-MAX_ERRNO
andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
cheers
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-05-04 6:03 ` Michael Ellerman
@ 2017-05-04 7:28 ` Naveen N. Rao
2017-05-04 9:52 ` Michael Ellerman
2017-05-04 8:41 ` [PATCH v3 " Naveen N. Rao
1 sibling, 1 reply; 11+ messages in thread
From: Naveen N. Rao @ 2017-05-04 7:28 UTC (permalink / raw)
To: Michael Ellerman
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
On 2017/05/04 04:03PM, Michael Ellerman wrote:
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
>
> > On 2017/04/27 08:19PM, Michael Ellerman wrote:
> >> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
> >>
> >> > It is actually safe to probe system_call() in entry_64.S, but only till
> >> > .Lsyscall_exit. To allow this, convert .Lsyscall_exit to a non-local
> >> > symbol __system_call() and blacklist that symbol, rather than
> >> > system_call().
> >>
> >> I'm not sure I like this. The reason we made it a local symbol in the
> >> first place is because it made backtraces look odd:
> >>
> >> commit 4c3b21686111e0ac6018469dacbc5549f9915cf8
> >> Author: Michael Ellerman <mpe@ellerman.id.au>
> >> AuthorDate: Fri Dec 5 21:16:59 2014 +1100
> >>
> >> powerpc/kernel: Make syscall_exit a local label
> >>
> >> Currently when we back trace something that is in a syscall we see
> >> something like this:
> >>
> >> [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
> >> [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
> >>
> >> Although it's entirely correct, seeing syscall_exit at the bottom can be
> >> confusing - we were exiting from a syscall and then called SyS_read() ?
> >>
> >> If we instead change syscall_exit to be a local label we get something
> >> more intuitive:
> >>
> >> [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
> >> [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
> >>
> >> ie. we were handling a system call, and it was SyS_read().
> >>
> >>
> >> I think you know that, although you didn't mention it in the change log,
> >> because you've called the new symbol __system_call. But that is not a
> >> great name either because that's not what it does.
> >
> > Yes, you're right. I used __system_call since I felt that it won't cause
> > confusion like syscall_exit did. I agree it's not a great name, but we
> > need _some_ label other than system_call if we want to allow probing at
> > this point.
> >
> > Also, if I'm reading this right, there is no other place to probe if we
> > want to capture all system call entries.
> >
> > So, I felt this would be good to have.
> >
> >>
> >> > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> >> > index 380361c0bb6a..e030ce34dd66 100644
> >> > --- a/arch/powerpc/kernel/entry_64.S
> >> > +++ b/arch/powerpc/kernel/entry_64.S
> >> > @@ -176,7 +176,7 @@ system_call: /* label this so stack traces look sane */
> >> > mtctr r12
> >> > bctrl /* Call handler */
> >> >
> >> > -.Lsyscall_exit:
> >> > +__system_call:
> >> > std r3,RESULT(r1)
> >> > CURRENT_THREAD_INFO(r12, r1)
> >>
> >> Why can't we kprobe the std and the rotate to current thread info?
> >>
> >> Is the real no-probe point just here, prior to the clearing of MSR_RI ?
> >>
> >> ld r8,_MSR(r1)
> >> #ifdef CONFIG_PPC_BOOK3S
> >> /* No MSR:RI on BookE */
> >
> > We can probe at all those places, just not once MSR_RI is unset. So, the
> > no-probe point is just *after* the mtmsrd.
> >
> > However, for kprobe blacklisting, the granularity is at a function level
> > (or ASM labels). As such, we will have to blacklist all of
> > syscall_exit/__system_call.
>
> Would this work?
>
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 767ef6d68c9e..8d0fa4a2262a 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -207,6 +207,7 @@ system_call: /* label this so stack traces look sane */
> mtmsrd r11,1
> #endif /* CONFIG_PPC_BOOK3E */
>
> +syscall_exit:
> ld r9,TI_FLAGS(r12)
> li r11,-MAX_ERRNO
> andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
Ah, nice. I previously incorrectly assumed that syscall_exit was not
desirable throughout this function. Your earlier patch was only about
what label showed up while _inside_ a syscall. So, syscall_exit post
handling of a syscall is fine.
This patch looks fine to me. I will test with this change and get back.
Thanks,
Naveen
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-05-04 6:03 ` Michael Ellerman
2017-05-04 7:28 ` Naveen N. Rao
@ 2017-05-04 8:41 ` Naveen N. Rao
1 sibling, 0 replies; 11+ messages in thread
From: Naveen N. Rao @ 2017-05-04 8:41 UTC (permalink / raw)
To: Michael Ellerman
Cc: Masami Hiramatsu, Ananth N Mavinakayanahalli, Anton Blanchard,
linuxppc-dev
It is actually safe to probe system_call() in entry_64.S, but only till
we unset MSR_RI. To allow this, add a new label system_call_exit after
the mtmsrd and blacklist that. Though the mtmsrd instruction itself is
now whitelisted, we won't be allowed to probe on it as we don't allow
probing on rfi and mtmsr instructions (checked for in arch_prepare_kprobe).
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Michael,
I have named the new label system_call_exit so as to follow the
existing labels (system_call and system_call_common) and to not
conflict with the syscall_exit private label.
- Naveen
arch/powerpc/kernel/entry_64.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 380361c0bb6a..e255221b0ec0 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -204,6 +204,7 @@ system_call: /* label this so stack traces look sane */
mtmsrd r11,1
#endif /* CONFIG_PPC_BOOK3E */
+system_call_exit:
ld r9,TI_FLAGS(r12)
li r11,-MAX_ERRNO
andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
@@ -388,7 +389,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
b . /* prevent speculative execution */
#endif
_ASM_NOKPROBE_SYMBOL(system_call_common);
-_ASM_NOKPROBE_SYMBOL(system_call);
+_ASM_NOKPROBE_SYMBOL(system_call_exit);
/* Save non-volatile GPRs, if not already saved. */
_GLOBAL(save_nvgprs)
--
2.12.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() from kprobes
2017-05-04 7:28 ` Naveen N. Rao
@ 2017-05-04 9:52 ` Michael Ellerman
0 siblings, 0 replies; 11+ messages in thread
From: Michael Ellerman @ 2017-05-04 9:52 UTC (permalink / raw)
To: Naveen N. Rao
Cc: Anton Blanchard, Ananth N Mavinakayanahalli, Masami Hiramatsu,
linuxppc-dev
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
> On 2017/05/04 04:03PM, Michael Ellerman wrote:
>> Would this work?
>>
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 767ef6d68c9e..8d0fa4a2262a 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -207,6 +207,7 @@ system_call: /* label this so stack traces look sane */
>> mtmsrd r11,1
>> #endif /* CONFIG_PPC_BOOK3E */
>>
>> +syscall_exit:
>> ld r9,TI_FLAGS(r12)
>> li r11,-MAX_ERRNO
>> andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
>
> Ah, nice. I previously incorrectly assumed that syscall_exit was not
> desirable throughout this function. Your earlier patch was only about
> what label showed up while _inside_ a syscall.
Yep. When you're somewhere in a syscall the LR on the stack points to
the instruction following the bctrl that called the syscall handler, so
as long as the label preceeding that is system_call then the backtrace
should look good.
We could even just have a nop after the bctrl and then the label, but
that would be a bit gross.
> So, syscall_exit post handling of a syscall is fine.
>
> This patch looks fine to me. I will test with this change and get back.
Thanks.
cheers
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-05-04 9:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-27 8:36 [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 1/3] powerpc/kprobes: cleanup system_call_common and blacklist it from kprobes Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 2/3] powerpc/kprobes: un-blacklist system_call() " Naveen N. Rao
2017-04-27 10:19 ` Michael Ellerman
2017-04-27 12:35 ` Naveen N. Rao
2017-05-04 6:03 ` Michael Ellerman
2017-05-04 7:28 ` Naveen N. Rao
2017-05-04 9:52 ` Michael Ellerman
2017-05-04 8:41 ` [PATCH v3 " Naveen N. Rao
2017-04-27 8:36 ` [PATCH v2 3/3] powerpc/kprobes: blacklist functions invoked on a trap Naveen N. Rao
2017-05-04 4:59 ` [PATCH v2 0/3] powerpc: build out kprobes blacklist Naveen N. Rao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.