* [PATCH 1/2] powerpc: Add trap_nr to thread_struct @ 2012-08-22 8:22 Ananth N Mavinakayanahalli 2012-08-22 8:27 ` [PATCH v4 2/2] powerpc: Uprobes port to powerpc Ananth N Mavinakayanahalli 0 siblings, 1 reply; 15+ messages in thread From: Ananth N Mavinakayanahalli @ 2012-08-22 8:22 UTC (permalink / raw) To: ppcdev, lkml Cc: benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar, michael From: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Add thread_struct.trap_nr and use it to store the last exception the thread experienced. In this patch, we populate the field at various places where we force_sig_info() to the process. This is also used in uprobes to determine if the probed instruction caused an exception. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> --- arch/powerpc/include/asm/processor.h | 1 + arch/powerpc/kernel/process.c | 2 ++ arch/powerpc/kernel/traps.c | 1 + arch/powerpc/mm/fault.c | 1 + 4 files changed, 5 insertions(+) Index: linux-26jul/arch/powerpc/include/asm/processor.h =================================================================== --- linux-26jul.orig/arch/powerpc/include/asm/processor.h +++ linux-26jul/arch/powerpc/include/asm/processor.h @@ -219,6 +219,7 @@ struct thread_struct { #endif /* CONFIG_HAVE_HW_BREAKPOINT */ #endif unsigned long dabr; /* Data address breakpoint register */ + unsigned long trap_nr; /* last trap # on this thread */ #ifdef CONFIG_ALTIVEC /* Complete AltiVec register set */ vector128 vr[32] __attribute__((aligned(16))); Index: linux-26jul/arch/powerpc/kernel/process.c =================================================================== --- linux-26jul.orig/arch/powerpc/kernel/process.c +++ linux-26jul/arch/powerpc/kernel/process.c @@ -258,6 +258,7 @@ void do_send_trap(struct pt_regs *regs, { siginfo_t info; + current->thread.trap_nr = signal_code; if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code, 11, SIGSEGV) == NOTIFY_STOP) return; @@ -275,6 +276,7 @@ void do_dabr(struct pt_regs *regs, unsig { siginfo_t info; + current->thread.trap_nr = TRAP_HWBKPT; if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code, 11, SIGSEGV) == NOTIFY_STOP) return; Index: linux-26jul/arch/powerpc/kernel/traps.c =================================================================== --- linux-26jul.orig/arch/powerpc/kernel/traps.c +++ linux-26jul/arch/powerpc/kernel/traps.c @@ -251,6 +251,7 @@ void _exception(int signr, struct pt_reg if (arch_irqs_disabled() && !arch_irq_disabled_regs(regs)) local_irq_enable(); + current->thread.trap_nr = code; memset(&info, 0, sizeof(info)); info.si_signo = signr; info.si_code = code; Index: linux-26jul/arch/powerpc/mm/fault.c =================================================================== --- linux-26jul.orig/arch/powerpc/mm/fault.c +++ linux-26jul/arch/powerpc/mm/fault.c @@ -133,6 +133,7 @@ static int do_sigbus(struct pt_regs *reg up_read(¤t->mm->mmap_sem); if (user_mode(regs)) { + current->thread.trap_nr = BUS_ADRERR; info.si_signo = SIGBUS; info.si_errno = 0; info.si_code = BUS_ADRERR; ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-22 8:22 [PATCH 1/2] powerpc: Add trap_nr to thread_struct Ananth N Mavinakayanahalli @ 2012-08-22 8:27 ` Ananth N Mavinakayanahalli 2012-08-22 15:55 ` Oleg Nesterov 2012-08-23 4:28 ` Michael Ellerman 0 siblings, 2 replies; 15+ messages in thread From: Ananth N Mavinakayanahalli @ 2012-08-22 8:27 UTC (permalink / raw) To: ppcdev, lkml Cc: benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar, michael From: Ananth N Mavinakayanahalli <ananth@in.ibm.com> This is the port of uprobes to powerpc. Usage is similar to x86. [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0xb4860) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 [root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20 [ perf record: Woken up 22 times to write data ] [ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ] [root@xxxx ~]# ./bin/perf report --stdio ... # Samples: 83K of event 'probe_libc:malloc' # Event count (approx.): 83484 # # Overhead Command Shared Object Symbol # ........ ............ ............. .......... # 69.05% tar libc-2.12.so [.] malloc 28.57% rm libc-2.12.so [.] malloc 1.32% avahi-daemon libc-2.12.so [.] malloc 0.58% bash libc-2.12.so [.] malloc 0.28% sshd libc-2.12.so [.] malloc 0.08% irqbalance libc-2.12.so [.] malloc 0.05% bzip2 libc-2.12.so [.] malloc 0.04% sleep libc-2.12.so [.] malloc 0.03% multipathd libc-2.12.so [.] malloc 0.01% sendmail libc-2.12.so [.] malloc 0.01% automount libc-2.12.so [.] malloc The trap_nr addition patch is a prereq. A few minor changes can be expected down the line based on the discussions currently happening on lkml around the single-stepping infrastructure code (http://marc.info/?l=linux-kernel&m=134545967710242&w=2). Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> --- Tested on POWER6; I don't see anything here that should stop it from working on a ppc32; since I don't have access to a ppc32 machine, it would be good if somoene could verify that part. V4: Enhance arch_analyze_uprobe_insn() to reject uprobes on trap variants. V3: Added abort_xol() logic for powerpc, using thread_struct.trap_nr to determine if the stepped instruction caused an exception. V2: a. arch_uprobe_analyze_insn() now gets unsigned long addr. b. Verified that mtmsr[d] and rfi[d] are handled correctly by emulate_step() (no changes to this patch). arch/powerpc/Kconfig | 3 arch/powerpc/include/asm/thread_info.h | 4 arch/powerpc/include/asm/uprobes.h | 58 ++++++++++ arch/powerpc/kernel/Makefile | 1 arch/powerpc/kernel/signal.c | 6 + arch/powerpc/kernel/uprobes.c | 180 +++++++++++++++++++++++++++++++++ 6 files changed, 251 insertions(+), 1 deletion(-) Index: linux-tip-16aug/arch/powerpc/include/asm/thread_info.h =================================================================== --- linux-tip-16aug.orig/arch/powerpc/include/asm/thread_info.h +++ linux-tip-16aug/arch/powerpc/include/asm/thread_info.h @@ -102,6 +102,7 @@ static inline struct thread_info *curren #define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */ #define TIF_NOERROR 12 /* Force successful syscall return */ #define TIF_NOTIFY_RESUME 13 /* callback before returning to user */ +#define TIF_UPROBE 14 /* breakpointed or single-stepping */ #define TIF_SYSCALL_TRACEPOINT 15 /* syscall tracepoint instrumentation */ /* as above, but as bit values */ @@ -118,12 +119,13 @@ static inline struct thread_info *curren #define _TIF_RESTOREALL (1<<TIF_RESTOREALL) #define _TIF_NOERROR (1<<TIF_NOERROR) #define _TIF_NOTIFY_RESUME (1<<TIF_NOTIFY_RESUME) +#define _TIF_UPROBE (1<<TIF_UPROBE) #define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT) #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT) #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ - _TIF_NOTIFY_RESUME) + _TIF_NOTIFY_RESUME | _TIF_UPROBE) #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR) /* Bits in local_flags */ Index: linux-tip-16aug/arch/powerpc/include/asm/uprobes.h =================================================================== --- /dev/null +++ linux-tip-16aug/arch/powerpc/include/asm/uprobes.h @@ -0,0 +1,58 @@ +#ifndef _ASM_UPROBES_H +#define _ASM_UPROBES_H +/* + * User-space Probes (UProbes) for powerpc + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2007-2012 + * + * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com> + */ + +#include <linux/notifier.h> + +typedef unsigned int uprobe_opcode_t; + +#define MAX_UINSN_BYTES 4 +#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES) + +#define UPROBE_SWBP_INSN 0x7fe00008 +#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */ + +#define IS_TW(instr) (((instr) & 0xfc0007fe) == 0x7c000008) +#define IS_TD(instr) (((instr) & 0xfc0007fe) == 0x7c000088) +#define IS_TDI(instr) (((instr) & 0xfc000000) == 0x08000000) +#define IS_TWI(instr) (((instr) & 0xfc000000) == 0x0c000000) + +#define is_trap(instr) (IS_TW(instr) || IS_TD(instr) || \ + IS_TWI(instr) || IS_TDI(instr)) + +struct arch_uprobe { + u8 insn[MAX_UINSN_BYTES]; +}; + +struct arch_uprobe_task { + unsigned long saved_trap_nr; +}; + +extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr); +extern unsigned long uprobe_get_swbp_addr(struct pt_regs *regs); +extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs); +extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs); +extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk); +extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data); +extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs); +#endif /* _ASM_UPROBES_H */ Index: linux-tip-16aug/arch/powerpc/kernel/Makefile =================================================================== --- linux-tip-16aug.orig/arch/powerpc/kernel/Makefile +++ linux-tip-16aug/arch/powerpc/kernel/Makefile @@ -96,6 +96,7 @@ obj-$(CONFIG_MODULES) += ppc_ksyms.o obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_KPROBES) += kprobes.o +obj-$(CONFIG_UPROBES) += uprobes.o obj-$(CONFIG_PPC_UDBG_16550) += legacy_serial.o udbg_16550.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_SWIOTLB) += dma-swiotlb.o Index: linux-tip-16aug/arch/powerpc/kernel/signal.c =================================================================== --- linux-tip-16aug.orig/arch/powerpc/kernel/signal.c +++ linux-tip-16aug/arch/powerpc/kernel/signal.c @@ -11,6 +11,7 @@ #include <linux/tracehook.h> #include <linux/signal.h> +#include <linux/uprobes.h> #include <linux/key.h> #include <asm/hw_breakpoint.h> #include <asm/uaccess.h> @@ -157,6 +158,11 @@ static int do_signal(struct pt_regs *reg void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) { + if (thread_info_flags & _TIF_UPROBE) { + clear_thread_flag(TIF_UPROBE); + uprobe_notify_resume(regs); + } + if (thread_info_flags & _TIF_SIGPENDING) do_signal(regs); Index: linux-tip-16aug/arch/powerpc/kernel/uprobes.c =================================================================== --- /dev/null +++ linux-tip-16aug/arch/powerpc/kernel/uprobes.c @@ -0,0 +1,180 @@ +/* + * User-space Probes (UProbes) for powerpc + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2007-2012 + * + * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com> + */ +#include <linux/kernel.h> +#include <linux/sched.h> +#include <linux/ptrace.h> +#include <linux/uprobes.h> +#include <linux/uaccess.h> +#include <linux/kdebug.h> + +#include <asm/sstep.h> + +#define UPROBE_TRAP_NR UINT_MAX + +/** + * arch_uprobe_analyze_insn + * @mm: the probed address space. + * @arch_uprobe: the probepoint information. + * @addr: vaddr to probe. + * Return 0 on success or a -ve number on error. + */ +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) +{ + unsigned int insn; + + if (addr & 0x03) + return -EINVAL; + + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); + if (is_trap(insn)) + return -ENOTSUPP; + return 0; +} + +/* + * arch_uprobe_pre_xol - prepare to execute out of line. + * @auprobe: the probepoint information. + * @regs: reflects the saved user state of current task. + */ +int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + struct arch_uprobe_task *autask = ¤t->utask->autask; + + autask->saved_trap_nr = current->thread.trap_nr; + current->thread.trap_nr = UPROBE_TRAP_NR; + regs->nip = current->utask->xol_vaddr; + return 0; +} + +/** + * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs + * @regs: Reflects the saved state of the task after it has hit a breakpoint + * instruction. + * Return the address of the breakpoint instruction. + */ +unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) +{ + return instruction_pointer(regs); +} + +/* + * If xol insn itself traps and generates a signal (SIGILL/SIGSEGV/etc), + * then detect the case where a singlestepped instruction jumps back to its + * own address. It is assumed that anything like do_page_fault/do_trap/etc + * sets thread.trap_nr != -1. + * + * arch_uprobe_pre_xol/arch_uprobe_post_xol save/restore thread.trap_nr, + * arch_uprobe_xol_was_trapped() simply checks that ->trap_nr is not equal to + * UPROBE_TRAP_NR == -1 set by arch_uprobe_pre_xol(). + */ +bool arch_uprobe_xol_was_trapped(struct task_struct *t) +{ + if (t->thread.trap_nr != UPROBE_TRAP_NR) + return true; + + return false; +} + +/* + * Called after single-stepping. To avoid the SMP problems that can + * occur when we temporarily put back the original opcode to + * single-step, we single-stepped a copy of the instruction. + * + * This function prepares to resume execution after the single-step. + */ +int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + struct uprobe_task *utask = current->utask; + + WARN_ON_ONCE(current->thread.trap_nr != UPROBE_TRAP_NR); + + current->thread.trap_nr = utask->autask.saved_trap_nr; + + /* + * On powerpc, except for loads and stores, most instructions + * including ones that alter code flow (branches, calls, returns) + * are emulated in the kernel. We get here only if the emulation + * support doesn't exist and have to fix-up the next instruction + * to be executed. + */ + regs->nip = utask->vaddr + MAX_UINSN_BYTES; + return 0; +} + +/* callback routine for handling exceptions. */ +int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data) +{ + struct die_args *args = data; + struct pt_regs *regs = args->regs; + + /* We are only interested in userspace traps */ + if (regs && !user_mode(regs)) + return NOTIFY_DONE; + + switch (val) { + case DIE_BPT: + if (uprobe_pre_sstep_notifier(regs)) + return NOTIFY_STOP; + break; + case DIE_SSTEP: + if (uprobe_post_sstep_notifier(regs)) + return NOTIFY_STOP; + default: + break; + } + return NOTIFY_DONE; +} + +/* + * This function gets called when XOL instruction either gets trapped or + * the thread has a fatal signal, so reset the instruction pointer to its + * probed address. + */ +void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + struct uprobe_task *utask = current->utask; + + current->thread.trap_nr = utask->autask.saved_trap_nr; + instruction_pointer_set(regs, utask->vaddr); +} + +/* + * See if the instruction can be emulated. + * Returns true if instruction was emulated, false otherwise. + */ +bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + int ret; + unsigned int insn; + + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); + + /* + * emulate_step() returns 1 if the insn was successfully emulated. + * For all other cases, we need to single-step in hardware. + */ + ret = emulate_step(regs, insn); + if (ret > 0) + return true; + + return false; +} Index: linux-tip-16aug/arch/powerpc/Kconfig =================================================================== --- linux-tip-16aug.orig/arch/powerpc/Kconfig +++ linux-tip-16aug/arch/powerpc/Kconfig @@ -239,6 +239,9 @@ config PPC_OF_PLATFORM_PCI config ARCH_SUPPORTS_DEBUG_PAGEALLOC def_bool y +config ARCH_SUPPORTS_UPROBES + def_bool y + config PPC_ADV_DEBUG_REGS bool depends on 40x || BOOKE ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-22 8:27 ` [PATCH v4 2/2] powerpc: Uprobes port to powerpc Ananth N Mavinakayanahalli @ 2012-08-22 15:55 ` Oleg Nesterov 2012-08-23 4:28 ` Michael Ellerman 1 sibling, 0 replies; 15+ messages in thread From: Oleg Nesterov @ 2012-08-22 15:55 UTC (permalink / raw) To: Ananth N Mavinakayanahalli Cc: ppcdev, lkml, benh, Paul Mackerras, Anton Blanchard, peterz, Srikar Dronamraju, Ingo Molnar, michael On 08/22, Ananth N Mavinakayanahalli wrote: > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) > +{ > + unsigned int insn; > + > + if (addr & 0x03) > + return -EINVAL; > + > + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); > + if (is_trap(insn)) > + return -ENOTSUPP; This addresses the only concern I had, thanks. Once again, I am in no position to review the powerpc code, but since I made some noise before: I think the patch is fine. Oleg. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-22 8:27 ` [PATCH v4 2/2] powerpc: Uprobes port to powerpc Ananth N Mavinakayanahalli 2012-08-22 15:55 ` Oleg Nesterov @ 2012-08-23 4:28 ` Michael Ellerman 2012-08-23 5:32 ` Srikar Dronamraju 2012-08-23 5:58 ` Ananth N Mavinakayanahalli 1 sibling, 2 replies; 15+ messages in thread From: Michael Ellerman @ 2012-08-23 4:28 UTC (permalink / raw) To: ananth Cc: ppcdev, lkml, benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar On Wed, 2012-08-22 at 13:57 +0530, Ananth N Mavinakayanahalli wrote: > From: Ananth N Mavinakayanahalli <ananth@in.ibm.com> > > This is the port of uprobes to powerpc. Usage is similar to x86. Hi Ananth, Excuse my ignorance of uprobes, some comments inline ... > [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc > Added new event: > probe_libc:malloc (on 0xb4860) > > You can now use it in all perf tools, such as: > > perf record -e probe_libc:malloc -aR sleep 1 Is there a test suite for any of this? > Index: linux-tip-16aug/arch/powerpc/include/asm/uprobes.h > =================================================================== > --- /dev/null > +++ linux-tip-16aug/arch/powerpc/include/asm/uprobes.h > @@ -0,0 +1,58 @@ > +#ifndef _ASM_UPROBES_H > +#define _ASM_UPROBES_H > +/* > + * User-space Probes (UProbes) for powerpc > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. > + * > + * Copyright (C) IBM Corporation, 2007-2012 The lawyers say we shouldn't use (C). Is it really copyright IBM 2007-2012? Or is that because you copied another header? > +typedef unsigned int uprobe_opcode_t; I'd prefer u32. It would be nice if someone could consolidate this with kprobe_opcode_t. > +#define MAX_UINSN_BYTES 4 > +#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES) > + > +#define UPROBE_SWBP_INSN 0x7fe00008 This is just "trap" ? > +#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */ > + > +#define IS_TW(instr) (((instr) & 0xfc0007fe) == 0x7c000008) > +#define IS_TD(instr) (((instr) & 0xfc0007fe) == 0x7c000088) > +#define IS_TDI(instr) (((instr) & 0xfc000000) == 0x08000000) > +#define IS_TWI(instr) (((instr) & 0xfc000000) == 0x0c000000) > + > +#define is_trap(instr) (IS_TW(instr) || IS_TD(instr) || \ > + IS_TWI(instr) || IS_TDI(instr)) These seem to be duplicated in kprobes.h, can we consolidate them. > +struct arch_uprobe { > + u8 insn[MAX_UINSN_BYTES]; > +}; Why not uprobe_opcode_t insn ? > Index: linux-tip-16aug/arch/powerpc/kernel/signal.c > =================================================================== > --- linux-tip-16aug.orig/arch/powerpc/kernel/signal.c > +++ linux-tip-16aug/arch/powerpc/kernel/signal.c > @@ -11,6 +11,7 @@ > > #include <linux/tracehook.h> > #include <linux/signal.h> > +#include <linux/uprobes.h> > #include <linux/key.h> > #include <asm/hw_breakpoint.h> > #include <asm/uaccess.h> > @@ -157,6 +158,11 @@ static int do_signal(struct pt_regs *reg > > void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) > { > + if (thread_info_flags & _TIF_UPROBE) { > + clear_thread_flag(TIF_UPROBE); > + uprobe_notify_resume(regs); > + } Presumably this ordering is crucial, ie. uprobes before signals. > if (thread_info_flags & _TIF_SIGPENDING) > do_signal(regs); > > Index: linux-tip-16aug/arch/powerpc/kernel/uprobes.c > =================================================================== > --- /dev/null > +++ linux-tip-16aug/arch/powerpc/kernel/uprobes.c > @@ -0,0 +1,180 @@ > +/* > + * User-space Probes (UProbes) for powerpc > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. > + * > + * Copyright (C) IBM Corporation, 2007-2012 > + * > + * Adapted from the x86 port by Ananth N Mavinakayanahalli <ananth@in.ibm.com> > + */ > +#include <linux/kernel.h> > +#include <linux/sched.h> > +#include <linux/ptrace.h> > +#include <linux/uprobes.h> > +#include <linux/uaccess.h> > +#include <linux/kdebug.h> > + > +#include <asm/sstep.h> > + > +#define UPROBE_TRAP_NR UINT_MAX In the comments below you talk about -1 a few times, but you actually mean UINT_MAX. > +/** > + * arch_uprobe_analyze_insn Analyze what about the instruction? > + * @mm: the probed address space. > + * @arch_uprobe: the probepoint information. > + * @addr: vaddr to probe. > + * Return 0 on success or a -ve number on error. > + */ > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) > +{ > + unsigned int insn; > + > + if (addr & 0x03) > + return -EINVAL; > + > + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); We shouldn't need to use memcpy, we know it's a u32. > + if (is_trap(insn)) > + return -ENOTSUPP; A comment saying why we can't handle this would be nice. > + return 0; > +} I am probably missing something, but why do we need to execute out of line? > +/* > + * arch_uprobe_pre_xol - prepare to execute out of line. > + * @auprobe: the probepoint information. > + * @regs: reflects the saved user state of current task. > + */ > +int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > +{ > + struct arch_uprobe_task *autask = ¤t->utask->autask; > + > + autask->saved_trap_nr = current->thread.trap_nr; > + current->thread.trap_nr = UPROBE_TRAP_NR; > + regs->nip = current->utask->xol_vaddr; > + return 0; > +} > + > +/** > + * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs > + * @regs: Reflects the saved state of the task after it has hit a breakpoint > + * instruction. > + * Return the address of the breakpoint instruction. > + */ > +unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) > +{ > + return instruction_pointer(regs); > +} This seems like it would be better in asm/uprobes.h as a static inline, but that's not your fault. > +/* > + * If xol insn itself traps and generates a signal (SIGILL/SIGSEGV/etc), > + * then detect the case where a singlestepped instruction jumps back to its > + * own address. It is assumed that anything like do_page_fault/do_trap/etc > + * sets thread.trap_nr != -1. > + * > + * arch_uprobe_pre_xol/arch_uprobe_post_xol save/restore thread.trap_nr, > + * arch_uprobe_xol_was_trapped() simply checks that ->trap_nr is not equal to > + * UPROBE_TRAP_NR == -1 set by arch_uprobe_pre_xol(). > + */ > +bool arch_uprobe_xol_was_trapped(struct task_struct *t) > +{ > + if (t->thread.trap_nr != UPROBE_TRAP_NR) > + return true; > + > + return false; > +} > + > +/* > + * Called after single-stepping. To avoid the SMP problems that can > + * occur when we temporarily put back the original opcode to > + * single-step, we single-stepped a copy of the instruction. > + * > + * This function prepares to resume execution after the single-step. > + */ > +int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > +{ > + struct uprobe_task *utask = current->utask; > + > + WARN_ON_ONCE(current->thread.trap_nr != UPROBE_TRAP_NR); > + > + current->thread.trap_nr = utask->autask.saved_trap_nr; > + > + /* > + * On powerpc, except for loads and stores, most instructions > + * including ones that alter code flow (branches, calls, returns) > + * are emulated in the kernel. We get here only if the emulation > + * support doesn't exist and have to fix-up the next instruction > + * to be executed. > + */ > + regs->nip = utask->vaddr + MAX_UINSN_BYTES; > + return 0; > +} > + > +/* callback routine for handling exceptions. */ > +int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data) > +{ > + struct die_args *args = data; > + struct pt_regs *regs = args->regs; > + > + /* We are only interested in userspace traps */ > + if (regs && !user_mode(regs)) > + return NOTIFY_DONE; Do we ever get here with a NULL regs? > + switch (val) { > + case DIE_BPT: > + if (uprobe_pre_sstep_notifier(regs)) > + return NOTIFY_STOP; > + break; > + case DIE_SSTEP: > + if (uprobe_post_sstep_notifier(regs)) > + return NOTIFY_STOP; > + default: > + break; > + } > + return NOTIFY_DONE; > +} > + > +/* > + * This function gets called when XOL instruction either gets trapped or > + * the thread has a fatal signal, so reset the instruction pointer to its > + * probed address. > + */ > +void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > +{ > + struct uprobe_task *utask = current->utask; > + > + current->thread.trap_nr = utask->autask.saved_trap_nr; > + instruction_pointer_set(regs, utask->vaddr); > +} > + > +/* > + * See if the instruction can be emulated. > + * Returns true if instruction was emulated, false otherwise. > + */ > +bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) > +{ > + int ret; > + unsigned int insn; > + > + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); Why memcpy? > + > + /* > + * emulate_step() returns 1 if the insn was successfully emulated. > + * For all other cases, we need to single-step in hardware. > + */ > + ret = emulate_step(regs, insn); > + if (ret > 0) > + return true; This actually emulates the instruction, ie. the contents of regs are changed based on the instruction. That seems to differ vs x86, where arch_uprobe_skip_sstep() just checks the instruction and returns true/false. Is that because on x86 they are only returning true for nops? ie. there is no emulation to be done? It's a little surprising that can_skip_sstep() actually emulates the instruction, but again that's not your fault. cheers ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 4:28 ` Michael Ellerman @ 2012-08-23 5:32 ` Srikar Dronamraju 2012-08-23 10:06 ` Benjamin Herrenschmidt 2012-08-24 1:33 ` Michael Ellerman 2012-08-23 5:58 ` Ananth N Mavinakayanahalli 1 sibling, 2 replies; 15+ messages in thread From: Srikar Dronamraju @ 2012-08-23 5:32 UTC (permalink / raw) To: Michael Ellerman Cc: ananth, ppcdev, lkml, benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Ingo Molnar > > These seem to be duplicated in kprobes.h, can we consolidate them. > > > +struct arch_uprobe { > > + u8 insn[MAX_UINSN_BYTES]; > > +}; > > Why not uprobe_opcode_t insn ? > insn is updated/accessed in the arch independent code. Size of uprobe_opcode_t could be different for different archs. uprobe_opcode_t represents the size of the smallest breakpoint instruction for an arch. Hence u8 works out the best. I know we could still use uprobe_opcode_t and achieve the same. In which case, we would have to interpret MAX_UINSN_BYTES differently. Do you see any advantages of using uprobe_opcode_t instead of u8 across archs? > > > void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) > > { > > + if (thread_info_flags & _TIF_UPROBE) { > > + clear_thread_flag(TIF_UPROBE); > > + uprobe_notify_resume(regs); > > + } > > Presumably this ordering is crucial, ie. uprobes before signals. > Yes, we would want uprobes to have a say before do_signal can take a look. > > > I am probably missing something, but why do we need to execute out of > line? > The other alternative to execute out of line is the current inline singlestepping. In inline singlestepping, we would have to guard other tasks from executing the same instruction. Executing out of line solves this problem. > > +/* > > + * arch_uprobe_pre_xol - prepare to execute out of line. > > + * @auprobe: the probepoint information. > > + * @regs: reflects the saved user state of current task. > > + */ > > +int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > > +{ > > + struct arch_uprobe_task *autask = ¤t->utask->autask; > > + > > + autask->saved_trap_nr = current->thread.trap_nr; > > + current->thread.trap_nr = UPROBE_TRAP_NR; > > + regs->nip = current->utask->xol_vaddr; > > + return 0; > > +} > > + > > +/** > > + * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs > > + * @regs: Reflects the saved state of the task after it has hit a breakpoint > > + * instruction. > > + * Return the address of the breakpoint instruction. > > + */ > > +unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) > > +{ > > + return instruction_pointer(regs); > > +} > > This seems like it would be better in asm/uprobes.h as a static inline, > but that's not your fault. > We want archs to override uprobe_get_swbp_addr() if the default uprobe_get_swbp_addr() doesnt suite for a particular arch. Do you have better ways to achieve this. Initially we were using arch specific callbacks from which we moved to using weak functions based on reviews. > > + /* > > + * emulate_step() returns 1 if the insn was successfully emulated. > > + * For all other cases, we need to single-step in hardware. > > + */ > > + ret = emulate_step(regs, insn); > > + if (ret > 0) > > + return true; > > This actually emulates the instruction, ie. the contents of regs are > changed based on the instruction. > > That seems to differ vs x86, where arch_uprobe_skip_sstep() just checks > the instruction and returns true/false. Is that because on x86 they are > only returning true for nops? ie. there is no emulation to be done? > Yes, In x86, we currently support skip for nop instructions, Once we add code for skipping other instructions in x86, we could enhance them to skip them too. > It's a little surprising that can_skip_sstep() actually emulates the > instruction, but again that's not your fault. > Are you saying its doing more than what the name suggests or to the fact that can_skip_step should ideally be called just once? -- Thanks and Regards Srikar ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 5:32 ` Srikar Dronamraju @ 2012-08-23 10:06 ` Benjamin Herrenschmidt 2012-08-23 9:02 ` Oleg Nesterov 2012-08-23 16:17 ` Srikar Dronamraju 2012-08-24 1:33 ` Michael Ellerman 1 sibling, 2 replies; 15+ messages in thread From: Benjamin Herrenschmidt @ 2012-08-23 10:06 UTC (permalink / raw) To: Srikar Dronamraju Cc: Michael Ellerman, ananth, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, oleg, Ingo Molnar On Thu, 2012-08-23 at 11:02 +0530, Srikar Dronamraju wrote: > > > > insn is updated/accessed in the arch independent code. Size of > uprobe_opcode_t could be different for different archs. > uprobe_opcode_t > represents the size of the smallest breakpoint instruction for an > arch. > > Hence u8 works out the best. I know we could still use uprobe_opcode_t > and achieve the same. In which case, we would have to interpret > MAX_UINSN_BYTES differently. Do you see any advantages of using > uprobe_opcode_t instead of u8 across archs? But don't you actively rely on the fact that on powerpc, unlike x86, you -can- atomically replace an instruction with a single 32-bit store ? If you don't you should consider it, and that makes defining this as a u8 array non-sensical (as is using memcpy) Ben. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 10:06 ` Benjamin Herrenschmidt @ 2012-08-23 9:02 ` Oleg Nesterov 2012-08-23 16:02 ` Srikar Dronamraju 2012-08-23 16:17 ` Srikar Dronamraju 1 sibling, 1 reply; 15+ messages in thread From: Oleg Nesterov @ 2012-08-23 9:02 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Srikar Dronamraju, Michael Ellerman, ananth, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, Ingo Molnar On 08/23, Benjamin Herrenschmidt wrote: > > On Thu, 2012-08-23 at 11:02 +0530, Srikar Dronamraju wrote: > > > > > > > insn is updated/accessed in the arch independent code. Size of > > uprobe_opcode_t could be different for different archs. > > uprobe_opcode_t > > represents the size of the smallest breakpoint instruction for an > > arch. > > > > Hence u8 works out the best. I know we could still use uprobe_opcode_t > > and achieve the same. In which case, we would have to interpret > > MAX_UINSN_BYTES differently. Do you see any advantages of using > > uprobe_opcode_t instead of u8 across archs? > > But don't you actively rely on the fact that on powerpc, unlike x86, you > -can- atomically replace an instruction with a single 32-bit store ? I must have missed something... But powerpc does not replace an instruction, the arch independent code does this and it assumes that uprobe->arch.insn is u8[MAX_UINSN_BYTES]. Perhaps you meant that on powerpc it is "safe" to replace the insn even if this can race with some CPU executing this code? But uprobes has to replace the original page anyway, we should not write to ->vm_file. I agree that memcpy() in arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() looks a bit strange. May be powerpc can do struct arch_uprobe { union { u8 insn[MAX_UINSN_BYTES]; u32 ainsn; }; }; and use auprobe->ainsn directly, I dunno. But probably I misunderstood you. Oleg. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 9:02 ` Oleg Nesterov @ 2012-08-23 16:02 ` Srikar Dronamraju 0 siblings, 0 replies; 15+ messages in thread From: Srikar Dronamraju @ 2012-08-23 16:02 UTC (permalink / raw) To: Oleg Nesterov Cc: Benjamin Herrenschmidt, Michael Ellerman, ananth, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, Ingo Molnar * Oleg Nesterov <oleg@redhat.com> [2012-08-23 11:02:09]: > On 08/23, Benjamin Herrenschmidt wrote: > > > > On Thu, 2012-08-23 at 11:02 +0530, Srikar Dronamraju wrote: > > > > > > > > > > insn is updated/accessed in the arch independent code. Size of > > > uprobe_opcode_t could be different for different archs. > > > uprobe_opcode_t > > > represents the size of the smallest breakpoint instruction for an > > > arch. > > > > > > Hence u8 works out the best. I know we could still use uprobe_opcode_t > > > and achieve the same. In which case, we would have to interpret > > > MAX_UINSN_BYTES differently. Do you see any advantages of using > > > uprobe_opcode_t instead of u8 across archs? > > > > But don't you actively rely on the fact that on powerpc, unlike x86, you > > -can- atomically replace an instruction with a single 32-bit store ? > > I must have missed something... > > But powerpc does not replace an instruction, the arch independent code > does this and it assumes that uprobe->arch.insn is u8[MAX_UINSN_BYTES]. > > Perhaps you meant that on powerpc it is "safe" to replace the insn > even if this can race with some CPU executing this code? But uprobes > has to replace the original page anyway, we should not write to > ->vm_file. I think Ben is referring to the fact that because we use an array we endup using memcpy to copy the original instruction from the ->vm_file. > > I agree that memcpy() in arch_uprobe_analyze_insn() and > arch_uprobe_skip_sstep() looks a bit strange. May be powerpc can do > > struct arch_uprobe { > union { > u8 insn[MAX_UINSN_BYTES]; > u32 ainsn; > }; > }; > > and use auprobe->ainsn directly, I dunno. I think this should work. Ben would this suffice? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 10:06 ` Benjamin Herrenschmidt 2012-08-23 9:02 ` Oleg Nesterov @ 2012-08-23 16:17 ` Srikar Dronamraju 2012-08-23 21:57 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 15+ messages in thread From: Srikar Dronamraju @ 2012-08-23 16:17 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Michael Ellerman, ananth, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, oleg, Ingo Molnar * Benjamin Herrenschmidt <benh@kernel.crashing.org> [2012-08-23 20:06:18]: > On Thu, 2012-08-23 at 11:02 +0530, Srikar Dronamraju wrote: > > > > > > > insn is updated/accessed in the arch independent code. Size of > > uprobe_opcode_t could be different for different archs. > > uprobe_opcode_t > > represents the size of the smallest breakpoint instruction for an > > arch. > > > > Hence u8 works out the best. I know we could still use uprobe_opcode_t > > and achieve the same. In which case, we would have to interpret > > MAX_UINSN_BYTES differently. Do you see any advantages of using > > uprobe_opcode_t instead of u8 across archs? > > But don't you actively rely on the fact that on powerpc, unlike x86, you > -can- atomically replace an instruction with a single 32-bit store ? > We are not doing a replace here, we are only copying from the ->vm_file for the largest size instruction possible for that instruction. For powerpc, this is easy because of fixed size instructions. On other archs, at this point, we dont even know the length of the underlying instruction. Now there are 3 ways to handle this: 1. use arch independent copy_insn() (current.) (handles if the instruction spreads across multiple pages on non fixed instruction archs). 2. make the copy_insn() arch specific, that would mean every arch will have to do read_mapping_page etc. 3. have a arch specific hook in arch independent copy_insn code that either does a memcpy for non fixed instruction archs or does an assignment in archs like powerpc. I think you are suggesting option 3. But instead of adding another call that does the arch specific stuff, we are probably be better of doing a memcpy. Right? For all powerpc references to insn we could refer to it as u32 as suggested by Oleg. -- Thanks and Regards Srikar > If you don't you should consider it, and that makes defining this as a > u8 array non-sensical (as is using memcpy) > > Ben. > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 16:17 ` Srikar Dronamraju @ 2012-08-23 21:57 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 15+ messages in thread From: Benjamin Herrenschmidt @ 2012-08-23 21:57 UTC (permalink / raw) To: Srikar Dronamraju Cc: Michael Ellerman, ananth, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, oleg, Ingo Molnar On Thu, 2012-08-23 at 21:47 +0530, Srikar Dronamraju wrote: > * Benjamin Herrenschmidt <benh@kernel.crashing.org> [2012-08-23 20:06:18]: > > > On Thu, 2012-08-23 at 11:02 +0530, Srikar Dronamraju wrote: > > > > > > > > > > insn is updated/accessed in the arch independent code. Size of > > > uprobe_opcode_t could be different for different archs. > > > uprobe_opcode_t > > > represents the size of the smallest breakpoint instruction for an > > > arch. > > > > > > Hence u8 works out the best. I know we could still use uprobe_opcode_t > > > and achieve the same. In which case, we would have to interpret > > > MAX_UINSN_BYTES differently. Do you see any advantages of using > > > uprobe_opcode_t instead of u8 across archs? > > > > But don't you actively rely on the fact that on powerpc, unlike x86, you > > -can- atomically replace an instruction with a single 32-bit store ? > > > > We are not doing a replace here, we are only copying from the ->vm_file > for the largest size instruction possible for that instruction. For > powerpc, this is easy because of fixed size instructions. > > On other archs, at this point, we dont even know the length of the > underlying instruction. > > Now there are 3 ways to handle this: > 1. use arch independent copy_insn() (current.) (handles if the > instruction spreads across multiple pages on non fixed instruction > archs). > > 2. make the copy_insn() arch specific, that would mean every arch will > have to do read_mapping_page etc. > > 3. have a arch specific hook in arch independent copy_insn code that > either does a memcpy for non fixed instruction archs or does an > assignment in archs like powerpc. > > I think you are suggesting option 3. > But instead of adding another call that does the arch specific stuff, we > are probably be better of doing a memcpy. Right? > > For all powerpc references to insn we could refer to it as u32 as > suggested by Oleg. Ok, doens't matter much either way, it's just odd and inefficient. Cheers, Ben. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 5:32 ` Srikar Dronamraju 2012-08-23 10:06 ` Benjamin Herrenschmidt @ 2012-08-24 1:33 ` Michael Ellerman 1 sibling, 0 replies; 15+ messages in thread From: Michael Ellerman @ 2012-08-24 1:33 UTC (permalink / raw) To: Srikar Dronamraju Cc: ananth, ppcdev, lkml, benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Ingo Molnar On Thu, 2012-08-23 at 11:02 +0530, Srikar Dronamraju wrote: > > > > These seem to be duplicated in kprobes.h, can we consolidate them. > > > > > +struct arch_uprobe { > > > + u8 insn[MAX_UINSN_BYTES]; > > > +}; > > > > Why not uprobe_opcode_t insn ? > > > > insn is updated/accessed in the arch independent code. Size of > uprobe_opcode_t could be different for different archs. uprobe_opcode_t > represents the size of the smallest breakpoint instruction for an arch. > > Hence u8 works out the best. I know we could still use uprobe_opcode_t > and achieve the same. In which case, we would have to interpret > MAX_UINSN_BYTES differently. Do you see any advantages of using > uprobe_opcode_t instead of u8 across archs? It's not a big deal, just a bit of a nit. It just feels a bit messy to have uprobe_opcode_t and also expect the arch code to use an array. It seems like if we need uprobe_opcode_t then the generic code should expect to use that everywhere. Or, we should get ride of uprobe_opcode_t and use u8 (or void *). > > > void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) > > > { > > > + if (thread_info_flags & _TIF_UPROBE) { > > > + clear_thread_flag(TIF_UPROBE); > > > + uprobe_notify_resume(regs); > > > + } > > > > Presumably this ordering is crucial, ie. uprobes before signals. > > > > Yes, we would want uprobes to have a say before do_signal can take a > look. A comment would be good IMHO, so in 10 years we don't forget. > > I am probably missing something, but why do we need to execute out of > > line? > > The other alternative to execute out of line is the current inline > singlestepping. In inline singlestepping, we would have to guard other > tasks from executing the same instruction. > > Executing out of line solves this problem. Yeah ignore that, I was forgetting that the original trap is caused by a patched instruction. > > > +/** > > > + * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs > > > + * @regs: Reflects the saved state of the task after it has hit a breakpoint > > > + * instruction. > > > + * Return the address of the breakpoint instruction. > > > + */ > > > +unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) > > > +{ > > > + return instruction_pointer(regs); > > > +} > > > > This seems like it would be better in asm/uprobes.h as a static inline, > > but that's not your fault. > > > > We want archs to override uprobe_get_swbp_addr() if the default > uprobe_get_swbp_addr() doesnt suite for a particular arch. Do you have > better ways to achieve this. Initially we were using arch specific > callbacks from which we moved to using weak functions based on reviews. OK, having a default that is weak makes sense if there is one behaviour that is common across arches, but a few arches need to override it. A static inline in the arch header is simpler if the behaviour is actually different across all arches. The current default implements the x86 behaviour, where the instruction pointer has moved past the breakpoint instruction. What does ARM do? ie. is this behaviour actually the right default? > > > + /* > > > + * emulate_step() returns 1 if the insn was successfully emulated. > > > + * For all other cases, we need to single-step in hardware. > > > + */ > > > + ret = emulate_step(regs, insn); > > > + if (ret > 0) > > > + return true; > > > > This actually emulates the instruction, ie. the contents of regs are > > changed based on the instruction. > > > > That seems to differ vs x86, where arch_uprobe_skip_sstep() just checks > > the instruction and returns true/false. Is that because on x86 they are > > only returning true for nops? ie. there is no emulation to be done? > > > > Yes, In x86, we currently support skip for nop instructions, Once we add > code for skipping other instructions in x86, we could enhance them to > skip them too. (and emulating them right?) > > It's a little surprising that can_skip_sstep() actually emulates the > > instruction, but again that's not your fault. > > Are you saying its doing more than what the name suggests or to the fact > that can_skip_step should ideally be called just once? Yeah just that the name suggests it is just a "checker" function, but it actually emulates the instruction and changes the task state! cheers ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 4:28 ` Michael Ellerman 2012-08-23 5:32 ` Srikar Dronamraju @ 2012-08-23 5:58 ` Ananth N Mavinakayanahalli 2012-08-24 1:13 ` Michael Ellerman 1 sibling, 1 reply; 15+ messages in thread From: Ananth N Mavinakayanahalli @ 2012-08-23 5:58 UTC (permalink / raw) To: Michael Ellerman Cc: ppcdev, lkml, benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar On Thu, Aug 23, 2012 at 02:28:20PM +1000, Michael Ellerman wrote: > On Wed, 2012-08-22 at 13:57 +0530, Ananth N Mavinakayanahalli wrote: > > From: Ananth N Mavinakayanahalli <ananth@in.ibm.com> > > > > This is the port of uprobes to powerpc. Usage is similar to x86. > > Hi Ananth, > > Excuse my ignorance of uprobes, some comments inline ... Thanks for the review Michael! > > [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc > > Added new event: > > probe_libc:malloc (on 0xb4860) > > > > You can now use it in all perf tools, such as: > > > > perf record -e probe_libc:malloc -aR sleep 1 > > Is there a test suite for any of this? We don't have a formal testsuite yet, but the usual way of testing it is to run kernbench while registering/unregistering a bunch of probes periodically. ... > > + * You should have received a copy of the GNU General Public License > > + * along with this program; if not, write to the Free Software > > + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. > > + * > > + * Copyright (C) IBM Corporation, 2007-2012 > > The lawyers say we shouldn't use (C). Will remove that. > Is it really copyright IBM 2007-2012? Or is that because you copied > another header? The later. This is adapted from the x86 version. > > +typedef unsigned int uprobe_opcode_t; > > I'd prefer u32. OK. Will change. > It would be nice if someone could consolidate this with kprobe_opcode_t. Thats on the TODO after the uprobes code stabilizes further. I am wondering which file would be appropriate? We could either consolidate a bunch of these into asm/kdebug.h or asm/ptrace.h. Any preference/suggestion? > > +#define MAX_UINSN_BYTES 4 > > +#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES) > > + > > +#define UPROBE_SWBP_INSN 0x7fe00008 > > This is just "trap" ? Yes. But since its referred to in arch agnostic code too, we'd have to alias it thus. > > +#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */ > > + > > +#define IS_TW(instr) (((instr) & 0xfc0007fe) == 0x7c000008) > > +#define IS_TD(instr) (((instr) & 0xfc0007fe) == 0x7c000088) > > +#define IS_TDI(instr) (((instr) & 0xfc000000) == 0x08000000) > > +#define IS_TWI(instr) (((instr) & 0xfc000000) == 0x0c000000) > > + > > +#define is_trap(instr) (IS_TW(instr) || IS_TD(instr) || \ > > + IS_TWI(instr) || IS_TDI(instr)) > > These seem to be duplicated in kprobes.h, can we consolidate them. Yes, similar to the opcode_t types above. > > +struct arch_uprobe { > > + u8 insn[MAX_UINSN_BYTES]; > > +}; > > Why not uprobe_opcode_t insn ? I had a similar discussion with Srikar while doing the port, but he has reasons for this... ... > > void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) > > { > > + if (thread_info_flags & _TIF_UPROBE) { > > + clear_thread_flag(TIF_UPROBE); > > + uprobe_notify_resume(regs); > > + } > > Presumably this ordering is crucial, ie. uprobes before signals. Correct! ... > > +#define UPROBE_TRAP_NR UINT_MAX > > In the comments below you talk about -1 a few times, but you actually > mean UINT_MAX. Correct. I will fix those references. > > +/** > > + * arch_uprobe_analyze_insn > > Analyze what about the instruction? Depends on the architecture. On x86, we need to verify if the address is at an instruction boundary, and if the instruction can be probed at all. On powerpc, we have an easier time. We just validate the address is aligned at instruction boundary and flag if the instruction at the address is a trap variant. > > + * @mm: the probed address space. > > + * @arch_uprobe: the probepoint information. > > + * @addr: vaddr to probe. > > + * Return 0 on success or a -ve number on error. > > + */ > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) > > +{ > > + unsigned int insn; > > + > > + if (addr & 0x03) > > + return -EINVAL; > > + > > + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); > > We shouldn't need to use memcpy, we know it's a u32. OK. Right now, its u8 insn[4], so I did this to be 'correct'. But I agree we can just do an assignment. > > + if (is_trap(insn)) > > + return -ENOTSUPP; > > A comment saying why we can't handle this would be nice. Will add. > > + return 0; > > +} > > > I am probably missing something, but why do we need to execute out of > line? I think Srikar answered that. ... > > +/* callback routine for handling exceptions. */ > > +int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data) > > +{ > > + struct die_args *args = data; > > + struct pt_regs *regs = args->regs; > > + > > + /* We are only interested in userspace traps */ > > + if (regs && !user_mode(regs)) > > + return NOTIFY_DONE; > > Do we ever get here with a NULL regs? I don't think so. Its just a paranoid check. Do you prefer it to be removed? ... > > + * See if the instruction can be emulated. > > + * Returns true if instruction was emulated, false otherwise. > > + */ > > +bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) > > +{ > > + int ret; > > + unsigned int insn; > > + > > + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); > > Why memcpy? Same as above - u8 insn[4]. > > + > > + /* > > + * emulate_step() returns 1 if the insn was successfully emulated. > > + * For all other cases, we need to single-step in hardware. > > + */ > > + ret = emulate_step(regs, insn); > > + if (ret > 0) > > + return true; > > This actually emulates the instruction, ie. the contents of regs are > changed based on the instruction. > > That seems to differ vs x86, where arch_uprobe_skip_sstep() just checks > the instruction and returns true/false. Is that because on x86 they are > only returning true for nops? ie. there is no emulation to be done? > > It's a little surprising that can_skip_sstep() actually emulates the > instruction, but again that's not your fault. We initially had a uprobe_emulate_step() callback which over many review cycles has morphed into this. Ananth ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-23 5:58 ` Ananth N Mavinakayanahalli @ 2012-08-24 1:13 ` Michael Ellerman 2012-08-24 7:07 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 15+ messages in thread From: Michael Ellerman @ 2012-08-24 1:13 UTC (permalink / raw) To: ananth Cc: ppcdev, lkml, benh, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar On Thu, 2012-08-23 at 11:28 +0530, Ananth N Mavinakayanahalli wrote: > On Thu, Aug 23, 2012 at 02:28:20PM +1000, Michael Ellerman wrote: > > On Wed, 2012-08-22 at 13:57 +0530, Ananth N Mavinakayanahalli wrote: > > > From: Ananth N Mavinakayanahalli <ananth@in.ibm.com> > > > > > > This is the port of uprobes to powerpc. Usage is similar to x86. > > > > Hi Ananth, > > > > Excuse my ignorance of uprobes, some comments inline ... > > Thanks for the review Michael! > > > > [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc > > > Added new event: > > > probe_libc:malloc (on 0xb4860) > > > > > > You can now use it in all perf tools, such as: > > > > > > perf record -e probe_libc:malloc -aR sleep 1 > > > > Is there a test suite for any of this? > > We don't have a formal testsuite yet, but the usual way of testing it is > to run kernbench while registering/unregistering a bunch of probes > periodically. OK. Someone should put that on their TODO list, otherwise in a year or two it'll be broken and we won't notice :) > > It would be nice if someone could consolidate this with kprobe_opcode_t. > > Thats on the TODO after the uprobes code stabilizes further. > > I am wondering which file would be appropriate? We could either > consolidate a bunch of these into asm/kdebug.h or asm/ptrace.h. Any > preference/suggestion? Add a new one :) > > > +#define MAX_UINSN_BYTES 4 > > > +#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES) > > > + > > > +#define UPROBE_SWBP_INSN 0x7fe00008 > > > > This is just "trap" ? > > Yes. But since its referred to in arch agnostic code too, we'd have to alias > it thus. Yep I was just checking, I think it's probably worth a comment. > > > +#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */ > > > + > > > +#define IS_TW(instr) (((instr) & 0xfc0007fe) == 0x7c000008) > > > +#define IS_TD(instr) (((instr) & 0xfc0007fe) == 0x7c000088) > > > +#define IS_TDI(instr) (((instr) & 0xfc000000) == 0x08000000) > > > +#define IS_TWI(instr) (((instr) & 0xfc000000) == 0x0c000000) > > > + > > > +#define is_trap(instr) (IS_TW(instr) || IS_TD(instr) || \ > > > + IS_TWI(instr) || IS_TDI(instr)) > > > > These seem to be duplicated in kprobes.h, can we consolidate them. > > Yes, similar to the opcode_t types above. Hmm, OK. Any reason you can't include kprobes.h to get those? > > > +struct arch_uprobe { > > > + u8 insn[MAX_UINSN_BYTES]; > > > +}; > > > > Why not uprobe_opcode_t insn ? > > I had a similar discussion with Srikar while doing the port, but he has > reasons for this... OK, will argue with him :D > > > +/** > > > + * arch_uprobe_analyze_insn > > > > Analyze what about the instruction? > + /* > Depends on the architecture. On x86, we need to verify if the address is > at an instruction boundary, and if the instruction can be probed at all. > On powerpc, we have an easier time. We just validate the address is > aligned at instruction boundary and flag if the instruction at the > address is a trap variant. + */ :) > > > + * @mm: the probed address space. > > > + * @arch_uprobe: the probepoint information. > > > + * @addr: vaddr to probe. > > > + * Return 0 on success or a -ve number on error. > > > + */ > > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) > > > +{ > > > + unsigned int insn; > > > + > > > + if (addr & 0x03) > > > + return -EINVAL; > > > + > > > + memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES); > > > > We shouldn't need to use memcpy, we know it's a u32. > > OK. Right now, its u8 insn[4], so I did this to be 'correct'. But I > agree we can just do an assignment. Yeah, at least in the arch code I think we should just use u32 and assign directly. > > > +/* callback routine for handling exceptions. */ > > > +int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data) > > > +{ > > > + struct die_args *args = data; > > > + struct pt_regs *regs = args->regs; > > > + > > > + /* We are only interested in userspace traps */ > > > + if (regs && !user_mode(regs)) > > > + return NOTIFY_DONE; > > > > Do we ever get here with a NULL regs? > > I don't think so. Its just a paranoid check. Do you prefer it to be > removed? Yeah. A NULL regs here is a kernel bug, so I think it's actually preferable to crash than silently return. cheers ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-24 1:13 ` Michael Ellerman @ 2012-08-24 7:07 ` Benjamin Herrenschmidt 2012-08-24 7:37 ` Ananth N Mavinakayanahalli 0 siblings, 1 reply; 15+ messages in thread From: Benjamin Herrenschmidt @ 2012-08-24 7:07 UTC (permalink / raw) To: Michael Ellerman Cc: ananth, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar On Fri, 2012-08-24 at 11:13 +1000, Michael Ellerman wrote: > > Yeah. A NULL regs here is a kernel bug, so I think it's actually > preferable to crash than silently return. Or best, if you think there's a remote chance that the bug might hit: if (WARN(!regs)) return Cheers, Ben. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/2] powerpc: Uprobes port to powerpc 2012-08-24 7:07 ` Benjamin Herrenschmidt @ 2012-08-24 7:37 ` Ananth N Mavinakayanahalli 0 siblings, 0 replies; 15+ messages in thread From: Ananth N Mavinakayanahalli @ 2012-08-24 7:37 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Michael Ellerman, ppcdev, lkml, Paul Mackerras, Anton Blanchard, peterz, oleg, Srikar Dronamraju, Ingo Molnar On Fri, Aug 24, 2012 at 05:07:31PM +1000, Benjamin Herrenschmidt wrote: > On Fri, 2012-08-24 at 11:13 +1000, Michael Ellerman wrote: > > > > Yeah. A NULL regs here is a kernel bug, so I think it's actually > > preferable to crash than silently return. > > Or best, if you think there's a remote chance that the bug might hit: > > if (WARN(!regs)) > return Incorporated this and other suggestions from Michael in v5 I just posted. Ananth ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-08-24 7:37 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-08-22 8:22 [PATCH 1/2] powerpc: Add trap_nr to thread_struct Ananth N Mavinakayanahalli 2012-08-22 8:27 ` [PATCH v4 2/2] powerpc: Uprobes port to powerpc Ananth N Mavinakayanahalli 2012-08-22 15:55 ` Oleg Nesterov 2012-08-23 4:28 ` Michael Ellerman 2012-08-23 5:32 ` Srikar Dronamraju 2012-08-23 10:06 ` Benjamin Herrenschmidt 2012-08-23 9:02 ` Oleg Nesterov 2012-08-23 16:02 ` Srikar Dronamraju 2012-08-23 16:17 ` Srikar Dronamraju 2012-08-23 21:57 ` Benjamin Herrenschmidt 2012-08-24 1:33 ` Michael Ellerman 2012-08-23 5:58 ` Ananth N Mavinakayanahalli 2012-08-24 1:13 ` Michael Ellerman 2012-08-24 7:07 ` Benjamin Herrenschmidt 2012-08-24 7:37 ` Ananth N Mavinakayanahalli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).