From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751574Ab1HKNWn (ORCPT ); Thu, 11 Aug 2011 09:22:43 -0400 Received: from hrndva-omtalb.mail.rr.com ([71.74.56.123]:61201 "EHLO hrndva-omtalb.mail.rr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750772Ab1HKNWi (ORCPT ); Thu, 11 Aug 2011 09:22:38 -0400 X-Authority-Analysis: v=1.1 cv=YhhhcVvq/Bf3xBNEvzTEV9JHGW2mXul7kEbaqsyQnMQ= c=1 sm=0 a=pcmhLELtPZwA:10 a=5SG0PmZfjMsA:10 a=Q9fys5e9bTEA:10 a=OPBmh+XkhLl+Enan7BmTLg==:17 a=20KFwNOVAAAA:8 a=meVymXHHAAAA:8 a=EQh9GNhfA6xwHjBRY_YA:9 a=6udtA2fqE5CvyF5eIjgA:7 a=PUjeQqilurYA:10 a=jEp0ucaQiEUA:10 a=jeBq3FmKZ4MA:10 a=g0zUar7BivD78PRT:21 a=AKC-GbJDoSTvJKWx:21 a=OPBmh+XkhLl+Enan7BmTLg==:117 X-Cloudmark-Score: 0 X-Originating-IP: 67.242.120.143 Subject: Re: [PATCH 5/5][RFC] kprobes: Use ftrace hooks when probing ftrace nops From: Steven Rostedt To: Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Andrew Morton , Thomas Gleixner , Peter Zijlstra , Frederic Weisbecker , Arnaldo Carvalho de Melo , Jason Baron , yrl.pp-manager.tt@hitachi.com In-Reply-To: <4E4387A7.7040200@hitachi.com> References: <20110810162222.017387055@goodmis.org> <20110810163039.000615163@goodmis.org> <4E4387A7.7040200@hitachi.com> Content-Type: text/plain; charset="ISO-8859-15" Date: Thu, 11 Aug 2011 09:22:36 -0400 Message-ID: <1313068956.18583.309.camel@gandalf.stny.rr.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.3 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2011-08-11 at 16:41 +0900, Masami Hiramatsu wrote: > Hi Steven, > > As I suggested in another reply, I'm now attracted by the idea > of using ftrace in kprobe-tracer, because of simplicity. > Anyway, here is my review. It may be a little simpler, but it defeats the purpose of what I'm trying to accomplish. I do want the ability to expand this to do other things than just trace_kprobe. > > (2011/08/11 1:22), Steven Rostedt wrote: > > From: Steven Rostedt > > > > Currently, if a probe is set at an ftrace nop, the probe will fail. > > > > When -mfentry is used by ftrace (in the near future), the nop will be > > at the beginning of the function. This will prevent kprobes from hooking > > to the most common place that kprobes are attached. > > > > Now that ftrace supports individual users as well as pt_regs passing, > > kprobes can use the ftrace infrastructure when a probe is placed on > > a ftrace nop. > > My one general concern is the timing of enabling ftrace-based probe. > Breakpoint-based kprobe (including optimized kprobe) is enabled > right after registering. Users might expect that. > And AFAIK, dynamic ftraces handler will be enabled (activated) > after a while, because it has to wait for an NMI, doesn't it? There's no NMI needed. Not sure where you got that idea. When register_ftrace_function() is called, it will start tracing immediately. Note, stop_machine() is called, but we could fix that in the future too. > > And theoretically, this ftrace-based probe can't support jprobe, > because it changes IP address. Nowadays, this may becomes minor > problem (because someone who wants to trace function args can > use kprobe-tracer), but still exist. I want to fix that too. :) We don't need to worry about that until -mfentry exists. But once that does, then, yes, I want the ftrace version working for jprobe too. > > > > Signed-off-by: Steven Rostedt > > --- > > include/linux/kprobes.h | 6 ++ > > kernel/kprobes.c | 163 ++++++++++++++++++++++++++++++++++++++++++++--- > > 2 files changed, 160 insertions(+), 9 deletions(-) > > > > diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h > > index dd7c12e..5742071 100644 > > --- a/include/linux/kprobes.h > > +++ b/include/linux/kprobes.h > > @@ -37,6 +37,7 @@ > > #include > > #include > > #include > > +#include > > > > #ifdef CONFIG_KPROBES > > #include > > @@ -112,6 +113,11 @@ struct kprobe { > > /* copy of the original instruction */ > > struct arch_specific_insn ainsn; > > > > +#ifdef CONFIG_FUNCTION_TRACER > > + /* If it is possible to use ftrace to probe */ > > + struct ftrace_ops fops; > > +#endif > > + > > /* > > * Indicates various status flags. > > * Protected by kprobe_mutex after this kprobe is registered. > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index e6c25eb..2160768 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -102,6 +102,31 @@ static struct kprobe_blackpoint kprobe_blacklist[] = { > > {NULL} /* Terminator */ > > }; > > > > +#ifdef CONFIG_FUNCTION_TRACER > > +/* kprobe stub function for use when probe is not connected to ftrace */ > > +static void > > +kprobe_ftrace_stub(unsigned long ip, unsigned long pip, > > + struct ftrace_ops *op, struct pt_regs *pt_regs) > > +{ > > +} > > + > > +static bool ftrace_kprobe(struct kprobe *p) > > +{ > > + return p->fops.func && p->fops.func != kprobe_ftrace_stub; > > +} > > + > > +static void init_non_ftrace_kprobe(struct kprobe *p) > > +{ > > + p->fops.func = kprobe_ftrace_stub; > > +} > > + > > +#else > > +static bool ftrace_kprobe(struct kprobe *p) > > +{ > > + return false; > > +} > > +#endif > > + > > #ifdef __ARCH_WANT_KPROBES_INSN_SLOT > > /* > > * kprobe->ainsn.insn points to the copy of the instruction to be > > @@ -396,6 +421,9 @@ static __kprobes void free_aggr_kprobe(struct kprobe *p) > > { > > struct optimized_kprobe *op; > > > > + if (ftrace_kprobe(p)) > > + return; > > + > > op = container_of(p, struct optimized_kprobe, kp); > > arch_remove_optimized_kprobe(op); > > arch_remove_kprobe(p); > > @@ -759,6 +787,9 @@ static __kprobes void try_to_optimize_kprobe(struct kprobe *p) > > struct kprobe *ap; > > struct optimized_kprobe *op; > > > > + if (ftrace_kprobe(p)) > > + return; > > + > > ap = alloc_aggr_kprobe(p); > > if (!ap) > > return; > > @@ -849,6 +880,10 @@ static void __kprobes __arm_kprobe(struct kprobe *p) > > { > > struct kprobe *_p; > > > > + /* Only arm non-ftrace probes */ > > + if (ftrace_kprobe(p)) > > + return; > > + > > /* Check collision with other optimized kprobes */ > > _p = get_optimized_kprobe((unsigned long)p->addr); > > if (unlikely(_p)) > > @@ -864,6 +899,10 @@ static void __kprobes __disarm_kprobe(struct kprobe *p, > bool reopt) > > { > > struct kprobe *_p; > > > > + /* Only disarm non-ftrace probes */ > > + if (ftrace_kprobe(p)) > > + return; > > + > > unoptimize_kprobe(p, false); /* Try to unoptimize */ > > > > if (!kprobe_queued(p)) { > > @@ -878,13 +917,26 @@ static void __kprobes __disarm_kprobe(struct kprobe *p, > bool reopt) > > > > #else /* !CONFIG_OPTPROBES */ > > > > +static void __kprobes __arm_kprobe(struct kprobe *p) > > +{ > > + /* Only arm non-ftrace probes */ > > + if (!ftrace_kprobe(p)) > > + arch_arm_kprobe(p); > > +} > > + > > +/* Remove the breakpoint of a probe. Must be called with text_mutex locked */ > > +static void __kprobes __disarm_kprobe(struct kprobe *p, bool reopt) > > +{ > > + /* Only disarm non-ftrace probes */ > > + if (!ftrace_kprobe(p)) > > + arch_disarm_kprobe(p); > > +} > > If it ignores disabling/enabling, kprobe_ftrace_callback must > check kprobe_disabled(p) and skip it. No, the callers of __(dis)arm_kprobe() also call the (dis)arm_kprobe() later, which calls (un)register_ftrace_function(). The reason we can't call the ftrace register functions here is because they call stop_machine() and then we get into a deadlock with the text_mutex. The places that call __(dis)arm_kprobe() later call (dis)arm_kprobe() after releasing the text_mutex. > > > + > > #define optimize_kprobe(p) do {} while (0) > > #define unoptimize_kprobe(p, f) do {} while (0) > > #define kill_optimized_kprobe(p) do {} while (0) > > #define prepare_optimized_kprobe(p) do {} while (0) > > #define try_to_optimize_kprobe(p) do {} while (0) > > -#define __arm_kprobe(p) arch_arm_kprobe(p) > > -#define __disarm_kprobe(p, o) arch_disarm_kprobe(p) > > #define kprobe_disarmed(p) kprobe_disabled(p) > > #define wait_for_kprobe_optimizer() do {} while (0) > > > > @@ -897,7 +949,9 @@ static void reuse_unused_kprobe(struct kprobe *ap) > > > > static __kprobes void free_aggr_kprobe(struct kprobe *p) > > { > > - arch_remove_kprobe(p); > > + > > + if (!ftrace_kprobe(p)) > > + arch_remove_kprobe(p); > > kfree(p); > > } > > > > @@ -910,6 +964,12 @@ static __kprobes struct kprobe *alloc_aggr_kprobe(struct > kprobe *p) > > /* Arm a kprobe with text_mutex */ > > static void __kprobes arm_kprobe(struct kprobe *kp) > > { > > + /* ftrace probes can skip arch calls */ > > + if (ftrace_kprobe(kp)) { > > + register_ftrace_function(&kp->fops); > > + return; > > + } > > + > > /* > > * Here, since __arm_kprobe() doesn't use stop_machine(), > > * this doesn't cause deadlock on text_mutex. So, we don't > > @@ -924,6 +984,12 @@ static void __kprobes arm_kprobe(struct kprobe *kp) > > static void __kprobes disarm_kprobe(struct kprobe *kp) > > { > > /* Ditto */ > > + > > + if (ftrace_kprobe(kp)) { > > + unregister_ftrace_function(&kp->fops); > > + return; > > + } > > + > > mutex_lock(&text_mutex); > > __disarm_kprobe(kp, true); > > mutex_unlock(&text_mutex); > > @@ -1313,6 +1379,56 @@ static inline int check_kprobe_rereg(struct kprobe *p) > > return ret; > > } > > > > +#ifdef CONFIG_FUNCTION_TRACER > > +static notrace void > > +kprobe_ftrace_callback(unsigned long ip, unsigned long parent_ip, > > + struct ftrace_ops *op, struct pt_regs *pt_regs) > > +{ > > + struct kprobe *p = container_of(op, struct kprobe, fops); > > + > > Here, we need to set up kprobe_ctlblk and some of pt_regs members, > ip, cs and orig_ax as optimized_callback()@arch/x86/kernel/kprobes.c > does. I'm curious to what this is used for? It doesn't seem to be needed for the generic kprobes. Because we know the probe was on a nop, there's no need to simulate the operation. IOW, there's no need for singlestep() or other gdb like operations. > > > + /* A nop has been trapped, just run both handlers back to back */ > > + if (p->pre_handler) > > + p->pre_handler(p, pt_regs); > > And increment regs->ip here for NOP. Does the post_handler() expect the ip to be after the call? Thus a post_handle() is the same as pre_handle() on rip+next_ins? > > > + if (p->post_handler) > > + p->post_handler(p, pt_regs, 0); > > +} > > Anyway, above operations strongly depends on arch, so > kprobe_ftrace_callback should be moved to arch/*/kprobes.c. > > And I think most of the code can be shared with optimized code. Not sure why. Except if regs->ip needs to be incremented. And that can be a per arch header file. Remember, the ftrace version has no need for single stepping like the optimized version does. ftrace replaces a nop, where as other kprobes replace actual instructions. This is what make the ftrace version so much less complex. > > > + > > +static int use_ftrace_hook(struct kprobe *p) > > +{ > > + char str[KSYM_SYMBOL_LEN]; > > + > > + /* see if this is an ftrace function */ > > + if (!ftrace_text_reserved(p->addr, p->addr)) { > > + /* make sure fops->func is nop */ > > + init_non_ftrace_kprobe(p); > > + return 0; > > + } > > + > > + /* If arch does not support pt_regs passing, bail */ > > + if (!ARCH_SUPPORTS_FTRACE_SAVE_REGS) > > + return -EINVAL; > > Hmm, I think this should be checked at build time... This is actually keeping the same logic that exists now. See below, we removed the check of ftrace_text_reserved() from the conditions that make register_kprobe() return -EINVAL. Now if the arch supports FTRACE_SAVE_REGS, it can handle the ftrace_text_reserved(), if not, then it goes back to its original behavior. The only way to remove the later code is with #ifdef ugliness, and I didn't want to add that. > > > + > > + /* Use ftrace hook instead */ > > + > > + memset(&p->fops, 0, sizeof(p->fops)); > > + > > + /* Find the function this is connected with this addr */ > > + kallsyms_lookup((unsigned long)p->addr, NULL, NULL, NULL, str); > > + > > + p->fops.flags = FTRACE_OPS_FL_SAVE_REGS; > > + p->fops.func = kprobe_ftrace_callback; > > + > > + ftrace_set_filter(&p->fops, str, strlen(str), 1); > > Hmm, IMHO, ftrace_set_filter should not be called here, because > there can be other kprobes are already registered on the same > address. In that case, it is natural that we use an aggr_kprobe > for handling several kprobes on same address. Or, kprobe hash table > will have several different probes on same address. The ftrace_set_filter() only updates the p->fops to what it will trace. It's the register_ftrace_function() that enables the ftrace tracing, and that is done with the arm_kprobe() call. If that's where the aggr_kprobe enables its call, then that will be the only probe that is called. Now I need to re-examine how the aggr_kprobes work. Does it have its own handler to call multiple probe->handlers? > > > + > > + return 0; > > +} > > +#else > > +static int use_ftrace_hook(struct kprobe *p) > > +{ > > + return 0; > > +} > > +#endif > > + > > int __kprobes register_kprobe(struct kprobe *p) > > { > > int ret = 0; > > @@ -1329,11 +1445,14 @@ int __kprobes register_kprobe(struct kprobe *p) > > if (ret) > > return ret; > > > > + ret = use_ftrace_hook(p); > > + if (ret) > > + return ret; > > + > > jump_label_lock(); > > preempt_disable(); > > if (!kernel_text_address((unsigned long) p->addr) || > > in_kprobes_functions((unsigned long) p->addr) || > > - ftrace_text_reserved(p->addr, p->addr) || > > jump_label_text_reserved(p->addr, p->addr)) > > goto fail_with_jump_label; > > > > @@ -1384,15 +1503,17 @@ int __kprobes register_kprobe(struct kprobe *p) > > goto out; > > } > > > > - ret = arch_prepare_kprobe(p); > > - if (ret) > > - goto out; > > + if (!ftrace_kprobe(p)) { > > + ret = arch_prepare_kprobe(p); > > + if (ret) > > + goto out; > > + } > > > > INIT_HLIST_NODE(&p->hlist); > > hlist_add_head_rcu(&p->hlist, > > &kprobe_table[hash_ptr(p->addr, KPROBE_HASH_BITS)]); > > > > - if (!kprobes_all_disarmed && !kprobe_disabled(p)) > > + if (!ftrace_kprobe(p) && !kprobes_all_disarmed && !kprobe_disabled(p)) > > __arm_kprobe(p); > > > > /* Try to optimize kprobe */ > > @@ -1400,6 +1521,12 @@ int __kprobes register_kprobe(struct kprobe *p) > > > > out: > > mutex_unlock(&text_mutex); > > + > > + /* ftrace kprobes must be set outside of text_mutex */ > > + if (!ret && ftrace_kprobe(p) && > > + !kprobes_all_disarmed && !kprobe_disabled(p)) > > + arm_kprobe(p); > > + > > put_online_cpus(); > > jump_label_unlock(); > > mutex_unlock(&kprobe_mutex); > > After this, we must handle some fails which can happen when probing > on a module. What problems that were added by ftrace that isn't a problem with normal probes? > > > > @@ -2134,6 +2261,14 @@ static void __kprobes arm_all_kprobes(void) > > } > > mutex_unlock(&text_mutex); > > > > + /* ftrace kprobes are enabled outside of text_mutex */ > > + for (i = 0; i < KPROBE_TABLE_SIZE; i++) { > > + head = &kprobe_table[i]; > > + hlist_for_each_entry_rcu(p, node, head, hlist) > > + if (ftrace_kprobe(p) && !kprobe_disabled(p)) > > + arm_kprobe(p); > > + } > > + > > kprobes_all_disarmed = false; > > printk(KERN_INFO "Kprobes globally enabled\n"); > > > > @@ -2164,11 +2299,21 @@ static void __kprobes disarm_all_kprobes(void) > > for (i = 0; i < KPROBE_TABLE_SIZE; i++) { > > head = &kprobe_table[i]; > > hlist_for_each_entry_rcu(p, node, head, hlist) { > > - if (!arch_trampoline_kprobe(p) && !kprobe_disabled(p)) > > + if (!ftrace_kprobe(p) && > > + !arch_trampoline_kprobe(p) && !kprobe_disabled(p)) > > __disarm_kprobe(p, false); > > } > > } > > mutex_unlock(&text_mutex); > > + > > + /* ftrace kprobes are disabled outside of text_mutex */ > > + for (i = 0; i < KPROBE_TABLE_SIZE; i++) { > > + head = &kprobe_table[i]; > > + hlist_for_each_entry_rcu(p, node, head, hlist) { > > + if (ftrace_kprobe(p) && !kprobe_disabled(p)) > > + disarm_kprobe(p); > > + } > > + } > > mutex_unlock(&kprobe_mutex); > > > > /* Wait for disarming all kprobes by optimizer */ > > Thank you, > Thanks for taking the time for your review! -- Steve