From: Stafford Horne <shorne@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Kees Cook <keescook@chromium.org>, Ingo Molnar <mingo@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Alexei Starovoitov <ast@kernel.org>,
Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>,
"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
"H . Peter Anvin" <hpa@zytor.com>,
Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
"David S . Miller" <davem@davemloft.net>,
Ian McDonald <ian.mcdonald@jandi.co.nz>,
Vlad Yasevich <vyasevich@gmail.com>,
Stephen Hemminger <stephen@networkplumber.org>
Subject: Re: [RFC PATCH -tip 0/5] kprobes: Abolish jprobe APIs
Date: Sat, 7 Oct 2017 14:24:53 +0900 [thread overview]
Message-ID: <20171007052453.GF2630@lianli.shorne-pla.net> (raw)
In-Reply-To: <20171006113430.2c31561b@gandalf.local.home>
Hello,
Nice read, see some comments below
On Fri, Oct 06, 2017 at 11:34:30AM -0400, Steven Rostedt wrote:
> On Fri, 6 Oct 2017 13:49:59 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> > Steve, could you write a documentation how to use ftrace callback?
> > I think I should update the Documentation/kprobes.txt so that jprobe
> > user can easily migrate on that.
>
> I decided to do this now. Here's a first draft. What do you think?
>
> -- Steve
>
> Using ftrace to hook to functions
> =================================
>
> Copyright 2017 VMware Inc.
> Author: Steven Rostedt <srostedt@goodmis.org>
> License: The GNU Free Documentation License, Version 1.2
> (dual licensed under the GPL v2)
>
> Written for: 4.14
>
> Introduction
> ------------
>
> The ftrace infrastructure was originially created to attach hooks to the
> beginning of functions in order to record and trace the flow of the kernel.
> But hooks to the start of a function can have other use cases. Either
> for live kernel patching, or for security monitoring. This document describes
> how to use ftrace to implement your own function hooks.
>
>
> The ftrace context
> ==================
>
> WARNING: The ability to add a callback to almost any function within the
> kernel comes with risks. A callback can be called from any context
> (normal, softirq, irq, and NMI). Callbacks can also be called just before
> going to idle, during CPU bring up and takedown, or going to user space.
> This requires extra care to what can be done inside a callback. A callback
> can be called outside the protective scope of RCU.
>
> The ftrace infrastructure has some protections agains recursions and RCU
> but one must still be very careful how they use the callbacks.
>
>
> The ftrace_ops structure
> ========================
>
> To register a function callback, a ftrace_ops is required. This structure
> is used to tell ftrace what function should be called as the callback
> as well as what protections the callback will perform and not require
> ftrace to handle.
>
> There are only two fields that are needed to be set when registering
> an ftrace_ops with ftrace. The rest should be NULL.
>
> struct ftrace_ops ops = {
> .func = my_callback_func,
> .flags = MY_FTRACE_FLAGS
> .private = any_private_data_structure,
> };
>
> Both .flags and .private are optional. Only .func is required.
>
> To enable tracing call:
>
> register_ftrace_function(&ops);
Maybe it would help to have a small section on 'The register function'
below to answer?
Is it possible to make changes to the filter after calling
register_ftrace_function()? Or do you need to call
register_ftrace_function() again?
> To disable tracing call:
>
> unregister_ftrace_function(@ops);
>
>
> The callback function
> =====================
>
> The prototype of the callback function is as follows (as of v4.14):
>
> void callback_func(unsigned long ip, unsigned long parent_ip,
> struct ftrace_ops *op, struct pt_regs *regs);
>
> @ip - This is the instruction pointer of the function that is being traced.
> (where the fentry or mcount is within the function)
>
> @parent_ip - This is the instruction pointer of the function that called the
> the function being traced (where the call of the function occurred).
>
> @op - This is a pointer to ftrace_ops that was used to register the callback.
> This can be used to pass data to the callback via the private pointer.
>
> @regs - If the FTRACE_OPS_FL_SAVE_REGS or FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED
> flags are set in the ftrace_ops structure, then this will be pointing
> to the pt_regs structure like it would be if an breakpoint was placed
> at the start of the function where ftrace was tracing. Otherwise it
> either contains garbage, or NULL.
>
>
> The ftrace FLAGS
> ================
>
> The ftrace_ops flags are all defined and documented in include/linux/ftrace.h.
> Some of the flags are used for internal infrastructure of ftrace, but the
> ones that users should be aware of are the following:
>
> (All of these are prefixed with FTRACE_OPS_FL_)
>
> PER_CPU - When set, the callback can be enabled or disabled per cpu with the
> following functions:
>
> void ftrace_function_local_enable(struct ftrace_ops *ops);
> void ftrace_function_local_disable(struct ftrace_ops *ops);
>
> These two functions must be called with preemption disabled.
>
> SAVE_REGS - If the callback requires reading or modifying the pt_regs
> passed to the callback, then it must set this flag. Registering
> a ftrace_ops with this flag set on an architecture that does not
> support passing of pt_regs to the callback, will fail.
>
> SAVE_REGS_IF_SUPPORTED - Similar to SAVE_REGS but the registering of a
> ftrace_ops on an architecture that does not support passing of regs
> will not fail with this flag set. But the callback must check if
> regs is NULL or not to determine if the architecture supports it.
>
> RECURSION_SAFE - By default, a wrapper is added around the callback to
> make sure that recursion of the function does not occur. That is
> if a function within the callback itself is also traced, ftrace
> will prevent the callback from being called again. But this wrapper
> adds some overhead, and if the callback is safe from recursion,
> it can set this flag to disable the ftrace protection.
>
> IPMODIFY - Requires SAVE_REGS set. If the callback is to "hijack" the
> traced function (have another function called instead of the traced
> function), it requires setting this flag. This is what live kernel
> patches uses. Without this flag the pt_regs->ip can not be modified.
> Note, only one ftrace_ops with IPMODIFY set may be registered to
> any given function at a time.
>
> RCU - If this is set, then the callback will only be called by functions
> where RCU is "watching". This is required if the callback function
> performs any rcu_read_lock() operation.
>
>
> Filtering what functions to trace
> =================================
>
> If a callback is only to be called from specific functions, a filter must be
> set up. The filters are added by name, or ip if it is known.
>
> int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
> int len, int reset);
>
> @ops - the ops to set the filter with
> @buf - the string that holds the function filter text.
> @len - the length of the string.
> @reset - non zero to reset all filters before applying this filter.
>
> Filters denote which functions should be enabled when tracing is enabled.
> If @buf is NULL and reset is set, all functions will be enabled for tracing.
>
>
> The @buf can also be a glob expression to enable all functions that
> match a specific pattern.
>
> To just trace the schedule function:
>
> ret = ftrace_set_filter(&ops, "schedule", strlen("schedule"), 0);
>
> To add more functions, call the ftrace_set_filter() more than once with the
> @reset parameter set to zero. To remove the current filter and replace it
> with new functions to trace, have @reset be non zero.
>
> Sometimes more than one function has the same name. To trace just a specific
> function in this case, ftrace_set_filter_ip() can be used.
>
> ret = ftrace_set_filter_ip(&ops, ip, 0, 0);
>
> Although the ip must be the address where the call to fentry or mcount is
> located in the function.
>
> If a glob is used to set the filter, to remove unwanted matches the
> ftrace_set_notrace() can also be used.
>
> int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
> int len, int reset);
>
> This takes the same parameters as ftrace_set_filter() but will add the
> functions it finds to not be traced. This doesn't remove them from the
> filter itself, but keeps them from being traced. If @reset is set,
> the filter is cleaded but the functions that match @buf will still not
'cleared'?
> be traced (the callback will not be called on those functions).
This is a bit confusing, I guess it means 'the existng filter is cleared
and the filter *will match all* functions excluding those that match @buf'.
-Stafford
next prev parent reply other threads:[~2017-10-07 5:24 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-05 23:13 [RFC PATCH -tip 0/5] kprobes: Abolish jprobe APIs Masami Hiramatsu
2017-10-05 23:13 ` [RFC PATCH -tip 1/5] kprobes: Use ENOTSUPP instead of ENOSYS Masami Hiramatsu
2017-10-20 8:57 ` Ingo Molnar
2017-10-20 15:51 ` Masami Hiramatsu
2017-10-05 23:14 ` [RFC PATCH -tip 2/5] kprobes: Abolish jprobe APIs Masami Hiramatsu
2017-10-20 12:26 ` [tip:perf/core] kprobes: Disable the jprobes APIs tip-bot for Masami Hiramatsu
2017-10-05 23:15 ` [RFC PATCH -tip 3/5] kprobes: Disable jprobe test code Masami Hiramatsu
2017-10-20 12:26 ` [tip:perf/core] kprobes: Disable the jprobes " tip-bot for Masami Hiramatsu
2017-10-05 23:15 ` [RFC PATCH -tip 4/5] kprobes: Remove jprobe sample code Masami Hiramatsu
2017-10-20 12:27 ` [tip:perf/core] kprobes: Remove the jprobes " tip-bot for Masami Hiramatsu
2017-10-05 23:16 ` [RFC PATCH -tip 5/5] kprobes: docs: Remove jprobe related document Masami Hiramatsu
2017-10-20 12:27 ` [tip:perf/core] kprobes/docs: Remove jprobes related documents tip-bot for Masami Hiramatsu
2017-10-05 23:35 ` [RFC PATCH -tip 0/5] kprobes: Abolish jprobe APIs Kees Cook
2017-10-05 23:58 ` Steven Rostedt
2017-10-06 0:06 ` Kees Cook
2017-10-06 4:49 ` Masami Hiramatsu
2017-10-06 12:58 ` Steven Rostedt
2017-10-06 15:34 ` Steven Rostedt
2017-10-07 5:24 ` Stafford Horne [this message]
2017-10-09 16:48 ` Steven Rostedt
2017-10-07 8:55 ` Ingo Molnar
2017-10-09 16:45 ` Steven Rostedt
2017-10-07 9:35 ` Masami Hiramatsu
2017-10-09 16:59 ` Steven Rostedt
2017-10-09 15:33 ` Jonathan Corbet
2017-10-09 16:20 ` Steven Rostedt
2017-10-09 16:33 ` Jonathan Corbet
2017-10-09 16:41 ` Steven Rostedt
2017-10-09 18:10 ` Steven Rostedt
2017-10-10 14:02 ` Steven Rostedt
2017-10-06 0:32 ` Masami Hiramatsu
2017-10-06 1:11 ` Steven Rostedt
2017-10-06 4:47 ` Masami Hiramatsu
2017-10-20 12:22 ` Ingo Molnar
2017-10-20 13:32 ` Kees Cook
2017-10-20 15:17 ` Ingo Molnar
2017-10-20 16:28 ` Kees Cook
2017-10-21 8:06 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171007052453.GF2630@lianli.shorne-pla.net \
--to=shorne@gmail.com \
--cc=ananth@linux.vnet.ibm.com \
--cc=anil.s.keshavamurthy@intel.com \
--cc=ast@kernel.org \
--cc=davem@davemloft.net \
--cc=hpa@zytor.com \
--cc=ian.mcdonald@jandi.co.nz \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stephen@networkplumber.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vyasevich@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).