This is a series of changes I plan on pushing for this release and to stable. Feel free to ack (or nack, but I hope you don't ;-). I pushed this up into my git repo as well, but it may change depending on comments. -- Steve git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git ftrace/urgent Head SHA1: dd44790df744e71674558ae7ceb9dde3ed6f419c Petr Mladek (1): ftrace/x86: One more missing sync after fixup of function modification failure Steven Rostedt (Red Hat) (4): ftrace/x86: Run a sync after fixup on failure tracing: Do not add event files for modules that fail tracepoints tracepoint: Do not waste memory on mods with no tracepoints tracepoint: Warn and notify if tracepoints are not loaded due to module taint ---- arch/x86/kernel/ftrace.c | 7 ++++--- include/linux/tracepoint.h | 6 ++++++ kernel/trace/trace_events.c | 4 ++++ kernel/tracepoint.c | 17 ++++++++++++++++- 4 files changed, 30 insertions(+), 4 deletions(-)
[-- Attachment #1: 0001-ftrace-x86-Run-a-sync-after-fixup-on-failure.patch --] [-- Type: text/plain, Size: 2124 bytes --] [Request for Ack] From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> If a failure occurs while enabling a trace, it bails out and will remove the tracepoints to be back to what the code originally was. But the fix up had some bugs in it. By injecting a failure in the code, the fix up ran to completion, but shortly afterward the system rebooted. There was two bugs here. The first was that there was no final sync run across the CPUs after the fix up was done, and before the ftrace int3 handler flag was reset. That means that other CPUs could still see the breakpoint and trigger on it long after the flag was cleared, and the int3 handler would think it was a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints because the ftrace int3 handler flag would have prevented the int3 handler from going further. The second bug was that the removal of the breakpoints required the "within()" logic updates instead of accessing the ip address directly. Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz Reported-by: Petr Mladek <pmladek@suse.cz> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable@vger.kernel.org # 3.5+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- arch/x86/kernel/ftrace.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index e625319..6b566c8 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -455,7 +455,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec) } update: - return probe_kernel_write((void *)ip, &nop[0], 1); + return ftrace_write(ip, nop, 1); } static int add_update_code(unsigned long ip, unsigned const char *new) @@ -634,6 +634,7 @@ void ftrace_replace_code(int enable) rec = ftrace_rec_iter_record(iter); remove_breakpoint(rec); } + run_sync(); } static int @@ -664,7 +665,7 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code, return ret; fail_update: - probe_kernel_write((void *)ip, &old_code[0], 1); + ftrace_write(ip, old_code, 1); goto out; } -- 1.8.5.3
[-- Attachment #1: 0002-ftrace-x86-One-more-missing-sync-after-fixup-of-func.patch --] [-- Type: text/plain, Size: 1059 bytes --] [Request for Ack] From: Petr Mladek <pmladek@suse.cz> If a failure occurs while modifying ftrace function, it bails out and will remove the tracepoints to be back to what the code originally was. There is missing the final sync run across the CPUs after the fix up is done and before the ftrace int3 handler flag is reset. Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz Fixes: 8a4d0a687a5 "ftrace: Use breakpoint method to update ftrace caller" Cc: stable@vger.kernel.org # 3.5+ Signed-off-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- arch/x86/kernel/ftrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 6b566c8..69885e2 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -660,8 +660,8 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code, ret = -EPERM; goto out; } - run_sync(); out: + run_sync(); return ret; fail_update: -- 1.8.5.3
[-- Attachment #1: 0003-tracing-Do-not-add-event-files-for-modules-that-fail.patch --] [-- Type: text/plain, Size: 2853 bytes --] [Request for Ack] From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> If a module fails to add its tracepoints due to module tainting, do not create the module event infrastructure in the debugfs directory. As the events will not work and worse yet, they will silently fail, making the user wonder why the events they enable do not display anything. Having a warning on module load and the events not visible to the users will make the cause of the problem much clearer. Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT" Cc: stable@vger.kernel.org # 2.6.31+ Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- include/linux/tracepoint.h | 6 ++++++ kernel/trace/trace_events.c | 4 ++++ kernel/tracepoint.c | 7 ++++++- 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index accc497..7159a0a 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -60,6 +60,12 @@ struct tp_module { unsigned int num_tracepoints; struct tracepoint * const *tracepoints_ptrs; }; +bool trace_module_has_bad_taint(struct module *mod); +#else +static inline bool trace_module_has_bad_taint(struct module *mod) +{ + return false; +} #endif /* CONFIG_MODULES */ struct tracepoint_iter { diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index e71ffd4..b2fee73 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -1777,6 +1777,10 @@ static void trace_module_add_events(struct module *mod) { struct ftrace_event_call **call, **start, **end; + /* Don't add infrastructure for mods without tracepoints */ + if (trace_module_has_bad_taint(mod)) + return; + start = mod->trace_events; end = mod->trace_events + mod->num_trace_events; diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 29f2654..031cc56 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -631,6 +631,11 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter) EXPORT_SYMBOL_GPL(tracepoint_iter_reset); #ifdef CONFIG_MODULES +bool trace_module_has_bad_taint(struct module *mod) +{ + return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP)); +} + static int tracepoint_module_coming(struct module *mod) { struct tp_module *tp_mod, *iter; @@ -641,7 +646,7 @@ static int tracepoint_module_coming(struct module *mod) * module headers (for forced load), to make sure we don't cause a crash. * Staging and out-of-tree GPL modules are fine. */ - if (mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP))) + if (trace_module_has_bad_taint(mod)) return 0; mutex_lock(&tracepoints_mutex); tp_mod = kmalloc(sizeof(struct tp_module), GFP_KERNEL); -- 1.8.5.3
[-- Attachment #1: 0004-tracepoint-Do-not-waste-memory-on-mods-with-no-trace.patch --] [-- Type: text/plain, Size: 1274 bytes --] [ Note, this is marked stable as the next patch requires it ] From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> No reason to allocate tp_module structures for modules that have no tracepoints. This just wastes memory. Fixes: b75ef8b44b1c "Tracepoint: Dissociate from module mutex" Cc: stable@vger.kernel.org # 3.2+ Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- kernel/tracepoint.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 031cc56..63630ae 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -641,6 +641,9 @@ static int tracepoint_module_coming(struct module *mod) struct tp_module *tp_mod, *iter; int ret = 0; + if (!mod->num_tracepoints) + return 0; + /* * We skip modules that taint the kernel, especially those with different * module headers (for forced load), to make sure we don't cause a crash. @@ -684,6 +687,9 @@ static int tracepoint_module_going(struct module *mod) { struct tp_module *pos; + if (!mod->num_tracepoints) + return 0; + mutex_lock(&tracepoints_mutex); tracepoint_update_probe_range(mod->tracepoints_ptrs, mod->tracepoints_ptrs + mod->num_tracepoints); -- 1.8.5.3
[-- Attachment #1: 0005-tracepoint-Warn-and-notify-if-tracepoints-are-not-lo.patch --] [-- Type: text/plain, Size: 1211 bytes --] [ Request for Ack ] From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> If a module is loaded that is tainted with anything but OOT or CRAP, then it will not create the tracepoint infrastructure for the module. The user needs to be warned when this happens instead of exiting silently. Fixes: 97e1c18e8d17 "tracing: Kernel Tracepoints" Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- kernel/tracepoint.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 63630ae..a65af20 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -649,8 +649,12 @@ static int tracepoint_module_coming(struct module *mod) * module headers (for forced load), to make sure we don't cause a crash. * Staging and out-of-tree GPL modules are fine. */ - if (trace_module_has_bad_taint(mod)) + if (trace_module_has_bad_taint(mod)) { + pr_err("Module '%s' is tainted, disabling tracepoints\n", + mod->name); return 0; + } + mutex_lock(&tracepoints_mutex); tp_mod = kmalloc(sizeof(struct tp_module), GFP_KERNEL); if (!tp_mod) { -- 1.8.5.3
On Thu 27-02-14 10:46:17, Steven Rostedt wrote: > [Request for Ack] > > From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> > > If a failure occurs while enabling a trace, it bails out and will remove > the tracepoints to be back to what the code originally was. But the fix > up had some bugs in it. By injecting a failure in the code, the fix up > ran to completion, but shortly afterward the system rebooted. > > There was two bugs here. > > The first was that there was no final sync run across the CPUs after the > fix up was done, and before the ftrace int3 handler flag was reset. That > means that other CPUs could still see the breakpoint and trigger on it > long after the flag was cleared, and the int3 handler would think it was > a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints > because the ftrace int3 handler flag would have prevented the int3 handler > from going further. > > The second bug was that the removal of the breakpoints required the > "within()" logic updates instead of accessing the ip address directly. > > Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz > > Reported-by: Petr Mladek <pmladek@suse.cz> Tested-by: Petr Mladek <pmladek@suse.cz> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: stable@vger.kernel.org # 3.5+ > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > arch/x86/kernel/ftrace.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c > index e625319..6b566c8 100644 > --- a/arch/x86/kernel/ftrace.c > +++ b/arch/x86/kernel/ftrace.c > @@ -455,7 +455,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec) > } > > update: > - return probe_kernel_write((void *)ip, &nop[0], 1); > + return ftrace_write(ip, nop, 1); > } > > static int add_update_code(unsigned long ip, unsigned const char *new) > @@ -634,6 +634,7 @@ void ftrace_replace_code(int enable) > rec = ftrace_rec_iter_record(iter); > remove_breakpoint(rec); > } > + run_sync(); > } > > static int > @@ -664,7 +665,7 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code, > return ret; > > fail_update: > - probe_kernel_write((void *)ip, &old_code[0], 1); > + ftrace_write(ip, old_code, 1); > goto out; > } > > -- > 1.8.5.3 > >
On Thu, 27 Feb 2014 16:58:12 +0100
> > Reported-by: Petr Mladek <pmladek@suse.cz>
>
> Tested-by: Petr Mladek <pmladek@suse.cz>
Thanks, I'll update my commit log.
-- Steve
H. Peter, I forgot to Cc you on this one. As you are the one I'm
looking for an Ack from.
-- Steve
On Thu, 27 Feb 2014 10:46:18 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> [Request for Ack]
>
> From: Petr Mladek <pmladek@suse.cz>
>
> If a failure occurs while modifying ftrace function, it bails out and will
> remove the tracepoints to be back to what the code originally was.
>
> There is missing the final sync run across the CPUs after the fix up is done
> and before the ftrace int3 handler flag is reset.
>
> Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz
>
> Fixes: 8a4d0a687a5 "ftrace: Use breakpoint method to update ftrace caller"
> Cc: stable@vger.kernel.org # 3.5+
> Signed-off-by: Petr Mladek <pmladek@suse.cz>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
> arch/x86/kernel/ftrace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 6b566c8..69885e2 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -660,8 +660,8 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
> ret = -EPERM;
> goto out;
> }
> - run_sync();
> out:
> + run_sync();
> return ret;
>
> fail_update:
----- Original Message ----- > From: "Steven Rostedt" <rostedt@goodmis.org> > To: linux-kernel@vger.kernel.org > Cc: "Ingo Molnar" <mingo@kernel.org>, "Andrew Morton" <akpm@linux-foundation.org>, "Peter Zijlstra" > <peterz@infradead.org>, "Frederic Weisbecker" <fweisbec@gmail.com>, "Mathieu Desnoyers" > <mathieu.desnoyers@efficios.com>, stable@vger.kernel.org, "Rusty Russell" <rusty@rustcorp.com.au> > Sent: Thursday, February 27, 2014 10:46:19 AM > Subject: [RFA][PATCH 3/5] tracing: Do not add event files for modules that fail tracepoints > > [Request for Ack] > > From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> > > If a module fails to add its tracepoints due to module tainting, do not > create the module event infrastructure in the debugfs directory. As the > events > will not work and worse yet, they will silently fail, making the user wonder > why the events they enable do not display anything. > > Having a warning on module load and the events not visible to the users > will make the cause of the problem much clearer. Looks good! Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT" > Cc: stable@vger.kernel.org # 2.6.31+ > Cc: Rusty Russell <rusty@rustcorp.com.au> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > include/linux/tracepoint.h | 6 ++++++ > kernel/trace/trace_events.c | 4 ++++ > kernel/tracepoint.c | 7 ++++++- > 3 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index accc497..7159a0a 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -60,6 +60,12 @@ struct tp_module { > unsigned int num_tracepoints; > struct tracepoint * const *tracepoints_ptrs; > }; > +bool trace_module_has_bad_taint(struct module *mod); > +#else > +static inline bool trace_module_has_bad_taint(struct module *mod) > +{ > + return false; > +} > #endif /* CONFIG_MODULES */ > > struct tracepoint_iter { > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c > index e71ffd4..b2fee73 100644 > --- a/kernel/trace/trace_events.c > +++ b/kernel/trace/trace_events.c > @@ -1777,6 +1777,10 @@ static void trace_module_add_events(struct module > *mod) > { > struct ftrace_event_call **call, **start, **end; > > + /* Don't add infrastructure for mods without tracepoints */ > + if (trace_module_has_bad_taint(mod)) > + return; > + > start = mod->trace_events; > end = mod->trace_events + mod->num_trace_events; > > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 29f2654..031cc56 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -631,6 +631,11 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter) > EXPORT_SYMBOL_GPL(tracepoint_iter_reset); > > #ifdef CONFIG_MODULES > +bool trace_module_has_bad_taint(struct module *mod) > +{ > + return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP)); > +} > + > static int tracepoint_module_coming(struct module *mod) > { > struct tp_module *tp_mod, *iter; > @@ -641,7 +646,7 @@ static int tracepoint_module_coming(struct module *mod) > * module headers (for forced load), to make sure we don't cause a crash. > * Staging and out-of-tree GPL modules are fine. > */ > - if (mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP))) > + if (trace_module_has_bad_taint(mod)) > return 0; > mutex_lock(&tracepoints_mutex); > tp_mod = kmalloc(sizeof(struct tp_module), GFP_KERNEL); > -- > 1.8.5.3 > > > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
----- Original Message ----- > From: "Steven Rostedt" <rostedt@goodmis.org> > To: linux-kernel@vger.kernel.org > Cc: "Ingo Molnar" <mingo@kernel.org>, "Andrew Morton" <akpm@linux-foundation.org>, "Peter Zijlstra" > <peterz@infradead.org>, "Frederic Weisbecker" <fweisbec@gmail.com>, "Mathieu Desnoyers" > <mathieu.desnoyers@efficios.com>, stable@vger.kernel.org, "Mathieu Desnoyers" <mathieu.desnoyers@polymtl.ca> > Sent: Thursday, February 27, 2014 10:46:21 AM > Subject: [RFA][PATCH 5/5] tracepoint: Warn and notify if tracepoints are not loaded due to module taint > > [ Request for Ack ] > > From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> > > If a module is loaded that is tainted with anything but OOT or CRAP, then > it will not create the tracepoint infrastructure for the module. The user > needs > to be warned when this happens instead of exiting silently. > > Fixes: 97e1c18e8d17 "tracing: Kernel Tracepoints" > Cc: stable@vger.kernel.org > Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > kernel/tracepoint.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 63630ae..a65af20 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -649,8 +649,12 @@ static int tracepoint_module_coming(struct module *mod) > * module headers (for forced load), to make sure we don't cause a crash. > * Staging and out-of-tree GPL modules are fine. > */ > - if (trace_module_has_bad_taint(mod)) > + if (trace_module_has_bad_taint(mod)) { > + pr_err("Module '%s' is tainted, disabling tracepoints\n", Hrm, I wonder if this message could confuse users into thinking that because of this error, tracepoints are disabled across the entire kernel (rather than within this module). I'd recommend the following message instead: pr_err("Module '%s' is tainted, ignoring its tracepoints\n", Thoughts ? Thanks, Mathieu > + mod->name); > return 0; > + } > + > mutex_lock(&tracepoints_mutex); > tp_mod = kmalloc(sizeof(struct tp_module), GFP_KERNEL); > if (!tp_mod) { > -- > 1.8.5.3 > > > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
On Thu, Feb 27, 2014 at 10:46:18AM -0500, Steven Rostedt wrote: > [Request for Ack] > > From: Petr Mladek <pmladek@suse.cz> > > If a failure occurs while modifying ftrace function, it bails out and will > remove the tracepoints to be back to what the code originally was. > > There is missing the final sync run across the CPUs after the fix up is done > and before the ftrace int3 handler flag is reset. So IIUC the risk is that other CPUs may spuriously ignore non-ftrace traps if we don't sync the other cores after reverting the int3 before decrementing the modifying_ftrace_code counter? > > Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz > > Fixes: 8a4d0a687a5 "ftrace: Use breakpoint method to update ftrace caller" > Cc: stable@vger.kernel.org # 3.5+ > Signed-off-by: Petr Mladek <pmladek@suse.cz> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > arch/x86/kernel/ftrace.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c > index 6b566c8..69885e2 100644 > --- a/arch/x86/kernel/ftrace.c > +++ b/arch/x86/kernel/ftrace.c > @@ -660,8 +660,8 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code, > ret = -EPERM; > goto out; > } > - run_sync(); > out: > + run_sync(); > return ret; > > fail_update: This could be further optimized by rather calling run_sync() in the end of the fail_update block (after the probe_kernel_write revert) otherwise even failure on setting the break will result in run_sync(), which doesn't appear to be needed. But that's really just nitpicking as it's a rare failure codepath and shouldn't hurt. In any case, the fix looks correct. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> > -- > 1.8.5.3 > >
On Thu, 27 Feb 2014 17:37:32 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Thu, Feb 27, 2014 at 10:46:18AM -0500, Steven Rostedt wrote: > > [Request for Ack] > > > > From: Petr Mladek <pmladek@suse.cz> > > > > If a failure occurs while modifying ftrace function, it bails out and will > > remove the tracepoints to be back to what the code originally was. > > > > There is missing the final sync run across the CPUs after the fix up is done > > and before the ftrace int3 handler flag is reset. > > So IIUC the risk is that other CPUs may spuriously ignore non-ftrace traps if we don't sync the > other cores after reverting the int3 before decrementing the modifying_ftrace_code counter? Actually, the bug is that they will not ignore the ftrace traps after we decrement modifying_ftrace_code counter. Here's the race: CPU0 CPU1 ---- ---- remove_breakpoint(); modifying_ftrace_code = 0; [still sees breakpoint] <takes trap> [sees modifying_ftrace_code as zero] [no breakpoint handler] [goto failed case] [trap exception - kernel breakpoint, no handler] BUG() Even if we had a smp_wmb() after removing the breakpoint and clearing the modifying_ftrace_code, we still need the smp_rmb() on the other CPUS. The run_sync() does a IPI on all CPUs doing the smp_rmb(). > > > > > Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz > > > > Fixes: 8a4d0a687a5 "ftrace: Use breakpoint method to update ftrace caller" > > Cc: stable@vger.kernel.org # 3.5+ > > Signed-off-by: Petr Mladek <pmladek@suse.cz> > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > > --- > > arch/x86/kernel/ftrace.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c > > index 6b566c8..69885e2 100644 > > --- a/arch/x86/kernel/ftrace.c > > +++ b/arch/x86/kernel/ftrace.c > > @@ -660,8 +660,8 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code, > > ret = -EPERM; > > goto out; > > } > > - run_sync(); > > out: > > + run_sync(); > > return ret; > > > > fail_update: > > This could be further optimized by rather calling run_sync() in the end of the > fail_update block (after the probe_kernel_write revert) otherwise even failure on > setting the break will result in run_sync(), which doesn't appear to be needed. But > that's really just nitpicking as it's a rare failure codepath and shouldn't hurt. No, the run_sync() must be done after removing the breakpoint. Again, we don't want one of these breakpoints to be called on another CPU and then see modifying_ftrace_code as zero. That is bad. The final run_sync() is required. I think I'll update the change log to include my race flow graph from above. -- Steve > > In any case, the fix looks correct. > > Acked-by: Frederic Weisbecker <fweisbec@gmail.com> > > > -- > > 1.8.5.3 > > > >
On Thu, 27 Feb 2014 16:33:50 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > - if (trace_module_has_bad_taint(mod))
> > + if (trace_module_has_bad_taint(mod)) {
> > + pr_err("Module '%s' is tainted, disabling tracepoints\n",
>
> Hrm, I wonder if this message could confuse users into thinking that because
> of this error, tracepoints are disabled across the entire kernel (rather than
> within this module).
>
> I'd recommend the following message instead:
>
> pr_err("Module '%s' is tainted, ignoring its tracepoints\n",
>
Sure. Will update.
-- Steve
On Thu, 27 Feb 2014 16:33:50 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > I'd recommend the following message instead: > > pr_err("Module '%s' is tainted, ignoring its tracepoints\n", > Better? -- Steve >From 08ea384dc937d75a2a1444a06c3e4553bd118fc1 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> Date: Mon, 24 Feb 2014 11:06:04 -0500 Subject: [PATCH] tracepoint: Warn and notify if tracepoints are not loaded due to module taint If a module is loaded that is tainted with anything but OOT or CRAP, then it will not create the tracepoint infrastructure for the module. The user needs to be warned when this happens instead of exiting silently. Fixes: 97e1c18e8d17 "tracing: Kernel Tracepoints" Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- kernel/tracepoint.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 63630ae..1d33831 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -649,8 +649,12 @@ static int tracepoint_module_coming(struct module *mod) * module headers (for forced load), to make sure we don't cause a crash. * Staging and out-of-tree GPL modules are fine. */ - if (trace_module_has_bad_taint(mod)) + if (trace_module_has_bad_taint(mod)) { + pr_err("Module '%s' is tainted, ignoring its tracepoints\n", + mod->name); return 0; + } + mutex_lock(&tracepoints_mutex); tp_mod = kmalloc(sizeof(struct tp_module), GFP_KERNEL); if (!tp_mod) { -- 1.8.1.4
----- Original Message ----- > From: "Steven Rostedt" <rostedt@goodmis.org> > To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com> > Cc: linux-kernel@vger.kernel.org, "Ingo Molnar" <mingo@kernel.org>, "Andrew Morton" <akpm@linux-foundation.org>, > "Peter Zijlstra" <peterz@infradead.org>, "Frederic Weisbecker" <fweisbec@gmail.com>, stable@vger.kernel.org, > "Mathieu Desnoyers" <mathieu.desnoyers@polymtl.ca> > Sent: Thursday, February 27, 2014 12:09:25 PM > Subject: Re: [RFA][PATCH 5/5] tracepoint: Warn and notify if tracepoints are not loaded due to module taint > > On Thu, 27 Feb 2014 16:33:50 +0000 (UTC) > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > > > I'd recommend the following message instead: > > > > pr_err("Module '%s' is tainted, ignoring its tracepoints\n", > > > > Better? Yep, thanks! Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > -- Steve > > From 08ea384dc937d75a2a1444a06c3e4553bd118fc1 Mon Sep 17 00:00:00 2001 > From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> > Date: Mon, 24 Feb 2014 11:06:04 -0500 > Subject: [PATCH] tracepoint: Warn and notify if tracepoints are not loaded > due > to module taint > > If a module is loaded that is tainted with anything but OOT or CRAP, then > it will not create the tracepoint infrastructure for the module. The user > needs > to be warned when this happens instead of exiting silently. > > Fixes: 97e1c18e8d17 "tracing: Kernel Tracepoints" > Cc: stable@vger.kernel.org > Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > kernel/tracepoint.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 63630ae..1d33831 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -649,8 +649,12 @@ static int tracepoint_module_coming(struct module *mod) > * module headers (for forced load), to make sure we don't cause a crash. > * Staging and out-of-tree GPL modules are fine. > */ > - if (trace_module_has_bad_taint(mod)) > + if (trace_module_has_bad_taint(mod)) { > + pr_err("Module '%s' is tainted, ignoring its tracepoints\n", > + mod->name); > return 0; > + } > + > mutex_lock(&tracepoints_mutex); > tp_mod = kmalloc(sizeof(struct tp_module), GFP_KERNEL); > if (!tp_mod) { > -- > 1.8.1.4 > > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
On Thu, Feb 27, 2014 at 12:00:14PM -0500, Steven Rostedt wrote: > On Thu, 27 Feb 2014 17:37:32 +0100 > Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > On Thu, Feb 27, 2014 at 10:46:18AM -0500, Steven Rostedt wrote: > > > [Request for Ack] > > > > > > From: Petr Mladek <pmladek@suse.cz> > > > > > > If a failure occurs while modifying ftrace function, it bails out and will > > > remove the tracepoints to be back to what the code originally was. > > > > > > There is missing the final sync run across the CPUs after the fix up is done > > > and before the ftrace int3 handler flag is reset. > > > > So IIUC the risk is that other CPUs may spuriously ignore non-ftrace traps if we don't sync the > > other cores after reverting the int3 before decrementing the modifying_ftrace_code counter? > > Actually, the bug is that they will not ignore the ftrace traps after > we decrement modifying_ftrace_code counter. Here's the race: > > CPU0 CPU1 > ---- ---- > remove_breakpoint(); > modifying_ftrace_code = 0; > > [still sees breakpoint] > <takes trap> > [sees modifying_ftrace_code as zero] > [no breakpoint handler] > [goto failed case] > [trap exception - kernel breakpoint, no > handler] > BUG() > > > Even if we had a smp_wmb() after removing the breakpoint and clearing > the modifying_ftrace_code, we still need the smp_rmb() on the other > CPUS. The run_sync() does a IPI on all CPUs doing the smp_rmb(). Ah ok. My understanding was indeed that it doesn't ignore the ftrace trap, but I thought the consequence was that we return immediately from the trap handler. > > > > > > > > > Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz > > > > > > Fixes: 8a4d0a687a5 "ftrace: Use breakpoint method to update ftrace caller" > > > Cc: stable@vger.kernel.org # 3.5+ > > > Signed-off-by: Petr Mladek <pmladek@suse.cz> > > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > > > --- > > > arch/x86/kernel/ftrace.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c > > > index 6b566c8..69885e2 100644 > > > --- a/arch/x86/kernel/ftrace.c > > > +++ b/arch/x86/kernel/ftrace.c > > > @@ -660,8 +660,8 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code, > > > ret = -EPERM; > > > goto out; > > > } > > > - run_sync(); > > > out: > > > + run_sync(); > > > return ret; > > > > > > fail_update: > > > > This could be further optimized by rather calling run_sync() in the end of the > > fail_update block (after the probe_kernel_write revert) otherwise even failure on > > setting the break will result in run_sync(), which doesn't appear to be needed. But > > that's really just nitpicking as it's a rare failure codepath and shouldn't hurt. > > No, the run_sync() must be done after removing the breakpoint. Again, > we don't want one of these breakpoints to be called on another CPU and > then see modifying_ftrace_code as zero. That is bad. The final > run_sync() is required. Ok but what I meant is to do this instead: fail_update: probe_kernel_write((void *)ip, &old_code[0], 1); + run_sync() goto out; Because with the current patch we also call run_sync() on add_break() failure. > > I think I'll update the change log to include my race flow graph from > above. > > -- Steve > > > > > > In any case, the fix looks correct. > > > > Acked-by: Frederic Weisbecker <fweisbec@gmail.com> > > > > > -- > > > 1.8.5.3 > > > > > > >
On Thu, 27 Feb 2014 18:19:37 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Thu, Feb 27, 2014 at 12:00:14PM -0500, Steven Rostedt wrote: > > On Thu, 27 Feb 2014 17:37:32 +0100 > > Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > On Thu, Feb 27, 2014 at 10:46:18AM -0500, Steven Rostedt wrote: > > > > [Request for Ack] > > > > > > > > From: Petr Mladek <pmladek@suse.cz> > > > > > > > > If a failure occurs while modifying ftrace function, it bails out and will > > > > remove the tracepoints to be back to what the code originally was. > > > > > > > > There is missing the final sync run across the CPUs after the fix up is done > > > > and before the ftrace int3 handler flag is reset. > > > > > > So IIUC the risk is that other CPUs may spuriously ignore non-ftrace traps if we don't sync the > > > other cores after reverting the int3 before decrementing the modifying_ftrace_code counter? > > > > Actually, the bug is that they will not ignore the ftrace traps after > > we decrement modifying_ftrace_code counter. Here's the race: > > > > CPU0 CPU1 > > ---- ---- > > remove_breakpoint(); > > modifying_ftrace_code = 0; > > > > [still sees breakpoint] > > <takes trap> > > [sees modifying_ftrace_code as zero] > > [no breakpoint handler] > > [goto failed case] > > [trap exception - kernel breakpoint, no > > handler] > > BUG() > > > > > > Even if we had a smp_wmb() after removing the breakpoint and clearing > > the modifying_ftrace_code, we still need the smp_rmb() on the other > > CPUS. The run_sync() does a IPI on all CPUs doing the smp_rmb(). > > Ah ok. My understanding was indeed that it doesn't ignore the ftrace trap, > but I thought the consequence was that we return immediately from the trap > handler. I'll add my above cpu race diagram (is that what we call it?). That should make this change more understandable. > Ok but what I meant is to do this instead: > > fail_update: > probe_kernel_write((void *)ip, &old_code[0], 1); > + run_sync() > goto out; > > Because with the current patch we also call run_sync() on add_break() failure. Ah ok (my turn to understand). Yeah, if the add_break() fails, then we don't need to do the run_sync(). But this is just for now, to prevent the add_update_code() error from crashing. I have more patches that clean this up further. But they are for 3.15. -- Steve
On Thu, Feb 27, 2014 at 12:35:53PM -0500, Steven Rostedt wrote: > On Thu, 27 Feb 2014 18:19:37 +0100 > Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > On Thu, Feb 27, 2014 at 12:00:14PM -0500, Steven Rostedt wrote: > > > On Thu, 27 Feb 2014 17:37:32 +0100 > > > Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > > > On Thu, Feb 27, 2014 at 10:46:18AM -0500, Steven Rostedt wrote: > > > > > [Request for Ack] > > > > > > > > > > From: Petr Mladek <pmladek@suse.cz> > > > > > > > > > > If a failure occurs while modifying ftrace function, it bails out and will > > > > > remove the tracepoints to be back to what the code originally was. > > > > > > > > > > There is missing the final sync run across the CPUs after the fix up is done > > > > > and before the ftrace int3 handler flag is reset. > > > > > > > > So IIUC the risk is that other CPUs may spuriously ignore non-ftrace traps if we don't sync the > > > > other cores after reverting the int3 before decrementing the modifying_ftrace_code counter? > > > > > > Actually, the bug is that they will not ignore the ftrace traps after > > > we decrement modifying_ftrace_code counter. Here's the race: > > > > > > CPU0 CPU1 > > > ---- ---- > > > remove_breakpoint(); > > > modifying_ftrace_code = 0; > > > > > > [still sees breakpoint] > > > <takes trap> > > > [sees modifying_ftrace_code as zero] > > > [no breakpoint handler] > > > [goto failed case] > > > [trap exception - kernel breakpoint, no > > > handler] > > > BUG() > > > > > > > > > Even if we had a smp_wmb() after removing the breakpoint and clearing > > > the modifying_ftrace_code, we still need the smp_rmb() on the other > > > CPUS. The run_sync() does a IPI on all CPUs doing the smp_rmb(). > > > > Ah ok. My understanding was indeed that it doesn't ignore the ftrace trap, > > but I thought the consequence was that we return immediately from the trap > > handler. > > I'll add my above cpu race diagram (is that what we call it?). That > should make this change more understandable. Yeah sounds like a good idea! > > > > Ok but what I meant is to do this instead: > > > > fail_update: > > probe_kernel_write((void *)ip, &old_code[0], 1); > > + run_sync() > > goto out; > > > > Because with the current patch we also call run_sync() on add_break() failure. > > Ah ok (my turn to understand). Yeah, if the add_break() fails, then we > don't need to do the run_sync(). > > But this is just for now, to prevent the add_update_code() error from > crashing. I have more patches that clean this up further. But they are > for 3.15. Yeah sure. That was really just nitpicking. It doesn't hurt in a rare failure path and the fix is there. Thanks.
On 02/27/2014 07:46 AM, Steven Rostedt wrote:
> The second bug was that the removal of the breakpoints required the
> "within()" logic updates instead of accessing the ip address directly.
Please update this bit to clarify, as we discussed on IRC.
Otherwise:
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
On Mon, 03 Mar 2014 14:12:22 -0800 "H. Peter Anvin" <hpa@zytor.com> wrote: > On 02/27/2014 07:46 AM, Steven Rostedt wrote: > > The second bug was that the removal of the breakpoints required the > > "within()" logic updates instead of accessing the ip address directly. > > Please update this bit to clarify, as we discussed on IRC. How's this sound: The second bug was that the removal of the breakpoints required the "within()" logic updates instead of accessing the ip address directly. As kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and the removal of the breakpoint is a modification of kernel text. > > Otherwise: > > Acked-by: H. Peter Anvin <hpa@linux.intel.com> Thanks! -- Steve
On 03/03/2014 02:18 PM, Steven Rostedt wrote:
>
> How's this sound:
>
> The second bug was that the removal of the breakpoints required the
> "within()" logic updates instead of accessing the ip address directly.
> As kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
> the removal of the breakpoint is a modification of kernel text.
>
I'd add:
"ftrace_write() includes this, but probe_kernel_write() does not."
-hpa
On Mon, 03 Mar 2014 14:27:09 -0800
"H. Peter Anvin" <hpa@zytor.com> wrote:
> On 03/03/2014 02:18 PM, Steven Rostedt wrote:
> >
> > How's this sound:
> >
> > The second bug was that the removal of the breakpoints required the
> > "within()" logic updates instead of accessing the ip address directly.
> > As kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
> > the removal of the breakpoint is a modification of kernel text.
> >
>
> I'd add:
>
> "ftrace_write() includes this, but probe_kernel_write() does not."
>
OK, thanks,
-- Steve