From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933480AbdDFSGF (ORCPT ); Thu, 6 Apr 2017 14:06:05 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:33145 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752383AbdDFSF5 (ORCPT ); Thu, 6 Apr 2017 14:05:57 -0400 Date: Thu, 6 Apr 2017 11:05:53 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Andrew Morton Subject: Re: [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Reply-To: paulmck@linux.vnet.ibm.com References: <20170406164237.874767449@goodmis.org> <20170406164432.069467885@goodmis.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170406164432.069467885@goodmis.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17040618-0052-0000-0000-000001CB06FD X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006888; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00843991; UDB=6.00415922; IPR=6.00622191; BA=6.00005274; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00014939; XFM=3.00000013; UTC=2017-04-06 18:05:55 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17040618-0053-0000-0000-00004FBDC663 Message-Id: <20170406180553.GG1600@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-06_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704060144 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 06, 2017 at 12:42:38PM -0400, Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" > > The function tracer needs to be more careful than other subsystems when it > comes to freeing data. Especially if that data is actually executable code. > When a single function is traced, a trampoline can be dynamically allocated > which is called to jump to the function trace callback. When the callback is > no longer needed, the dynamic allocated trampoline needs to be freed. This > is where the issues arise. The dynamically allocated trampoline must not be > used again. As function tracing can trace all subsystems, including > subsystems that are used to serialize aspects of freeing (namely RCU), it > must take extra care when doing the freeing. > > Before synchronize_rcu_tasks() was around, there was no way for the function > tracer to know that nothing was using the dynamically allocated trampoline > when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely > preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(), > it will wait till all tasks have either voluntarily scheduled (not on the > trampoline) or goes into userspace (not on the trampoline). Then it is safe > to free the trampoline even with CONFIG_PREEMPT set. > > Cc: "Paul E. McKenney" One question below. Other than that: Acked-by: "Paul E. McKenney" > Signed-off-by: Steven Rostedt (VMware) > --- > kernel/trace/Kconfig | 3 ++- > kernel/trace/ftrace.c | 42 ++++++++++++++++++------------------------ > 2 files changed, 20 insertions(+), 25 deletions(-) > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index d4a06e714645..67b463b4f169 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -134,7 +134,8 @@ config FUNCTION_TRACER > select KALLSYMS > select GENERIC_TRACER > select CONTEXT_SWITCH_TRACER > - select GLOB > + select GLOB Does GLOB really want to be selected in production environments? I could understand "select GLOB if WE_ARE_TESTING" or some such. > + select TASKS_RCU if PREEMPT > help > Enable the kernel to trace every kernel function. This is done > by using a compiler feature to insert a small, 5-byte No-Operation > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 8efd9fe7aec0..34f63e78d661 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command) > * callers are done before leaving this function. > * The same goes for freeing the per_cpu data of the per_cpu > * ops. > - * > - * Again, normal synchronize_sched() is not good enough. > - * We need to do a hard force of sched synchronization. > - * This is because we use preempt_disable() to do RCU, but > - * the function tracers can be called where RCU is not watching > - * (like before user_exit()). We can not rely on the RCU > - * infrastructure to do the synchronization, thus we must do it > - * ourselves. > */ > if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) { > + /* > + * We need to do a hard force of sched synchronization. > + * This is because we use preempt_disable() to do RCU, but > + * the function tracers can be called where RCU is not watching > + * (like before user_exit()). We can not rely on the RCU > + * infrastructure to do the synchronization, thus we must do it > + * ourselves. > + */ > schedule_on_each_cpu(ftrace_sync); > > + /* > + * When the kernel is preeptive, tasks can be preempted > + * while on a ftrace trampoline. Just scheduling a task on > + * a CPU is not good enough to flush them. Calling > + * synchornize_rcu_tasks() will wait for those tasks to > + * execute and either schedule voluntarily or enter user space. > + */ > + if (IS_ENABLED(CONFIG_PREEMPT)) > + synchronize_rcu_tasks(); > + > arch_ftrace_trampoline_free(ops); > > if (ops->flags & FTRACE_OPS_FL_PER_CPU) > @@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops) > > static void ftrace_update_trampoline(struct ftrace_ops *ops) > { > - > -/* > - * Currently there's no safe way to free a trampoline when the kernel > - * is configured with PREEMPT. That is because a task could be preempted > - * when it jumped to the trampoline, it may be preempted for a long time > - * depending on the system load, and currently there's no way to know > - * when it will be off the trampoline. If the trampoline is freed > - * too early, when the task runs again, it will be executing on freed > - * memory and crash. > - */ > -#ifdef CONFIG_PREEMPT > - /* Currently, only non dynamic ops can have a trampoline */ > - if (ops->flags & FTRACE_OPS_FL_DYNAMIC) > - return; > -#endif > - > arch_ftrace_update_trampoline(ops); > } > > -- > 2.10.2 > >