From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755187AbdDLQ1C (ORCPT ); Wed, 12 Apr 2017 12:27:02 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:57363 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754588AbdDLQ0W (ORCPT ); Wed, 12 Apr 2017 12:26:22 -0400 Date: Wed, 12 Apr 2017 09:26:10 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: linux-kernel@vger.kernel.org Subject: Re: There is a Tasks RCU stall warning Reply-To: paulmck@linux.vnet.ibm.com References: <20170411181530.27dc21cc@gandalf.local.home> <20170411230154.GA3956@linux.vnet.ibm.com> <20170411230445.GA25951@linux.vnet.ibm.com> <20170411231138.GB25951@linux.vnet.ibm.com> <20170412032307.GA27011@linux.vnet.ibm.com> <20170412091821.4ad74bb0@gandalf.local.home> <20170412141936.GF3956@linux.vnet.ibm.com> <20170412104255.26bb17d4@gandalf.local.home> <20170412151817.GG3956@linux.vnet.ibm.com> <20170412115304.3077dbc8@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170412115304.3077dbc8@gandalf.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17041216-0008-0000-0000-000001F22748 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006923; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00846580; UDB=6.00417593; IPR=6.00625016; BA=6.00005286; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015022; XFM=3.00000013; UTC=2017-04-12 16:26:13 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17041216-0009-0000-0000-0000349687D1 Message-Id: <20170412162610.GI3956@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-12_12:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704120134 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 12, 2017 at 11:53:04AM -0400, Steven Rostedt wrote: > On Wed, 12 Apr 2017 08:18:17 -0700 > "Paul E. McKenney" wrote: > > > > > Well the trampolines pretty much can, but they are removed before > > > calling synchronize_rcu_tasks(), and nothing can enter the trampoline > > > when that is called. > > > > Color me confused... > > > > So you can have an arbitrary function call within a trampoline? > > Sorta. > > When you do register_ftrace_function(ops), where ops has ops->func that > points to a function you want to have called when a function is traced, > the following happens (if there's no other ops registered). Let's use > an example where ops is filtered on just the schedule() function call: > > > : > call trampoline ---+ > [..] | > +--> : > push regs > call ops->func > pop regs > ret > > But that ops->func() must be very limited in what it can do. Although, > it may actually call an rcu_read_lock()! But if that's the case, it > must either check if rcu is watching (which perf does), or enable rcu > via the rcu_irq_enter() with a check on rcu_irq_enter_disabled(), which > my stack tracer does. > > Now this can be called even from NMI context! Thus what ops->func does > must be aware of that. The stack tracer func has an: > > if (in_nmi()) > return; > > Because it needs to grab spin locks. But preemption is enabled within the trampoline? If so, then if CONFIG_RCU_BOOST is set, rcu_read_unlock() can call rt_mutex_unlock(). Which looks OK to me, but I thought I should mention it. > But one thing an op->func() is never allowed to do, is to call > schedule() directly, or even a cond_resched(). It may be preempted if > preemption was enabled when the trampoline was hit, but it must not > assume that it can do a voluntary schedule. That would break the > rcu_tasks as well if it did. OK, so it can call functions, but it is not permitted to call functions that voluntarily block. That should work. (Fingers firmly crossed.) > > If not, agreed, no problem. Otherwise, it seems like we have a big > > problem remaining. Unless the functions called from a trampoline are > > guaranteed never to do a context switch. > > Well, they can be preempted, but they should never do a voluntary > schedule. If they did, that would be bad. OK, feeeling better now. ;-) > > So what exactly is the trampoline code allowed to do? ;-) > > Well, it must be able to work in an NMI context, or bail otherwise. And > it should never schedule on its own. Good. > > My problem is that I have no idea what can and cannot be included in > > trampoline code. In absence of that information, my RCU-honed reflexes > > jump immediately to the worst case that I can think of. ;-) > > Lets just say that it can't voluntarily sleep. Would that be good > enough? If someday in the future I decide to let it do so, I would add > a flag and force that ops not to be able to use a dynamic trampoline. That would work. Again, feeling much better now. ;-) > Currently, without the synchronize_rcu_tasks(), when a dynamic ops is > registered, the functions will point to a non dynamic trampoline. That > is, one that is never freed. It simply does: > > preempt_disable_notrace(); > > do_for_each_ftrace_op(op, ftrace_ops_list) { > /* > * Check the following for each ops before calling their func: > * if RCU flag is set, then rcu_is_watching() must be true > * if PER_CPU is set, then ftrace_function_local_disable() > * must be false > * Otherwise test if the ip matches the ops filter > * > * If any of the above fails then the op->func() is not executed. > */ > if ((!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching()) && > (!(op->flags & FTRACE_OPS_FL_PER_CPU) || > !ftrace_function_local_disabled(op)) && > ftrace_ops_test(op, ip, regs)) { > > if (FTRACE_WARN_ON(!op->func)) { > pr_warn("op=%p %pS\n", op, op); > goto out; > } > op->func(ip, parent_ip, op, regs); > } > } while_for_each_ftrace_op(op); > out: > preempt_enable_notrace(); Makes sense! Thanx, Paul