From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965684AbdEOUNB (ORCPT ); Mon, 15 May 2017 16:13:01 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:55357 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932677AbdEOUM7 (ORCPT ); Mon, 15 May 2017 16:12:59 -0400 Date: Mon, 15 May 2017 13:12:53 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: mingo@kernel.org, linux-kernel@vger.kernel.org Subject: Re: Use case for TASKS_RCU Reply-To: paulmck@linux.vnet.ibm.com References: <20170515182354.GA25440@linux.vnet.ibm.com> <20170515144810.563a4d9b@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170515144810.563a4d9b@gandalf.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17051520-0036-0000-0000-00000206F19F X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007067; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000212; SDB=6.00860928; UDB=6.00426954; IPR=6.00640592; BA=6.00005350; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015468; XFM=3.00000015; UTC=2017-05-15 20:12:52 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17051520-0037-0000-0000-00004058F46E Message-Id: <20170515201253.GW3956@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-05-15_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705150190 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 15, 2017 at 02:48:10PM -0400, Steven Rostedt wrote: > On Mon, 15 May 2017 11:23:54 -0700 > "Paul E. McKenney" wrote: > > > Hello! > > > > The question of the use case for TASKS_RCU came up, and here is my > > understanding. Steve will not be shy about correcting any misconceptions > > I might have. ;-) > > > > The use case is to support freeing of trampolines used in tracing/probing > > in CONFIG_PREEMPT=y kernels. It is necessary to wait until any task > > executing in the trampoline in question has left it, taking into account > > that the trampoline's code might be interrupted and preempted. However, > > the code in the trampolines is guaranteed never to context switch. > > nit, "never to voluntarily context switch" as it can still be > preempted. It should never call schedule nor a mutex. And really it > shouldn't even call any spinlocks. Although, trace_stack does, but it > does so after checking if in_nmi(), which it bails if that is true. Good catch, thank you! And thank you for the checking on the rest. Ingo, thoughts? Thanx, Paul > > Note that in CONFIG_PREEMPT=n kernels, synchronize_sched() suffices. > > It is therefore tempting to think in terms of disabling preemption across > > the trampolines, but there is apparently not enough room to accommodate > > the needed preempt_disable() and preempt_enable() in the code invoking > > the trampoline, and putting the preempt_disable() and preempt_enable() > > in the trampoline itself fails because of the possibility of preemption > > just before the preempt_disable() and just after the preempt_enable(). > > Similar reasoning rules out use of rcu_read_lock() and rcu_read_unlock(). > > Correct, as the jump to the trampoline may be preempted. And preemption > happens just before the first instruction on the trampoline is being > executed. > > > > > > Another possibility would be to place the trampolines in a known region > > of memory, and check for the task's PC being in that region. This fails > > because trampolines can be interrupted, and I vaguely recall something > > about them calling function as well. Stack tracing could be added, > > but stack tracing is not as reliable as it would need to be. > > Correct. > > > > > The solution chosen relies on the fact that code in trampolines > > (and code invoked from trampolines) is not permitted to do voluntary > > context switches. Thus, if a trampoline is removed, and a given task > > later does a voluntary context switch (or has been seen in usermode), > > that task will never again reference that trampoline. Once all tasks > > are accounted for, the trampoline may safely be removed. > > Correct. > > > > > TASKS_RCU implements a flavor of RCU that does exactly this. It has > > only a single use at the moment, but avoiding memory leaks on > > production machines being instrumented seems to me to be quite valuable. > > Optimized kprobes can also benefit from this, as it currently is > disabled on CONFIG_PREEMPT due to exactly the same issue. I'll poke > Masami about this again. I should be seeing him in a couple of weeks at > the Open Source Summit in Tokyo. > > > > > > So, Steve, please correct any misconceptions! > > Nope, all looks good. > > -- Steve >