From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1162472AbdEWVKX (ORCPT ); Tue, 23 May 2017 17:10:23 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:50328 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1161788AbdEWVKT (ORCPT ); Tue, 23 May 2017 17:10:19 -0400 Date: Tue, 23 May 2017 14:10:09 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Peter Zijlstra , Thomas Gleixner Subject: Re: Use case for TASKS_RCU Reply-To: paulmck@linux.vnet.ibm.com References: <20170516122354.GB3956@linux.vnet.ibm.com> <20170519062331.52dhungzvcsdxdgo@gmail.com> <20170519133550.GD3956@linux.vnet.ibm.com> <20170519100421.27298063@gandalf.local.home> <20170519102331.0d5a8536@gandalf.local.home> <20170519190609.GE3956@linux.vnet.ibm.com> <20170523000036.GA13506@linux.vnet.ibm.com> <20170523153939.7122e892@vmware.local.home> <20170523200035.GW3956@linux.vnet.ibm.com> <20170523163853.70773f4a@vmware.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170523163853.70773f4a@vmware.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17052321-0056-0000-0000-0000036E7D35 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007107; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000212; SDB=6.00864643; UDB=6.00429248; IPR=6.00644424; BA=6.00005369; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015555; XFM=3.00000015; UTC=2017-05-23 21:10:11 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17052321-0057-0000-0000-000007A4B058 Message-Id: <20170523211009.GX3956@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-05-23_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705230107 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 23, 2017 at 04:38:53PM -0400, Steven Rostedt wrote: > On Tue, 23 May 2017 13:00:35 -0700 > "Paul E. McKenney" wrote: > > > > > > Unfortunately, it does not work, as I should have known ahead of > > > > time from the dyntick-idle experience. Not all context switches > > > > go through context_switch(). :-/ > > > > > > Wait. What context switch doesn't go through a context switch? Or do > > > you mean a user/kernel context switch? > > > > I mean that putting printk() before and after the call to > > context_switch() can show tasks switching out twice without switching > > in and vice versa. No sign of lost printk()s, and I also confirmed > > this behavior using a flag in task_struct. > > I hope you meant trace_printk()s' as printk is a huge overhead and can > cause side effects. Not so much during boot. But actually, I meant to ask you about that... >>From what I can see from the ftrace documentation, booting with something like this: ftrace=function ftrace_filter=tasks_rcu_qs,tasks_rcu_qs_enter,tasks_rcu_qs_exit Should enable ftrace, but only on the three functions called out. But when I try this, I get the following in dmesg: [ 1.506171] ftrace bootup tracer 'function' not registered And I don't get anything from ftrace_dump() later on. What am I doing wrong here? (Event tracing has worked for me in the past from the boot line, but I was lazy so just fell back to printk(). And I didn't think of trace_printk().) > > One way that this can happen on some architectures is via the "helper" > > mechanism, where the task sleeps normally, but where a later interrupt > > or exception takes on its context "behind the scenes" in the arch > > code. This is what messed up my attempt to use a simple > > interrupt-nesting counter for RCU dynticks some years back. What I > > counted on there was that the idle loop would never do that sort of > > thing, so I could zero the count when entering idle from process > > context. > > > > But I have not yet found a similar trick for counting voluntary > > context switches. > > > > I also tried making context_switch() look like a momentary quiescent > > state, but of course that means that tasks that block forever also > > block the grace period forever. At which point, I need to scan the > > task list to find them. And that pretty much brings me back to the > > current RCU-tasks implementation. :-/ > > Nothing should block in a preempted state forever, and if it does, that > means we want to wait forever. Because it could be preempted on the > trampoline. Blocking in a preempted state is not the problem here. Given that the obvious hooks don't seem to be catching all of the switch-to and switch-from events, blocking forever in a not-preempted state is the problem. I either need some way to see all of the switch-from and switch-to events (and the ways I can see to do this have patch-size and maintainability issues), or I need to go back to scanning the task list. And of course, all of the approaches that update state upon context switch are slowing down a fastpath for the benefit of a slowpath, which is not necessarily all that good of a thing. Thanx, Paul