From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758241Ab1CORyZ (ORCPT ); Tue, 15 Mar 2011 13:54:25 -0400 Received: from mail-qw0-f46.google.com ([209.85.216.46]:65205 "EHLO mail-qw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758215Ab1CORyY (ORCPT ); Tue, 15 Mar 2011 13:54:24 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; b=EwdeH9JIV5rwqZaDjfMd3/gjPB66qJvo8BMcDjvjmkVvk6gman9wY5rBczWHmVS0W8 MCFmQigV84+kVRpoKTPhg6DoR82SR2wvHLl2TI07HgauiwCm6vgqcVSyH2AS7/9nGtSm Nkkc7cVlk8qB8jYgzMUCTptXWbFjJpDmKpF5U= Date: Tue, 15 Mar 2011 18:54:19 +0100 From: Frederic Weisbecker To: Lin Ming Cc: LKML , Ingo Molnar , Peter Zijlstra , Arnaldo Carvalho de Melo , Paul Mackerras , Stephane Eranian , Steven Rostedt , Masami Hiramatsu , Thomas Gleixner , Hitoshi Mitake Subject: Re: [RFC PATCH 1/4] perf: Starter and stopper events Message-ID: <20110315175415.GA6605@nowhere> References: <1300130283-10466-1-git-send-email-fweisbec@gmail.com> <1300130283-10466-2-git-send-email-fweisbec@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 15, 2011 at 10:36:18PM +0800, Lin Ming wrote: > On Tue, Mar 15, 2011 at 3:18 AM, Frederic Weisbecker wrote: > > > > > +static void perf_event_pause_resume(struct perf_event *event, int nmi, > > +                                   struct perf_sample_data *data, > > +                                   struct pt_regs *regs) > > +{ > > +       struct perf_event *iter; > > +       unsigned long flags; > > + > > +       /* > > +        * Ensure the targets can't be sched in/out concurrently. > > +        * Disabling irqs is sufficient for that because starters/stoppers > > +        * are on the same cpu/task. > > +        */ > > +       local_irq_save(flags); > > Could you explain this more detail? Yeah, I should have detailed that more. So, I put a constraint in starters and stoppers: those must be attached to the same task and cpu than the target. That allows us to do this pause/resume lockless if we can ensure that: - target sched in/out can't interrupt perf_event_pause_resume() - perf_event_pause_resume() can interrupt the target in the middle of event_sched_in() So that both are strictly serialized. We need to ensure that the target event can not be concurrently scheduled in (->add()) or scheduled out (->del() ), so that when we check PERF_EVENT_STATE_ACTIVE, we know that the event is currently running and is not going to move while we do our checks and we call start() and stop(). So the rationale is that the target can not be in the middle of event_sched_in() or event_sched_out() when the starter/stopper trigger. We have no guarantee of that currently, especially because of events that trigger in NMIs, but also for other corner cases may be, so I'll need to think about it later. Why not by using pmu_disable_all() on the starter/stopper when the target is about to schedule in/out, until we know the event->state really reflects the hardware and logical states. Now event_sched_in() and event_sched_out() can still be called from an IPI to enable/disable an event. Hence the interrupts disabled to prevent from that. > > + > > + > > +       /* Prevent the targets from being removed under us. */ > > +       rcu_read_lock(); > > + > > +       list_for_each_entry_rcu(iter, &event->starter_list, starter_entry) { > > +               /* > > +                * There is a small race window here, between the state > > +                * gets set to active and the event is actually ->add(). > > +                * We need to find a way to ensure the starter/stopper > > +                * can't trigger in between. > > Can we set the state to active after the event is actually ->add()? If we do so, the event may trigger before its state gets updated and that state can be checked from event fast paths. As is the case from hrtimer for example. Or perf_event_update_userpage(). Not sure how we deal with that with del() called after we update the state.