From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757206AbbLAWUe (ORCPT ); Tue, 1 Dec 2015 17:20:34 -0500 Received: from mail-wm0-f42.google.com ([74.125.82.42]:34141 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756928AbbLAWUc (ORCPT ); Tue, 1 Dec 2015 17:20:32 -0500 Date: Tue, 1 Dec 2015 23:20:28 +0100 From: Frederic Weisbecker To: Peter Zijlstra Cc: LKML , Chris Metcalf , Thomas Gleixner , Luiz Capitulino , Christoph Lameter , Ingo Molnar , Viresh Kumar , Rik van Riel Subject: Re: [PATCH 2/7] nohz: New tick dependency mask Message-ID: <20151201222025.GA31973@lerouge> References: <1447424529-13671-1-git-send-email-fweisbec@gmail.com> <1447424529-13671-3-git-send-email-fweisbec@gmail.com> <20151201204109.GN17308@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151201204109.GN17308@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 01, 2015 at 09:41:09PM +0100, Peter Zijlstra wrote: > On Fri, Nov 13, 2015 at 03:22:04PM +0100, Frederic Weisbecker wrote: > > The tick dependency is evaluated on every IRQ. This is a batch of checks > > which determine whether it is safe to stop the tick or not. These checks > > are often split in many details: posix cpu timers, scheduler, sched clock, > > perf events. Each of which are made of smaller details: posix cpu > > timer involves checking process wide timers then thread wide timers. Perf > > involves checking freq events then more per cpu details. > > > > Checking these details asynchronously every time we update the full > > dynticks state bring avoidable overhead and a messy layout. > > > > Lets introduce instead tick dependency masks: one for system wide > > dependency (unstable sched clock), one for CPU wide dependency (sched, > > perf), and task/signal level dependencies. The subsystems are responsible > > of setting and clearing their dependency through a set of APIs that will > > take care of concurrent dependency mask modifications and kick targets > > to restart the relevant CPU tick whenever needed. > > Maybe better explain why we need the per task and per signal thingy? I'll detail that some more in the changelog. The only user of the per task/per signal tick dependency is posix cpu timer. I've been first proposing a global tick dependency as soon as any posix cpu timer is armed. It simplified everything but some reviewers complained (eg: some users might want to run posix timers on housekeepers without bothering full dynticks CPUs). I could remove the per signal dependency with dispatching it through all threads in the group each time there is an update but that's the best I can think of. > > > +static void trace_tick_dependency(unsigned long dep) > > +{ > > + if (dep & TICK_POSIX_TIMER_MASK) { > > + trace_tick_stop(0, "posix timers running\n"); > > + return; > > + } > > + > > + if (dep & TICK_PERF_EVENTS_MASK) { > > + trace_tick_stop(0, "perf events running\n"); > > + return; > > + } > > + > > + if (dep & TICK_SCHED_MASK) { > > + trace_tick_stop(0, "more than 1 task in runqueue\n"); > > + return; > > + } > > + > > + if (dep & TICK_CLOCK_UNSTABLE_MASK) > > + trace_tick_stop(0, "unstable sched clock\n"); > > +} > > I would suggest ditching the strings and using the Using a code value instead? > > > +static void kick_all_work_fn(struct work_struct *work) > > +{ > > + tick_nohz_full_kick_all(); > > +} > > +static DECLARE_WORK(kick_all_work, kick_all_work_fn); > > + > > +void __tick_nohz_set_dep_delayed(enum tick_dependency_bit bit, unsigned long *dep) > > +{ > > + unsigned long prev; > > + > > + prev = fetch_or(dep, BIT_MASK(bit)); > > + if (!prev) { > > + /* > > + * We need the IPIs to be sent from sane process context. > > Why ? Because posix timers code is all called with interrupts disabled and we can't send IPIs then. > > > + * The posix cpu timers are always set with irqs disabled. > > + */ > > + schedule_work(&kick_all_work); > > + } > > +} > > + > > +/* > > + * Set a global tick dependency. Lets do the wide IPI kick asynchronously > > + * for callers with irqs disabled. > > This seems to suggest you can call this with IRQs disabled Ah right, that's a misleading comment. We need to use the _delayed() version when interrupts are disabled. Thanks. > > > + */ > > +void tick_nohz_set_dep(enum tick_dependency_bit bit) > > +{ > > + unsigned long prev; > > + > > + prev = fetch_or(&tick_dependency, BIT_MASK(bit)); > > + if (!prev) > > + tick_nohz_full_kick_all(); > > But that function seems implemented using smp_call_function_many() which > cannot be called with IRQs disabled.