On Wed, May 14, 2014 at 01:30:39AM +0200, Frederic Weisbecker wrote: > On Fri, May 09, 2014 at 02:14:10PM +0530, Viresh Kumar wrote: > > On 23 April 2014 16:42, Viresh Kumar wrote: > > > On 15 April 2014 15:00, Frederic Weisbecker wrote: > > >> Ok, I'm a bit buzy with a conference right now but I'm going to summarize that > > >> soonish. > > > > Hi Frederic, > > > > Please see if you can find some time to close this, that would be very > > helpful :) > > > > Thanks > > I'm generally worried about the accounting in update_curr() that periodically > updates stats. I have no idea which of these stats could be read by other CPUs: > vruntime, load bandwitdth, etc... update_curr() principally does the sum_exec_runtime and vruntime. Now vruntime is only interesting for other cpus when moving tasks across CPUs, so see below on load-balancing. sum_exec_runtime is used for a number of task stats, but when nobody looks at those it shouldn't matter. So rather than constantly force update them for no purpose, update them on-demand. So when someone reads those cputime stats, prod the task/cpu. I think you can do a remote update_curr() just fine. And I suppose you also need to do something with task_tick_numa(), that relies on the tick regardless of nr_running. And that's very much not something you can do remotely. As it stands I think the numa balancing and nohz_full are incompatible. > Also without tick: > > * we don't poll anymore on trigger_load_balance() > > * __update_cpu_load() requires fixed rate periodic polling. Alex Shi had > patches for that but I'm not sure if that's going to be merged > > * rq->rt_avg accounting? So I think typically we don't want load-balancing to happen when we're on a nohz_full cpu and there's only the one task running. So what you can do is extend the existing nohz balancing (which currently only deals with CPU_IDLE) to also remote balance CPU_NOT_IDLE when nr_running == 1.