All of lore.kernel.org
 help / color / mirror / Atom feed
* [Query]: tick-sched: why don't we stop tick when we are running idle task?
@ 2014-04-09 10:33 Viresh Kumar
  2014-04-09 10:49 ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-04-09 10:33 UTC (permalink / raw)
  To: Frédéric Weisbecker
  Cc: Thomas Gleixner, Linux Kernel Mailing List, Lists linaro-kernel

Hi Frederic,

File: kernel/time/tick-sched.c
Function: tick_nohz_full_stop_tick()

We are doing this:

if (!tick_nohz_full_cpu(cpu) || is_idle_task(current))
        return;

Which means: if a FULL_NO_HZ cpu is running idle task currently,
don't stop its tick..

I couldn't understand why. Can you please help here?

--
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-09 10:33 [Query]: tick-sched: why don't we stop tick when we are running idle task? Viresh Kumar
@ 2014-04-09 10:49 ` Viresh Kumar
  2014-04-10 14:39   ` Frederic Weisbecker
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-04-09 10:49 UTC (permalink / raw)
  To: Frédéric Weisbecker
  Cc: Thomas Gleixner, Linux Kernel Mailing List, Lists linaro-kernel

On 9 April 2014 16:03, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Hi Frederic,
>
> File: kernel/time/tick-sched.c
> Function: tick_nohz_full_stop_tick()
>
> We are doing this:
>
> if (!tick_nohz_full_cpu(cpu) || is_idle_task(current))
>         return;
>
> Which means: if a FULL_NO_HZ cpu is running idle task currently,
> don't stop its tick..
>
> I couldn't understand why. Can you please help here?

Is it because of this code ? :

void tick_nohz_irq_exit(void)
{
        struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);

        if (ts->inidle)
                __tick_nohz_idle_enter(ts);
        else
                tick_nohz_full_stop_tick(ts);
}

i.e. tick_nohz_full_stop_tick() would never be called while we have
'inidle' set or we are running idle task?

In this case as well, we don't really need that check to be present.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-09 10:49 ` Viresh Kumar
@ 2014-04-10 14:39   ` Frederic Weisbecker
  2014-04-11 10:04     ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Frederic Weisbecker @ 2014-04-10 14:39 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Thomas Gleixner, Linux Kernel Mailing List, Lists linaro-kernel

On Wed, Apr 09, 2014 at 04:19:44PM +0530, Viresh Kumar wrote:
> On 9 April 2014 16:03, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > Hi Frederic,
> >
> > File: kernel/time/tick-sched.c
> > Function: tick_nohz_full_stop_tick()
> >
> > We are doing this:
> >
> > if (!tick_nohz_full_cpu(cpu) || is_idle_task(current))
> >         return;
> >
> > Which means: if a FULL_NO_HZ cpu is running idle task currently,
> > don't stop its tick..
> >
> > I couldn't understand why. Can you please help here?
> 
> Is it because of this code ? :
> 
> void tick_nohz_irq_exit(void)
> {
>         struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
> 
>         if (ts->inidle)
>                 __tick_nohz_idle_enter(ts);
>         else
>                 tick_nohz_full_stop_tick(ts);
> }
> 
> i.e. tick_nohz_full_stop_tick() would never be called while we have
> 'inidle' set or we are running idle task?
> 
> In this case as well, we don't really need that check to be present.

So I did that to avoid full dynticks to mess up with idle dynticks which
has special requirement, accounting and stuff that full dynticks doesn't
have.

For example in tick_nohz_irq_exit(), if we are in dyntick idle (ts->inidle && ts->tick_stopped)
and we call the above tick_nohz_full_stop_tick() instead of __tick_nohz_idle_enter(), we'll
bug the idle time accounting.

So I kept the tick_nohz_full_stop_tick() function away when the task is idle. But I've
been lazy by using is_idle_task(current) instead of ts->inidle.

Note that this also match __tick_nohz_full_check() that also doesn't restart the tick
when we are idle so to avoid messing up with dynticks idle. And it does that also because
it knows that full dyticks doesn't happen when we idle.

But yeah this is all suboptimal. We should rely on ts->inidle instead.
See below (untested):

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9f8af69..1e2d6b7 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -202,13 +202,16 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now);
 void __tick_nohz_full_check(void)
 {
 	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	unsigned long flags;
 
+	local_irq_save(flags);
 	if (tick_nohz_full_cpu(smp_processor_id())) {
-		if (ts->tick_stopped && !is_idle_task(current)) {
+		if (ts->tick_stopped && !ts->inidle)) {
 			if (!can_stop_full_tick())
 				tick_nohz_restart_sched_tick(ts, ktime_get());
 		}
 	}
+	local_irq_restore(flags);
 }
 
 static void nohz_full_kick_work_func(struct irq_work *work)
@@ -681,7 +684,9 @@ static void tick_nohz_full_stop_tick(struct tick_sched *ts)
 #ifdef CONFIG_NO_HZ_FULL
 	int cpu = smp_processor_id();
 
-	if (!tick_nohz_full_cpu(cpu) || is_idle_task(current))
+	WARN_ON_ONCE(ts->inidle);
+
+	if (!tick_nohz_full_cpu(cpu))
 		return;
 
 	if (!ts->tick_stopped && ts->nohz_mode == NOHZ_MODE_INACTIVE)



---

There is a risk that ts->idle_jiffies snaphots get missed if tick_nohz_idle_enter()
is called when full dynticks is already on. But this was already possible if full
dynticks was already armed from a previous task. So nothing new.

If you like it I'll push it to Ingo.

Thanks.

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-10 14:39   ` Frederic Weisbecker
@ 2014-04-11 10:04     ` Viresh Kumar
  2014-04-11 14:53       ` Frederic Weisbecker
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-04-11 10:04 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, Linux Kernel Mailing List, Lists linaro-kernel

On 10 April 2014 20:09, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 9f8af69..1e2d6b7 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -202,13 +202,16 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now);
>  void __tick_nohz_full_check(void)
>  {
>         struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
> +       unsigned long flags;
>
> +       local_irq_save(flags);

As we need to disable interrupts to read this variable, would it be
better to just remove this completely and use is_idle_task(current)
instead?

>         if (tick_nohz_full_cpu(smp_processor_id())) {
> -               if (ts->tick_stopped && !is_idle_task(current)) {
> +               if (ts->tick_stopped && !ts->inidle)) {
>                         if (!can_stop_full_tick())
>                                 tick_nohz_restart_sched_tick(ts, ktime_get());
>                 }
>         }
> +       local_irq_restore(flags);
>  }

> If you like it I'll push it to Ingo.

Yes please. And thanks for the explanations. It was pretty useful.

I am looking to offload 1 second tick to timekeeping CPUs and so
going through these frameworks. I don't have a working solution yet
(even partially :)). Would send a RFC to you as soon as I get anything
working.

--
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-11 10:04     ` Viresh Kumar
@ 2014-04-11 14:53       ` Frederic Weisbecker
  2014-04-11 15:18         ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Frederic Weisbecker @ 2014-04-11 14:53 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Thomas Gleixner, Linux Kernel Mailing List, Lists linaro-kernel,
	Peter Zijlstra

On Fri, Apr 11, 2014 at 03:34:23PM +0530, Viresh Kumar wrote:
> On 10 April 2014 20:09, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 9f8af69..1e2d6b7 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -202,13 +202,16 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now);
> >  void __tick_nohz_full_check(void)
> >  {
> >         struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
> > +       unsigned long flags;
> >
> > +       local_irq_save(flags);
> 
> As we need to disable interrupts to read this variable, would it be
> better to just remove this completely and use is_idle_task(current)
> instead?

I don't get what you mean. The goal was get read of the hammer is_idle_task()
check, wasn't it?

Also irqs are disabled but this is fundamentaly not required as this can only be
called by IPIs which always have irqs disabled.

Hmm I should add a WARN_ON_(!irqs_disabled()) though just in case.

> 
> >         if (tick_nohz_full_cpu(smp_processor_id())) {
> > -               if (ts->tick_stopped && !is_idle_task(current)) {
> > +               if (ts->tick_stopped && !ts->inidle)) {
> >                         if (!can_stop_full_tick())
> >                                 tick_nohz_restart_sched_tick(ts, ktime_get());
> >                 }
> >         }
> > +       local_irq_restore(flags);
> >  }
> 
> > If you like it I'll push it to Ingo.
> 
> Yes please. And thanks for the explanations. It was pretty useful.
> 
> I am looking to offload 1 second tick to timekeeping CPUs and so
> going through these frameworks. I don't have a working solution yet
> (even partially :)). Would send a RFC to you as soon as I get anything
> working.

I see. The only solution I can think of right now is to have the timekeeper call
sched_class(current[$CPU])::scheduler_tick() on behalf of all full dynticks CPUs.

This sounds costly but can be done once per sec for each CPUs. Not sure if Peterz will
like it but sending mockup RFC patches will tell us more about his opinion :)

Otherwise (and ideally) we need to make the scheduler code able to handle long periods without
calling scheduler_tick(). But this is a lot more plumbing. And the scheduler has gazillions
accounting stuffs to handle. Sounds like a big nightmare to take that direction.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-11 14:53       ` Frederic Weisbecker
@ 2014-04-11 15:18         ` Peter Zijlstra
  2014-04-11 16:38           ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2014-04-11 15:18 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Viresh Kumar, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Fri, Apr 11, 2014 at 04:53:35PM +0200, Frederic Weisbecker wrote:
> On Fri, Apr 11, 2014 at 03:34:23PM +0530, Viresh Kumar wrote:
> > On 10 April 2014 20:09, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 9f8af69..1e2d6b7 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -202,13 +202,16 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now);
> > >  void __tick_nohz_full_check(void)
> > >  {
> > >         struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
> > > +       unsigned long flags;
> > >
> > > +       local_irq_save(flags);
> > 
> > As we need to disable interrupts to read this variable, would it be
> > better to just remove this completely and use is_idle_task(current)
> > instead?
> 
> I don't get what you mean. The goal was get read of the hammer is_idle_task()
> check, wasn't it?
> 
> Also irqs are disabled but this is fundamentaly not required as this can only be
> called by IPIs which always have irqs disabled.
> 
> Hmm I should add a WARN_ON_(!irqs_disabled()) though just in case.
> 
> > 
> > >         if (tick_nohz_full_cpu(smp_processor_id())) {
> > > -               if (ts->tick_stopped && !is_idle_task(current)) {
> > > +               if (ts->tick_stopped && !ts->inidle)) {
> > >                         if (!can_stop_full_tick())
> > >                                 tick_nohz_restart_sched_tick(ts, ktime_get());
> > >                 }
> > >         }
> > > +       local_irq_restore(flags);
> > >  }
> > 
> > > If you like it I'll push it to Ingo.
> > 
> > Yes please. And thanks for the explanations. It was pretty useful.
> > 
> > I am looking to offload 1 second tick to timekeeping CPUs and so
> > going through these frameworks. I don't have a working solution yet
> > (even partially :)). Would send a RFC to you as soon as I get anything
> > working.
> 
> I see. The only solution I can think of right now is to have the timekeeper call
> sched_class(current[$CPU])::scheduler_tick() on behalf of all full dynticks CPUs.
> 
> This sounds costly but can be done once per sec for each CPUs. Not sure if Peterz will
> like it but sending mockup RFC patches will tell us more about his opinion :)

I think there's assumptions that tick runs on the local cpu; also what
are you going to do when running it on all remote cpus takes longer than
the tick?

> Otherwise (and ideally) we need to make the scheduler code able to handle long periods without
> calling scheduler_tick(). But this is a lot more plumbing. And the scheduler has gazillions
> accounting stuffs to handle. Sounds like a big nightmare to take that direction.

So i'm not at all sure what you guys are talking about, but it seems to
me you should all put down the bong and have a detox round instead.

This all sounds like a cure worse than the problem.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-11 15:18         ` Peter Zijlstra
@ 2014-04-11 16:38           ` Viresh Kumar
  2014-04-14  9:48             ` Preeti Murthy
  2014-04-14 11:02             ` Peter Zijlstra
  0 siblings, 2 replies; 22+ messages in thread
From: Viresh Kumar @ 2014-04-11 16:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On 11 April 2014 20:48, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Apr 11, 2014 at 04:53:35PM +0200, Frederic Weisbecker wrote:

> I think there's assumptions that tick runs on the local cpu;

Yes, many function behave that way, i.e. with smp_processor_id() as
CPU.

> also what
> are you going to do when running it on all remote cpus takes longer than
> the tick?
>
>> Otherwise (and ideally) we need to make the scheduler code able to handle long periods without
>> calling scheduler_tick(). But this is a lot more plumbing. And the scheduler has gazillions
>> accounting stuffs to handle. Sounds like a big nightmare to take that direction.
>
> So i'm not at all sure what you guys are talking about, but it seems to
> me you should all put down the bong and have a detox round instead.
>
> This all sounds like a cure worse than the problem.

So, what I was working on isn't ready yet but I would like to show what lines
I have been trying on. In case that is completely incorrect and I should stop
making that work :)

Please share your feedback about this (Yes there are several parts broken
currently, specially the assumption that tick runs on local CPU):

NOTE: Its rebased over some cleanups I did which aren't sent to LKML yet.

---------x-------------------------x---------------
    tick-sched: offload NO_HZ_FULL computation to timekeeping CPUs

    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 include/linux/tick.h     |  2 ++
 kernel/time/tick-sched.c | 90
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------
 2 files changed, 82 insertions(+), 10 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 97bbb64..f8efa9f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -62,6 +62,7 @@ enum tick_nohz_mode {
  */
 struct tick_sched {
        struct hrtimer                  sched_timer;
+       struct cpumask                  timekeeping_pending;
        enum tick_nohz_mode             nohz_mode;
        unsigned long                   check_clocks;
        unsigned long                   idle_jiffies;
@@ -77,6 +78,7 @@ struct tick_sched {
        ktime_t                         iowait_sleeptime;
        ktime_t                         sleep_length;
        ktime_t                         idle_expires;
+       unsigned int                    cpu;
        int                             inidle;
        int                             idle_active;
        int                             tick_stopped;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2d0b154..7560bd0 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -532,13 +532,56 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
 }
 EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);

+static int tick_nohz_full_get_offload_cpu(unsigned int cpu)
+{
+       const struct cpumask *nodemask = cpumask_of_node(cpu_to_node(cpu));
+       unsigned int offload_cpu;
+       cpumask_t cpumask;
+
+       /* Don't pick any NO_HZ_FULL cpu */
+       cpumask_andnot(&cpumask, cpu_online_mask, tick_nohz_full_mask);
+       cpumask_clear_cpu(cpu, &cpumask);
+
+       /* Try for same node. */
+       offload_cpu = cpumask_first_and(nodemask, &cpumask);
+       if (offload_cpu < nr_cpu_ids)
+               return offload_cpu;
+
+       /* Any online will do */
+       return cpumask_any(&cpumask);
+}
+
+static void tick_nohz_full_offload_timer(void *info)
+{
+       struct tick_sched *ts = info;
+       hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);
+}
+
+static void tick_nohz_full_offload_tick(struct tick_sched *ts, ktime_t expires,
+                                       unsigned int cpu)
+{
+       unsigned int offload_cpu = tick_nohz_full_get_offload_cpu(cpu);
+
+       /* Offload accounting to timekeeper */
+       hrtimer_cancel(&ts->sched_timer);
+       hrtimer_set_expires(&ts->sched_timer, expires);
+
+       /*
+        * This is triggering a WARN_ON() as below routine doesn't support calls
+        * while interrupts are disabled. Need to think of some other way to get
+        * this fixed.
+        */
+       smp_call_function_single(offload_cpu, tick_nohz_full_offload_timer, ts,
+                                true);
+}
+
 static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
                                         ktime_t now, int cpu)
 {
        unsigned long seq, last_jiffies, next_jiffies, delta_jiffies;
        ktime_t last_update, expires, ret = { .tv64 = 0 };
        unsigned long rcu_delta_jiffies;
-       struct clock_event_device *dev = tick_get_cpu_device()->evtdev;
+       struct clock_event_device *dev = tick_get_device(cpu)->evtdev;
        u64 time_delta;

        time_delta = timekeeping_max_deferment();
@@ -661,8 +704,12 @@ static ktime_t tick_nohz_stop_sched_tick(struct
tick_sched *ts,
                }

                if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
-                       hrtimer_start(&ts->sched_timer, expires,
-                                     HRTIMER_MODE_ABS_PINNED);
+                       if (ts->inidle)
+                               hrtimer_start(&ts->sched_timer, expires,
+                                             HRTIMER_MODE_ABS_PINNED);
+                       else
+                               tick_nohz_full_offload_tick(ts, expires, cpu);
+
                        /* Check, if the timer was already in the past */
                        if (hrtimer_active(&ts->sched_timer))
                                goto out;
@@ -687,9 +734,7 @@ out:
 static void tick_nohz_full_stop_tick(struct tick_sched *ts)
 {
 #ifdef CONFIG_NO_HZ_FULL
-       int cpu = smp_processor_id();
-
-       if (!tick_nohz_full_cpu(cpu) || is_idle_task(current))
+       if (!tick_nohz_full_cpu(ts->cpu) || is_idle_task(current))
                return;

        if (!ts->tick_stopped && ts->nohz_mode == NOHZ_MODE_INACTIVE)
@@ -698,7 +743,7 @@ static void tick_nohz_full_stop_tick(struct tick_sched *ts)
        if (!can_stop_full_tick())
                return;

-       tick_nohz_stop_sched_tick(ts, ktime_get(), cpu);
+       tick_nohz_stop_sched_tick(ts, ktime_get(), ts->cpu);
 #endif
 }

@@ -824,11 +869,21 @@ EXPORT_SYMBOL_GPL(tick_nohz_idle_enter);
 void tick_nohz_irq_exit(void)
 {
        struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+       unsigned int cpu;

-       if (ts->inidle)
+       if (ts->inidle) {
                __tick_nohz_idle_enter(ts);
-       else
+       } else {
                tick_nohz_full_stop_tick(ts);
+
+               while (!cpumask_empty(&ts->timekeeping_pending)) {
+                       cpu = cpumask_first(&ts->timekeeping_pending);
+                       cpumask_clear_cpu(cpu, &ts->timekeeping_pending);
+
+                       /* Try to stop tick of NO_HZ_FULL cpu */
+                       tick_nohz_full_stop_tick(tick_get_tick_sched(cpu));
+               }
+       }
 }

 /**
@@ -1090,13 +1145,23 @@ static enum hrtimer_restart
tick_sched_timer(struct hrtimer *timer)
 {
        struct tick_sched *ts =
                container_of(timer, struct tick_sched, sched_timer);
+       struct tick_sched *this_ts = &__get_cpu_var(tick_cpu_sched);
        ktime_t now = ktime_get();

-       tick_sched_do_timer(now);
+       /* Running as timekeeper ? */
+       if (likely(ts == this_ts))
+               tick_sched_do_timer(now);
+       else
+               cpumask_set_cpu(ts->cpu, &this_ts->timekeeping_pending);
+
        tick_sched_handle(ts);

        hrtimer_forward(timer, now, tick_period);

+       /*
+        * Yes, we are scheduling next tick also while timekeeping on this_cpu.
+        * We will handle that from irq_exit().
+        */
        return HRTIMER_RESTART;
 }

@@ -1149,6 +1214,11 @@ void tick_setup_sched_timer(void)
                tick_nohz_active = 1;
        }
 #endif
+
+#ifdef CONFIG_NO_HZ_FULL
+       ts->cpu = smp_processor_id();
+       cpumask_clear(&ts->timekeeping_pending);
+#endif
 }
 #endif /* HIGH_RES_TIMERS */

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-11 16:38           ` Viresh Kumar
@ 2014-04-14  9:48             ` Preeti Murthy
  2014-04-14  9:51               ` Viresh Kumar
  2014-04-14 11:02             ` Peter Zijlstra
  1 sibling, 1 reply; 22+ messages in thread
From: Preeti Murthy @ 2014-04-14  9:48 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Linux Kernel Mailing List, Lists linaro-kernel, Preeti U Murthy

Hi Viresh,

I am not too sure about the complexity or the worthiness of this patch but
just wanted to add that care must be taken to migrate the tick_sched_timer
of all the remote CPUs off a hotplugged out CPU if the latter was keeping
their time thus far. In the normal scenario I am guessing the tick_sched_timer
dies along with the hotplugged out CPU since there is no need for it any more.

Regards
Preeti U Murthy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14  9:48             ` Preeti Murthy
@ 2014-04-14  9:51               ` Viresh Kumar
  0 siblings, 0 replies; 22+ messages in thread
From: Viresh Kumar @ 2014-04-14  9:51 UTC (permalink / raw)
  To: Preeti Murthy
  Cc: Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Linux Kernel Mailing List, Lists linaro-kernel, Preeti U Murthy

On 14 April 2014 15:18, Preeti Murthy <preeti.lkml@gmail.com> wrote:
> I am not too sure about the complexity or the worthiness of this patch but
> just wanted to add that care must be taken to migrate the tick_sched_timer
> of all the remote CPUs off a hotplugged out CPU if the latter was keeping
> their time thus far. In the normal scenario I am guessing the tick_sched_timer
> dies along with the hotplugged out CPU since there is no need for it any more.

Agreed. Lets see if there is anybody in favor of this work as it is
very important
for some real time use cases we have. Like running data plane threads on
isolated CPUs.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-11 16:38           ` Viresh Kumar
  2014-04-14  9:48             ` Preeti Murthy
@ 2014-04-14 11:02             ` Peter Zijlstra
  2014-04-14 11:42               ` Viresh Kumar
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2014-04-14 11:02 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Fri, Apr 11, 2014 at 10:08:30PM +0530, Viresh Kumar wrote:
> On 11 April 2014 20:48, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, Apr 11, 2014 at 04:53:35PM +0200, Frederic Weisbecker wrote:
> 
> > I think there's assumptions that tick runs on the local cpu;
> 
> Yes, many function behave that way, i.e. with smp_processor_id() as
> CPU.
> 
> > also what
> > are you going to do when running it on all remote cpus takes longer than
> > the tick?
> >
> >> Otherwise (and ideally) we need to make the scheduler code able to handle long periods without
> >> calling scheduler_tick(). But this is a lot more plumbing. And the scheduler has gazillions
> >> accounting stuffs to handle. Sounds like a big nightmare to take that direction.
> >
> > So i'm not at all sure what you guys are talking about, but it seems to
> > me you should all put down the bong and have a detox round instead.
> >
> > This all sounds like a cure worse than the problem.
> 
> So, what I was working on isn't ready yet but I would like to show what lines
> I have been trying on. In case that is completely incorrect and I should stop
> making that work :)
> 
> Please share your feedback about this (Yes there are several parts broken
> currently, specially the assumption that tick runs on local CPU):

I'm still not sure _what_ you're trying to solve here. What are you
doing and why?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14 11:02             ` Peter Zijlstra
@ 2014-04-14 11:42               ` Viresh Kumar
  2014-04-14 11:47                 ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-04-14 11:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On 14 April 2014 16:32, Peter Zijlstra <peterz@infradead.org> wrote:
> I'm still not sure _what_ you're trying to solve here. What are you
> doing and why?

Hi Peter,

We are working building ARM Networking machines. Networking Data
plane is handled completely at user space. At run time we may fix
any number of CPUs for data plane activities. There will be a single
user space thread per CPU for these data plane packet processing.
Due to timing constraints these cores can't allow any interruption
from kernel. These include interruption from:

- other tasks: Fixed with cpusets
- timers/hrtimers: Implemented cpuset.quiesce as you suggested:
Waiting for reviews
- workqueues: Probably would be fixed by Frederic's work.
- Tick: Even with NO_HZ_FULL we get a tick every second. This is
what I am trying to address here. Frederic earlier suggested to
offload this accounting to other CPUs and so was my initial proposal.

Please let me know what's the right way to get this fixed and I will
try it that way.

Thanks for your inputs.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14 11:42               ` Viresh Kumar
@ 2014-04-14 11:47                 ` Peter Zijlstra
  2014-04-14 11:52                   ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2014-04-14 11:47 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Mon, Apr 14, 2014 at 05:12:08PM +0530, Viresh Kumar wrote:
> On 14 April 2014 16:32, Peter Zijlstra <peterz@infradead.org> wrote:
> > I'm still not sure _what_ you're trying to solve here. What are you
> > doing and why?
> 
> Hi Peter,
> 
> We are working building ARM Networking machines. Networking Data
> plane is handled completely at user space. At run time we may fix
> any number of CPUs for data plane activities. There will be a single
> user space thread per CPU for these data plane packet processing.
> Due to timing constraints these cores can't allow any interruption
> from kernel. These include interruption from:
> 
> - other tasks: Fixed with cpusets
> - timers/hrtimers: Implemented cpuset.quiesce as you suggested:
> Waiting for reviews
> - workqueues: Probably would be fixed by Frederic's work.

Ok.

> - Tick: Even with NO_HZ_FULL we get a tick every second. This is
> what I am trying to address here. Frederic earlier suggested to
> offload this accounting to other CPUs and so was my initial proposal.

What causes this tick? I was under the impression that once there's a
single task (not doing any syscalls) and the above issues are sorted, no
more tick would happen.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14 11:47                 ` Peter Zijlstra
@ 2014-04-14 11:52                   ` Viresh Kumar
  2014-04-14 12:06                     ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-04-14 11:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On 14 April 2014 17:17, Peter Zijlstra <peterz@infradead.org> wrote:
> What causes this tick? I was under the impression that once there's a
> single task (not doing any syscalls) and the above issues are sorted, no
> more tick would happen.

This is what Frederic told me earlier:

https://lkml.org/lkml/2014/2/13/238

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14 11:52                   ` Viresh Kumar
@ 2014-04-14 12:06                     ` Peter Zijlstra
  2014-04-15  6:04                       ` Viresh Kumar
  2014-04-15  9:30                       ` Frederic Weisbecker
  0 siblings, 2 replies; 22+ messages in thread
From: Peter Zijlstra @ 2014-04-14 12:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Mon, Apr 14, 2014 at 05:22:30PM +0530, Viresh Kumar wrote:
> On 14 April 2014 17:17, Peter Zijlstra <peterz@infradead.org> wrote:
> > What causes this tick? I was under the impression that once there's a
> > single task (not doing any syscalls) and the above issues are sorted, no
> > more tick would happen.
> 
> This is what Frederic told me earlier:
> 
> https://lkml.org/lkml/2014/2/13/238

That's a bit of a non-answer. I'm fairly sure its not a gazillion
issues, since the actual scheduler tick doesn't actually do that much.

So start by enumerating what is actually required.

The 2), which I suppose you're now trying to implement is I think
entirely the wrong way. The tick really assumes it runs local, moving it
to another CPU is insane.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14 12:06                     ` Peter Zijlstra
@ 2014-04-15  6:04                       ` Viresh Kumar
  2014-04-15  9:30                       ` Frederic Weisbecker
  1 sibling, 0 replies; 22+ messages in thread
From: Viresh Kumar @ 2014-04-15  6:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On 14 April 2014 17:36, Peter Zijlstra <peterz@infradead.org> wrote:
> That's a bit of a non-answer. I'm fairly sure its not a gazillion
> issues, since the actual scheduler tick doesn't actually do that much.
>
> So start by enumerating what is actually required.
>
> The 2), which I suppose you're now trying to implement is I think
> entirely the wrong way. The tick really assumes it runs local, moving it
> to another CPU is insane.

Yeah, I was trying this one :(

I still don't have enough knowledge of scheduler and so can't exactly
tell what all requires tick to fire a 1 second.

@Frederic: Can you please help :) ?

--
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-14 12:06                     ` Peter Zijlstra
  2014-04-15  6:04                       ` Viresh Kumar
@ 2014-04-15  9:30                       ` Frederic Weisbecker
  2014-04-15 10:52                         ` Peter Zijlstra
  2014-04-23 11:12                         ` Viresh Kumar
  1 sibling, 2 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2014-04-15  9:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Viresh Kumar, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Mon, Apr 14, 2014 at 02:06:00PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 14, 2014 at 05:22:30PM +0530, Viresh Kumar wrote:
> > On 14 April 2014 17:17, Peter Zijlstra <peterz@infradead.org> wrote:
> > > What causes this tick? I was under the impression that once there's a
> > > single task (not doing any syscalls) and the above issues are sorted, no
> > > more tick would happen.
> > 
> > This is what Frederic told me earlier:
> > 
> > https://lkml.org/lkml/2014/2/13/238
> 
> That's a bit of a non-answer. I'm fairly sure its not a gazillion
> issues, since the actual scheduler tick doesn't actually do that much.
> 
> So start by enumerating what is actually required.

Ok, I'm a bit buzy with a conference right now but I'm going to summarize that
soonish.

> 
> The 2), which I suppose you're now trying to implement is I think
> entirely the wrong way. The tick really assumes it runs local, moving it
> to another CPU is insane.

There is probably a few things that assume local calls but last time
I checked I had the impression that it was fairly possible to call sched_class::task_tick()
remotely. rq is locked, no reference to "current", use rq accessors...

OTOH scheduler_tick() itself definetly requires local calls.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-15  9:30                       ` Frederic Weisbecker
@ 2014-04-15 10:52                         ` Peter Zijlstra
  2014-04-15 10:53                           ` Peter Zijlstra
  2014-04-23 11:12                         ` Viresh Kumar
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2014-04-15 10:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Viresh Kumar, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Tue, Apr 15, 2014 at 11:30:04AM +0200, Frederic Weisbecker wrote:
> There is probably a few things that assume local calls but last time
> I checked I had the impression that it was fairly possible to call sched_class::task_tick()
> remotely. rq is locked, no reference to "current", use rq accessors...
> 
> OTOH scheduler_tick() itself definetly requires local calls.

possible isn't the problem, its completely insane to do that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-15 10:52                         ` Peter Zijlstra
@ 2014-04-15 10:53                           ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2014-04-15 10:53 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Viresh Kumar, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Tue, Apr 15, 2014 at 12:52:26PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2014 at 11:30:04AM +0200, Frederic Weisbecker wrote:
> > There is probably a few things that assume local calls but last time
> > I checked I had the impression that it was fairly possible to call sched_class::task_tick()
> > remotely. rq is locked, no reference to "current", use rq accessors...
> > 
> > OTOH scheduler_tick() itself definetly requires local calls.
> 
> possible isn't the problem, its completely insane to do that.

What's more, I'm still waiting to hear why we're wanting to do any of
this.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-15  9:30                       ` Frederic Weisbecker
  2014-04-15 10:52                         ` Peter Zijlstra
@ 2014-04-23 11:12                         ` Viresh Kumar
  2014-05-09  8:44                           ` Viresh Kumar
  1 sibling, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-04-23 11:12 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On 15 April 2014 15:00, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Ok, I'm a bit buzy with a conference right now but I'm going to summarize that
> soonish.

Are you back now ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-04-23 11:12                         ` Viresh Kumar
@ 2014-05-09  8:44                           ` Viresh Kumar
  2014-05-13 23:30                             ` Frederic Weisbecker
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2014-05-09  8:44 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On 23 April 2014 16:42, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 15 April 2014 15:00, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> Ok, I'm a bit buzy with a conference right now but I'm going to summarize that
>> soonish.

Hi Frederic,

Please see if you can find some time to close this, that would be very
helpful :)

Thanks

--
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-05-09  8:44                           ` Viresh Kumar
@ 2014-05-13 23:30                             ` Frederic Weisbecker
  2014-05-22  8:44                               ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 23:30 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Peter Zijlstra, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

On Fri, May 09, 2014 at 02:14:10PM +0530, Viresh Kumar wrote:
> On 23 April 2014 16:42, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 15 April 2014 15:00, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >> Ok, I'm a bit buzy with a conference right now but I'm going to summarize that
> >> soonish.
> 
> Hi Frederic,
> 
> Please see if you can find some time to close this, that would be very
> helpful :)
> 
> Thanks

I'm generally worried about the accounting in update_curr() that periodically
updates stats. I have no idea which of these stats could be read by other CPUs:
vruntime, load bandwitdth, etc...

Also without tick:

* we don't poll anymore on trigger_load_balance()

* __update_cpu_load() requires fixed rate periodic polling. Alex Shi had
patches for that but I'm not sure if that's going to be merged

* rq->rt_avg accounting?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Query]: tick-sched: why don't we stop tick when we are running idle task?
  2014-05-13 23:30                             ` Frederic Weisbecker
@ 2014-05-22  8:44                               ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2014-05-22  8:44 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Viresh Kumar, Thomas Gleixner, Linux Kernel Mailing List,
	Lists linaro-kernel

[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]

On Wed, May 14, 2014 at 01:30:39AM +0200, Frederic Weisbecker wrote:
> On Fri, May 09, 2014 at 02:14:10PM +0530, Viresh Kumar wrote:
> > On 23 April 2014 16:42, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > On 15 April 2014 15:00, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > >> Ok, I'm a bit buzy with a conference right now but I'm going to summarize that
> > >> soonish.
> > 
> > Hi Frederic,
> > 
> > Please see if you can find some time to close this, that would be very
> > helpful :)
> > 
> > Thanks
> 
> I'm generally worried about the accounting in update_curr() that periodically
> updates stats. I have no idea which of these stats could be read by other CPUs:
> vruntime, load bandwitdth, etc...

update_curr() principally does the sum_exec_runtime and vruntime. Now
vruntime is only interesting for other cpus when moving tasks across
CPUs, so see below on load-balancing.

sum_exec_runtime is used for a number of task stats, but when nobody
looks at those it shouldn't matter.

So rather than constantly force update them for no purpose, update them
on-demand. So when someone reads those cputime stats, prod the task/cpu.
I think you can do a remote update_curr() just fine.

And I suppose you also need to do something with task_tick_numa(), that
relies on the tick regardless of nr_running. And that's very much not
something you can do remotely. As it stands I think the numa balancing
and nohz_full are incompatible.

> Also without tick:
> 
> * we don't poll anymore on trigger_load_balance()
> 
> * __update_cpu_load() requires fixed rate periodic polling. Alex Shi had
> patches for that but I'm not sure if that's going to be merged
> 
> * rq->rt_avg accounting?

So I think typically we don't want load-balancing to happen when we're
on a nohz_full cpu and there's only the one task running.

So what you can do is extend the existing nohz balancing (which
currently only deals with CPU_IDLE) to also remote balance CPU_NOT_IDLE
when nr_running == 1.


[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-05-22  8:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-09 10:33 [Query]: tick-sched: why don't we stop tick when we are running idle task? Viresh Kumar
2014-04-09 10:49 ` Viresh Kumar
2014-04-10 14:39   ` Frederic Weisbecker
2014-04-11 10:04     ` Viresh Kumar
2014-04-11 14:53       ` Frederic Weisbecker
2014-04-11 15:18         ` Peter Zijlstra
2014-04-11 16:38           ` Viresh Kumar
2014-04-14  9:48             ` Preeti Murthy
2014-04-14  9:51               ` Viresh Kumar
2014-04-14 11:02             ` Peter Zijlstra
2014-04-14 11:42               ` Viresh Kumar
2014-04-14 11:47                 ` Peter Zijlstra
2014-04-14 11:52                   ` Viresh Kumar
2014-04-14 12:06                     ` Peter Zijlstra
2014-04-15  6:04                       ` Viresh Kumar
2014-04-15  9:30                       ` Frederic Weisbecker
2014-04-15 10:52                         ` Peter Zijlstra
2014-04-15 10:53                           ` Peter Zijlstra
2014-04-23 11:12                         ` Viresh Kumar
2014-05-09  8:44                           ` Viresh Kumar
2014-05-13 23:30                             ` Frederic Weisbecker
2014-05-22  8:44                               ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.