linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Juri Lelli <juri.lelli@arm.com>
Cc: Morten Rasmussen <Morten.Rasmussen@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Dietmar Eggemann <Dietmar.Eggemann@arm.com>,
	Yuyang Du <yuyang.du@intel.com>,
	Michael Turquette <mturquette@baylibre.com>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	Sai Charan Gurrappadi <sgurrappadi@nvidia.com>,
	"pang.xunlei@zte.com.cn" <pang.xunlei@zte.com.cn>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Subject: Re: [RFCv5 PATCH 41/46] sched/fair: add triggers for OPP change requests
Date: Tue, 11 Aug 2015 13:41:11 +0200	[thread overview]
Message-ID: <CAKfTPtDWTon2m=tYU+taZYxgJV2OhPQXX_+Ku+ty=EjEpAzDsA@mail.gmail.com> (raw)
In-Reply-To: <55C9BB78.1090508@arm.com>

On 11 August 2015 at 11:08, Juri Lelli <juri.lelli@arm.com> wrote:
> On 10/08/15 16:07, Vincent Guittot wrote:
>> On 10 August 2015 at 15:43, Juri Lelli <juri.lelli@arm.com> wrote:
>>>
>>> Hi Vincent,
>>>
>>> On 04/08/15 14:41, Vincent Guittot wrote:
>>>> Hi Juri,
>>>>
>>>> On 7 July 2015 at 20:24, Morten Rasmussen <morten.rasmussen@arm.com> wrote:
>>>>> From: Juri Lelli <juri.lelli@arm.com>
>>>>>
>>>>> Each time a task is {en,de}queued we might need to adapt the current
>>>>> frequency to the new usage. Add triggers on {en,de}queue_task_fair() for
>>>>> this purpose.  Only trigger a freq request if we are effectively waking up
>>>>> or going to sleep.  Filter out load balancing related calls to reduce the
>>>>> number of triggers.
>>>>>
>>>>> cc: Ingo Molnar <mingo@redhat.com>
>>>>> cc: Peter Zijlstra <peterz@infradead.org>
>>>>>
>>>>> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
>>>>> ---
>>>>>  kernel/sched/fair.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>>>>>  1 file changed, 40 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index f74e9d2..b8627c6 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>> @@ -4281,7 +4281,10 @@ static inline void hrtick_update(struct rq *rq)
>>>>>  }
>>>>>  #endif
>>>>>
>>>>> +static unsigned int capacity_margin = 1280; /* ~20% margin */
>>>>> +
>>>>>  static bool cpu_overutilized(int cpu);
>>>>> +static unsigned long get_cpu_usage(int cpu);
>>>>>  struct static_key __sched_energy_freq __read_mostly = STATIC_KEY_INIT_FALSE;
>>>>>
>>>>>  /*
>>>>> @@ -4332,6 +4335,26 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>>>>>                 if (!task_new && !rq->rd->overutilized &&
>>>>>                     cpu_overutilized(rq->cpu))
>>>>>                         rq->rd->overutilized = true;
>>>>> +               /*
>>>>> +                * We want to trigger a freq switch request only for tasks that
>>>>> +                * are waking up; this is because we get here also during
>>>>> +                * load balancing, but in these cases it seems wise to trigger
>>>>> +                * as single request after load balancing is done.
>>>>> +                *
>>>>> +                * XXX: how about fork()? Do we need a special flag/something
>>>>> +                *      to tell if we are here after a fork() (wakeup_task_new)?
>>>>> +                *
>>>>> +                * Also, we add a margin (same ~20% used for the tipping point)
>>>>> +                * to our request to provide some head room if p's utilization
>>>>> +                * further increases.
>>>>> +                */
>>>>> +               if (sched_energy_freq() && !task_new) {
>>>>> +                       unsigned long req_cap = get_cpu_usage(cpu_of(rq));
>>>>> +
>>>>> +                       req_cap = req_cap * capacity_margin
>>>>> +                                       >> SCHED_CAPACITY_SHIFT;
>>>>> +                       cpufreq_sched_set_cap(cpu_of(rq), req_cap);
>>>>> +               }
>>>>>         }
>>>>>         hrtick_update(rq);
>>>>>  }
>>>>> @@ -4393,6 +4416,23 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>>>>>         if (!se) {
>>>>>                 sub_nr_running(rq, 1);
>>>>>                 update_rq_runnable_avg(rq, 1);
>>>>> +               /*
>>>>> +                * We want to trigger a freq switch request only for tasks that
>>>>> +                * are going to sleep; this is because we get here also during
>>>>> +                * load balancing, but in these cases it seems wise to trigger
>>>>> +                * as single request after load balancing is done.
>>>>> +                *
>>>>> +                * Also, we add a margin (same ~20% used for the tipping point)
>>>>> +                * to our request to provide some head room if p's utilization
>>>>> +                * further increases.
>>>>> +                */
>>>>> +               if (sched_energy_freq() && task_sleep) {
>>>>> +                       unsigned long req_cap = get_cpu_usage(cpu_of(rq));
>>>>> +
>>>>> +                       req_cap = req_cap * capacity_margin
>>>>> +                                       >> SCHED_CAPACITY_SHIFT;
>>>>> +                       cpufreq_sched_set_cap(cpu_of(rq), req_cap);
>>>>
>>>> Could you clarify why you want to trig a freq switch for tasks that
>>>> are going to sleep ?
>>>> The cpu_usage should not changed that much as the se_utilization of
>>>> the entity moves from utilization_load_avg to utilization_blocked_avg
>>>> of the rq and the usage and the freq are updated periodically.
>>>
>>> I think we still need to cover multiple back-to-back dequeues. Suppose
>>> that you have, let's say, 3 tasks that get enqueued at the same time.
>>> After some time the first one goes to sleep and its utilization, as you
>>> say, gets moved to utilization_blocked_avg. So, nothing changes, and
>>> the trigger is superfluous (even if no freq change I guess will be
>>> issued as we are already servicing enough capacity). However, after a
>>> while, the second task goes to sleep. Now we still use get_cpu_usage()
>>> and the first task contribution in utilization_blocked_avg should have
>>> been decayed by this time. Same thing may than happen for the third task
>>> as well. So, if we don't check if we need to scale down in
>>> dequeue_task_fair, it seems to me that we might miss some opportunities,
>>> as blocked contribution of other tasks could have been successively
>>> decayed.
>>>
>>> What you think?
>>
>> The tick is used to monitor such variation of the usage (in both way,
>> decay of the usage of sleeping tasks and increase of the usage of
>> running tasks). So in your example, if the duration between the sleep
>> of the 2 tasks is significant enough, the tick will handle this
>> variation
>>
>
> The tick is used to decide if we need to scale up (to max OPP for the
> time being), but we don't scale down. It makes more logical sense to

why don't you want to check if you need to scale down ?

> scale down at task deactivation, or wakeup after a long time, IMHO.

But waking up or going to sleep don't have any impact on the usage of
a cpu. The only events that impact the cpu usage are:
-task migration,
-new task
-time that elapse which can be monitored by periodically checking the usage.
-and for nohz system when cpu enter or leave idle state

waking up and going to sleep events doesn't give any useful
information and using them to trig the monitoring of the usage
variation doesn't give you a predictable/periodic update of it whereas
the tick will

Regards,
Vincent

>
> Best,
>
> - Juri
>
>> Regards,
>> Vincent
>>>
>>> Thanks,
>>>
>>> - Juri
>>>
>>>> It should be the same for the wake up of a task in enqueue_task_fair
>>>> above, even if it's less obvious for this latter use case because the
>>>> cpu might wake up from a long idle phase during which its
>>>> utilization_blocked_avg has not been updated. Nevertheless, a trig of
>>>> the freq switch at wake up of the cpu once its usage has been updated
>>>> should do the job.
>>>>
>>>> So tick, migration of tasks, new tasks, entering/leaving idle state of
>>>> cpu should be enough to trig freq switch
>>>>
>>>> Regards,
>>>> Vincent
>>>>
>>>>
>>>>> +               }
>>>>>         }
>>>>>         hrtick_update(rq);
>>>>>  }
>>>>> @@ -4959,8 +4999,6 @@ static int find_new_capacity(struct energy_env *eenv,
>>>>>         return idx;
>>>>>  }
>>>>>
>>>>> -static unsigned int capacity_margin = 1280; /* ~20% margin */
>>>>> -
>>>>>  static bool cpu_overutilized(int cpu)
>>>>>  {
>>>>>         return (capacity_of(cpu) * 1024) <
>>>>> --
>>>>> 1.9.1
>>>>>
>>>>
>>>
>>
>

  reply	other threads:[~2015-08-11 11:41 UTC|newest]

Thread overview: 155+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-07 18:23 [RFCv5 PATCH 00/46] sched: Energy cost model for energy-aware scheduling Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 01/46] arm: Frequency invariant scheduler load-tracking support Morten Rasmussen
2015-07-21 15:41   ` [RFCv5, " Leo Yan
2015-07-22 13:31     ` Morten Rasmussen
2015-07-22 14:59       ` Leo Yan
2015-07-23 11:06         ` Morten Rasmussen
2015-07-23 14:22           ` Leo Yan
2015-07-24  9:43             ` Morten Rasmussen
2015-08-03  9:22   ` [RFCv5 PATCH " Vincent Guittot
2015-08-17 15:59     ` Dietmar Eggemann
2015-08-11  9:27   ` Peter Zijlstra
2015-08-14 16:08     ` Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 02/46] sched: Make load tracking frequency scale-invariant Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 03/46] arm: vexpress: Add CPU clock-frequencies to TC2 device-tree Morten Rasmussen
2015-07-08 12:36   ` Jon Medhurst (Tixy)
2015-07-10 13:35     ` Dietmar Eggemann
2015-07-07 18:23 ` [RFCv5 PATCH 04/46] sched: Convert arch_scale_cpu_capacity() from weak function to #define Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 05/46] arm: Update arch_scale_cpu_capacity() to reflect change to define Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 06/46] sched: Make usage tracking cpu scale-invariant Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 07/46] arm: Cpu invariant scheduler load-tracking support Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 08/46] sched: Get rid of scaling usage by cpu_capacity_orig Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 09/46] sched: Track blocked utilization contributions Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 10/46] sched: Include blocked utilization in usage tracking Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 11/46] sched: Remove blocked load and utilization contributions of dying tasks Morten Rasmussen
2015-07-22  6:51   ` Leo Yan
2015-07-22 13:45     ` Morten Rasmussen
2015-08-11 11:39   ` Peter Zijlstra
2015-08-11 14:58     ` Morten Rasmussen
2015-08-11 17:23       ` Peter Zijlstra
2015-08-12  9:08         ` Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 12/46] sched: Initialize CFS task load and usage before placing task on rq Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 13/46] sched: Documentation for scheduler energy cost model Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 14/46] sched: Make energy awareness a sched feature Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 15/46] sched: Introduce energy data structures Morten Rasmussen
2015-07-07 18:23 ` [RFCv5 PATCH 16/46] sched: Allocate and initialize " Morten Rasmussen
2015-08-12 10:04   ` Peter Zijlstra
2015-08-12 17:08     ` Dietmar Eggemann
2015-08-12 10:17   ` Peter Zijlstra
2015-08-12 17:09     ` Dietmar Eggemann
2015-08-12 17:23       ` Peter Zijlstra
2015-07-07 18:24 ` [RFCv5 PATCH 17/46] sched: Introduce SD_SHARE_CAP_STATES sched_domain flag Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 18/46] arm: topology: Define TC2 energy and provide it to the scheduler Morten Rasmussen
2015-08-12 10:33   ` Peter Zijlstra
2015-08-12 18:47     ` Dietmar Eggemann
2015-08-17  9:19   ` [RFCv5, " Leo Yan
2015-08-20 19:19     ` Dietmar Eggemann
2015-07-07 18:24 ` [RFCv5 PATCH 19/46] sched: Compute cpu capacity available at current frequency Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 20/46] sched: Relocated get_cpu_usage() and change return type Morten Rasmussen
2015-08-12 10:59   ` Peter Zijlstra
2015-08-12 14:40     ` Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 21/46] sched: Highest energy aware balancing sched_domain level pointer Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 22/46] sched: Calculate energy consumption of sched_group Morten Rasmussen
2015-08-13 15:34   ` Peter Zijlstra
2015-08-14 10:28     ` Morten Rasmussen
2015-09-02 17:19   ` Leo Yan
2015-09-17 16:41     ` Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 23/46] sched: Extend sched_group_energy to test load-balancing decisions Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 24/46] sched: Estimate energy impact of scheduling decisions Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 25/46] sched: Add over-utilization/tipping point indicator Morten Rasmussen
2015-08-13 17:35   ` Peter Zijlstra
2015-08-14 13:02     ` Morten Rasmussen
2015-09-29 20:08       ` Steve Muckle
2015-10-09 12:49         ` Morten Rasmussen
2015-08-17 13:10   ` Leo Yan
2015-07-07 18:24 ` [RFCv5 PATCH 26/46] sched: Store system-wide maximum cpu capacity in root domain Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 27/46] sched, cpuidle: Track cpuidle state index in the scheduler Morten Rasmussen
2015-07-21  6:41   ` Leo Yan
2015-07-21 15:16     ` Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 28/46] sched: Count number of shallower idle-states in struct sched_group_energy Morten Rasmussen
2015-08-13 18:10   ` Peter Zijlstra
2015-08-14 19:08     ` Sai Gurrappadi
2015-07-07 18:24 ` [RFCv5 PATCH 29/46] sched: Determine the current sched_group idle-state Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 30/46] sched: Add cpu capacity awareness to wakeup balancing Morten Rasmussen
2015-08-13 18:24   ` Peter Zijlstra
2015-08-14 16:20     ` Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 31/46] sched: Consider spare cpu capacity at task wake-up Morten Rasmussen
2015-07-21  0:37   ` Sai Gurrappadi
2015-07-21 15:12     ` Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 32/46] sched: Energy-aware wake-up task placement Morten Rasmussen
2015-07-17  0:10   ` Sai Gurrappadi
2015-07-20 15:38     ` Morten Rasmussen
2015-08-17 16:23   ` Leo Yan
2015-09-02 17:11   ` Leo Yan
2015-09-18 10:34     ` Dietmar Eggemann
2015-09-20 18:39       ` Steve Muckle
2015-09-20 22:03         ` Leo Yan
2015-09-29  0:15           ` Steve Muckle
2015-07-07 18:24 ` [RFCv5 PATCH 33/46] sched: Consider a not over-utilized energy-aware system as balanced Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 34/46] sched: Enable idle balance to pull single task towards cpu with higher capacity Morten Rasmussen
2015-08-15  9:15   ` Peter Zijlstra
2015-07-07 18:24 ` [RFCv5 PATCH 35/46] sched: Disable energy-unfriendly nohz kicks Morten Rasmussen
2015-08-15  9:33   ` Peter Zijlstra
2015-07-07 18:24 ` [RFCv5 PATCH 36/46] sched: Prevent unnecessary active balance of single task in sched group Morten Rasmussen
2015-08-15  9:46   ` Peter Zijlstra
2015-07-07 18:24 ` [RFCv5 PATCH 37/46] cpufreq: introduce cpufreq_driver_might_sleep Morten Rasmussen
2015-07-08 15:09   ` Michael Turquette
2015-07-07 18:24 ` [RFCv5 PATCH 38/46] sched: scheduler-driven cpu frequency selection Morten Rasmussen
2015-07-08 15:09   ` Michael Turquette
2015-08-11  2:14   ` Leo Yan
2015-08-11  8:59     ` Juri Lelli
2015-08-15 12:35   ` Peter Zijlstra
2015-09-04 13:27     ` Juri Lelli
2015-09-14 15:57       ` Juri Lelli
2015-09-15 13:45         ` Peter Zijlstra
2015-09-15 16:22           ` Juri Lelli
2015-08-15 13:05   ` Peter Zijlstra
2015-08-25 10:45     ` Juri Lelli
2015-10-08  0:14       ` Steve Muckle
2015-10-08  9:41         ` Juri Lelli
2015-09-28 16:48   ` Punit Agrawal
2015-09-29  0:26     ` Steve Muckle
2015-07-07 18:24 ` [RFCv5 PATCH 39/46] sched/cpufreq_sched: use static key for " Morten Rasmussen
2015-07-08 15:19   ` Michael Turquette
2015-07-10  9:50     ` Juri Lelli
2015-08-15 12:40     ` Peter Zijlstra
2015-07-07 18:24 ` [RFCv5 PATCH 40/46] sched/cpufreq_sched: compute freq_new based on capacity_orig_of() Morten Rasmussen
2015-07-08 15:22   ` Michael Turquette
2015-07-09 16:21     ` Juri Lelli
2015-08-15 12:46   ` Peter Zijlstra
2015-08-16  4:03     ` Michael Turquette
2015-08-16 20:24       ` Peter Zijlstra
2015-08-17 12:19         ` Juri Lelli
2015-10-13 19:47           ` Steve Muckle
2015-07-07 18:24 ` [RFCv5 PATCH 41/46] sched/fair: add triggers for OPP change requests Morten Rasmussen
2015-07-08 15:42   ` Michael Turquette
2015-07-09 16:52     ` Juri Lelli
2015-08-04 13:41   ` Vincent Guittot
2015-08-10 13:43     ` Juri Lelli
2015-08-10 15:07       ` Vincent Guittot
2015-08-11  9:08         ` Juri Lelli
2015-08-11 11:41           ` Vincent Guittot [this message]
2015-08-11 15:07             ` Juri Lelli
2015-08-11 16:37               ` Vincent Guittot
2015-08-12 15:15                 ` Juri Lelli
2015-08-13 12:08                   ` Vincent Guittot
2015-08-14 11:39                     ` Juri Lelli
2015-08-17  9:43                       ` Vincent Guittot
2015-08-15 12:48   ` Peter Zijlstra
2015-08-16  3:50     ` Michael Turquette
2015-08-17 18:22     ` Rafael J. Wysocki
2015-07-07 18:24 ` [RFCv5 PATCH 42/46] sched/{core,fair}: trigger OPP change request on fork() Morten Rasmussen
2015-07-07 18:24 ` [RFCv5 PATCH 43/46] sched/{fair,cpufreq_sched}: add reset_capacity interface Morten Rasmussen
2015-10-08 20:40   ` Steve Muckle
2015-10-09  9:14     ` Juri Lelli
2015-10-12 19:02       ` Steve Muckle
2015-07-07 18:24 ` [RFCv5 PATCH 44/46] sched/fair: jump to max OPP when crossing UP threshold Morten Rasmussen
2015-07-08 16:40   ` Michael Turquette
2015-07-08 16:47   ` Michael Turquette
2015-07-10 10:17     ` Juri Lelli
2015-07-07 18:24 ` [RFCv5 PATCH 45/46] sched/cpufreq_sched: modify pcpu_capacity handling Morten Rasmussen
2015-07-08 16:42   ` Michael Turquette
2015-07-09 16:55     ` Juri Lelli
2015-08-16 20:35   ` Peter Zijlstra
2015-08-17 11:16     ` Juri Lelli
2015-07-07 18:24 ` [RFCv5 PATCH 46/46] sched/fair: cpufreq_sched triggers for load balancing Morten Rasmussen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKfTPtDWTon2m=tYU+taZYxgJV2OhPQXX_+Ku+ty=EjEpAzDsA@mail.gmail.com' \
    --to=vincent.guittot@linaro.org \
    --cc=Dietmar.Eggemann@arm.com \
    --cc=Morten.Rasmussen@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mturquette@baylibre.com \
    --cc=pang.xunlei@zte.com.cn \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=sgurrappadi@nvidia.com \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).