* [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq @ 2016-06-01 19:39 Dietmar Eggemann 2016-06-01 19:39 ` [RFC PATCH 1/3] sched/fair: " Dietmar Eggemann ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-01 19:39 UTC (permalink / raw) To: Peter Zijlstra, linux-kernel Cc: Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du This is an alternative approach to '[RFC PATCH v2] sched: reflect sched_entity movement into task_group's utilization' which requires '[RFC PATCH] sched: fix hierarchical order in rq->leaf_cfs_rq_list' The patch-set doesn't currently handle the impact of cfs throttling on utilization. [patch 1] sched/fair: Aggregate task utilization only on root cfs_rq Prevent the utilization signal maintenance for se/cfs_rq representing task groups other than root task group. [patch 2] sched/fair: Sync se with root cfs_rq Make sure that the se (of a task) is in sync with the root cfs_rq [patch 3] sched/fair: Change @running of __update_load_avg() to @update_util Pass the information whether utilization (besides load) has to be maintained for the container element of @sa. Dietmar Eggemann (3): sched/fair: Aggregate task utilization only on root cfs_rq sched/fair: Sync se with root cfs_rq sched/fair: Change @running of __update_load_avg() to @update_util kernel/sched/fair.c | 81 ++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 61 insertions(+), 20 deletions(-) -- 1.9.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH 1/3] sched/fair: Aggregate task utilization only on root cfs_rq 2016-06-01 19:39 [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Dietmar Eggemann @ 2016-06-01 19:39 ` Dietmar Eggemann 2016-06-02 9:23 ` Juri Lelli 2016-06-01 19:39 ` [RFC PATCH 2/3] sched/fair: Sync se with " Dietmar Eggemann ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-01 19:39 UTC (permalink / raw) To: Peter Zijlstra, linux-kernel Cc: Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du cpu utilization (cpu_util()) is defined as the cpu (original) capacity capped cfs_rq->avg->util_avg signal of the root cfs_rq. With the current pelt version, the utilization of a task [en|de]queued on/from a cfs_rq, representing a task group other than the root task group on a cpu, is not immediately propagated down to the root cfs_rq. This makes decisions based on cpu_util() for scheduling or cpu frequency settings less accurate in case tasks are running in task groups. This patch aggregates the task utilization only on the root cfs_rq, essentially avoiding maintaining utilization for a se/cfs_rq representing task groups other than the root task group (!entity_is_task(se) and &rq_of(cfs_rq)->cfs != cfs_rq). The additional if/else condition to set @update_util in __update_load_avg() is replaced in 'sched/fair: Change @running of __update_load_avg() to @update_util' by providing the information whether utilization has to be maintained via an argument to this function. The additional requirements for the alignment of the last_update_time of a se and the root cfs_rq are handled by the patch 'sched/fair: Sync se with root cfs_rq'. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- kernel/sched/fair.c | 48 ++++++++++++++++++++++++++++++++++++------------ 1 file changed, 36 insertions(+), 12 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 218f8e83db73..212becd3708f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2705,6 +2705,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, u32 contrib; unsigned int delta_w, scaled_delta_w, decayed = 0; unsigned long scale_freq, scale_cpu; + int update_util = 0; delta = now - sa->last_update_time; /* @@ -2725,6 +2726,12 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, return 0; sa->last_update_time = now; + if (cfs_rq) { + if (&rq_of(cfs_rq)->cfs == cfs_rq) + update_util = 1; + } else if (entity_is_task(container_of(sa, struct sched_entity, avg))) + update_util = 1; + scale_freq = arch_scale_freq_capacity(NULL, cpu); scale_cpu = arch_scale_cpu_capacity(NULL, cpu); @@ -2750,7 +2757,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, weight * scaled_delta_w; } } - if (running) + if (update_util && running) sa->util_sum += scaled_delta_w * scale_cpu; delta -= delta_w; @@ -2774,7 +2781,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, if (cfs_rq) cfs_rq->runnable_load_sum += weight * contrib; } - if (running) + if (update_util && running) sa->util_sum += contrib * scale_cpu; } @@ -2785,7 +2792,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, if (cfs_rq) cfs_rq->runnable_load_sum += weight * scaled_delta; } - if (running) + if (update_util && running) sa->util_sum += scaled_delta * scale_cpu; sa->period_contrib += delta; @@ -2796,7 +2803,8 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, cfs_rq->runnable_load_avg = div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX); } - sa->util_avg = sa->util_sum / LOAD_AVG_MAX; + if (update_util) + sa->util_avg = sa->util_sum / LOAD_AVG_MAX; } return decayed; @@ -2918,7 +2926,8 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) removed_load = 1; } - if (atomic_long_read(&cfs_rq->removed_util_avg)) { + if ((&rq_of(cfs_rq)->cfs == cfs_rq) && + atomic_long_read(&cfs_rq->removed_util_avg)) { long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0); sa->util_avg = max_t(long, sa->util_avg - r, 0); sa->util_sum = max_t(s32, sa->util_sum - r * LOAD_AVG_MAX, 0); @@ -2982,8 +2991,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s se->avg.last_update_time = cfs_rq->avg.last_update_time; cfs_rq->avg.load_avg += se->avg.load_avg; cfs_rq->avg.load_sum += se->avg.load_sum; - cfs_rq->avg.util_avg += se->avg.util_avg; - cfs_rq->avg.util_sum += se->avg.util_sum; + + if (!entity_is_task(se)) + return; + + rq_of(cfs_rq)->cfs.avg.util_avg += se->avg.util_avg; + rq_of(cfs_rq)->cfs.avg.util_sum += se->avg.util_sum; cfs_rq_util_change(cfs_rq); } @@ -2996,8 +3009,14 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0); cfs_rq->avg.load_sum = max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0); - cfs_rq->avg.util_avg = max_t(long, cfs_rq->avg.util_avg - se->avg.util_avg, 0); - cfs_rq->avg.util_sum = max_t(s32, cfs_rq->avg.util_sum - se->avg.util_sum, 0); + + if (!entity_is_task(se)) + return; + + rq_of(cfs_rq)->cfs.avg.util_avg = + max_t(long, rq_of(cfs_rq)->cfs.avg.util_avg - se->avg.util_avg, 0); + rq_of(cfs_rq)->cfs.avg.util_sum = + max_t(s32, rq_of(cfs_rq)->cfs.avg.util_sum - se->avg.util_sum, 0); cfs_rq_util_change(cfs_rq); } @@ -3082,7 +3101,11 @@ void remove_entity_load_avg(struct sched_entity *se) __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL); atomic_long_add(se->avg.load_avg, &cfs_rq->removed_load_avg); - atomic_long_add(se->avg.util_avg, &cfs_rq->removed_util_avg); + + if (!entity_is_task(se)) + return; + + atomic_long_add(se->avg.util_avg, &rq_of(cfs_rq)->cfs.removed_util_avg); } static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq) @@ -8460,7 +8483,9 @@ void init_cfs_rq(struct cfs_rq *cfs_rq) #endif #ifdef CONFIG_SMP atomic_long_set(&cfs_rq->removed_load_avg, 0); - atomic_long_set(&cfs_rq->removed_util_avg, 0); + + if (&rq_of(cfs_rq)->cfs == cfs_rq) + atomic_long_set(&cfs_rq->removed_util_avg, 0); #endif } @@ -8525,7 +8550,6 @@ int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) init_cfs_rq(cfs_rq); init_tg_cfs_entry(tg, cfs_rq, se, i, parent->se[i]); init_entity_runnable_average(se); - post_init_entity_util_avg(se); } return 1; -- 1.9.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/3] sched/fair: Aggregate task utilization only on root cfs_rq 2016-06-01 19:39 ` [RFC PATCH 1/3] sched/fair: " Dietmar Eggemann @ 2016-06-02 9:23 ` Juri Lelli 2016-06-02 15:53 ` Dietmar Eggemann 0 siblings, 1 reply; 17+ messages in thread From: Juri Lelli @ 2016-06-02 9:23 UTC (permalink / raw) To: Dietmar Eggemann Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du Hi, minor comment below. On 01/06/16 20:39, Dietmar Eggemann wrote: > cpu utilization (cpu_util()) is defined as the cpu (original) capacity > capped cfs_rq->avg->util_avg signal of the root cfs_rq. > > With the current pelt version, the utilization of a task [en|de]queued > on/from a cfs_rq, representing a task group other than the root task group > on a cpu, is not immediately propagated down to the root cfs_rq. > > This makes decisions based on cpu_util() for scheduling or cpu frequency > settings less accurate in case tasks are running in task groups. > > This patch aggregates the task utilization only on the root cfs_rq, > essentially avoiding maintaining utilization for a se/cfs_rq representing > task groups other than the root task group (!entity_is_task(se) and > &rq_of(cfs_rq)->cfs != cfs_rq). > > The additional if/else condition to set @update_util in > __update_load_avg() is replaced in 'sched/fair: Change @running of > __update_load_avg() to @update_util' by providing the information whether > utilization has to be maintained via an argument to this function. > > The additional requirements for the alignment of the last_update_time of a > se and the root cfs_rq are handled by the patch 'sched/fair: Sync se with > root cfs_rq'. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > kernel/sched/fair.c | 48 ++++++++++++++++++++++++++++++++++++------------ > 1 file changed, 36 insertions(+), 12 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 218f8e83db73..212becd3708f 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2705,6 +2705,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > u32 contrib; > unsigned int delta_w, scaled_delta_w, decayed = 0; > unsigned long scale_freq, scale_cpu; > + int update_util = 0; > > delta = now - sa->last_update_time; > /* > @@ -2725,6 +2726,12 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > return 0; > sa->last_update_time = now; > > + if (cfs_rq) { > + if (&rq_of(cfs_rq)->cfs == cfs_rq) Maybe we can wrap this sort of checks in a static inline improving readability? Best, - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/3] sched/fair: Aggregate task utilization only on root cfs_rq 2016-06-02 9:23 ` Juri Lelli @ 2016-06-02 15:53 ` Dietmar Eggemann 2016-06-02 16:11 ` Juri Lelli 0 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-02 15:53 UTC (permalink / raw) To: Juri Lelli Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 02/06/16 10:23, Juri Lelli wrote: >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 218f8e83db73..212becd3708f 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -2705,6 +2705,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, >> u32 contrib; >> unsigned int delta_w, scaled_delta_w, decayed = 0; >> unsigned long scale_freq, scale_cpu; >> + int update_util = 0; >> >> delta = now - sa->last_update_time; >> /* >> @@ -2725,6 +2726,12 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, >> return 0; >> sa->last_update_time = now; >> >> + if (cfs_rq) { >> + if (&rq_of(cfs_rq)->cfs == cfs_rq) > > Maybe we can wrap this sort of checks in a static inline improving > readability? Something like this? diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 218f8e83db73..01b0fa689ef9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -251,6 +251,18 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq) return cfs_rq->rq; } +/* cfs_rq "owned" by the root task group */ +static inline struct cfs_rq *root_rq_of(struct cfs_rq *cfs_rq) +{ + return &rq_of(cfs_rq)->cfs; +} + +/* Is this cfs_rq "owned" by the root task group ? */ +static inline bool rq_is_root(struct cfs_rq *cfs_rq) +{ + return root_rq_of(cfs_rq) == cfs_rq; +} + /* An entity is a task if it doesn't "own" a runqueue */ #define entity_is_task(se) (!se->my_q) @@ -376,6 +388,16 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq) return container_of(cfs_rq, struct rq, cfs); } +static inline struct cfs_rq *root_rq_of(struct cfs_rq *cfs_rq) +{ + return cfs_rq; +} + +static inline bool rq_is_root(struct cfs_rq *cfs_rq) +{ + return true; +} + #define entity_is_task(se) 1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 1/3] sched/fair: Aggregate task utilization only on root cfs_rq 2016-06-02 15:53 ` Dietmar Eggemann @ 2016-06-02 16:11 ` Juri Lelli 0 siblings, 0 replies; 17+ messages in thread From: Juri Lelli @ 2016-06-02 16:11 UTC (permalink / raw) To: Dietmar Eggemann Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 02/06/16 16:53, Dietmar Eggemann wrote: > On 02/06/16 10:23, Juri Lelli wrote: > > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index 218f8e83db73..212becd3708f 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -2705,6 +2705,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > >> u32 contrib; > >> unsigned int delta_w, scaled_delta_w, decayed = 0; > >> unsigned long scale_freq, scale_cpu; > >> + int update_util = 0; > >> > >> delta = now - sa->last_update_time; > >> /* > >> @@ -2725,6 +2726,12 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > >> return 0; > >> sa->last_update_time = now; > >> > >> + if (cfs_rq) { > >> + if (&rq_of(cfs_rq)->cfs == cfs_rq) > > > > Maybe we can wrap this sort of checks in a static inline improving > > readability? > > Something like this? > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 218f8e83db73..01b0fa689ef9 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -251,6 +251,18 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq) > return cfs_rq->rq; > } > > +/* cfs_rq "owned" by the root task group */ > +static inline struct cfs_rq *root_rq_of(struct cfs_rq *cfs_rq) > +{ > + return &rq_of(cfs_rq)->cfs; > +} > + > +/* Is this cfs_rq "owned" by the root task group ? */ > +static inline bool rq_is_root(struct cfs_rq *cfs_rq) > +{ > + return root_rq_of(cfs_rq) == cfs_rq; > +} > + > /* An entity is a task if it doesn't "own" a runqueue */ > #define entity_is_task(se) (!se->my_q) > > @@ -376,6 +388,16 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq) > return container_of(cfs_rq, struct rq, cfs); > } > > +static inline struct cfs_rq *root_rq_of(struct cfs_rq *cfs_rq) > +{ > + return cfs_rq; > +} > + > +static inline bool rq_is_root(struct cfs_rq *cfs_rq) > +{ > + return true; > +} > + > #define entity_is_task(se) 1 > Looks good to me. Thanks, - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH 2/3] sched/fair: Sync se with root cfs_rq 2016-06-01 19:39 [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Dietmar Eggemann 2016-06-01 19:39 ` [RFC PATCH 1/3] sched/fair: " Dietmar Eggemann @ 2016-06-01 19:39 ` Dietmar Eggemann 2016-06-06 2:59 ` Leo Yan 2016-06-06 12:11 ` Vincent Guittot 2016-06-01 19:39 ` [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util Dietmar Eggemann 2016-06-01 20:10 ` [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Peter Zijlstra 3 siblings, 2 replies; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-01 19:39 UTC (permalink / raw) To: Peter Zijlstra, linux-kernel Cc: Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du Since task utilization is accrued only on the root cfs_rq, there are a couple of places where the se has to be synced with the root cfs_rq: (1) The root cfs_rq has to be updated in attach_entity_load_avg() for an se representing a task in a tg other than the root tg before the se utilization can be added to it. (2) The last_update_time value of the root cfs_rq can be higher than the one of the cfs_rq the se is enqueued in. Call __update_load_avg() on the se with the last_update_time value of the root cfs_rq before removing se's utilization from the root cfs_rq in [remove|detach]_entity_load_avg(). In case the difference between the last_update_time value of the cfs_rq and the root cfs_rq is smaller than 1024ns, the additional calls to __update_load_avg() will bail early. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- kernel/sched/fair.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 212becd3708f..3ae8e79fb687 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2970,6 +2970,8 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg) static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { + struct cfs_rq* root_cfs_rq; + if (!sched_feat(ATTACH_AGE_LOAD)) goto skip_aging; @@ -2995,8 +2997,16 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s if (!entity_is_task(se)) return; - rq_of(cfs_rq)->cfs.avg.util_avg += se->avg.util_avg; - rq_of(cfs_rq)->cfs.avg.util_sum += se->avg.util_sum; + root_cfs_rq = &rq_of(cfs_rq)->cfs; + + if (parent_entity(se)) + __update_load_avg(cfs_rq_clock_task(root_cfs_rq), + cpu_of(rq_of(root_cfs_rq)), &root_cfs_rq->avg, + scale_load_down(root_cfs_rq->load.weight), + upd_util_cfs_rq(root_cfs_rq), root_cfs_rq); + + root_cfs_rq->avg.util_avg += se->avg.util_avg; + root_cfs_rq->avg.util_sum += se->avg.util_sum; cfs_rq_util_change(cfs_rq); } @@ -3013,6 +3023,10 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s if (!entity_is_task(se)) return; + __update_load_avg(rq_of(cfs_rq)->cfs.avg.last_update_time, cpu_of(rq_of(cfs_rq)), + &se->avg, se->on_rq * scale_load_down(se->load.weight), + cfs_rq->curr == se, NULL); + rq_of(cfs_rq)->cfs.avg.util_avg = max_t(long, rq_of(cfs_rq)->cfs.avg.util_avg - se->avg.util_avg, 0); rq_of(cfs_rq)->cfs.avg.util_sum = @@ -3105,6 +3119,9 @@ void remove_entity_load_avg(struct sched_entity *se) if (!entity_is_task(se)) return; + last_update_time = cfs_rq_last_update_time(&rq_of(cfs_rq)->cfs); + + __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL); atomic_long_add(se->avg.util_avg, &rq_of(cfs_rq)->cfs.removed_util_avg); } -- 1.9.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/3] sched/fair: Sync se with root cfs_rq 2016-06-01 19:39 ` [RFC PATCH 2/3] sched/fair: Sync se with " Dietmar Eggemann @ 2016-06-06 2:59 ` Leo Yan 2016-06-06 8:45 ` Dietmar Eggemann 2016-06-06 12:11 ` Vincent Guittot 1 sibling, 1 reply; 17+ messages in thread From: Leo Yan @ 2016-06-06 2:59 UTC (permalink / raw) To: Dietmar Eggemann Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On Wed, Jun 01, 2016 at 08:39:21PM +0100, Dietmar Eggemann wrote: > Since task utilization is accrued only on the root cfs_rq, there are a > couple of places where the se has to be synced with the root cfs_rq: > > (1) The root cfs_rq has to be updated in attach_entity_load_avg() for > an se representing a task in a tg other than the root tg before > the se utilization can be added to it. > > (2) The last_update_time value of the root cfs_rq can be higher > than the one of the cfs_rq the se is enqueued in. Call > __update_load_avg() on the se with the last_update_time value of > the root cfs_rq before removing se's utilization from the root > cfs_rq in [remove|detach]_entity_load_avg(). > > In case the difference between the last_update_time value of the cfs_rq > and the root cfs_rq is smaller than 1024ns, the additional calls to > __update_load_avg() will bail early. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > kernel/sched/fair.c | 21 +++++++++++++++++++-- > 1 file changed, 19 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 212becd3708f..3ae8e79fb687 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2970,6 +2970,8 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg) > > static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) > { > + struct cfs_rq* root_cfs_rq; > + > if (!sched_feat(ATTACH_AGE_LOAD)) > goto skip_aging; > > @@ -2995,8 +2997,16 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s > if (!entity_is_task(se)) > return; > > - rq_of(cfs_rq)->cfs.avg.util_avg += se->avg.util_avg; > - rq_of(cfs_rq)->cfs.avg.util_sum += se->avg.util_sum; > + root_cfs_rq = &rq_of(cfs_rq)->cfs; > + > + if (parent_entity(se)) > + __update_load_avg(cfs_rq_clock_task(root_cfs_rq), > + cpu_of(rq_of(root_cfs_rq)), &root_cfs_rq->avg, > + scale_load_down(root_cfs_rq->load.weight), > + upd_util_cfs_rq(root_cfs_rq), root_cfs_rq); Maybe add below macro in this patch for more readable: #define upd_util_cfs_rq(cfs_rq) XXX > + > + root_cfs_rq->avg.util_avg += se->avg.util_avg; > + root_cfs_rq->avg.util_sum += se->avg.util_sum; > > cfs_rq_util_change(cfs_rq); > } > @@ -3013,6 +3023,10 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s > if (!entity_is_task(se)) > return; > > + __update_load_avg(rq_of(cfs_rq)->cfs.avg.last_update_time, cpu_of(rq_of(cfs_rq)), > + &se->avg, se->on_rq * scale_load_down(se->load.weight), > + cfs_rq->curr == se, NULL); > + > rq_of(cfs_rq)->cfs.avg.util_avg = > max_t(long, rq_of(cfs_rq)->cfs.avg.util_avg - se->avg.util_avg, 0); > rq_of(cfs_rq)->cfs.avg.util_sum = > @@ -3105,6 +3119,9 @@ void remove_entity_load_avg(struct sched_entity *se) > if (!entity_is_task(se)) > return; > > + last_update_time = cfs_rq_last_update_time(&rq_of(cfs_rq)->cfs); > + > + __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL); > atomic_long_add(se->avg.util_avg, &rq_of(cfs_rq)->cfs.removed_util_avg); > } > > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/3] sched/fair: Sync se with root cfs_rq 2016-06-06 2:59 ` Leo Yan @ 2016-06-06 8:45 ` Dietmar Eggemann 0 siblings, 0 replies; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-06 8:45 UTC (permalink / raw) To: Leo Yan Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 06/06/16 03:59, Leo Yan wrote: > On Wed, Jun 01, 2016 at 08:39:21PM +0100, Dietmar Eggemann wrote: [...] >> @@ -2995,8 +2997,16 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s >> if (!entity_is_task(se)) >> return; >> >> - rq_of(cfs_rq)->cfs.avg.util_avg += se->avg.util_avg; >> - rq_of(cfs_rq)->cfs.avg.util_sum += se->avg.util_sum; >> + root_cfs_rq = &rq_of(cfs_rq)->cfs; >> + >> + if (parent_entity(se)) >> + __update_load_avg(cfs_rq_clock_task(root_cfs_rq), >> + cpu_of(rq_of(root_cfs_rq)), &root_cfs_rq->avg, >> + scale_load_down(root_cfs_rq->load.weight), >> + upd_util_cfs_rq(root_cfs_rq), root_cfs_rq); > > Maybe add below macro in this patch for more readable: > #define upd_util_cfs_rq(cfs_rq) XXX Sorry, this doesn't belong here since the macro is only introduced in 3/3. Should be 'root_cfs_rq->curr != NULL' here instead. [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 2/3] sched/fair: Sync se with root cfs_rq 2016-06-01 19:39 ` [RFC PATCH 2/3] sched/fair: Sync se with " Dietmar Eggemann 2016-06-06 2:59 ` Leo Yan @ 2016-06-06 12:11 ` Vincent Guittot 1 sibling, 0 replies; 17+ messages in thread From: Vincent Guittot @ 2016-06-06 12:11 UTC (permalink / raw) To: Dietmar Eggemann Cc: Peter Zijlstra, linux-kernel, Ben Segall, Morten Rasmussen, Yuyang Du Hi Dietmar, On 1 June 2016 at 21:39, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > Since task utilization is accrued only on the root cfs_rq, there are a > couple of places where the se has to be synced with the root cfs_rq: > > (1) The root cfs_rq has to be updated in attach_entity_load_avg() for > an se representing a task in a tg other than the root tg before > the se utilization can be added to it. > > (2) The last_update_time value of the root cfs_rq can be higher > than the one of the cfs_rq the se is enqueued in. Call > __update_load_avg() on the se with the last_update_time value of > the root cfs_rq before removing se's utilization from the root > cfs_rq in [remove|detach]_entity_load_avg(). > > In case the difference between the last_update_time value of the cfs_rq > and the root cfs_rq is smaller than 1024ns, the additional calls to > __update_load_avg() will bail early. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > kernel/sched/fair.c | 21 +++++++++++++++++++-- > 1 file changed, 19 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 212becd3708f..3ae8e79fb687 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2970,6 +2970,8 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg) > > static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) > { > + struct cfs_rq* root_cfs_rq; > + > if (!sched_feat(ATTACH_AGE_LOAD)) > goto skip_aging; > > @@ -2995,8 +2997,16 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s > if (!entity_is_task(se)) > return; > > - rq_of(cfs_rq)->cfs.avg.util_avg += se->avg.util_avg; > - rq_of(cfs_rq)->cfs.avg.util_sum += se->avg.util_sum; > + root_cfs_rq = &rq_of(cfs_rq)->cfs; > + > + if (parent_entity(se)) > + __update_load_avg(cfs_rq_clock_task(root_cfs_rq), > + cpu_of(rq_of(root_cfs_rq)), &root_cfs_rq->avg, > + scale_load_down(root_cfs_rq->load.weight), > + upd_util_cfs_rq(root_cfs_rq), root_cfs_rq); > + > + root_cfs_rq->avg.util_avg += se->avg.util_avg; > + root_cfs_rq->avg.util_sum += se->avg.util_sum; The main issue with flat utilization is that we can't keep the sched_avg on an sched_entity synced (from a last_update_time pov) with both the cfs_rq on which load is attached and the root_cfs rq on which the utilization is attached. With this additional sync to root cfs_rq in attach/detach_entity_load_avg and in remove_entity_load_avg, the load of a sched_entity is no more synced to the time stamp of cfs_rq onto which it is attached. This can generate several wrong update of the load of the latter. As an example, lets take a task TA that sleeps and move it on TGB which has not run recently so TGB.avg.last_update_time << root cfs_rq.avg.last_update_time (a decay of 20ms remove 35% of the load) When we attach TA to TGB, TA is sync with TGB for attaching it and then decayed to be synced with root cfs_rq. If TA is then moved to another task group, we try to sync TA to TGB but TA is in the future so TA.avg.last_update_time is set to TGB one. Then, TA load is removed to TGB but TA load has been decayed so only a part will be effectively subtracted. Then, TA load is synced with root cfs_rq which means decayed one more time for the same time slot because TA.avg.last_update_time has been reset to TGB.avg.last_update_time so we will substract less utilization than what we should in root cfs_rq. I think that similar behavior can apply with the removed load. > > cfs_rq_util_change(cfs_rq); > } > @@ -3013,6 +3023,10 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s > if (!entity_is_task(se)) > return; > > + __update_load_avg(rq_of(cfs_rq)->cfs.avg.last_update_time, cpu_of(rq_of(cfs_rq)), > + &se->avg, se->on_rq * scale_load_down(se->load.weight), > + cfs_rq->curr == se, NULL); > + > rq_of(cfs_rq)->cfs.avg.util_avg = > max_t(long, rq_of(cfs_rq)->cfs.avg.util_avg - se->avg.util_avg, 0); > rq_of(cfs_rq)->cfs.avg.util_sum = > @@ -3105,6 +3119,9 @@ void remove_entity_load_avg(struct sched_entity *se) > if (!entity_is_task(se)) > return; > > + last_update_time = cfs_rq_last_update_time(&rq_of(cfs_rq)->cfs); > + > + __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL); > atomic_long_add(se->avg.util_avg, &rq_of(cfs_rq)->cfs.removed_util_avg); > } > > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util 2016-06-01 19:39 [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Dietmar Eggemann 2016-06-01 19:39 ` [RFC PATCH 1/3] sched/fair: " Dietmar Eggemann 2016-06-01 19:39 ` [RFC PATCH 2/3] sched/fair: Sync se with " Dietmar Eggemann @ 2016-06-01 19:39 ` Dietmar Eggemann 2016-06-01 20:11 ` Peter Zijlstra 2016-06-02 9:25 ` Juri Lelli 2016-06-01 20:10 ` [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Peter Zijlstra 3 siblings, 2 replies; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-01 19:39 UTC (permalink / raw) To: Peter Zijlstra, linux-kernel Cc: Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du The information whether a se/cfs_rq should get its load and utilization (se representing a task and root cfs_rq) or only its load (se representing a task group and cfs_rq owned by this se) updated can be passed into __update_load_avg() to avoid the additional if/else condition to set update_util. @running is changed to @update_util which now carries the information if the utilization of the se/cfs_rq should be updated and if the se/cfs_rq is running or not. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- kernel/sched/fair.c | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3ae8e79fb687..a1c13975cf56 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2669,6 +2669,10 @@ static u32 __compute_runnable_contrib(u64 n) #define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT) +#define upd_util_se(se, rng) ((entity_is_task(se) << 1) | (rng)) +#define upd_util_cfs_rq(cfs_rq) \ + (((&rq_of(cfs_rq)->cfs == cfs_rq) << 1) | !!cfs_rq->curr) + /* * We can represent the historical contribution to runnable average as the * coefficients of a geometric series. To do this we sub-divide our runnable @@ -2699,13 +2703,12 @@ static u32 __compute_runnable_contrib(u64 n) */ static __always_inline int __update_load_avg(u64 now, int cpu, struct sched_avg *sa, - unsigned long weight, int running, struct cfs_rq *cfs_rq) + unsigned long weight, int update_util, struct cfs_rq *cfs_rq) { u64 delta, scaled_delta, periods; u32 contrib; unsigned int delta_w, scaled_delta_w, decayed = 0; unsigned long scale_freq, scale_cpu; - int update_util = 0; delta = now - sa->last_update_time; /* @@ -2726,12 +2729,6 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, return 0; sa->last_update_time = now; - if (cfs_rq) { - if (&rq_of(cfs_rq)->cfs == cfs_rq) - update_util = 1; - } else if (entity_is_task(container_of(sa, struct sched_entity, avg))) - update_util = 1; - scale_freq = arch_scale_freq_capacity(NULL, cpu); scale_cpu = arch_scale_cpu_capacity(NULL, cpu); @@ -2757,7 +2754,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, weight * scaled_delta_w; } } - if (update_util && running) + if (update_util == 0x3) sa->util_sum += scaled_delta_w * scale_cpu; delta -= delta_w; @@ -2781,7 +2778,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, if (cfs_rq) cfs_rq->runnable_load_sum += weight * contrib; } - if (update_util && running) + if (update_util == 0x3) sa->util_sum += contrib * scale_cpu; } @@ -2792,7 +2789,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, if (cfs_rq) cfs_rq->runnable_load_sum += weight * scaled_delta; } - if (update_util && running) + if (update_util == 0x3) sa->util_sum += scaled_delta * scale_cpu; sa->period_contrib += delta; @@ -2803,7 +2800,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, cfs_rq->runnable_load_avg = div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX); } - if (update_util) + if (update_util & 0x2) sa->util_avg = sa->util_sum / LOAD_AVG_MAX; } @@ -2873,7 +2870,7 @@ void set_task_rq_fair(struct sched_entity *se, n_last_update_time = next->avg.last_update_time; #endif __update_load_avg(p_last_update_time, cpu_of(rq_of(prev)), - &se->avg, 0, 0, NULL); + &se->avg, 0, upd_util_se(se, 0), NULL); se->avg.last_update_time = n_last_update_time; } } @@ -2935,7 +2932,8 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) } decayed = __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa, - scale_load_down(cfs_rq->load.weight), cfs_rq->curr != NULL, cfs_rq); + scale_load_down(cfs_rq->load.weight), upd_util_cfs_rq(cfs_rq), + cfs_rq); #ifndef CONFIG_64BIT smp_wmb(); @@ -2962,7 +2960,7 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg) */ __update_load_avg(now, cpu, &se->avg, se->on_rq * scale_load_down(se->load.weight), - cfs_rq->curr == se, NULL); + upd_util_se(se, cfs_rq->curr == se), NULL); if (update_cfs_rq_load_avg(now, cfs_rq, true) && update_tg) update_tg_load_avg(cfs_rq, 0); @@ -2981,7 +2979,7 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s */ if (se->avg.last_update_time) { __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)), - &se->avg, 0, 0, NULL); + &se->avg, 0, upd_util_se(se, 0), NULL); /* * XXX: we could have just aged the entire load away if we've been @@ -3015,7 +3013,7 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s { __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, se->on_rq * scale_load_down(se->load.weight), - cfs_rq->curr == se, NULL); + upd_util_se(se, cfs_rq->curr == se), NULL); cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0); cfs_rq->avg.load_sum = max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0); @@ -3025,7 +3023,7 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s __update_load_avg(rq_of(cfs_rq)->cfs.avg.last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, se->on_rq * scale_load_down(se->load.weight), - cfs_rq->curr == se, NULL); + upd_util_se(se, cfs_rq->curr == se), NULL); rq_of(cfs_rq)->cfs.avg.util_avg = max_t(long, rq_of(cfs_rq)->cfs.avg.util_avg - se->avg.util_avg, 0); @@ -3047,7 +3045,7 @@ enqueue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) if (!migrated) { __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa, se->on_rq * scale_load_down(se->load.weight), - cfs_rq->curr == se, NULL); + upd_util_se(se, cfs_rq->curr == se), NULL); } decayed = update_cfs_rq_load_avg(now, cfs_rq, !migrated); @@ -3113,7 +3111,8 @@ void remove_entity_load_avg(struct sched_entity *se) last_update_time = cfs_rq_last_update_time(cfs_rq); - __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL); + __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, + upd_util_se(se, 0), NULL); atomic_long_add(se->avg.load_avg, &cfs_rq->removed_load_avg); if (!entity_is_task(se)) @@ -3121,7 +3120,8 @@ void remove_entity_load_avg(struct sched_entity *se) last_update_time = cfs_rq_last_update_time(&rq_of(cfs_rq)->cfs); - __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL); + __update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, + upd_util_se(se, 0), NULL); atomic_long_add(se->avg.util_avg, &rq_of(cfs_rq)->cfs.removed_util_avg); } -- 1.9.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util 2016-06-01 19:39 ` [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util Dietmar Eggemann @ 2016-06-01 20:11 ` Peter Zijlstra 2016-06-02 15:59 ` Dietmar Eggemann 2016-06-02 9:25 ` Juri Lelli 1 sibling, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2016-06-01 20:11 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On Wed, Jun 01, 2016 at 08:39:22PM +0100, Dietmar Eggemann wrote: > The information whether a se/cfs_rq should get its load and > utilization (se representing a task and root cfs_rq) or only its load > (se representing a task group and cfs_rq owned by this se) updated can > be passed into __update_load_avg() to avoid the additional if/else > condition to set update_util. > > @running is changed to @update_util which now carries the information if > the utilization of the se/cfs_rq should be updated and if the se/cfs_rq > is running or not. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > kernel/sched/fair.c | 42 +++++++++++++++++++++--------------------- > 1 file changed, 21 insertions(+), 21 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 3ae8e79fb687..a1c13975cf56 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2669,6 +2669,10 @@ static u32 __compute_runnable_contrib(u64 n) > > #define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT) > > +#define upd_util_se(se, rng) ((entity_is_task(se) << 1) | (rng)) Just saying that on first reading that went: Random Number Generator, uh what?! So maybe pick better names? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util 2016-06-01 20:11 ` Peter Zijlstra @ 2016-06-02 15:59 ` Dietmar Eggemann 0 siblings, 0 replies; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-02 15:59 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 01/06/16 21:11, Peter Zijlstra wrote: > On Wed, Jun 01, 2016 at 08:39:22PM +0100, Dietmar Eggemann wrote: >> The information whether a se/cfs_rq should get its load and >> utilization (se representing a task and root cfs_rq) or only its load >> (se representing a task group and cfs_rq owned by this se) updated can >> be passed into __update_load_avg() to avoid the additional if/else >> condition to set update_util. >> >> @running is changed to @update_util which now carries the information if >> the utilization of the se/cfs_rq should be updated and if the se/cfs_rq >> is running or not. >> >> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> >> --- >> kernel/sched/fair.c | 42 +++++++++++++++++++++--------------------- >> 1 file changed, 21 insertions(+), 21 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 3ae8e79fb687..a1c13975cf56 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -2669,6 +2669,10 @@ static u32 __compute_runnable_contrib(u64 n) >> >> #define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT) >> >> +#define upd_util_se(se, rng) ((entity_is_task(se) << 1) | (rng)) > > Just saying that on first reading that went: Random Number Generator, uh > what?! > > So maybe pick better names? Yeah, can do. What about? #define update_util_se(se, running) ((entity_is_task(se) << 1) | (running)) #define update_util_rq(cfs_rq) ((rq_is_root(cfs_rq) << 1) | !!(cfs_rq)->curr) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util 2016-06-01 19:39 ` [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util Dietmar Eggemann 2016-06-01 20:11 ` Peter Zijlstra @ 2016-06-02 9:25 ` Juri Lelli 2016-06-02 17:27 ` Dietmar Eggemann 1 sibling, 1 reply; 17+ messages in thread From: Juri Lelli @ 2016-06-02 9:25 UTC (permalink / raw) To: Dietmar Eggemann Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du Hi, another minor comment below. :-) On 01/06/16 20:39, Dietmar Eggemann wrote: > The information whether a se/cfs_rq should get its load and > utilization (se representing a task and root cfs_rq) or only its load > (se representing a task group and cfs_rq owned by this se) updated can > be passed into __update_load_avg() to avoid the additional if/else > condition to set update_util. > > @running is changed to @update_util which now carries the information if > the utilization of the se/cfs_rq should be updated and if the se/cfs_rq > is running or not. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > kernel/sched/fair.c | 42 +++++++++++++++++++++--------------------- > 1 file changed, 21 insertions(+), 21 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 3ae8e79fb687..a1c13975cf56 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2669,6 +2669,10 @@ static u32 __compute_runnable_contrib(u64 n) > > #define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT) > > +#define upd_util_se(se, rng) ((entity_is_task(se) << 1) | (rng)) > +#define upd_util_cfs_rq(cfs_rq) \ > + (((&rq_of(cfs_rq)->cfs == cfs_rq) << 1) | !!cfs_rq->curr) > + > /* > * We can represent the historical contribution to runnable average as the > * coefficients of a geometric series. To do this we sub-divide our runnable > @@ -2699,13 +2703,12 @@ static u32 __compute_runnable_contrib(u64 n) > */ > static __always_inline int > __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > - unsigned long weight, int running, struct cfs_rq *cfs_rq) > + unsigned long weight, int update_util, struct cfs_rq *cfs_rq) > { > u64 delta, scaled_delta, periods; > u32 contrib; > unsigned int delta_w, scaled_delta_w, decayed = 0; > unsigned long scale_freq, scale_cpu; > - int update_util = 0; > > delta = now - sa->last_update_time; > /* > @@ -2726,12 +2729,6 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > return 0; > sa->last_update_time = now; > > - if (cfs_rq) { > - if (&rq_of(cfs_rq)->cfs == cfs_rq) > - update_util = 1; > - } else if (entity_is_task(container_of(sa, struct sched_entity, avg))) > - update_util = 1; > - > scale_freq = arch_scale_freq_capacity(NULL, cpu); > scale_cpu = arch_scale_cpu_capacity(NULL, cpu); > > @@ -2757,7 +2754,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > weight * scaled_delta_w; > } > } > - if (update_util && running) > + if (update_util == 0x3) How about a define for these masks? Best, - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util 2016-06-02 9:25 ` Juri Lelli @ 2016-06-02 17:27 ` Dietmar Eggemann 2016-06-03 10:56 ` Juri Lelli 0 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-02 17:27 UTC (permalink / raw) To: Juri Lelli Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 02/06/16 10:25, Juri Lelli wrote: [...] >> @@ -2757,7 +2754,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, >> weight * scaled_delta_w; >> } >> } >> - if (update_util && running) >> + if (update_util == 0x3) > > How about a define for these masks? Something like this? +#define UTIL_RUNNING 1 +#define UTIL_UPDATE 2 + /* * We can represent the historical contribution to runnable average as the * coefficients of a geometric series. To do this we sub-divide our runnable @@ -2724,7 +2727,7 @@ static u32 __compute_runnable_contrib(u64 n) */ static __always_inline int __update_load_avg(u64 now, int cpu, struct sched_avg *sa, - unsigned long weight, int update_util, struct cfs_rq *cfs_rq) + unsigned long weight, int util_flags, struct cfs_rq *cfs_rq) { u64 delta, scaled_delta, periods; u32 contrib; @@ -2775,7 +2778,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, weight * scaled_delta_w; } } - if (update_util == 0x3) + if (util_flags == (UTIL_UPDATE | UTIL_RUNNING)) sa->util_sum += scaled_delta_w * scale_cpu; ... ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util 2016-06-02 17:27 ` Dietmar Eggemann @ 2016-06-03 10:56 ` Juri Lelli 0 siblings, 0 replies; 17+ messages in thread From: Juri Lelli @ 2016-06-03 10:56 UTC (permalink / raw) To: Dietmar Eggemann Cc: Peter Zijlstra, linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 02/06/16 18:27, Dietmar Eggemann wrote: > On 02/06/16 10:25, Juri Lelli wrote: > > [...] > > >> @@ -2757,7 +2754,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > >> weight * scaled_delta_w; > >> } > >> } > >> - if (update_util && running) > >> + if (update_util == 0x3) > > > > How about a define for these masks? > > Something like this? > > +#define UTIL_RUNNING 1 > +#define UTIL_UPDATE 2 Make these 0x01 and 0x02, I'd say. > + > /* > * We can represent the historical contribution to runnable average as the > * coefficients of a geometric series. To do this we sub-divide our runnable > @@ -2724,7 +2727,7 @@ static u32 __compute_runnable_contrib(u64 n) > */ > static __always_inline int > __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > - unsigned long weight, int update_util, struct cfs_rq *cfs_rq) > + unsigned long weight, int util_flags, struct cfs_rq *cfs_rq) > { > u64 delta, scaled_delta, periods; > u32 contrib; > @@ -2775,7 +2778,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > weight * scaled_delta_w; > } > } > - if (update_util == 0x3) > + if (util_flags == (UTIL_UPDATE | UTIL_RUNNING)) Looks more readable to me. :-) Best, - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq 2016-06-01 19:39 [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Dietmar Eggemann ` (2 preceding siblings ...) 2016-06-01 19:39 ` [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util Dietmar Eggemann @ 2016-06-01 20:10 ` Peter Zijlstra 2016-06-02 15:40 ` Dietmar Eggemann 3 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2016-06-01 20:10 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On Wed, Jun 01, 2016 at 08:39:19PM +0100, Dietmar Eggemann wrote: > This is an alternative approach to '[RFC PATCH v2] sched: reflect > sched_entity movement into task_group's utilization' which requires > '[RFC PATCH] sched: fix hierarchical order in rq->leaf_cfs_rq_list' Awesome, a 3rd alternative :-) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq 2016-06-01 20:10 ` [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Peter Zijlstra @ 2016-06-02 15:40 ` Dietmar Eggemann 0 siblings, 0 replies; 17+ messages in thread From: Dietmar Eggemann @ 2016-06-02 15:40 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Vincent Guittot, Ben Segall, Morten Rasmussen, Yuyang Du On 01/06/16 21:10, Peter Zijlstra wrote: > On Wed, Jun 01, 2016 at 08:39:19PM +0100, Dietmar Eggemann wrote: >> This is an alternative approach to '[RFC PATCH v2] sched: reflect >> sched_entity movement into task_group's utilization' which requires >> '[RFC PATCH] sched: fix hierarchical order in rq->leaf_cfs_rq_list' > > Awesome, a 3rd alternative :-) Yeah, sorry, but at least it runs op tip/sched/core w/o any other PELT related changes. Moreover, it's less code changes compared to the reflect-thing. And Yuyang's implementation depended on the optimization of _update_load_avg() (1us->1ms) which IMHO turned out to be not working. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2016-06-06 12:11 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-01 19:39 [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Dietmar Eggemann 2016-06-01 19:39 ` [RFC PATCH 1/3] sched/fair: " Dietmar Eggemann 2016-06-02 9:23 ` Juri Lelli 2016-06-02 15:53 ` Dietmar Eggemann 2016-06-02 16:11 ` Juri Lelli 2016-06-01 19:39 ` [RFC PATCH 2/3] sched/fair: Sync se with " Dietmar Eggemann 2016-06-06 2:59 ` Leo Yan 2016-06-06 8:45 ` Dietmar Eggemann 2016-06-06 12:11 ` Vincent Guittot 2016-06-01 19:39 ` [RFC PATCH 3/3] sched/fair: Change @running of __update_load_avg() to @update_util Dietmar Eggemann 2016-06-01 20:11 ` Peter Zijlstra 2016-06-02 15:59 ` Dietmar Eggemann 2016-06-02 9:25 ` Juri Lelli 2016-06-02 17:27 ` Dietmar Eggemann 2016-06-03 10:56 ` Juri Lelli 2016-06-01 20:10 ` [RFC PATCH 0/3] Aggregate task utilization only on root cfs_rq Peter Zijlstra 2016-06-02 15:40 ` Dietmar Eggemann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).