[RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection
@ 2017-03-24 14:08 Juri Lelli
  2017-03-24 14:08 ` [RFD PATCH 1/5] sched/cpufreq_schedutil: make use of DEADLINE utilization signal Juri Lelli
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-24 14:08 UTC (permalink / raw)
  To: peterz, mingo, rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, tglx, vincent.guittot, rostedt,
	luca.abeni, claudio, tommaso.cucinotta, bristot, mathieu.poirier,
	tkjos, joelaf, andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, juri.lelli

Hi,

this is a very exploratory set implementing frequency/cpu invariance and OPP
selection for SCHED_DEADLINE. The set has been slightly tested on a Juno
platform. While the actual implementation is very premature, I'm posting this
early to facilitate discussion at OSPM-summit [1].

Results of the testing, highlighting why these features are useful are
available here:

 - without the set
   https://gist.github.com/a6e3ee99cec32e00cc537b53cd3d54d2

 - with the set
   https://gist.github.com/1f7d485fc3ce9234fe627dcb53b2935c

The set is based on tip/sched/core as of today (bc4278987e38) plus a couple of
schedutil fixes coming from linux-pm/linux-next and Luca's "CPU reclaiming for
SCHED_DEADLINE" v5 [2].

Patches high level description:

 o [01-02]/05 add the necessary links to start accounting DEADLINE contribution
              to OPP selection 
 o 03/05      it's an hack to make possible (on ARM) to change frequency for
              DEADLINE tasks (that would possibly delay the SCHED_FIFO worker
              kthread); suggestions on how to do this properly are very welcome
 o 04/05      it's a schedutil change that copes with the fact that DEADLINE
              doesn't require periodic OPP selection triggering point
 o 05/05      implements frequency/cpu invariance for tasks' reservation
              parameters*; which basically means that we implement GRUB-PA [3]

Please have a look. Feedback on how we want to shape this is the sole purpose
of this posting.

In case you would like to test this out:

 git://linux-arm.org/linux-jl.git upstream/deadline/freq-rfd

Best,

- Juri

[1] http://retis.sssup.it/ospm-summit/index.html
[2] https://marc.info/?l=linux-kernel&m=149029880524038
[3] C. Scordino, G. Lipari, A Resource Reservation Algorithm for Power-Aware
    Scheduling of Periodic and Aperiodic Real-Time Tasks, IEEE Transactions
    on Computers, December 2006.

* Notice that this currently breaks !CONFIG_SMP, as arch_scale_{freq,cpu}
_capacity and SCHED_CAPACITY_SCALE are only defined on CONFIG_SMP. Fixing
this particular issue is straightforward, but we should probably look into
making frequency scaling (and PELT averages) available on !CONFIG_SMP as well
(so that schedutil can work on such configurations for example). Since this is
only an RFD and since a proper rework might be non trivial, I decided to leave
it out of scope for the time being.

Juri Lelli (5):
  sched/cpufreq_schedutil: make use of DEADLINE utilization signal
  sched/deadline: move cpu frequency selection triggering points
  sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  sched/cpufreq_schedutil: always consider all CPUs when deciding next
    freq
  sched/deadline: make bandwidth enforcement scale-invariant

 include/linux/sched.h            |  1 +
 include/linux/sched/cpufreq.h    |  2 --
 include/uapi/linux/sched.h       |  1 +
 kernel/sched/core.c              | 19 +++++++++++++++++--
 kernel/sched/cpufreq_schedutil.c | 37 ++++++++++++++++++++++---------------
 kernel/sched/deadline.c          | 40 +++++++++++++++++++++++++++++++++-------
 kernel/sched/fair.c              |  2 --
 kernel/sched/sched.h             | 10 +++++++++-
 8 files changed, 83 insertions(+), 29 deletions(-)

-- 
2.10.0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFD PATCH 1/5] sched/cpufreq_schedutil: make use of DEADLINE utilization signal
  2017-03-24 14:08 [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection Juri Lelli
@ 2017-03-24 14:08 ` Juri Lelli
  2017-03-24 14:08 ` [RFD PATCH 2/5] sched/deadline: move cpu frequency selection triggering points Juri Lelli
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-24 14:08 UTC (permalink / raw)
  To: peterz, mingo, rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, tglx, vincent.guittot, rostedt,
	luca.abeni, claudio, tommaso.cucinotta, bristot, mathieu.poirier,
	tkjos, joelaf, andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, juri.lelli, Ingo Molnar, Rafael J . Wysocki

SCHED_DEADLINE tracks active utilization signal with a per rq variable
named running_bw.

Make use of that to drive cpu frequency selection: add up FAIR and
DEADLINE contribution to get the required CPU capacity to handle both
requirements.

Co-authored-by: Claudio Scordino <claudio@evidence.eu.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
---
 include/linux/sched/cpufreq.h    |  2 --
 kernel/sched/cpufreq_schedutil.c | 13 ++++++-------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/linux/sched/cpufreq.h b/include/linux/sched/cpufreq.h
index d2be2ccbb372..39640bb3a8ee 100644
--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -11,8 +11,6 @@
 #define SCHED_CPUFREQ_DL	(1U << 1)
 #define SCHED_CPUFREQ_IOWAIT	(1U << 2)
 
-#define SCHED_CPUFREQ_RT_DL	(SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
-
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
        void (*func)(struct update_util_data *data, u64 time, unsigned int flags);
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index f5ffe241812e..05f5625ea005 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -154,12 +154,11 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
 static void sugov_get_util(unsigned long *util, unsigned long *max)
 {
 	struct rq *rq = this_rq();
-	unsigned long cfs_max;
+	unsigned long dl_util = (rq->dl.running_bw * SCHED_CAPACITY_SCALE) >> 20;
 
-	cfs_max = arch_scale_cpu_capacity(NULL, smp_processor_id());
+	*max = arch_scale_cpu_capacity(NULL, smp_processor_id());
 
-	*util = min(rq->cfs.avg.util_avg, cfs_max);
-	*max = cfs_max;
+	*util = min(rq->cfs.avg.util_avg + dl_util, *max);
 }
 
 static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
@@ -207,7 +206,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
 	if (!sugov_should_update_freq(sg_policy, time))
 		return;
 
-	if (flags & SCHED_CPUFREQ_RT_DL) {
+	if (flags & SCHED_CPUFREQ_RT) {
 		next_f = policy->cpuinfo.max_freq;
 	} else {
 		sugov_get_util(&util, &max);
@@ -242,7 +241,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
 			j_sg_cpu->iowait_boost = 0;
 			continue;
 		}
-		if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
+		if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
 			return policy->cpuinfo.max_freq;
 
 		j_util = j_sg_cpu->util;
@@ -278,7 +277,7 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
 	sg_cpu->last_update = time;
 
 	if (sugov_should_update_freq(sg_policy, time)) {
-		if (flags & SCHED_CPUFREQ_RT_DL)
+		if (flags & SCHED_CPUFREQ_RT)
 			next_f = sg_policy->policy->cpuinfo.max_freq;
 		else
 			next_f = sugov_next_freq_shared(sg_cpu);
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFD PATCH 2/5] sched/deadline: move cpu frequency selection triggering points
  2017-03-24 14:08 [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection Juri Lelli
  2017-03-24 14:08 ` [RFD PATCH 1/5] sched/cpufreq_schedutil: make use of DEADLINE utilization signal Juri Lelli
@ 2017-03-24 14:08 ` Juri Lelli
  2017-03-24 14:08 ` [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE Juri Lelli
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-24 14:08 UTC (permalink / raw)
  To: peterz, mingo, rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, tglx, vincent.guittot, rostedt,
	luca.abeni, claudio, tommaso.cucinotta, bristot, mathieu.poirier,
	tkjos, joelaf, andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, juri.lelli, Ingo Molnar, Rafael J . Wysocki

Since SCHED_DEADLINE doesn't track utilization signal (but reserves a
fraction of CPU bandwidth to tasks admitted to the system), there is no
point in evaluating frequency changes during each tick event.

Move frequency selection triggering points to where running_bw changes.

Co-authored-by: Claudio Scordino <claudio@evidence.eu.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
---
 kernel/sched/deadline.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 55471016d73c..5c1a205e830f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -52,6 +52,8 @@ void add_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 	dl_rq->running_bw += dl_bw;
 	SCHED_WARN_ON(dl_rq->running_bw < old); /* overflow */
 	SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
+	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
+	cpufreq_update_this_cpu(rq_of_dl_rq(dl_rq), SCHED_CPUFREQ_DL);
 }
 
 static inline
@@ -64,6 +66,8 @@ void sub_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 	SCHED_WARN_ON(dl_rq->running_bw > old); /* underflow */
 	if (dl_rq->running_bw > old)
 		dl_rq->running_bw = 0;
+	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
+	cpufreq_update_this_cpu(rq_of_dl_rq(dl_rq), SCHED_CPUFREQ_DL);
 }
 
 static inline
@@ -953,9 +957,6 @@ static void update_curr_dl(struct rq *rq)
 		return;
 	}
 
-	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
-	cpufreq_update_this_cpu(rq, SCHED_CPUFREQ_DL);
-
 	schedstat_set(curr->se.statistics.exec_max,
 		      max(curr->se.statistics.exec_max, delta_exec));
 
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-24 14:08 [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection Juri Lelli
  2017-03-24 14:08 ` [RFD PATCH 1/5] sched/cpufreq_schedutil: make use of DEADLINE utilization signal Juri Lelli
  2017-03-24 14:08 ` [RFD PATCH 2/5] sched/deadline: move cpu frequency selection triggering points Juri Lelli
@ 2017-03-24 14:08 ` Juri Lelli
  2017-03-27 16:50   ` Peter Zijlstra
  2017-03-24 14:08 ` [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq Juri Lelli
  2017-03-24 14:09 ` [RFD PATCH 5/5] sched/deadline: make bandwidth enforcement scale-invariant Juri Lelli
  4 siblings, 1 reply; 25+ messages in thread
From: Juri Lelli @ 2017-03-24 14:08 UTC (permalink / raw)
  To: peterz, mingo, rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, tglx, vincent.guittot, rostedt,
	luca.abeni, claudio, tommaso.cucinotta, bristot, mathieu.poirier,
	tkjos, joelaf, andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, juri.lelli, Ingo Molnar, Rafael J . Wysocki

Worker kthread needs to be able to change frequency for all other
threads.

Make it special, just under STOP class.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Claudio Scordino <claudio@evidence.eu.com>
---
 include/linux/sched.h            |  1 +
 include/uapi/linux/sched.h       |  1 +
 kernel/sched/core.c              | 19 +++++++++++++++++--
 kernel/sched/cpufreq_schedutil.c | 15 ++++++++++++---
 kernel/sched/deadline.c          |  6 ++++++
 kernel/sched/sched.h             |  8 +++++++-
 6 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 952cac87e433..6f508980f320 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1351,6 +1351,7 @@ extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, const struct sched_param *);
 extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct sched_param *);
 extern int sched_setattr(struct task_struct *, const struct sched_attr *);
+extern int sched_setattr_nocheck(struct task_struct *, const struct sched_attr *);
 extern struct task_struct *idle_task(int cpu);
 
 /**
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index e2a6c7b3510b..72723859ef74 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -48,5 +48,6 @@
  */
 #define SCHED_FLAG_RESET_ON_FORK	0x01
 #define SCHED_FLAG_RECLAIM		0x02
+#define SCHED_FLAG_SPECIAL		0x04
 
 #endif /* _UAPI_LINUX_SCHED_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 378d402ee7a6..9b211c77cb54 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2495,6 +2495,9 @@ static int dl_overflow(struct task_struct *p, int policy,
 	u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
 	int cpus, err = -1;
 
+	if (attr->sched_flags & SCHED_FLAG_SPECIAL)
+		return 0;
+
 	/* !deadline task may carry old deadline bandwidth */
 	if (new_bw == p->dl.dl_bw && task_has_dl_policy(p))
 		return 0;
@@ -4052,6 +4055,10 @@ __getparam_dl(struct task_struct *p, struct sched_attr *attr)
 static bool
 __checkparam_dl(const struct sched_attr *attr)
 {
+	/* special dl tasks don't actually use any parameter */
+	if (attr->sched_flags & SCHED_FLAG_SPECIAL)
+		return true;
+
 	/* deadline != 0 */
 	if (attr->sched_deadline == 0)
 		return false;
@@ -4138,7 +4145,9 @@ static int __sched_setscheduler(struct task_struct *p,
 	}
 
 	if (attr->sched_flags &
-		~(SCHED_FLAG_RESET_ON_FORK | SCHED_FLAG_RECLAIM))
+		~(SCHED_FLAG_RESET_ON_FORK |
+		  SCHED_FLAG_RECLAIM |
+		  SCHED_FLAG_SPECIAL))
 		return -EINVAL;
 
 	/*
@@ -4260,7 +4269,8 @@ static int __sched_setscheduler(struct task_struct *p,
 		}
 #endif
 #ifdef CONFIG_SMP
-		if (dl_bandwidth_enabled() && dl_policy(policy)) {
+		if (dl_bandwidth_enabled() && dl_policy(policy) &&
+				!(attr->sched_flags & SCHED_FLAG_SPECIAL)) {
 			cpumask_t *span = rq->rd->span;
 
 			/*
@@ -4390,6 +4400,11 @@ int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
 }
 EXPORT_SYMBOL_GPL(sched_setattr);
 
+int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
+{
+	return __sched_setscheduler(p, attr, false, true);
+}
+
 /**
  * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
  * @p: the task in question.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 05f5625ea005..da67a1cf91e7 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -394,7 +394,16 @@ static void sugov_policy_free(struct sugov_policy *sg_policy)
 static int sugov_kthread_create(struct sugov_policy *sg_policy)
 {
 	struct task_struct *thread;
-	struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO / 2 };
+	struct sched_attr attr = {
+		.size = sizeof(struct sched_attr),
+		.sched_policy = SCHED_DEADLINE,
+		.sched_flags = SCHED_FLAG_SPECIAL,
+		.sched_nice = 0,
+		.sched_priority = 0,
+		.sched_runtime = 0,
+		.sched_deadline = 0,
+		.sched_period = 0,
+	};
 	struct cpufreq_policy *policy = sg_policy->policy;
 	int ret;
 
@@ -412,10 +421,10 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
 		return PTR_ERR(thread);
 	}
 
-	ret = sched_setscheduler_nocheck(thread, SCHED_FIFO, &param);
+	ret = sched_setattr_nocheck(thread, &attr);
 	if (ret) {
 		kthread_stop(thread);
-		pr_warn("%s: failed to set SCHED_FIFO\n", __func__);
+		pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
 		return ret;
 	}
 
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 5c1a205e830f..853de524c6c6 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -131,6 +131,9 @@ static void task_non_contending(struct task_struct *p)
 	if (dl_se->dl_runtime == 0)
 		return;
 
+	if (dl_entity_is_special(dl_se))
+		return;
+
 	WARN_ON(hrtimer_active(&dl_se->inactive_timer));
 	WARN_ON(dl_se->dl_non_contending);
 
@@ -968,6 +971,9 @@ static void update_curr_dl(struct rq *rq)
 
 	sched_rt_avg_update(rq, delta_exec);
 
+	if (unlikely(dl_entity_is_special(dl_se)))
+		return;
+
 	if (unlikely(dl_se->flags & SCHED_FLAG_RECLAIM))
 		delta_exec = grub_reclaim(delta_exec, rq, curr->dl.dl_bw);
 	dl_se->runtime -= delta_exec;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 93c24528ceb6..7b5e81120813 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -155,13 +155,19 @@ static inline int task_has_dl_policy(struct task_struct *p)
 	return dl_policy(p->policy);
 }
 
+static inline int dl_entity_is_special(struct sched_dl_entity *dl_se)
+{
+	return dl_se->flags & SCHED_FLAG_SPECIAL;
+}
+
 /*
  * Tells if entity @a should preempt entity @b.
  */
 static inline bool
 dl_entity_preempt(struct sched_dl_entity *a, struct sched_dl_entity *b)
 {
-	return dl_time_before(a->deadline, b->deadline);
+	return dl_entity_is_special(a) ||
+	       dl_time_before(a->deadline, b->deadline);
 }
 
 /*
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-24 14:08 [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection Juri Lelli
                   ` (2 preceding siblings ...)
  2017-03-24 14:08 ` [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE Juri Lelli
@ 2017-03-24 14:08 ` Juri Lelli
  2017-03-29 22:41   ` Rafael J. Wysocki
  2017-03-24 14:09 ` [RFD PATCH 5/5] sched/deadline: make bandwidth enforcement scale-invariant Juri Lelli
  4 siblings, 1 reply; 25+ messages in thread
From: Juri Lelli @ 2017-03-24 14:08 UTC (permalink / raw)
  To: peterz, mingo, rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, tglx, vincent.guittot, rostedt,
	luca.abeni, claudio, tommaso.cucinotta, bristot, mathieu.poirier,
	tkjos, joelaf, andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, juri.lelli, Ingo Molnar, Rafael J . Wysocki

No assumption can be made upon the rate at which frequency updates get
triggered, as there are scheduling policies (like SCHED_DEADLINE) which
don't trigger them so frequently.

Remove such assumption from the code.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Claudio Scordino <claudio@evidence.eu.com>
---
 kernel/sched/cpufreq_schedutil.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index da67a1cf91e7..40f30373b709 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -233,14 +233,13 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
 		 * If the CPU utilization was last updated before the previous
 		 * frequency update and the time elapsed between the last update
 		 * of the CPU utilization and the last frequency update is long
-		 * enough, don't take the CPU into account as it probably is
-		 * idle now (and clear iowait_boost for it).
+		 * enough, reset iowait_boost, as it probably is not boosted
+		 * anymore now.
 		 */
 		delta_ns = last_freq_update_time - j_sg_cpu->last_update;
-		if (delta_ns > TICK_NSEC) {
+		if (delta_ns > TICK_NSEC)
 			j_sg_cpu->iowait_boost = 0;
-			continue;
-		}
+
 		if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
 			return policy->cpuinfo.max_freq;
 
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFD PATCH 5/5] sched/deadline: make bandwidth enforcement scale-invariant
  2017-03-24 14:08 [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection Juri Lelli
                   ` (3 preceding siblings ...)
  2017-03-24 14:08 ` [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq Juri Lelli
@ 2017-03-24 14:09 ` Juri Lelli
  4 siblings, 0 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-24 14:09 UTC (permalink / raw)
  To: peterz, mingo, rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, tglx, vincent.guittot, rostedt,
	luca.abeni, claudio, tommaso.cucinotta, bristot, mathieu.poirier,
	tkjos, joelaf, andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, juri.lelli, Ingo Molnar, Rafael J . Wysocki

Apply frequency and cpu scale-invariance correction factor to bandwidth
enforcement (similar to what we already do to fair utilization tracking).

Each delta_exec gets scaled considering current frequency and maximum
cpu capacity; which means that the reservation runtime parameter (that
need to be specified profiling the task execution at max frequency on
biggest capacity core) gets thus scaled accordingly.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Claudio Scordino <claudio@evidence.eu.com>
---
 kernel/sched/deadline.c | 27 +++++++++++++++++++++++----
 kernel/sched/fair.c     |  2 --
 kernel/sched/sched.h    |  2 ++
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 853de524c6c6..7141d6f51ee0 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -940,7 +940,9 @@ static void update_curr_dl(struct rq *rq)
 {
 	struct task_struct *curr = rq->curr;
 	struct sched_dl_entity *dl_se = &curr->dl;
-	u64 delta_exec;
+	u64 delta_exec, scaled_delta_exec;
+	unsigned long scale_freq, scale_cpu;
+	int cpu = cpu_of(rq);
 
 	if (!dl_task(curr) || !on_dl_rq(dl_se))
 		return;
@@ -974,9 +976,26 @@ static void update_curr_dl(struct rq *rq)
 	if (unlikely(dl_entity_is_special(dl_se)))
 		return;
 
-	if (unlikely(dl_se->flags & SCHED_FLAG_RECLAIM))
-		delta_exec = grub_reclaim(delta_exec, rq, curr->dl.dl_bw);
-	dl_se->runtime -= delta_exec;
+	/*
+	 * XXX When clock frequency is controlled by the scheduler (via
+	 * schedutil governor) we implement GRUB-PA: the spare reclaimed
+	 * bandwidth is used to clock down frequency.
+	 *
+	 * However, what below seems to assume scheduler to always be in
+	 * control of clock frequency; when running at a fixed frequency
+	 * (e.g., performance or userspace governor), shouldn't we instead
+	 * use the grub_reclaim mechanism below?
+	 *
+	 * if (unlikely(dl_se->flags & SCHED_FLAG_RECLAIM))
+	 *	delta_exec = grub_reclaim(delta_exec, rq, curr->dl.dl_bw);
+	 * dl_se->runtime -= delta_exec;
+	 */
+	scale_freq = arch_scale_freq_capacity(NULL, cpu);
+	scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+
+	scaled_delta_exec = cap_scale(delta_exec, scale_freq);
+	scaled_delta_exec = cap_scale(scaled_delta_exec, scale_cpu);
+	dl_se->runtime -= scaled_delta_exec;
 
 throttle:
 	if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2805bd7c8994..37f12d0a3bc4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2818,8 +2818,6 @@ static u32 __compute_runnable_contrib(u64 n)
 	return contrib + runnable_avg_yN_sum[n];
 }
 
-#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
-
 /*
  * We can represent the historical contribution to runnable average as the
  * coefficients of a geometric series.  To do this we sub-divide our runnable
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 7b5e81120813..81bd048ed181 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -155,6 +155,8 @@ static inline int task_has_dl_policy(struct task_struct *p)
 	return dl_policy(p->policy);
 }
 
+#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
+
 static inline int dl_entity_is_special(struct sched_dl_entity *dl_se)
 {
 	return dl_se->flags & SCHED_FLAG_SPECIAL;
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-24 14:08 ` [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE Juri Lelli
@ 2017-03-27 16:50   ` Peter Zijlstra
  2017-03-27 17:01     ` Juri Lelli
                       ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-27 16:50 UTC (permalink / raw)
  To: Juri Lelli
  Cc: mingo, rjw, viresh.kumar, linux-kernel, linux-pm, tglx,
	vincent.guittot, rostedt, luca.abeni, claudio, tommaso.cucinotta,
	bristot, mathieu.poirier, tkjos, joelaf, andresoportus,
	morten.rasmussen, dietmar.eggemann, patrick.bellasi, Ingo Molnar,
	Rafael J . Wysocki, Mark Brown

On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
> Worker kthread needs to be able to change frequency for all other
> threads.
> 
> Make it special, just under STOP class.

*yuck* ;-)

So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
'soecial' task will need to boost it. Now add BWI to your thinking and
shudder.

On IRC broonie mentioned that:

 - most PMIC operations are fire and forget (no need to wait for a
   response).
 - PMIC 'packets' are 'small'.
 - SPI has the possibility to push stuff on the queue.

Taken together this seems to suggest we can rework cpufreq drivers to
function in-context, either directly push the packet on the bus if
available, or queue it and let whoever owns it sort it without blocking.

It might be possible to rework/augment I2C to also support pushing stuff
on a queue.

So if we can make all that work, we can do away with this horrible
horrible kthread. Which is, IMO, a much better solution.

Thoughts?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 16:50   ` Peter Zijlstra
@ 2017-03-27 17:01     ` Juri Lelli
  2017-03-27 17:05       ` Rafael J. Wysocki
  2017-03-27 18:05     ` Mark Brown
  2017-03-28  9:29     ` Vincent Guittot
  2 siblings, 1 reply; 25+ messages in thread
From: Juri Lelli @ 2017-03-27 17:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, rjw, viresh.kumar, linux-kernel, linux-pm, tglx,
	vincent.guittot, rostedt, luca.abeni, claudio, tommaso.cucinotta,
	bristot, mathieu.poirier, tkjos, joelaf, andresoportus,
	morten.rasmussen, dietmar.eggemann, patrick.bellasi, Ingo Molnar,
	Rafael J . Wysocki, Mark Brown

On 27/03/17 18:50, Peter Zijlstra wrote:
> On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
> > Worker kthread needs to be able to change frequency for all other
> > threads.
> > 
> > Make it special, just under STOP class.
> 
> *yuck* ;-)
> 

Eh, I know. :/

> So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
> 'soecial' task will need to boost it. Now add BWI to your thinking and
> shudder.
> 

Currently that kthread is FIFO already, so boosting still applies. Not as
bad as in the BWI case though. More thinking required.

> 
> On IRC broonie mentioned that:
> 
>  - most PMIC operations are fire and forget (no need to wait for a
>    response).
>  - PMIC 'packets' are 'small'.
>  - SPI has the possibility to push stuff on the queue.
> 
> Taken together this seems to suggest we can rework cpufreq drivers to
> function in-context, either directly push the packet on the bus if
> available, or queue it and let whoever owns it sort it without blocking.
> 
> It might be possible to rework/augment I2C to also support pushing stuff
> on a queue.
> 
> 
> So if we can make all that work, we can do away with this horrible
> horrible kthread. Which is, IMO, a much better solution.
> 
> Thoughts?

Right. This is more a schedutil (cpufreq) problem though, IMHO. Even if
I agree that what you are proposing is way more clean (and here I
actually assume it's feasible at all), I fear it will take quite some
time to get reworked.

Do we want to wait until that moment to get DEADLINE contribution
accounted for? :(

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 17:01     ` Juri Lelli
@ 2017-03-27 17:05       ` Rafael J. Wysocki
  2017-03-27 17:13         ` Juri Lelli
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2017-03-27 17:05 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Peter Zijlstra, mingo, viresh.kumar, linux-kernel, linux-pm,
	tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, Rafael J . Wysocki, Mark Brown

On Monday, March 27, 2017 06:01:34 PM Juri Lelli wrote:
> On 27/03/17 18:50, Peter Zijlstra wrote:
> > On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
> > > Worker kthread needs to be able to change frequency for all other
> > > threads.
> > > 
> > > Make it special, just under STOP class.
> > 
> > *yuck* ;-)
> > 
> 
> Eh, I know. :/
> 
> > So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
> > 'soecial' task will need to boost it. Now add BWI to your thinking and
> > shudder.
> > 
> 
> Currently that kthread is FIFO already, so boosting still applies. Not as
> bad as in the BWI case though. More thinking required.
> 
> > 
> > On IRC broonie mentioned that:
> > 
> >  - most PMIC operations are fire and forget (no need to wait for a
> >    response).
> >  - PMIC 'packets' are 'small'.
> >  - SPI has the possibility to push stuff on the queue.
> > 
> > Taken together this seems to suggest we can rework cpufreq drivers to
> > function in-context, either directly push the packet on the bus if
> > available, or queue it and let whoever owns it sort it without blocking.
> > 
> > It might be possible to rework/augment I2C to also support pushing stuff
> > on a queue.
> > 
> > 
> > So if we can make all that work, we can do away with this horrible
> > horrible kthread. Which is, IMO, a much better solution.
> > 
> > Thoughts?
> 
> Right. This is more a schedutil (cpufreq) problem though, IMHO. Even if
> I agree that what you are proposing is way more clean (and here I
> actually assume it's feasible at all), I fear it will take quite some
> time to get reworked.

Why do you think so?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 17:05       ` Rafael J. Wysocki
@ 2017-03-27 17:13         ` Juri Lelli
  2017-03-27 17:37           ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Juri Lelli @ 2017-03-27 17:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Zijlstra, mingo, viresh.kumar, linux-kernel, linux-pm,
	tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, Rafael J . Wysocki, Mark Brown

On 27/03/17 19:05, Rafael J. Wysocki wrote:
> On Monday, March 27, 2017 06:01:34 PM Juri Lelli wrote:
> > On 27/03/17 18:50, Peter Zijlstra wrote:
> > > On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
> > > > Worker kthread needs to be able to change frequency for all other
> > > > threads.
> > > > 
> > > > Make it special, just under STOP class.
> > > 
> > > *yuck* ;-)
> > > 
> > 
> > Eh, I know. :/
> > 
> > > So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
> > > 'soecial' task will need to boost it. Now add BWI to your thinking and
> > > shudder.
> > > 
> > 
> > Currently that kthread is FIFO already, so boosting still applies. Not as
> > bad as in the BWI case though. More thinking required.
> > 
> > > 
> > > On IRC broonie mentioned that:
> > > 
> > >  - most PMIC operations are fire and forget (no need to wait for a
> > >    response).
> > >  - PMIC 'packets' are 'small'.
> > >  - SPI has the possibility to push stuff on the queue.
> > > 
> > > Taken together this seems to suggest we can rework cpufreq drivers to
> > > function in-context, either directly push the packet on the bus if
> > > available, or queue it and let whoever owns it sort it without blocking.
> > > 
> > > It might be possible to rework/augment I2C to also support pushing stuff
> > > on a queue.
> > > 
> > > 
> > > So if we can make all that work, we can do away with this horrible
> > > horrible kthread. Which is, IMO, a much better solution.
> > > 
> > > Thoughts?
> > 
> > Right. This is more a schedutil (cpufreq) problem though, IMHO. Even if
> > I agree that what you are proposing is way more clean (and here I
> > actually assume it's feasible at all), I fear it will take quite some
> > time to get reworked.
> 
> Why do you think so?
> 

It simply seemed a major rework to me. :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 17:13         ` Juri Lelli
@ 2017-03-27 17:37           ` Rafael J. Wysocki
  2017-03-27 18:09             ` Mark Brown
  2017-03-30 15:50             ` Vikram Mulukutla
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2017-03-27 17:37 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Peter Zijlstra, mingo, viresh.kumar, linux-kernel, linux-pm,
	tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, Rafael J . Wysocki, Mark Brown

On Monday, March 27, 2017 06:13:36 PM Juri Lelli wrote:
> On 27/03/17 19:05, Rafael J. Wysocki wrote:
> > On Monday, March 27, 2017 06:01:34 PM Juri Lelli wrote:
> > > On 27/03/17 18:50, Peter Zijlstra wrote:
> > > > On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
> > > > > Worker kthread needs to be able to change frequency for all other
> > > > > threads.
> > > > > 
> > > > > Make it special, just under STOP class.
> > > > 
> > > > *yuck* ;-)
> > > > 
> > > 
> > > Eh, I know. :/
> > > 
> > > > So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
> > > > 'soecial' task will need to boost it. Now add BWI to your thinking and
> > > > shudder.
> > > > 
> > > 
> > > Currently that kthread is FIFO already, so boosting still applies. Not as
> > > bad as in the BWI case though. More thinking required.
> > > 
> > > > 
> > > > On IRC broonie mentioned that:
> > > > 
> > > >  - most PMIC operations are fire and forget (no need to wait for a
> > > >    response).
> > > >  - PMIC 'packets' are 'small'.
> > > >  - SPI has the possibility to push stuff on the queue.
> > > > 
> > > > Taken together this seems to suggest we can rework cpufreq drivers to
> > > > function in-context, either directly push the packet on the bus if
> > > > available, or queue it and let whoever owns it sort it without blocking.
> > > > 
> > > > It might be possible to rework/augment I2C to also support pushing stuff
> > > > on a queue.
> > > > 
> > > > 
> > > > So if we can make all that work, we can do away with this horrible
> > > > horrible kthread. Which is, IMO, a much better solution.
> > > > 
> > > > Thoughts?
> > > 
> > > Right. This is more a schedutil (cpufreq) problem though, IMHO. Even if
> > > I agree that what you are proposing is way more clean (and here I
> > > actually assume it's feasible at all), I fear it will take quite some
> > > time to get reworked.
> > 
> > Why do you think so?
> > 
> 
> It simply seemed a major rework to me. :)

OK

So there are two pieces here.

One is that if we want *all* drivers to work with schedutil, we need to keep
the kthread for the ones that will never be reworked (because nobody cares
etc).  But then perhaps the kthread implementation may be left alone (because
nobody cares etc).

The second one is that there are drivers operating in-context that work with
schedutil already, so I don't see major obstacles to making more drivers work
that way.  That would be only a matter of reworking the drivers in question.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 16:50   ` Peter Zijlstra
  2017-03-27 17:01     ` Juri Lelli
@ 2017-03-27 18:05     ` Mark Brown
  2017-03-28  9:29     ` Vincent Guittot
  2 siblings, 0 replies; 25+ messages in thread
From: Mark Brown @ 2017-03-27 18:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, mingo, rjw, viresh.kumar, linux-kernel, linux-pm,
	tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, Rafael J . Wysocki

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

On Mon, Mar 27, 2017 at 06:50:11PM +0200, Peter Zijlstra wrote:

> Taken together this seems to suggest we can rework cpufreq drivers to
> function in-context, either directly push the packet on the bus if
> available, or queue it and let whoever owns it sort it without blocking.

Note that this isn't really cpufreq's problem per se, cpufreq is going
to work out what voltage it wants then tell the regulator API to do that
which will in turn go communicate with the PMIC somehow, typically with
regmap over I2C or SPI.  All those layers use mutexes by default.

> So if we can make all that work, we can do away with this horrible
> horrible kthread. Which is, IMO, a much better solution.

> Thoughts?

I think it's doable, but a lot of work especially in the regulator code.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 17:37           ` Rafael J. Wysocki
@ 2017-03-27 18:09             ` Mark Brown
  2017-03-30 15:50             ` Vikram Mulukutla
  1 sibling, 0 replies; 25+ messages in thread
From: Mark Brown @ 2017-03-27 18:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Juri Lelli, Peter Zijlstra, mingo, viresh.kumar, linux-kernel,
	linux-pm, tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, Rafael J . Wysocki

[-- Attachment #1: Type: text/plain, Size: 548 bytes --]

On Mon, Mar 27, 2017 at 07:37:39PM +0200, Rafael J. Wysocki wrote:

> One is that if we want *all* drivers to work with schedutil, we need to keep
> the kthread for the ones that will never be reworked (because nobody cares
> etc).  But then perhaps the kthread implementation may be left alone (because
> nobody cares etc).

A very large proportion of the affected code (at least once you get into
the realm of talking to the PMICs) is shared in core code in various
subsystems rather than in individual drivers so the problem might be
tractable.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 16:50   ` Peter Zijlstra
  2017-03-27 17:01     ` Juri Lelli
  2017-03-27 18:05     ` Mark Brown
@ 2017-03-28  9:29     ` Vincent Guittot
  2017-03-28 10:20       ` Peter Zijlstra
  2 siblings, 1 reply; 25+ messages in thread
From: Vincent Guittot @ 2017-03-28  9:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, mingo, rjw, viresh kumar, linux-kernel, linux-pm,
	Thomas Gleixner, Steven Rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, Mathieu Poirier, Todd Kjos,
	Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar,
	Rafael J . Wysocki, Mark Brown

On 27 March 2017 at 18:50, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
>> Worker kthread needs to be able to change frequency for all other
>> threads.
>>
>> Make it special, just under STOP class.
>
> *yuck* ;-)
>
> So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
> 'soecial' task will need to boost it. Now add BWI to your thinking and
> shudder.
>
>
> On IRC broonie mentioned that:
>
>  - most PMIC operations are fire and forget (no need to wait for a
>    response).
>  - PMIC 'packets' are 'small'.
>  - SPI has the possibility to push stuff on the queue.
>
> Taken together this seems to suggest we can rework cpufreq drivers to
> function in-context, either directly push the packet on the bus if
> available, or queue it and let whoever owns it sort it without blocking.
>
> It might be possible to rework/augment I2C to also support pushing stuff
> on a queue.

But sending new voltage value to PMIC is only part of the sequence.
When cpufreq set a new opp, it does

-set new voltage
-wait for the voltage to settle down.
-set the new clock frequency

you can even have to switch to an intermediate clock source.

When such sequence is managed by the kernel, we can't easily git ride
of a kthread

>
>
> So if we can make all that work, we can do away with this horrible
> horrible kthread. Which is, IMO, a much better solution.
>
> Thoughts?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-28  9:29     ` Vincent Guittot
@ 2017-03-28 10:20       ` Peter Zijlstra
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-28 10:20 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Juri Lelli, mingo, rjw, viresh kumar, linux-kernel, linux-pm,
	Thomas Gleixner, Steven Rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, Mathieu Poirier, Todd Kjos,
	Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar,
	Rafael J . Wysocki, Mark Brown

On Tue, Mar 28, 2017 at 11:29:31AM +0200, Vincent Guittot wrote:
> On 27 March 2017 at 18:50, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, Mar 24, 2017 at 02:08:58PM +0000, Juri Lelli wrote:
> >> Worker kthread needs to be able to change frequency for all other
> >> threads.
> >>
> >> Make it special, just under STOP class.
> >
> > *yuck* ;-)
> >
> > So imagine our I2C/SPI bus is 'busy' and its mutex taken, then this
> > 'soecial' task will need to boost it. Now add BWI to your thinking and
> > shudder.
> >
> >
> > On IRC broonie mentioned that:
> >
> >  - most PMIC operations are fire and forget (no need to wait for a
> >    response).
> >  - PMIC 'packets' are 'small'.
> >  - SPI has the possibility to push stuff on the queue.
> >
> > Taken together this seems to suggest we can rework cpufreq drivers to
> > function in-context, either directly push the packet on the bus if
> > available, or queue it and let whoever owns it sort it without blocking.
> >
> > It might be possible to rework/augment I2C to also support pushing stuff
> > on a queue.
> 
> But sending new voltage value to PMIC is only part of the sequence.
> When cpufreq set a new opp, it does
> 
> -set new voltage
> -wait for the voltage to settle down.
> -set the new clock frequency
> 
> you can even have to switch to an intermediate clock source.
> 
> When such sequence is managed by the kernel, we can't easily git ride
> of a kthread

That stinks :-(

The whole blocking and bandwidth inheritance stuff gives me nightmares.
Makes all this stuff almost impossible to analyse.

Esp. if the bus (SPI/I2C) is shared with some other 'expensive' device
like a DSP or MMC flash crud.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-24 14:08 ` [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq Juri Lelli
@ 2017-03-29 22:41   ` Rafael J. Wysocki
  2017-03-30  8:58     ` Juri Lelli
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2017-03-29 22:41 UTC (permalink / raw)
  To: Juri Lelli
  Cc: peterz, mingo, viresh.kumar, linux-kernel, linux-pm, tglx,
	vincent.guittot, rostedt, luca.abeni, claudio, tommaso.cucinotta,
	bristot, mathieu.poirier, tkjos, joelaf, andresoportus,
	morten.rasmussen, dietmar.eggemann, patrick.bellasi, Ingo Molnar

On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
> No assumption can be made upon the rate at which frequency updates get
> triggered, as there are scheduling policies (like SCHED_DEADLINE) which
> don't trigger them so frequently.
> 
> Remove such assumption from the code.

But the util/max values for idle CPUs may be stale, no?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-29 22:41   ` Rafael J. Wysocki
@ 2017-03-30  8:58     ` Juri Lelli
  2017-03-30 13:21       ` Vincent Guittot
  2017-03-30 20:13       ` Rafael J. Wysocki
  0 siblings, 2 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-30  8:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: peterz, mingo, viresh.kumar, linux-kernel, linux-pm, tglx,
	vincent.guittot, rostedt, luca.abeni, claudio, tommaso.cucinotta,
	bristot, mathieu.poirier, tkjos, joelaf, andresoportus,
	morten.rasmussen, dietmar.eggemann, patrick.bellasi, Ingo Molnar

Hi,

On 30/03/17 00:41, Rafael J. Wysocki wrote:
> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
> > No assumption can be made upon the rate at which frequency updates get
> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which
> > don't trigger them so frequently.
> > 
> > Remove such assumption from the code.
> 
> But the util/max values for idle CPUs may be stale, no?
> 

Right, that might be a problem. A proper solution I think would be to
remotely update such values for idle CPUs, and I believe Vincent is
working on a patch for that.

As mid-term workarounds, changing a bit the current one, come to my
mind:

 - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set
 - remove CFS contribution (without triggering a freq update) when a CPU
   enters IDLE; this might not work well, though, as we probably want
   to keep in blocked util contribution for a bit

What you think is the way to go?

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-30  8:58     ` Juri Lelli
@ 2017-03-30 13:21       ` Vincent Guittot
  2017-03-30 20:13       ` Rafael J. Wysocki
  1 sibling, 0 replies; 25+ messages in thread
From: Vincent Guittot @ 2017-03-30 13:21 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Rafael J. Wysocki, Peter Zijlstra, mingo, viresh kumar,
	linux-kernel, linux-pm, Thomas Gleixner, Steven Rostedt,
	Luca Abeni, claudio, Tommaso Cucinotta, bristot, Mathieu Poirier,
	Todd Kjos, Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar

On 30 March 2017 at 10:58, Juri Lelli <juri.lelli@arm.com> wrote:
> Hi,
>
> On 30/03/17 00:41, Rafael J. Wysocki wrote:
>> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
>> > No assumption can be made upon the rate at which frequency updates get
>> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which
>> > don't trigger them so frequently.
>> >
>> > Remove such assumption from the code.
>>
>> But the util/max values for idle CPUs may be stale, no?
>>
>
> Right, that might be a problem. A proper solution I think would be to
> remotely update such values for idle CPUs, and I believe Vincent is
> working on a patch for that.

Yes. I'm working on a patch that will regularly update the blocked
load/utilization of idle CPU. This update will be done on a slow pace
to make sure that utilization and load will be decayed regularly

>
> As mid-term workarounds, changing a bit the current one, come to my
> mind:
>
>  - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set
>  - remove CFS contribution (without triggering a freq update) when a CPU
>    enters IDLE; this might not work well, though, as we probably want
>    to keep in blocked util contribution for a bit
>
> What you think is the way to go?
>
> Thanks,
>
> - Juri

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-27 17:37           ` Rafael J. Wysocki
  2017-03-27 18:09             ` Mark Brown
@ 2017-03-30 15:50             ` Vikram Mulukutla
  2017-03-30 20:22               ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Vikram Mulukutla @ 2017-03-30 15:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Juri Lelli, Peter Zijlstra, mingo, viresh.kumar, linux-kernel,
	linux-pm, tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, Rafael J . Wysocki, Mark Brown,
	linux-kernel-owner



> OK
> 
> So there are two pieces here.
> 
> One is that if we want *all* drivers to work with schedutil, we need to 
> keep
> the kthread for the ones that will never be reworked (because nobody 
> cares
> etc).  But then perhaps the kthread implementation may be left alone 
> (because
> nobody cares etc).
> 
> The second one is that there are drivers operating in-context that work 
> with
> schedutil already, so I don't see major obstacles to making more 
> drivers work
> that way.  That would be only a matter of reworking the drivers in 
> question.
> 
> Thanks,
> Rafael

There are some MSM platforms that do need a kthread and would love to 
use
schedutil. This is all mainly due to the point that Vincent raised; 
having
to actually wait for voltage transitions before clock switches. I can't
speak about the future, but that's the situation right now. Leaving the
kthread alone for now would be appreciated!

Thanks,
Vikram

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-30  8:58     ` Juri Lelli
  2017-03-30 13:21       ` Vincent Guittot
@ 2017-03-30 20:13       ` Rafael J. Wysocki
  2017-03-31  7:31         ` Juri Lelli
  1 sibling, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2017-03-30 20:13 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Rafael J. Wysocki, Peter Zijlstra, Ingo Molnar, Viresh Kumar,
	Linux Kernel Mailing List, Linux PM, Thomas Gleixner,
	Vincent Guittot, Steven Rostedt, Luca Abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, Todd Kjos,
	Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar

On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote:
> Hi,

Hi,

> On 30/03/17 00:41, Rafael J. Wysocki wrote:
>> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
>> > No assumption can be made upon the rate at which frequency updates get
>> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which
>> > don't trigger them so frequently.
>> >
>> > Remove such assumption from the code.
>>
>> But the util/max values for idle CPUs may be stale, no?
>>
>
> Right, that might be a problem. A proper solution I think would be to
> remotely update such values for idle CPUs, and I believe Vincent is
> working on a patch for that.
>
> As mid-term workarounds, changing a bit the current one, come to my
> mind:
>
>  - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set
>  - remove CFS contribution (without triggering a freq update) when a CPU
>    enters IDLE; this might not work well, though, as we probably want
>    to keep in blocked util contribution for a bit
>
> What you think is the way to go?

Well, do we want SCHED_DEADLINE util contribution to be there even for
idle CPUs?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-30 15:50             ` Vikram Mulukutla
@ 2017-03-30 20:22               ` Rafael J. Wysocki
  2017-03-31  7:26                 ` Juri Lelli
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2017-03-30 20:22 UTC (permalink / raw)
  To: Vikram Mulukutla, Mark Brown
  Cc: Juri Lelli, Peter Zijlstra, mingo, viresh.kumar, linux-kernel,
	linux-pm, tglx, vincent.guittot, rostedt, luca.abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, tkjos, joelaf,
	andresoportus, morten.rasmussen, dietmar.eggemann,
	patrick.bellasi, Ingo Molnar, linux-kernel-owner

On Thursday, March 30, 2017 08:50:11 AM Vikram Mulukutla wrote:
> 
> > OK
> > 
> > So there are two pieces here.
> > 
> > One is that if we want *all* drivers to work with schedutil, we need to 
> > keep
> > the kthread for the ones that will never be reworked (because nobody 
> > cares
> > etc).  But then perhaps the kthread implementation may be left alone 
> > (because
> > nobody cares etc).
> > 
> > The second one is that there are drivers operating in-context that work 
> > with
> > schedutil already, so I don't see major obstacles to making more 
> > drivers work
> > that way.  That would be only a matter of reworking the drivers in 
> > question.
> > 
> > Thanks,
> > Rafael
> 
> There are some MSM platforms that do need a kthread and would love to 
> use
> schedutil. This is all mainly due to the point that Vincent raised; 
> having
> to actually wait for voltage transitions before clock switches. I can't
> speak about the future, but that's the situation right now. Leaving the
> kthread alone for now would be appreciated!

I was not arguing for removing the kthread (quite opposite rather).

My point was that *if* it is viable to rework drivers to operate in-context,
that would be the way to go IMO instead of messing up with the kthread thing.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
  2017-03-30 20:22               ` Rafael J. Wysocki
@ 2017-03-31  7:26                 ` Juri Lelli
  0 siblings, 0 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-31  7:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vikram Mulukutla, Mark Brown, Peter Zijlstra, mingo,
	viresh.kumar, linux-kernel, linux-pm, tglx, vincent.guittot,
	rostedt, luca.abeni, claudio, tommaso.cucinotta, bristot,
	mathieu.poirier, tkjos, joelaf, andresoportus, morten.rasmussen,
	dietmar.eggemann, patrick.bellasi, Ingo Molnar,
	linux-kernel-owner

On 30/03/17 22:22, Rafael J. Wysocki wrote:
> On Thursday, March 30, 2017 08:50:11 AM Vikram Mulukutla wrote:
> > 
> > > OK
> > > 
> > > So there are two pieces here.
> > > 
> > > One is that if we want *all* drivers to work with schedutil, we need to 
> > > keep
> > > the kthread for the ones that will never be reworked (because nobody 
> > > cares
> > > etc).  But then perhaps the kthread implementation may be left alone 
> > > (because
> > > nobody cares etc).
> > > 
> > > The second one is that there are drivers operating in-context that work 
> > > with
> > > schedutil already, so I don't see major obstacles to making more 
> > > drivers work
> > > that way.  That would be only a matter of reworking the drivers in 
> > > question.
> > > 
> > > Thanks,
> > > Rafael
> > 
> > There are some MSM platforms that do need a kthread and would love to 
> > use
> > schedutil. This is all mainly due to the point that Vincent raised; 
> > having
> > to actually wait for voltage transitions before clock switches. I can't
> > speak about the future, but that's the situation right now. Leaving the
> > kthread alone for now would be appreciated!
> 
> I was not arguing for removing the kthread (quite opposite rather).
> 
> My point was that *if* it is viable to rework drivers to operate in-context,
> that would be the way to go IMO instead of messing up with the kthread thing.
> 

Right, I agree. Problem is that in principle we might still want to use
DEADLINE with the other platforms (MSM being a perfect example), so IMHO
we should still try to find a solution for the kthread anyway.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-30 20:13       ` Rafael J. Wysocki
@ 2017-03-31  7:31         ` Juri Lelli
  2017-03-31  9:03           ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Juri Lelli @ 2017-03-31  7:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Peter Zijlstra, Ingo Molnar, Viresh Kumar,
	Linux Kernel Mailing List, Linux PM, Thomas Gleixner,
	Vincent Guittot, Steven Rostedt, Luca Abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, Todd Kjos,
	Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar

On 30/03/17 22:13, Rafael J. Wysocki wrote:
> On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote:
> > Hi,
> 
> Hi,
> 
> > On 30/03/17 00:41, Rafael J. Wysocki wrote:
> >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
> >> > No assumption can be made upon the rate at which frequency updates get
> >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which
> >> > don't trigger them so frequently.
> >> >
> >> > Remove such assumption from the code.
> >>
> >> But the util/max values for idle CPUs may be stale, no?
> >>
> >
> > Right, that might be a problem. A proper solution I think would be to
> > remotely update such values for idle CPUs, and I believe Vincent is
> > working on a patch for that.
> >
> > As mid-term workarounds, changing a bit the current one, come to my
> > mind:
> >
> >  - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set
> >  - remove CFS contribution (without triggering a freq update) when a CPU
> >    enters IDLE; this might not work well, though, as we probably want
> >    to keep in blocked util contribution for a bit
> >
> > What you think is the way to go?
> 
> Well, do we want SCHED_DEADLINE util contribution to be there even for
> idle CPUs?
> 

DEADLINE util contribution is removed, even if the CPU is idle, by the
reclaiming mechanism when we know (applying GRUB algorithm rules [1])
that it can't be used anymore by a task (roughly speaking). So, we
shouldn't have this problem in the DEADLINE case.

[1] https://marc.info/?l=linux-kernel&m=149029880524038

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-31  7:31         ` Juri Lelli
@ 2017-03-31  9:03           ` Rafael J. Wysocki
  2017-03-31  9:16             ` Juri Lelli
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2017-03-31  9:03 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Peter Zijlstra,
	Ingo Molnar, Viresh Kumar, Linux Kernel Mailing List, Linux PM,
	Thomas Gleixner, Vincent Guittot, Steven Rostedt, Luca Abeni,
	claudio, tommaso.cucinotta, bristot, mathieu.poirier, Todd Kjos,
	Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar

On Fri, Mar 31, 2017 at 9:31 AM, Juri Lelli <juri.lelli@arm.com> wrote:
> On 30/03/17 22:13, Rafael J. Wysocki wrote:
>> On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote:
>> > Hi,
>>
>> Hi,
>>
>> > On 30/03/17 00:41, Rafael J. Wysocki wrote:
>> >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
>> >> > No assumption can be made upon the rate at which frequency updates get
>> >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which
>> >> > don't trigger them so frequently.
>> >> >
>> >> > Remove such assumption from the code.
>> >>
>> >> But the util/max values for idle CPUs may be stale, no?
>> >>
>> >
>> > Right, that might be a problem. A proper solution I think would be to
>> > remotely update such values for idle CPUs, and I believe Vincent is
>> > working on a patch for that.
>> >
>> > As mid-term workarounds, changing a bit the current one, come to my
>> > mind:
>> >
>> >  - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set
>> >  - remove CFS contribution (without triggering a freq update) when a CPU
>> >    enters IDLE; this might not work well, though, as we probably want
>> >    to keep in blocked util contribution for a bit
>> >
>> > What you think is the way to go?
>>
>> Well, do we want SCHED_DEADLINE util contribution to be there even for
>> idle CPUs?
>>
>
> DEADLINE util contribution is removed, even if the CPU is idle, by the
> reclaiming mechanism when we know (applying GRUB algorithm rules [1])
> that it can't be used anymore by a task (roughly speaking). So, we
> shouldn't have this problem in the DEADLINE case.
>
> [1] https://marc.info/?l=linux-kernel&m=149029880524038

OK

Why don't you store the contributions from DL and CFS separately, then
(say, as util_dl, util_cfs, respectively) and only discard the CFS one
if delta_ns > TICK_NSEC?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq
  2017-03-31  9:03           ` Rafael J. Wysocki
@ 2017-03-31  9:16             ` Juri Lelli
  0 siblings, 0 replies; 25+ messages in thread
From: Juri Lelli @ 2017-03-31  9:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Peter Zijlstra, Ingo Molnar, Viresh Kumar,
	Linux Kernel Mailing List, Linux PM, Thomas Gleixner,
	Vincent Guittot, Steven Rostedt, Luca Abeni, claudio,
	tommaso.cucinotta, bristot, mathieu.poirier, Todd Kjos,
	Joel Fernandes, Andres Oportus, Morten Rasmussen,
	Dietmar Eggemann, Patrick Bellasi, Ingo Molnar

On 31/03/17 11:03, Rafael J. Wysocki wrote:
> On Fri, Mar 31, 2017 at 9:31 AM, Juri Lelli <juri.lelli@arm.com> wrote:
> > On 30/03/17 22:13, Rafael J. Wysocki wrote:
> >> On Thu, Mar 30, 2017 at 10:58 AM, Juri Lelli <juri.lelli@arm.com> wrote:
> >> > Hi,
> >>
> >> Hi,
> >>
> >> > On 30/03/17 00:41, Rafael J. Wysocki wrote:
> >> >> On Friday, March 24, 2017 02:08:59 PM Juri Lelli wrote:
> >> >> > No assumption can be made upon the rate at which frequency updates get
> >> >> > triggered, as there are scheduling policies (like SCHED_DEADLINE) which
> >> >> > don't trigger them so frequently.
> >> >> >
> >> >> > Remove such assumption from the code.
> >> >>
> >> >> But the util/max values for idle CPUs may be stale, no?
> >> >>
> >> >
> >> > Right, that might be a problem. A proper solution I think would be to
> >> > remotely update such values for idle CPUs, and I believe Vincent is
> >> > working on a patch for that.
> >> >
> >> > As mid-term workarounds, changing a bit the current one, come to my
> >> > mind:
> >> >
> >> >  - consider TICK_NSEC (continue) only when SCHED_CPUFREQ_DL is not set
> >> >  - remove CFS contribution (without triggering a freq update) when a CPU
> >> >    enters IDLE; this might not work well, though, as we probably want
> >> >    to keep in blocked util contribution for a bit
> >> >
> >> > What you think is the way to go?
> >>
> >> Well, do we want SCHED_DEADLINE util contribution to be there even for
> >> idle CPUs?
> >>
> >
> > DEADLINE util contribution is removed, even if the CPU is idle, by the
> > reclaiming mechanism when we know (applying GRUB algorithm rules [1])
> > that it can't be used anymore by a task (roughly speaking). So, we
> > shouldn't have this problem in the DEADLINE case.
> >
> > [1] https://marc.info/?l=linux-kernel&m=149029880524038
> 
> OK
> 
> Why don't you store the contributions from DL and CFS separately, then
> (say, as util_dl, util_cfs, respectively) and only discard the CFS one
> if delta_ns > TICK_NSEC?

Sure, this should work as well. I'll try this approach for next version.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-03-31  9:16 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 14:08 [RFD PATCH 0/5] SCHED_DEADLINE freq/cpu invariance and OPP selection Juri Lelli
2017-03-24 14:08 ` [RFD PATCH 1/5] sched/cpufreq_schedutil: make use of DEADLINE utilization signal Juri Lelli
2017-03-24 14:08 ` [RFD PATCH 2/5] sched/deadline: move cpu frequency selection triggering points Juri Lelli
2017-03-24 14:08 ` [RFD PATCH 3/5] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE Juri Lelli
2017-03-27 16:50   ` Peter Zijlstra
2017-03-27 17:01     ` Juri Lelli
2017-03-27 17:05       ` Rafael J. Wysocki
2017-03-27 17:13         ` Juri Lelli
2017-03-27 17:37           ` Rafael J. Wysocki
2017-03-27 18:09             ` Mark Brown
2017-03-30 15:50             ` Vikram Mulukutla
2017-03-30 20:22               ` Rafael J. Wysocki
2017-03-31  7:26                 ` Juri Lelli
2017-03-27 18:05     ` Mark Brown
2017-03-28  9:29     ` Vincent Guittot
2017-03-28 10:20       ` Peter Zijlstra
2017-03-24 14:08 ` [RFD PATCH 4/5] sched/cpufreq_schedutil: always consider all CPUs when deciding next freq Juri Lelli
2017-03-29 22:41   ` Rafael J. Wysocki
2017-03-30  8:58     ` Juri Lelli
2017-03-30 13:21       ` Vincent Guittot
2017-03-30 20:13       ` Rafael J. Wysocki
2017-03-31  7:31         ` Juri Lelli
2017-03-31  9:03           ` Rafael J. Wysocki
2017-03-31  9:16             ` Juri Lelli
2017-03-24 14:09 ` [RFD PATCH 5/5] sched/deadline: make bandwidth enforcement scale-invariant Juri Lelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).