LKML Archive on lore.kernel.org
 help / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Paul Turner <pjt@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Steve Muckle <smuckle@google.com>,
	Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v3 13/14] sched/core: uclamp: update CPU's refcount on TG's clamp changes
Date: Mon,  6 Aug 2018 17:39:45 +0100
Message-ID: <20180806163946.28380-14-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20180806163946.28380-1-patrick.bellasi@arm.com>

When a task group refcounts a new clamp group, we need to ensure that
the new clamp values are immediately enforced to all its tasks which are
currently RUNNABLE. This is to ensure that all currently RUNNABLE tasks
are boosted and/or clamped as requested as soon as possible.

Let's ensure that, whenever a new clamp group is refcounted by a task
group, all its RUNNABLE tasks are correctly accounted in their
respective CPUs. We do that by slightly refactoring uclamp_group_get()
to get an additional parameter *cgroup_subsys_state which, when
provided, it's used to walk the list of tasks in the corresponding TGs
and update the RUNNABLE ones.

This is a "brute force" solution which allows to reuse the same refcount
update code already used by the per-task API. That's also the only way
to ensure a prompt enforcement of new clamp constraints on RUNNABLE
tasks, as soon as a task group attribute is tweaked.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org

---
Changes in v3:
 - rebased on tip/sched/core
 - fixed some typos

Changes in v2:
 - rebased on v4.18-rc4
 - this code has been split from a previous patch to simplify the review
---
 kernel/sched/core.c     | 44 ++++++++++++++++++++++++++++++++++-------
 kernel/sched/features.h |  5 +++++
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 48458fea2d5e..6db307803047 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1255,9 +1255,30 @@ static inline void uclamp_group_put(int clamp_id, int group_id)
 	raw_spin_unlock_irqrestore(&uc_map[group_id].se_lock, flags);
 }
 
+static inline void uclamp_group_get_tg(struct cgroup_subsys_state *css,
+				       int clamp_id, unsigned int group_id)
+{
+	struct css_task_iter it;
+	struct task_struct *p;
+
+	/*
+	 * In lazy update mode, tasks will be accounted into the right clamp
+	 * group the next time they will be requeued.
+	 */
+	if (unlikely(sched_feat(UCLAMP_LAZY_UPDATE)))
+		return;
+
+	/* Update clamp groups for RUNNABLE tasks in this TG */
+	css_task_iter_start(css, 0, &it);
+	while ((p = css_task_iter_next(&it)))
+		uclamp_task_update_active(p, clamp_id, group_id);
+	css_task_iter_end(&it);
+}
+
 /**
  * uclamp_group_get: increase the reference count for a clamp group
  * @p: the task which clamp value must be tracked
+ * @css: the task group which clamp value must be tracked
  * @clamp_id: the clamp index affected by the task
  * @next_group_id: the clamp group to refcount
  * @uc_se: the utilization clamp data for the task
@@ -1269,6 +1290,7 @@ static inline void uclamp_group_put(int clamp_id, int group_id)
  * the task to reference count the clamp value on CPUs while enqueued.
  */
 static inline void uclamp_group_get(struct task_struct *p,
+				    struct cgroup_subsys_state *css,
 				    int clamp_id, int next_group_id,
 				    struct uclamp_se *uc_se,
 				    unsigned int clamp_value)
@@ -1288,6 +1310,10 @@ static inline void uclamp_group_get(struct task_struct *p,
 	uc_map[next_group_id].se_count += 1;
 	raw_spin_unlock_irqrestore(&uc_map[next_group_id].se_lock, flags);
 
+	/* Newly created TG don't have tasks assigned */
+	if (css)
+		uclamp_group_get_tg(css, clamp_id, next_group_id);
+
 	/* Update CPU's clamp group refcounts of RUNNABLE task */
 	if (p)
 		uclamp_task_update_active(p, clamp_id, next_group_id);
@@ -1344,12 +1370,12 @@ int sched_uclamp_handler(struct ctl_table *table, int write,
 	/* Update each required clamp group */
 	if (old_min != sysctl_sched_uclamp_util_min) {
 		uc_se = &uclamp_default[UCLAMP_MIN];
-		uclamp_group_get(NULL, UCLAMP_MIN, group_id[UCLAMP_MIN],
+		uclamp_group_get(NULL, NULL, UCLAMP_MIN, group_id[UCLAMP_MIN],
 				 uc_se, sysctl_sched_uclamp_util_min);
 	}
 	if (old_max != sysctl_sched_uclamp_util_max) {
 		uc_se = &uclamp_default[UCLAMP_MAX];
-		uclamp_group_get(NULL, UCLAMP_MAX, group_id[UCLAMP_MAX],
+		uclamp_group_get(NULL, NULL, UCLAMP_MAX, group_id[UCLAMP_MAX],
 				 uc_se, sysctl_sched_uclamp_util_max);
 	}
 
@@ -1441,7 +1467,7 @@ static inline int alloc_uclamp_sched_group(struct task_group *tg,
 			return 0;
 		}
 #endif
-		uclamp_group_get(NULL, clamp_id, group_id, uc_se,
+		uclamp_group_get(NULL, NULL, clamp_id, group_id, uc_se,
 				 parent->uclamp[clamp_id].value);
 	}
 
@@ -1532,12 +1558,12 @@ static inline int __setscheduler_uclamp(struct task_struct *p,
 	/* Update each required clamp group */
 	if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
 		uc_se = &p->uclamp[UCLAMP_MIN];
-		uclamp_group_get(p, UCLAMP_MIN, group_id[UCLAMP_MIN],
+		uclamp_group_get(p, NULL, UCLAMP_MIN, group_id[UCLAMP_MIN],
 				 uc_se, attr->sched_util_min);
 	}
 	if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
 		uc_se = &p->uclamp[UCLAMP_MAX];
-		uclamp_group_get(p, UCLAMP_MAX, group_id[UCLAMP_MAX],
+		uclamp_group_get(p, NULL, UCLAMP_MAX, group_id[UCLAMP_MAX],
 				 uc_se, attr->sched_util_max);
 	}
 
@@ -7468,6 +7494,10 @@ static void cpu_util_update_hier(struct cgroup_subsys_state *css,
 
 		uc_se->effective.value = value;
 		uc_se->effective.group_id = group_id;
+
+		/* Immediately updated descendants active tasks */
+		if (css != top_css)
+			uclamp_group_get_tg(css, clamp_id, group_id);
 	}
 }
 
@@ -7508,7 +7538,7 @@ static int cpu_util_min_write_u64(struct cgroup_subsys_state *css,
 
 	/* Update TG's reference count */
 	uc_se = &tg->uclamp[UCLAMP_MIN];
-	uclamp_group_get(NULL, UCLAMP_MIN, group_id, uc_se, min_value);
+	uclamp_group_get(NULL, css, UCLAMP_MIN, group_id, uc_se, min_value);
 
 out:
 	rcu_read_unlock();
@@ -7554,7 +7584,7 @@ static int cpu_util_max_write_u64(struct cgroup_subsys_state *css,
 
 	/* Update TG's reference count */
 	uc_se = &tg->uclamp[UCLAMP_MAX];
-	uclamp_group_get(NULL, UCLAMP_MAX, group_id, uc_se, max_value);
+	uclamp_group_get(NULL, css, UCLAMP_MAX, group_id, uc_se, max_value);
 
 out:
 	rcu_read_unlock();
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index a3ca449e36c1..ced86cfd8fcd 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -91,6 +91,11 @@ SCHED_FEAT(WA_BIAS, true)
  */
 SCHED_FEAT(UTIL_EST, true)
 
+/*
+ * Utilization clamping lazy update.
+ */
+SCHED_FEAT(UCLAMP_LAZY_UPDATE, false)
+
 /*
  * Per class CPU's utilization clamping.
  */
-- 
2.18.0


  parent reply index

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-06 16:39 [PATCH v3 00/14] Add utilization clamping support Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 01/14] sched/core: uclamp: extend sched_setattr to support utilization clamping Patrick Bellasi
2018-08-06 16:50   ` Randy Dunlap
2018-08-09  8:39     ` Patrick Bellasi
2018-08-09 15:20       ` Randy Dunlap
2018-08-07  9:59   ` Juri Lelli
2018-08-13 12:14     ` Patrick Bellasi
2018-08-13 12:27       ` Juri Lelli
2018-08-07 12:35   ` Juri Lelli
2018-08-09  9:14     ` Patrick Bellasi
2018-08-09  9:50       ` Juri Lelli
2018-08-09 15:23         ` Patrick Bellasi
2018-08-10  7:50           ` Juri Lelli
2018-08-17 10:34           ` Quentin Perret
2018-08-17 10:57             ` Patrick Bellasi
2018-08-17 11:14               ` Quentin Perret
2018-08-06 16:39 ` [PATCH v3 02/14] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups Patrick Bellasi
2018-08-14 11:25   ` Pavan Kondeti
2018-08-14 15:21     ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 03/14] sched/core: uclamp: add CPU's clamp groups accounting Patrick Bellasi
2018-08-14 15:44   ` Dietmar Eggemann
2018-08-14 16:49     ` Patrick Bellasi
2018-08-15  9:37       ` Dietmar Eggemann
2018-08-15 10:54         ` Patrick Bellasi
2018-08-15 10:59           ` Dietmar Eggemann
2018-08-16 13:32             ` Patrick Bellasi
2018-08-16 13:37               ` Quentin Perret
2018-08-16 13:45                 ` Dietmar Eggemann
2018-08-16 14:21                   ` Quentin Perret
2018-08-16 15:00                     ` Dietmar Eggemann
2018-08-17 11:04   ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 04/14] sched/core: uclamp: update CPU's refcount on clamp changes Patrick Bellasi
2018-08-15 15:02   ` Dietmar Eggemann
2018-08-16 13:22     ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 05/14] sched/cpufreq: uclamp: add utilization clamping for FAIR tasks Patrick Bellasi
2018-08-08 13:18   ` Vincent Guittot
2018-08-09 15:30     ` Patrick Bellasi
2018-08-15 15:30   ` Dietmar Eggemann
2018-08-16 13:53     ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 06/14] sched/cpufreq: uclamp: add utilization clamping for RT tasks Patrick Bellasi
2018-08-07 13:26   ` Juri Lelli
2018-08-09 15:34     ` Patrick Bellasi
2018-08-09 16:03       ` Vincent Guittot
2018-08-13 10:12         ` Patrick Bellasi
2018-08-13 10:50           ` Juri Lelli
2018-08-13 12:07           ` Vincent Guittot
2018-08-13 12:09             ` Vincent Guittot
2018-08-13 12:49             ` Patrick Bellasi
2018-08-13 14:06               ` Vincent Guittot
2018-08-13 15:01                 ` Patrick Bellasi
2018-08-16 10:34                   ` Dietmar Eggemann
2018-08-16 13:40                     ` Patrick Bellasi
2018-08-07 13:54   ` Quentin Perret
2018-08-09 15:41     ` Patrick Bellasi
2018-08-09 15:55       ` Quentin Perret
2018-08-13 10:17         ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 07/14] sched/core: uclamp: enforce last task UCLAMP_MAX Patrick Bellasi
2018-08-16 15:43   ` Dietmar Eggemann
2018-08-16 16:47     ` Patrick Bellasi
2018-08-16 17:10       ` Dietmar Eggemann
2018-08-16 17:27         ` Patrick Bellasi
2018-08-16 17:20   ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 08/14] sched/core: uclamp: extend cpu's cgroup controller Patrick Bellasi
2018-08-17 12:21   ` Dietmar Eggemann
2018-08-17 14:24     ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 09/14] sched/core: uclamp: propagate parent clamps Patrick Bellasi
2018-08-16  9:09   ` Pavan Kondeti
2018-08-16 14:07     ` Patrick Bellasi
2018-08-17 13:43   ` Dietmar Eggemann
2018-08-17 14:45     ` Patrick Bellasi
2018-08-17 15:50       ` Dietmar Eggemann
2018-08-20 10:01         ` Dietmar Eggemann
2018-08-20 12:28           ` Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 10/14] sched/core: uclamp: map TG's clamp values into CPU's clamp groups Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 11/14] sched/core: uclamp: use TG's clamps to restrict Task's clamps Patrick Bellasi
2018-08-06 16:39 ` [PATCH v3 12/14] sched/core: uclamp: add system default clamps Patrick Bellasi
2018-08-16  9:13   ` Pavan Kondeti
2018-08-16 14:37     ` Patrick Bellasi
2018-08-20 10:18   ` Dietmar Eggemann
2018-08-20 12:27     ` Patrick Bellasi
2018-08-06 16:39 ` Patrick Bellasi [this message]
2018-08-06 16:39 ` [PATCH v3 14/14] sched/core: uclamp: use percentage clamp values Patrick Bellasi

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180806163946.28380-14-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=smuckle@google.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox