All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>
Subject: [RFC v3 3/5] sched/core: sync capacity_{min,max} between slow and fast paths
Date: Tue, 28 Feb 2017 14:38:40 +0000	[thread overview]
Message-ID: <1488292722-19410-4-git-send-email-patrick.bellasi@arm.com> (raw)
In-Reply-To: <1488292722-19410-1-git-send-email-patrick.bellasi@arm.com>

At enqueue/dequeue time a task needs to be placed in the CPU's rb_tree,
depending on the current capacity_{min,max} value of the cgroup it
belongs to. Thus, we need to grant that these values cannot be changed
while the task is in these critical sections.

To this purpose, this patch uses the same locking schema already used by
the __set_cpus_allowed_ptr. We might uselessly lock the (previous) RQ of
a !RUNNABLE task, but that's the price to pay to safely serialize
capacity_{min,max} updates with enqueues, dequeues and migrations.

This patch adds the synchronization calls required to grant that each
RUNNABLE task is always in the correct relative position within the
RBTree. Specifically, when a group's capacity_{min,max} value is
updated, each task in that group is re-positioned within the rb_tree, if
currently RUNNABLE and its relative position has changed.
This operation is mutually exclusive with the task being {en,de}queued
or migrated via a task_rq_lock().

It's worth to notice that moving a task from a CGroup to another,
perhaps with different capacity_{min,max} values, is already covered by
the current locking schema. Indeed, this operation requires a dequeue
from the original cgroup's RQ followed by an enqueue in the new one.
The same argument is true for tasks migrations thus, tasks migrations
between CPUs and CGruoups are ultimately managed like tasks
wakeups/sleeps.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org
---
 kernel/sched/core.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8f509be..d620bc4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -846,9 +846,68 @@ cap_clamp_remove_capacity(struct rq *rq, struct task_struct *p,
 	RB_CLEAR_NODE(node);
 }
 
+static void
+cap_clamp_update_capacity(struct task_struct *p, unsigned int cap_idx)
+{
+	struct task_group *tg = task_group(p);
+	unsigned int next_cap = SCHED_CAPACITY_SCALE;
+	unsigned int prev_cap = 0;
+	struct task_struct *entry;
+	struct rb_node *node;
+	struct rq_flags rf;
+	struct rq *rq;
+
+	/*
+	 * Lock the CPU's RBTree where the task is (eventually) queued.
+	 *
+	 * We might uselessly lock the (previous) RQ of a !RUNNABLE task, but
+	 * that's the price to pay to safely serializ capacity_{min,max}
+	 * updates with enqueues, dequeues and migration operations, which is
+	 * the same locking schema already in use by __set_cpus_allowed_ptr().
+	 */
+	rq = task_rq_lock(p, &rf);
+
+	/*
+	 * If the task has not a node in the rbtree, it's not yet RUNNABLE or
+	 * it's going to be enqueued with the proper value.
+	 * The setting of the cap_clamp_node is serialized by task_rq_lock().
+	 */
+	if (RB_EMPTY_NODE(&p->cap_clamp_node[cap_idx]))
+		goto done;
+
+	/* Check current position in the capacity rbtree */
+	node = rb_next(&p->cap_clamp_node[cap_idx]);
+	if (node) {
+		entry = rb_entry(node, struct task_struct,
+				 cap_clamp_node[cap_idx]);
+		next_cap = task_group(entry)->cap_clamp[cap_idx];
+	}
+	node = rb_prev(&p->cap_clamp_node[cap_idx]);
+	if (node) {
+		entry = rb_entry(node, struct task_struct,
+				 cap_clamp_node[cap_idx]);
+		prev_cap = task_group(entry)->cap_clamp[cap_idx];
+	}
+
+	/* If relative position has not changed: nothing to do */
+	if (prev_cap <= tg->cap_clamp[cap_idx] &&
+	    next_cap >= tg->cap_clamp[cap_idx])
+		goto done;
+
+	/* Reposition this node within the rbtree */
+	cap_clamp_remove_capacity(rq, p, cap_idx);
+	cap_clamp_insert_capacity(rq, p, cap_idx);
+
+done:
+	task_rq_unlock(rq, p, &rf);
+}
+
 static inline void
 cap_clamp_enqueue_task(struct rq *rq, struct task_struct *p, int flags)
 {
+	lockdep_assert_held(&p->pi_lock);
+	lockdep_assert_held(&rq->lock);
+
 	/* Track task's min/max capacities */
 	cap_clamp_insert_capacity(rq, p, CAP_CLAMP_MIN);
 	cap_clamp_insert_capacity(rq, p, CAP_CLAMP_MAX);
@@ -857,6 +916,9 @@ cap_clamp_enqueue_task(struct rq *rq, struct task_struct *p, int flags)
 static inline void
 cap_clamp_dequeue_task(struct rq *rq, struct task_struct *p, int flags)
 {
+	lockdep_assert_held(&p->pi_lock);
+	lockdep_assert_held(&rq->lock);
+
 	/* Track task's min/max capacities */
 	cap_clamp_remove_capacity(rq, p, CAP_CLAMP_MIN);
 	cap_clamp_remove_capacity(rq, p, CAP_CLAMP_MAX);
@@ -7046,8 +7108,10 @@ static int cpu_capacity_min_write_u64(struct cgroup_subsys_state *css,
 				      struct cftype *cftype, u64 value)
 {
 	struct cgroup_subsys_state *pos;
+	struct css_task_iter it;
 	unsigned int min_value;
 	struct task_group *tg;
+	struct task_struct *p;
 	int ret = -EINVAL;
 
 	min_value = min_t(unsigned int, value, SCHED_CAPACITY_SCALE);
@@ -7078,6 +7142,12 @@ static int cpu_capacity_min_write_u64(struct cgroup_subsys_state *css,
 
 	tg->cap_clamp[CAP_CLAMP_MIN] = min_value;
 
+	/* Update the capacity_min of RUNNABLE tasks */
+	css_task_iter_start(css, &it);
+	while ((p = css_task_iter_next(&it)))
+		cap_clamp_update_capacity(p, CAP_CLAMP_MIN);
+	css_task_iter_end(&it);
+
 done:
 	ret = 0;
 out:
@@ -7091,8 +7161,10 @@ static int cpu_capacity_max_write_u64(struct cgroup_subsys_state *css,
 				      struct cftype *cftype, u64 value)
 {
 	struct cgroup_subsys_state *pos;
+	struct css_task_iter it;
 	unsigned int max_value;
 	struct task_group *tg;
+	struct task_struct *p;
 	int ret = -EINVAL;
 
 	max_value = min_t(unsigned int, value, SCHED_CAPACITY_SCALE);
@@ -7123,6 +7195,12 @@ static int cpu_capacity_max_write_u64(struct cgroup_subsys_state *css,
 
 	tg->cap_clamp[CAP_CLAMP_MAX] = max_value;
 
+	/* Update the capacity_max of RUNNABLE tasks */
+	css_task_iter_start(css, &it);
+	while ((p = css_task_iter_next(&it)))
+		cap_clamp_update_capacity(p, CAP_CLAMP_MAX);
+	css_task_iter_end(&it);
+
 done:
 	ret = 0;
 out:
-- 
2.7.4

  parent reply	other threads:[~2017-02-28 14:50 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28 14:38 [RFC v3 0/5] Add capacity capping support to the CPU controller Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 1/5] sched/core: add capacity constraints to " Patrick Bellasi
2017-03-13 10:46   ` Joel Fernandes (Google)
2017-03-15 11:20     ` Patrick Bellasi
2017-03-15 13:20       ` Joel Fernandes
2017-03-15 16:10         ` Paul E. McKenney
2017-03-15 16:44           ` Patrick Bellasi
2017-03-15 17:24             ` Paul E. McKenney
2017-03-15 17:57               ` Patrick Bellasi
2017-03-20 17:15   ` Tejun Heo
2017-03-20 17:36     ` Tejun Heo
2017-03-20 18:08     ` Patrick Bellasi
2017-03-23  0:28       ` Joel Fernandes (Google)
2017-03-23 10:32         ` Patrick Bellasi
2017-03-23 16:01           ` Tejun Heo
2017-03-23 18:15             ` Patrick Bellasi
2017-03-23 18:39               ` Tejun Heo
2017-03-24  6:37                 ` Joel Fernandes (Google)
2017-03-24 15:00                   ` Tejun Heo
2017-03-30 21:13                 ` Paul Turner
2017-03-24  7:02           ` Joel Fernandes (Google)
2017-03-30 21:15       ` Paul Turner
2017-04-01 16:25         ` Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 2/5] sched/core: track CPU's capacity_{min,max} Patrick Bellasi
2017-02-28 14:38 ` Patrick Bellasi [this message]
2017-02-28 14:38 ` [RFC v3 4/5] sched/{core,cpufreq_schedutil}: add capacity clamping for FAIR tasks Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 5/5] sched/{core,cpufreq_schedutil}: add capacity clamping for RT/DL tasks Patrick Bellasi
2017-03-13 10:08   ` Joel Fernandes (Google)
2017-03-15 11:40     ` Patrick Bellasi
2017-03-15 12:59       ` Joel Fernandes
2017-03-15 14:44         ` Juri Lelli
2017-03-15 16:13           ` Joel Fernandes
2017-03-15 16:24             ` Juri Lelli
2017-03-15 23:40               ` Joel Fernandes
2017-03-16 11:16                 ` Juri Lelli
2017-03-16 12:27                   ` Patrick Bellasi
2017-03-16 12:44                     ` Juri Lelli
2017-03-16 16:58                       ` Joel Fernandes
2017-03-16 17:17                         ` Juri Lelli
2017-03-15 11:41 ` [RFC v3 0/5] Add capacity capping support to the CPU controller Rafael J. Wysocki
2017-03-15 12:59   ` Patrick Bellasi
2017-03-16  1:04     ` Rafael J. Wysocki
2017-03-16  3:15       ` Joel Fernandes
2017-03-20 22:51         ` Rafael J. Wysocki
2017-03-21 11:01           ` Patrick Bellasi
2017-03-24 23:52             ` Rafael J. Wysocki
2017-03-16 12:23       ` Patrick Bellasi
2017-03-20 14:51 ` Tejun Heo
2017-03-20 17:22   ` Patrick Bellasi
2017-04-10  7:36     ` Peter Zijlstra
2017-04-11 17:58       ` Patrick Bellasi
2017-04-12 12:10         ` Peter Zijlstra
2017-04-12 13:55           ` Patrick Bellasi
2017-04-12 15:37             ` Peter Zijlstra
2017-04-13 11:33               ` Patrick Bellasi
2017-04-12 12:15         ` Peter Zijlstra
2017-04-12 13:34           ` Patrick Bellasi
2017-04-12 14:41             ` Peter Zijlstra
2017-04-12 12:22         ` Peter Zijlstra
2017-04-12 13:24           ` Patrick Bellasi
2017-04-12 12:48         ` Peter Zijlstra
2017-04-12 13:27           ` Patrick Bellasi
2017-04-12 14:34             ` Peter Zijlstra
2017-04-12 14:43               ` Patrick Bellasi
2017-04-12 16:14                 ` Peter Zijlstra
2017-04-13 10:34                   ` Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1488292722-19410-4-git-send-email-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.