All of lore.kernel.org
 help / color / mirror / Atom feed
From: Morten Rasmussen <morten.rasmussen@arm.com>
To: peterz@infradead.org, mingo@redhat.com
Cc: vincent.guittot@linaro.org,
	Dietmar Eggemann <Dietmar.Eggemann@arm.com>,
	yuyang.du@intel.com, preeti@linux.vnet.ibm.com,
	mturquette@linaro.org, rjw@rjwysocki.net,
	Juri Lelli <Juri.Lelli@arm.com>,
	sgurrappadi@nvidia.com, pang.xunlei@zte.com.cn,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	morten.rasmussen@arm.com
Subject: [RFCv4 PATCH 22/34] sched: Calculate energy consumption of sched_group
Date: Tue, 12 May 2015 20:38:57 +0100	[thread overview]
Message-ID: <1431459549-18343-23-git-send-email-morten.rasmussen@arm.com> (raw)
In-Reply-To: <1431459549-18343-1-git-send-email-morten.rasmussen@arm.com>

For energy-aware load-balancing decisions it is necessary to know the
energy consumption estimates of groups of cpus. This patch introduces a
basic function, sched_group_energy(), which estimates the energy
consumption of the cpus in the group and any resources shared by the
members of the group.

NOTE: The function has five levels of identation and breaks the 80
character limit. Refactoring is necessary.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
---
 kernel/sched/fair.c | 146 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 146 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 70f2700..2677ca6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4831,6 +4831,152 @@ static inline bool energy_aware(void)
 	return sched_feat(ENERGY_AWARE);
 }
 
+/*
+ * cpu_norm_usage() returns the cpu usage relative to a specific capacity,
+ * i.e. it's busy ratio, in the range [0..SCHED_LOAD_SCALE] which is useful for
+ * energy calculations. Using the scale-invariant usage returned by
+ * get_cpu_usage() and approximating scale-invariant usage by:
+ *
+ *   usage ~ (curr_freq/max_freq)*1024 * capacity_orig/1024 * running_time/time
+ *
+ * the normalized usage can be found using the specific capacity.
+ *
+ *   capacity = capacity_orig * curr_freq/max_freq
+ *
+ *   norm_usage = running_time/time ~ usage/capacity
+ */
+static unsigned long cpu_norm_usage(int cpu, unsigned long capacity)
+{
+	int usage = __get_cpu_usage(cpu);
+
+	if (usage >= capacity)
+		return SCHED_CAPACITY_SCALE;
+
+	return (usage << SCHED_CAPACITY_SHIFT)/capacity;
+}
+
+static unsigned long group_max_usage(struct sched_group *sg)
+{
+	int i;
+	unsigned long max_usage = 0;
+
+	for_each_cpu(i, sched_group_cpus(sg))
+		max_usage = max(max_usage, get_cpu_usage(i));
+
+	return max_usage;
+}
+
+/*
+ * group_norm_usage() returns the approximated group usage relative to it's
+ * current capacity (busy ratio) in the range [0..SCHED_LOAD_SCALE] for use in
+ * energy calculations. Since task executions may or may not overlap in time in
+ * the group the true normalized usage is between max(cpu_norm_usage(i)) and
+ * sum(cpu_norm_usage(i)) when iterating over all cpus in the group, i. The
+ * latter is used as the estimate as it leads to a more pessimistic energy
+ * estimate (more busy).
+ */
+static unsigned long group_norm_usage(struct sched_group *sg, int cap_idx)
+{
+	int i;
+	unsigned long usage_sum = 0;
+	unsigned long capacity = sg->sge->cap_states[cap_idx].cap;
+
+	for_each_cpu(i, sched_group_cpus(sg))
+		usage_sum += cpu_norm_usage(i, capacity);
+
+	if (usage_sum > SCHED_CAPACITY_SCALE)
+		return SCHED_CAPACITY_SCALE;
+	return usage_sum;
+}
+
+static int find_new_capacity(struct sched_group *sg,
+		struct sched_group_energy *sge)
+{
+	int idx;
+	unsigned long util = group_max_usage(sg);
+
+	for (idx = 0; idx < sge->nr_cap_states; idx++) {
+		if (sge->cap_states[idx].cap >= util)
+			return idx;
+	}
+
+	return idx;
+}
+
+/*
+ * sched_group_energy(): Returns absolute energy consumption of cpus belonging
+ * to the sched_group including shared resources shared only by members of the
+ * group. Iterates over all cpus in the hierarchy below the sched_group starting
+ * from the bottom working it's way up before going to the next cpu until all
+ * cpus are covered at all levels. The current implementation is likely to
+ * gather the same usage statistics multiple times. This can probably be done in
+ * a faster but more complex way.
+ */
+static unsigned int sched_group_energy(struct sched_group *sg_top)
+{
+	struct sched_domain *sd;
+	int cpu, total_energy = 0;
+	struct cpumask visit_cpus;
+	struct sched_group *sg;
+
+	WARN_ON(!sg_top->sge);
+
+	cpumask_copy(&visit_cpus, sched_group_cpus(sg_top));
+
+	while (!cpumask_empty(&visit_cpus)) {
+		struct sched_group *sg_shared_cap = NULL;
+
+		cpu = cpumask_first(&visit_cpus);
+
+		/*
+		 * Is the group utilization affected by cpus outside this
+		 * sched_group?
+		 */
+		sd = highest_flag_domain(cpu, SD_SHARE_CAP_STATES);
+		if (sd && sd->parent)
+			sg_shared_cap = sd->parent->groups;
+
+		for_each_domain(cpu, sd) {
+			sg = sd->groups;
+
+			/* Has this sched_domain already been visited? */
+			if (sd->child && group_first_cpu(sg) != cpu)
+				break;
+
+			do {
+				struct sched_group *sg_cap_util;
+				unsigned long group_util;
+				int sg_busy_energy, sg_idle_energy, cap_idx;
+
+				if (sg_shared_cap && sg_shared_cap->group_weight >= sg->group_weight)
+					sg_cap_util = sg_shared_cap;
+				else
+					sg_cap_util = sg;
+
+				cap_idx = find_new_capacity(sg_cap_util, sg->sge);
+				group_util = group_norm_usage(sg, cap_idx);
+				sg_busy_energy = (group_util * sg->sge->cap_states[cap_idx].power)
+										>> SCHED_CAPACITY_SHIFT;
+				sg_idle_energy = ((SCHED_LOAD_SCALE-group_util) * sg->sge->idle_states[0].power)
+										>> SCHED_CAPACITY_SHIFT;
+
+				total_energy += sg_busy_energy + sg_idle_energy;
+
+				if (!sd->child)
+					cpumask_xor(&visit_cpus, &visit_cpus, sched_group_cpus(sg));
+
+				if (cpumask_equal(sched_group_cpus(sg), sched_group_cpus(sg_top)))
+					goto next_cpu;
+
+			} while (sg = sg->next, sg != sd->groups);
+		}
+next_cpu:
+		continue;
+	}
+
+	return total_energy;
+}
+
 static int wake_wide(struct task_struct *p)
 {
 	int factor = this_cpu_read(sd_llc_size);
-- 
1.9.1


  parent reply	other threads:[~2015-05-12 19:38 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-12 19:38 [RFCv4 PATCH 00/34] sched: Energy cost model for energy-aware scheduling Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 01/34] arm: Frequency invariant scheduler load-tracking support Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 02/34] sched: Make load tracking frequency scale-invariant Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 03/34] arm: vexpress: Add CPU clock-frequencies to TC2 device-tree Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 04/34] sched: Convert arch_scale_cpu_capacity() from weak function to #define Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 05/34] arm: Update arch_scale_cpu_capacity() to reflect change to define Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 06/34] sched: Make usage tracking cpu scale-invariant Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 07/34] arm: Cpu invariant scheduler load-tracking support Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 08/34] sched: Get rid of scaling usage by cpu_capacity_orig Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 09/34] sched: Track blocked utilization contributions Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 10/34] sched: Include blocked utilization in usage tracking Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 11/34] sched: Remove blocked load and utilization contributions of dying tasks Morten Rasmussen
2015-05-13  0:33   ` Sai Gurrappadi
2015-05-13  0:33     ` Sai Gurrappadi
2015-05-13 13:49     ` Morten Rasmussen
2015-05-19 14:22       ` Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 12/34] sched: Initialize CFS task load and usage before placing task on rq Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 13/34] sched: Documentation for scheduler energy cost model Morten Rasmussen
2015-05-20  4:04   ` Kamalesh Babulal
2015-05-20  9:27     ` Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 14/34] sched: Make energy awareness a sched feature Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 15/34] sched: Introduce energy data structures Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 16/34] sched: Allocate and initialize " Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 17/34] sched: Introduce SD_SHARE_CAP_STATES sched_domain flag Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 18/34] arm: topology: Define TC2 energy and provide it to the scheduler Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 19/34] sched: Compute cpu capacity available at current frequency Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 20/34] sched: Relocated get_cpu_usage() and change return type Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 21/34] sched: Highest energy aware balancing sched_domain level pointer Morten Rasmussen
2015-05-12 19:38 ` Morten Rasmussen [this message]
2015-05-21  7:57   ` [RFCv4 PATCH 22/34] sched: Calculate energy consumption of sched_group Kamalesh Babulal
2015-05-22 15:38     ` Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 23/34] sched: Extend sched_group_energy to test load-balancing decisions Morten Rasmussen
2015-05-12 19:38 ` [RFCv4 PATCH 24/34] sched: Estimate energy impact of scheduling decisions Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 25/34] sched: Add over-utilization/tipping point indicator Morten Rasmussen
2015-05-22 19:48   ` [PATCH] sched: Fix compiler errors for NO_SMP machines Abel Vesa
2015-05-23 14:52     ` Ingo Molnar
2015-05-23 19:22       ` Abel Vesa
2015-06-30  9:35   ` [RFCv4 PATCH 25/34] sched: Add over-utilization/tipping point indicator pang.xunlei
2015-06-30  9:35     ` pang.xunlei
2015-05-12 19:39 ` [RFCv4 PATCH 26/34] sched: Store system-wide maximum cpu capacity in root domain Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 27/34] sched, cpuidle: Track cpuidle state index in the scheduler Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 28/34] sched: Count number of shallower idle-states in struct sched_group_energy Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 29/34] sched: Determine the current sched_group idle-state Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 30/34] sched: Add cpu capacity awareness to wakeup balancing Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 31/34] sched: Energy-aware wake-up task placement Morten Rasmussen
2015-05-14 14:03   ` Dietmar Eggemann
     [not found]   ` <OF168B7415.9556008C-ON48257E45.003388D7-48257E45.00349D8D@zte.com.cn>
2015-05-14 15:10     ` Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 32/34] sched: Consider a not over-utilized energy-aware system as balanced Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 33/34] sched: Enable idle balance to pull single task towards cpu with higher capacity Morten Rasmussen
2015-05-12 19:39 ` [RFCv4 PATCH 34/34] sched: Disable energy-unfriendly nohz kicks Morten Rasmussen
2015-05-12 22:07 ` [RFCv4 PATCH 00/34] sched: Energy cost model for energy-aware scheduling Sai Gurrappadi
2015-05-12 22:07   ` Sai Gurrappadi
2015-05-13 13:47   ` Morten Rasmussen
2015-06-28 20:26     ` Abel Vesa
2015-06-29  9:06       ` pang.xunlei
2015-06-29 10:19         ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1431459549-18343-23-git-send-email-morten.rasmussen@arm.com \
    --to=morten.rasmussen@arm.com \
    --cc=Dietmar.Eggemann@arm.com \
    --cc=Juri.Lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mturquette@linaro.org \
    --cc=pang.xunlei@zte.com.cn \
    --cc=peterz@infradead.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=sgurrappadi@nvidia.com \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.