All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets
@ 2015-03-03 23:00 riel
  2015-03-03 23:00 ` [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler riel
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: riel @ 2015-03-03 23:00 UTC (permalink / raw)
  To: tj; +Cc: linux-kernel

Ensure that cpus specified with the isolcpus= boot commandline
option stay outside of the load balancing in the kernel scheduler.

Operations like load balancing can introduce unwanted latencies,
which is exactly what the isolcpus= commandline is there to prevent.

Previously, simply creating a new cpuset, without even touching the
cpuset.cpus field inside the new cpuset, would undo the effects of
isolcpus=, by creating a scheduler domain spanning the whole system,
and setting up load balancing inside that domain. The cpuset root
cpuset.cpus file is read-only, so there was not even a way to undo
that effect.

This does not impact the majority of cpusets users, since isolcpus=
is a fairly specialized feature used for realtime purposes.

This version fixes the UP compilation issue, in the same way done
for the other cpumasks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler
  2015-03-03 23:00 [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
@ 2015-03-03 23:00 ` riel
  2015-03-03 23:00 ` [PATCH 2/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: riel @ 2015-03-03 23:00 UTC (permalink / raw)
  To: tj; +Cc: linux-kernel

From: Rik van Riel <riel@redhat.com>

Needed by the next patch. Also makes cpu_isolated_map present
when compiled without SMP and/or with CONFIG_NR_CPUS=1, like
the other cpu masks.

At some point we may want to clean things up so cpumasks do
not exist in UP kernels. Maybe something for the CONFIG_TINY
crowd.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: cgroups@vger.kernel.org
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/sched.h | 2 ++
 kernel/sched/core.c   | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6d77432e14ff..ca365d79480c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -329,6 +329,8 @@ extern asmlinkage void schedule_tail(struct task_struct *prev);
 extern void init_idle(struct task_struct *idle, int cpu);
 extern void init_idle_bootup_task(struct task_struct *idle);
 
+extern cpumask_var_t cpu_isolated_map;
+
 extern int runqueue_is_locked(int cpu);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..b578bb23410b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -306,6 +306,9 @@ __read_mostly int scheduler_running;
  */
 int sysctl_sched_rt_runtime = 950000;
 
+/* cpus with isolated domains */
+cpumask_var_t cpu_isolated_map;
+
 /*
  * this_rq_lock - lock this runqueue and disable interrupts.
  */
@@ -5811,9 +5814,6 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
 	update_top_cache_domain(cpu);
 }
 
-/* cpus with isolated domains */
-static cpumask_var_t cpu_isolated_map;
-
 /* Setup the mask of cpus configured for isolated domains */
 static int __init isolated_cpu_setup(char *str)
 {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets
  2015-03-03 23:00 [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
  2015-03-03 23:00 ` [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler riel
@ 2015-03-03 23:00 ` riel
  2015-03-03 23:00 ` [PATCH 3/4] cpusets,isolcpus: add file to show isolated cpus in cpuset riel
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: riel @ 2015-03-03 23:00 UTC (permalink / raw)
  To: tj; +Cc: linux-kernel

From: Rik van Riel <riel@redhat.com>

Ensure that cpus specified with the isolcpus= boot commandline
option stay outside of the load balancing in the kernel scheduler.

Operations like load balancing can introduce unwanted latencies,
which is exactly what the isolcpus= commandline is there to prevent.

Previously, simply creating a new cpuset, without even touching the
cpuset.cpus field inside the new cpuset, would undo the effects of
isolcpus=, by creating a scheduler domain spanning the whole system,
and setting up load balancing inside that domain. The cpuset root
cpuset.cpus file is read-only, so there was not even a way to undo
that effect.

This does not impact the majority of cpusets users, since isolcpus=
is a fairly specialized feature used for realtime purposes.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: cgroups@vger.kernel.org
Signed-off-by: Rik van Riel <riel@redhat.com>
Tested-by: David Rientjes <rientjes@google.com>
---
 kernel/cpuset.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 1d1fe9361d29..b544e5229d99 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -625,6 +625,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 	int csn;		/* how many cpuset ptrs in csa so far */
 	int i, j, k;		/* indices for partition finding loops */
 	cpumask_var_t *doms;	/* resulting partition; i.e. sched domains */
+	cpumask_var_t non_isolated_cpus;  /* load balanced CPUs */
 	struct sched_domain_attr *dattr;  /* attributes for custom domains */
 	int ndoms = 0;		/* number of sched domains in result */
 	int nslot;		/* next empty doms[] struct cpumask slot */
@@ -634,6 +635,10 @@ static int generate_sched_domains(cpumask_var_t **domains,
 	dattr = NULL;
 	csa = NULL;
 
+	if (!alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL))
+		goto done;
+	cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
+
 	/* Special case for the 99% of systems with one, full, sched domain */
 	if (is_sched_load_balance(&top_cpuset)) {
 		ndoms = 1;
@@ -646,7 +651,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
 			*dattr = SD_ATTR_INIT;
 			update_domain_attr_tree(dattr, &top_cpuset);
 		}
-		cpumask_copy(doms[0], top_cpuset.effective_cpus);
+		cpumask_and(doms[0], top_cpuset.effective_cpus,
+				     non_isolated_cpus);
 
 		goto done;
 	}
@@ -669,7 +675,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
 		 * the corresponding sched domain.
 		 */
 		if (!cpumask_empty(cp->cpus_allowed) &&
-		    !is_sched_load_balance(cp))
+		    !(is_sched_load_balance(cp) &&
+		      cpumask_intersects(cp->cpus_allowed, non_isolated_cpus)))
 			continue;
 
 		if (is_sched_load_balance(cp))
@@ -751,6 +758,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 
 			if (apn == b->pn) {
 				cpumask_or(dp, dp, b->effective_cpus);
+				cpumask_and(dp, dp, non_isolated_cpus);
 				if (dattr)
 					update_domain_attr_tree(dattr + nslot, b);
 
@@ -763,6 +771,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 	BUG_ON(nslot != ndoms);
 
 done:
+	free_cpumask_var(non_isolated_cpus);
 	kfree(csa);
 
 	/*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] cpusets,isolcpus: add file to show isolated cpus in cpuset
  2015-03-03 23:00 [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
  2015-03-03 23:00 ` [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler riel
  2015-03-03 23:00 ` [PATCH 2/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
@ 2015-03-03 23:00 ` riel
  2015-03-03 23:00 ` [PATCH 4/4] cpuset,isolcpus: document relationship between cpusets & isolcpus riel
  2015-03-09  3:37 ` [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets Tejun Heo
  4 siblings, 0 replies; 7+ messages in thread
From: riel @ 2015-03-03 23:00 UTC (permalink / raw)
  To: tj; +Cc: linux-kernel

From: Rik van Riel <riel@redhat.com>

The previous patch makes it so the code skips over isolcpus when
building scheduler load balancing domains. This makes it hard to
see for a user which of the CPUs in a cpuset are participating in
load balancing, and which ones are isolated cpus.

Add a cpuset.isolcpus file with info on which cpus in a cpuset are
isolated CPUs.

This file is read-only for now. In the future we could extend things
so isolcpus can be changed at run time, for the root (system wide)
cpuset only.

Acked-by: David Rientjes <rientjes@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: cgroups@vger.kernel.org
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/cpuset.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index b544e5229d99..5462e1ca90bd 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1563,6 +1563,7 @@ typedef enum {
 	FILE_MEMORY_PRESSURE,
 	FILE_SPREAD_PAGE,
 	FILE_SPREAD_SLAB,
+	FILE_ISOLCPUS,
 } cpuset_filetype_t;
 
 static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
@@ -1704,6 +1705,16 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
 	return retval ?: nbytes;
 }
 
+/* protected by the lock in cpuset_common_seq_show */
+static cpumask_var_t print_isolated_cpus;
+
+static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
+{
+	cpumask_and(print_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
+
+	seq_printf(sf, "%*pbl\n", cpumask_pr_args(print_isolated_cpus));
+}
+
 /*
  * These ascii lists should be read in a single call, by using a user
  * buffer large enough to hold the entire map.  If read in smaller
@@ -1733,6 +1744,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
 	case FILE_EFFECTIVE_MEMLIST:
 		seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
 		break;
+	case FILE_ISOLCPUS:
+		cpuset_seq_print_isolcpus(sf, cs);
+		break;
 	default:
 		ret = -EINVAL;
 	}
@@ -1893,6 +1907,12 @@ static struct cftype files[] = {
 		.private = FILE_MEMORY_PRESSURE_ENABLED,
 	},
 
+	{
+		.name = "isolcpus",
+		.seq_show = cpuset_common_seq_show,
+		.private = FILE_ISOLCPUS,
+	},
+
 	{ }	/* terminate */
 };
 
@@ -2070,6 +2090,8 @@ int __init cpuset_init(void)
 		BUG();
 	if (!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL))
 		BUG();
+	if (!alloc_cpumask_var(&print_isolated_cpus, GFP_KERNEL))
+		BUG();
 
 	cpumask_setall(top_cpuset.cpus_allowed);
 	nodes_setall(top_cpuset.mems_allowed);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] cpuset,isolcpus: document relationship between cpusets & isolcpus
  2015-03-03 23:00 [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
                   ` (2 preceding siblings ...)
  2015-03-03 23:00 ` [PATCH 3/4] cpusets,isolcpus: add file to show isolated cpus in cpuset riel
@ 2015-03-03 23:00 ` riel
  2015-03-09  3:37 ` [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets Tejun Heo
  4 siblings, 0 replies; 7+ messages in thread
From: riel @ 2015-03-03 23:00 UTC (permalink / raw)
  To: tj; +Cc: linux-kernel

From: Rik van Riel <riel@redhat.com>

Document the subtly changed relationship between cpusets and isolcpus.
Turns out the old documentation did not match the code...

Signed-off-by: Rik van Riel <riel@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
---
 Documentation/cgroups/cpusets.txt | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
index f2235a162529..fdf7dff3f607 100644
--- a/Documentation/cgroups/cpusets.txt
+++ b/Documentation/cgroups/cpusets.txt
@@ -392,8 +392,10 @@ Put simply, it costs less to balance between two smaller sched domains
 than one big one, but doing so means that overloads in one of the
 two domains won't be load balanced to the other one.
 
-By default, there is one sched domain covering all CPUs, except those
-marked isolated using the kernel boot time "isolcpus=" argument.
+By default, there is one sched domain covering all CPUs, including those
+marked isolated using the kernel boot time "isolcpus=" argument. However,
+the isolated CPUs will not participate in load balancing, and will not
+have tasks running on them unless explicitly assigned.
 
 This default load balancing across all CPUs is not well suited for
 the following two situations:
@@ -465,6 +467,10 @@ such partially load balanced cpusets, as they may be artificially
 constrained to some subset of the CPUs allowed to them, for lack of
 load balancing to the other CPUs.
 
+CPUs in "cpuset.isolcpus" were excluded from load balancing by the
+isolcpus= kernel boot option, and will never be load balanced regardless
+of the value of "cpuset.sched_load_balance" in any cpuset.
+
 1.7.1 sched_load_balance implementation details.
 ------------------------------------------------
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets
  2015-03-03 23:00 [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
                   ` (3 preceding siblings ...)
  2015-03-03 23:00 ` [PATCH 4/4] cpuset,isolcpus: document relationship between cpusets & isolcpus riel
@ 2015-03-09  3:37 ` Tejun Heo
  4 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2015-03-09  3:37 UTC (permalink / raw)
  To: riel; +Cc: linux-kernel

On Tue, Mar 03, 2015 at 06:00:19PM -0500, riel@redhat.com wrote:
> Ensure that cpus specified with the isolcpus= boot commandline
> option stay outside of the load balancing in the kernel scheduler.
> 
> Operations like load balancing can introduce unwanted latencies,
> which is exactly what the isolcpus= commandline is there to prevent.
> 
> Previously, simply creating a new cpuset, without even touching the
> cpuset.cpus field inside the new cpuset, would undo the effects of
> isolcpus=, by creating a scheduler domain spanning the whole system,
> and setting up load balancing inside that domain. The cpuset root
> cpuset.cpus file is read-only, so there was not even a way to undo
> that effect.
> 
> This does not impact the majority of cpusets users, since isolcpus=
> is a fairly specialized feature used for realtime purposes.
> 
> This version fixes the UP compilation issue, in the same way done
> for the other cpumasks.

Can you please repost with Li Zefan <lizefan@huawei.com> and
cgroups@vger.kernel.org cc'd?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler
  2015-03-09 16:12 [PATCH v4 RESEND " riel
@ 2015-03-09 16:12 ` riel
  0 siblings, 0 replies; 7+ messages in thread
From: riel @ 2015-03-09 16:12 UTC (permalink / raw)
  To: tj
  Cc: linux-kernel, cgroups, lizefan, Rik van Riel, Peter Zijlstra,
	Clark Williams, Ingo Molnar, Luiz Capitulino, David Rientjes,
	Mike Galbraith

From: Rik van Riel <riel@redhat.com>

Needed by the next patch. Also makes cpu_isolated_map present
when compiled without SMP and/or with CONFIG_NR_CPUS=1, like
the other cpu masks.

At some point we may want to clean things up so cpumasks do
not exist in UP kernels. Maybe something for the CONFIG_TINY
crowd.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: cgroups@vger.kernel.org
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/sched.h | 2 ++
 kernel/sched/core.c   | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6d77432e14ff..ca365d79480c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -329,6 +329,8 @@ extern asmlinkage void schedule_tail(struct task_struct *prev);
 extern void init_idle(struct task_struct *idle, int cpu);
 extern void init_idle_bootup_task(struct task_struct *idle);
 
+extern cpumask_var_t cpu_isolated_map;
+
 extern int runqueue_is_locked(int cpu);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..b578bb23410b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -306,6 +306,9 @@ __read_mostly int scheduler_running;
  */
 int sysctl_sched_rt_runtime = 950000;
 
+/* cpus with isolated domains */
+cpumask_var_t cpu_isolated_map;
+
 /*
  * this_rq_lock - lock this runqueue and disable interrupts.
  */
@@ -5811,9 +5814,6 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
 	update_top_cache_domain(cpu);
 }
 
-/* cpus with isolated domains */
-static cpumask_var_t cpu_isolated_map;
-
 /* Setup the mask of cpus configured for isolated domains */
 static int __init isolated_cpu_setup(char *str)
 {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-09 16:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-03 23:00 [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
2015-03-03 23:00 ` [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler riel
2015-03-03 23:00 ` [PATCH 2/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets riel
2015-03-03 23:00 ` [PATCH 3/4] cpusets,isolcpus: add file to show isolated cpus in cpuset riel
2015-03-03 23:00 ` [PATCH 4/4] cpuset,isolcpus: document relationship between cpusets & isolcpus riel
2015-03-09  3:37 ` [PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets Tejun Heo
2015-03-09 16:12 [PATCH v4 RESEND " riel
2015-03-09 16:12 ` [PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler riel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.