linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/12] cpuset: separate configured masks and effective masks
@ 2013-10-11  9:49 Li Zefan
  2013-10-11  9:49 ` [PATCH v2 01/12] cpuset: add cs->effective_cpus and cs->effective_mems Li Zefan
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

This patcheset introduces behavior changes, but only if you mount cgroupfs
with sane_behavior option:

- We introduce new interfaces cpuset.effective_cpus and cpuset.effective_mems,
  while cpuset.cpus and cpuset.mems will be configured masks.

- The configured masks can be changed by writing cpuset.cpus/mems only. They
  won't be changed when hotplug happens.

- Users can config cpus and mems without restrictions from the parent cpuset.
  effective masks will enforce the hierarchical behavior.

- Users can also config cpus and mems to have already offlined CPU/nodes.

- When a CPU/node is onlined, it will be brought back to the effective masks
  if it's in the configured masks.

- We build sched domains based on effective cpumask but not configured cpumask.


v2:
- fixed two bugs
- made changelogs more verbose
- added more comments
- changed cs->real_{mems,cpus}_allowed to cs->effective_{mems, cpus}
- splitted "cpuset: enable onlined cpu/node in effective masks" into 2 patches
- exported cpuset.effective_{cpus,mems} unconditionally


Li Zefan (12):
  cpuset: add cs->effective_cpus and cs->effective_mems
  cpuset: update cpuset->effective_{cpus,mems} at hotplug
  cpuset: update cs->effective_{cpus,mems} when config changes
  cpuset: inherit ancestor's masks if effective_{cpus,mems} becomes empty
  cpuset: use effective cpumask to build sched domains
  cpuset: initialize top_cpuset's configured masks at mount
  cpuset: apply cs->effective_{cpus,mems}
  cpuset: make cs->{cpus,mems}_allowed as user-configured masks
  cpuset: refactor cpuset_hotplug_update_tasks()
  cpuset: enable onlined cpu/node in effective masks
  cpuset: allow writing offlined masks to cpuset.cpus/mems
  cpuset: export effective masks to userspace

 kernel/cpuset.c | 513 ++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 316 insertions(+), 197 deletions(-)

-- 
1.8.0.2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 01/12] cpuset: add cs->effective_cpus and cs->effective_mems
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
@ 2013-10-11  9:49 ` Li Zefan
  2013-10-11  9:49 ` [PATCH v2 02/12] cpuset: update cpuset->effective_{cpus,mems} at hotplug Li Zefan
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

We're going to have separate user-configured masks and effective ones.

Eventually configured masks can only be changed by writing cpuset.cpus
and cpuset.mems, and they won't be restricted by parent cpuset. While
effective masks reflect cpu/memory hotplug and hierachical restriction,
and these are the real masks that apply to the tasks in the cpuset.

We calculate effective mask this way:
  - top cpuset's effective_mask == online_mask, otherwise
  - cpuset's effective_mask == configured_mask & parent effective_mask,
    if the result is empty, it inherits parent effective mask.

Those behavior changes are for sane_behavior only. For !sane_behavior
effective_mask and configured_mask are the same, so we won't break old
interfaces.

This patch adds and initializes the effective masks to struct cpuset.
The effective masks of the top cpuset is the same with configured masks,
and a child cpuset inherits its parent's effective masks.

This won't introduce behavior change.

v2:
- s/real_{mems,cpus}_allowed/effective_{mems,cpus}, suggested by Tejun.
- don't init effective masks in cpuset_css_online() if !sane_behavior

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 48 insertions(+), 11 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 6bf981e..e13fc2a 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -81,8 +81,14 @@ struct cpuset {
 	struct cgroup_subsys_state css;
 
 	unsigned long flags;		/* "unsigned long" so bitops work */
-	cpumask_var_t cpus_allowed;	/* CPUs allowed to tasks in cpuset */
-	nodemask_t mems_allowed;	/* Memory Nodes allowed to tasks */
+
+	/* user-configured CPUs and Memory Nodes allow to tasks */
+	cpumask_var_t cpus_allowed;
+	nodemask_t mems_allowed;
+
+	/* effective CPUs and Memory Nodes allow to tasks */
+	cpumask_var_t effective_cpus;
+	nodemask_t effective_mems;
 
 	/*
 	 * This is old Memory Nodes tasks took on.
@@ -381,13 +387,20 @@ static struct cpuset *alloc_trial_cpuset(struct cpuset *cs)
 	if (!trial)
 		return NULL;
 
-	if (!alloc_cpumask_var(&trial->cpus_allowed, GFP_KERNEL)) {
-		kfree(trial);
-		return NULL;
-	}
-	cpumask_copy(trial->cpus_allowed, cs->cpus_allowed);
+	if (!alloc_cpumask_var(&trial->cpus_allowed, GFP_KERNEL))
+		goto free_cs;
+	if (!alloc_cpumask_var(&trial->effective_cpus, GFP_KERNEL))
+		goto free_cpus;
 
+	cpumask_copy(trial->cpus_allowed, cs->cpus_allowed);
+	cpumask_copy(trial->effective_cpus, cs->effective_cpus);
 	return trial;
+
+free_cpus:
+	free_cpumask_var(trial->cpus_allowed);
+free_cs:
+	kfree(trial);
+	return NULL;
 }
 
 /**
@@ -396,6 +409,7 @@ static struct cpuset *alloc_trial_cpuset(struct cpuset *cs)
  */
 static void free_trial_cpuset(struct cpuset *trial)
 {
+	free_cpumask_var(trial->effective_cpus);
 	free_cpumask_var(trial->cpus_allowed);
 	kfree(trial);
 }
@@ -1948,18 +1962,26 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
 	if (!cs)
 		return ERR_PTR(-ENOMEM);
-	if (!alloc_cpumask_var(&cs->cpus_allowed, GFP_KERNEL)) {
-		kfree(cs);
-		return ERR_PTR(-ENOMEM);
-	}
+	if (!alloc_cpumask_var(&cs->cpus_allowed, GFP_KERNEL))
+		goto free_cs;
+	if (!alloc_cpumask_var(&cs->effective_cpus, GFP_KERNEL))
+		goto free_cpus;
 
 	set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
 	cpumask_clear(cs->cpus_allowed);
 	nodes_clear(cs->mems_allowed);
+	cpumask_clear(cs->effective_cpus);
+	nodes_clear(cs->effective_mems);
 	fmeter_init(&cs->fmeter);
 	cs->relax_domain_level = -1;
 
 	return &cs->css;
+
+free_cpus:
+	free_cpumask_var(cs->cpus_allowed);
+free_cs:
+	kfree(cs);
+	return ERR_PTR(-ENOMEM);
 }
 
 static int cpuset_css_online(struct cgroup_subsys_state *css)
@@ -1982,6 +2004,13 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
 
 	number_of_cpusets++;
 
+	mutex_lock(&callback_mutex);
+	if (cgroup_sane_behavior(cs->css.cgroup)) {
+		cpumask_copy(cs->effective_cpus, parent->effective_cpus);
+		cs->effective_mems = parent->effective_mems;
+	}
+	mutex_unlock(&callback_mutex);
+
 	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags))
 		goto out_unlock;
 
@@ -2041,6 +2070,7 @@ static void cpuset_css_free(struct cgroup_subsys_state *css)
 {
 	struct cpuset *cs = css_cs(css);
 
+	free_cpumask_var(cs->effective_cpus);
 	free_cpumask_var(cs->cpus_allowed);
 	kfree(cs);
 }
@@ -2071,9 +2101,13 @@ int __init cpuset_init(void)
 
 	if (!alloc_cpumask_var(&top_cpuset.cpus_allowed, GFP_KERNEL))
 		BUG();
+	if (!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL))
+		BUG();
 
 	cpumask_setall(top_cpuset.cpus_allowed);
 	nodes_setall(top_cpuset.mems_allowed);
+	cpumask_setall(top_cpuset.effective_cpus);
+	nodes_setall(top_cpuset.effective_mems);
 
 	fmeter_init(&top_cpuset.fmeter);
 	set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags);
@@ -2311,6 +2345,9 @@ void __init cpuset_init_smp(void)
 	top_cpuset.mems_allowed = node_states[N_MEMORY];
 	top_cpuset.old_mems_allowed = top_cpuset.mems_allowed;
 
+	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
+	top_cpuset.effective_mems = node_states[N_MEMORY];
+
 	register_hotmemory_notifier(&cpuset_track_online_nodes_nb);
 }
 
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 02/12] cpuset: update cpuset->effective_{cpus,mems} at hotplug
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
  2013-10-11  9:49 ` [PATCH v2 01/12] cpuset: add cs->effective_cpus and cs->effective_mems Li Zefan
@ 2013-10-11  9:49 ` Li Zefan
  2013-10-11  9:50 ` [PATCH v2 03/12] cpuset: update cs->effective_{cpus,mems} when config changes Li Zefan
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

We're going to have separate user-configured masks and effective ones.

Eventually configured masks can only be changed by writing cpuset.cpus
and cpuset.mems, and they won't be restricted by parent cpuset. While
effective masks reflect cpu/memory hotplug and hierachical restriction,
and these are the real masks that apply to the tasks in the cpuset.

We calculate effective mask this way:
  - top cpuset's effective_mask == online_mask, otherwise
  - cpuset's effective_mask == configured_mask & parent effective_mask,
    if the result is empty, it inherits parent effective mask.

Those behavior changes are for sane_behavior only. For !sane_behavior
effective_mask and configured_mask are the same, so we won't break old
interfaces.

To make cs->effective_{cpus,mems} to be effective masks, we need to
  - change the effective masks at hotplug
  - change the effective masks at config change
  - take on ancestor's mask when the effective mask is empty

The first item is done here.

This won't introduce behavior change.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index e13fc2a..d0ccde2 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2186,6 +2186,7 @@ retry:
 
 	mutex_lock(&callback_mutex);
 	cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, &off_cpus);
+	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, &off_cpus);
 	mutex_unlock(&callback_mutex);
 
 	/*
@@ -2200,6 +2201,7 @@ retry:
 
 	mutex_lock(&callback_mutex);
 	nodes_andnot(cs->mems_allowed, cs->mems_allowed, off_mems);
+	nodes_andnot(cs->effective_mems, cs->effective_mems, off_mems);
 	mutex_unlock(&callback_mutex);
 
 	/*
@@ -2263,6 +2265,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 	if (cpus_updated) {
 		mutex_lock(&callback_mutex);
 		cpumask_copy(top_cpuset.cpus_allowed, &new_cpus);
+		cpumask_copy(top_cpuset.effective_cpus, &new_cpus);
 		mutex_unlock(&callback_mutex);
 		/* we don't mess with cpumasks of tasks in top_cpuset */
 	}
@@ -2271,6 +2274,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 	if (mems_updated) {
 		mutex_lock(&callback_mutex);
 		top_cpuset.mems_allowed = new_mems;
+		top_cpuset.effective_mems = new_mems;
 		mutex_unlock(&callback_mutex);
 		update_tasks_nodemask(&top_cpuset, NULL);
 	}
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 03/12] cpuset: update cs->effective_{cpus,mems} when config changes
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
  2013-10-11  9:49 ` [PATCH v2 01/12] cpuset: add cs->effective_cpus and cs->effective_mems Li Zefan
  2013-10-11  9:49 ` [PATCH v2 02/12] cpuset: update cpuset->effective_{cpus,mems} at hotplug Li Zefan
@ 2013-10-11  9:50 ` Li Zefan
  2013-10-15 15:18   ` Tejun Heo
  2013-10-11  9:50 ` [PATCH v2 04/12] cpuset: inherit ancestor's masks if effective_{cpus,mems} becomes empty Li Zefan
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

We're going to have separate user-configured masks and effective ones.

Eventually configured masks can only be changed by writing cpuset.cpus
and cpuset.mems, and they won't be restricted by parent cpuset. While
effective masks reflect cpu/memory hotplug and hierachical restriction,
and these are the real masks that apply to the tasks in the cpuset.

We calculate effective mask this way:
  - top cpuset's effective_mask == online_mask, otherwise
  - cpuset's effective_mask == configured_mask & parent effective_mask,
    if the result is empty, it inherits parent effective mask.

Those behavior changes are for sane_behavior only. For !sane_behavior
effective_mask and configured_mask are the same, so we won't break old
interfaces.

To make cs->effective_{cpus,mems} to be effective masks, we need to
  - change the effective masks at hotplug
  - change the effective masks at config change
  - take on ancestor's mask when the effective mask is empty

The second item is done here. We don't need to treat root_cs specially
in update_cpumasks_hier(). While at it, remove the redundant variable
is_load_balanced.

This won't introduce behavior change.

v2:
- revise the comment in update_{cpu,node}masks_hier(), suggested by Tejun.
- fix to use @cp instead of @cs in these two functions.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 115 ++++++++++++++++++++++++++++++++------------------------
 1 file changed, 66 insertions(+), 49 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index d0ccde2..bdc6047 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -879,39 +879,49 @@ static void update_tasks_cpumask(struct cpuset *cs, struct ptr_heap *heap)
 	css_scan_tasks(&cs->css, NULL, cpuset_change_cpumask, cs, heap);
 }
 
-/*
- * update_tasks_cpumask_hier - Update the cpumasks of tasks in the hierarchy.
- * @root_cs: the root cpuset of the hierarchy
- * @update_root: update root cpuset or not?
+/**
+ * update_cpumasks_hier - Update effective cpumasks and tasks in the subtree
+ * @cs: the cpuset to consider
+ * @trialcs: the trial cpuset
  * @heap: the heap used by css_scan_tasks()
  *
- * This will update cpumasks of tasks in @root_cs and all other empty cpusets
- * which take on cpumask of @root_cs.
- *
- * Called with cpuset_mutex held
+ * When configured cpumask is changed, the effective cpumasks of this cpuset
+ * and all its descendants need to be updated.
  */
-static void update_tasks_cpumask_hier(struct cpuset *root_cs,
-				      bool update_root, struct ptr_heap *heap)
+static void update_cpumasks_hier(struct cpuset *cs, struct cpuset *trialcs,
+				 struct ptr_heap *heap)
 {
-	struct cpuset *cp;
 	struct cgroup_subsys_state *pos_css;
+	struct cpuset *cp;
 
 	rcu_read_lock();
-	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
-		if (cp == root_cs) {
-			if (!update_root)
-				continue;
-		} else {
-			/* skip the whole subtree if @cp have some CPU */
-			if (!cpumask_empty(cp->cpus_allowed)) {
-				pos_css = css_rightmost_descendant(pos_css);
-				continue;
-			}
+	cpuset_for_each_descendant_pre(cp, pos_css, cs) {
+		struct cpuset *parent = parent_cs(cp);
+		struct cpumask *new_cpus = trialcs->effective_cpus;
+
+		cpumask_and(new_cpus, cp->cpus_allowed,
+			    parent->effective_cpus);
+
+		/*
+		 * Skip the whole subtree if the cpumask remains the same
+		 * and isn't empty. If it's empty, we need to update tasks
+		 * to take on an ancestor's cpumask.
+		 */
+		if (cpumask_equal(new_cpus, cp->effective_cpus) &&
+		    ((cp == cs) || !cpumask_empty(new_cpus))) {
+			pos_css = css_rightmost_descendant(pos_css);
+			continue;
 		}
+
 		if (!css_tryget(&cp->css))
 			continue;
+
 		rcu_read_unlock();
 
+		mutex_lock(&callback_mutex);
+		cpumask_copy(cp->effective_cpus, new_cpus);
+		mutex_unlock(&callback_mutex);
+
 		update_tasks_cpumask(cp, heap);
 
 		rcu_read_lock();
@@ -930,7 +940,6 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
 {
 	struct ptr_heap heap;
 	int retval;
-	int is_load_balanced;
 
 	/* top_cpuset.cpus_allowed tracks cpu_online_mask; it's read-only */
 	if (cs == &top_cpuset)
@@ -965,17 +974,15 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
 	if (retval)
 		return retval;
 
-	is_load_balanced = is_sched_load_balance(trialcs);
-
 	mutex_lock(&callback_mutex);
 	cpumask_copy(cs->cpus_allowed, trialcs->cpus_allowed);
 	mutex_unlock(&callback_mutex);
 
-	update_tasks_cpumask_hier(cs, true, &heap);
+	update_cpumasks_hier(cs, trialcs, &heap);
 
 	heap_free(&heap);
 
-	if (is_load_balanced)
+	if (is_sched_load_balance(cs))
 		rebuild_sched_domains_locked();
 	return 0;
 }
@@ -1136,40 +1143,50 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
 	cpuset_being_rebound = NULL;
 }
 
-/*
- * update_tasks_nodemask_hier - Update the nodemasks of tasks in the hierarchy.
- * @cs: the root cpuset of the hierarchy
- * @update_root: update the root cpuset or not?
+/**
+ * update_nodesmasks_hier - Update effective nodemasks and tasks in the subtree
+ * @cs: the cpuset to consider
+ * @trialcs: the trial cpuset
  * @heap: the heap used by css_scan_tasks()
  *
- * This will update nodemasks of tasks in @root_cs and all other empty cpusets
- * which take on nodemask of @root_cs.
- *
- * Called with cpuset_mutex held
+ * When configured nodemask is changed, the effective nodemasks of this cpuset
+ * and all its descendants need to be updated.
  */
-static void update_tasks_nodemask_hier(struct cpuset *root_cs,
-				       bool update_root, struct ptr_heap *heap)
+static void update_nodemasks_hier(struct cpuset *cs, struct cpuset *trialcs,
+				 struct ptr_heap *heap)
 {
-	struct cpuset *cp;
 	struct cgroup_subsys_state *pos_css;
+	struct cpuset *cp;
 
 	rcu_read_lock();
-	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
-		if (cp == root_cs) {
-			if (!update_root)
-				continue;
-		} else {
-			/* skip the whole subtree if @cp have some CPU */
-			if (!nodes_empty(cp->mems_allowed)) {
-				pos_css = css_rightmost_descendant(pos_css);
-				continue;
-			}
+	cpuset_for_each_descendant_pre(cp, pos_css, cs) {
+		struct cpuset *parent = parent_cs(cp);
+		nodemask_t *new_mems = &trialcs->effective_mems;
+
+		nodes_and(*new_mems, cp->mems_allowed,
+			  parent->effective_mems);
+
+		/*
+		 * Skip the whole subtree if the nodemask remains the same
+		 * and isn't empty. If it's empty, we need to update tasks
+		 * to take on an ancestor's nodemask.
+		 */
+		if (nodes_equal(*new_mems, cp->effective_mems) &&
+		    ((cp == cs) || !nodes_empty(*new_mems))) {
+			pos_css = css_rightmost_descendant(pos_css);
+			continue;
 		}
+
 		if (!css_tryget(&cp->css))
 			continue;
+
 		rcu_read_unlock();
 
-		update_tasks_nodemask(cp, heap);
+		mutex_lock(&callback_mutex);
+		cp->effective_mems = *new_mems;
+		mutex_unlock(&callback_mutex);
+
+		update_tasks_cpumask(cp, heap);
 
 		rcu_read_lock();
 		css_put(&cp->css);
@@ -1241,7 +1258,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
 	cs->mems_allowed = trialcs->mems_allowed;
 	mutex_unlock(&callback_mutex);
 
-	update_tasks_nodemask_hier(cs, true, &heap);
+	update_nodemasks_hier(cs, trialcs, &heap);
 
 	heap_free(&heap);
 done:
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 04/12] cpuset: inherit ancestor's masks if  effective_{cpus,mems} becomes empty
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (2 preceding siblings ...)
  2013-10-11  9:50 ` [PATCH v2 03/12] cpuset: update cs->effective_{cpus,mems} when config changes Li Zefan
@ 2013-10-11  9:50 ` Li Zefan
  2013-10-11  9:50 ` [PATCH v2 05/12] cpuset: use effective cpumask to build sched domains Li Zefan
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

We're going to have separate user-configured masks and effective ones.

Eventually configured masks can only be changed by writing cpuset.cpus
and cpuset.mems, and they won't be restricted by parent cpuset. While
effective masks reflect cpu/memory hotplug and hierachical restriction,
and these are the real masks that apply to the tasks in the cpuset.

We calculate effective mask this way:
  - top cpuset's effective_mask == online_mask, otherwise
  - cpuset's effective_mask == configured_mask & parent effective_mask,
    if the result is empty, it inherits parent effective mask.

Those behavior changes are for sane_behavior only. For !sane_behavior
effective_mask and configured_mask are the same, so we won't break old
interfaces.

To make cs->effective_{cpus,mems} to be effective masks, we need to
  - change the effective masks at hotplug
  - change the effective masks at config change
  - take on ancestor's mask when the effective mask is empty

This won't introduce behavior change.

v2:
- Add comments to explain effective masks are the same with configured
masks for !sane_behavior.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index bdc6047..6723b88 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -899,16 +899,22 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpuset *trialcs,
 		struct cpuset *parent = parent_cs(cp);
 		struct cpumask *new_cpus = trialcs->effective_cpus;
 
-		cpumask_and(new_cpus, cp->cpus_allowed,
-			    parent->effective_cpus);
+		/*
+		 * If !sane_behavior, new_cpus will equals cpus_allowed,
+		 * which is not empty, so it's guaranteed the effective mask
+		 * is the same with the configured mask.
+		 */
+		cpumask_and(new_cpus, cp->cpus_allowed, parent->effective_cpus);
 
 		/*
-		 * Skip the whole subtree if the cpumask remains the same
-		 * and isn't empty. If it's empty, we need to update tasks
-		 * to take on an ancestor's cpumask.
+		 * If it becomes empty, inherit the effective mask of the
+		 * parent, which is guaranteed to have some CPUs.
 		 */
-		if (cpumask_equal(new_cpus, cp->effective_cpus) &&
-		    ((cp == cs) || !cpumask_empty(new_cpus))) {
+		if (cpumask_empty(new_cpus))
+			cpumask_copy(new_cpus, parent->effective_cpus);
+
+		/* Skip the whole subtree if the cpumask remains the same. */
+		if (cpumask_equal(new_cpus, cp->effective_cpus)) {
 			pos_css = css_rightmost_descendant(pos_css);
 			continue;
 		}
@@ -1163,16 +1169,22 @@ static void update_nodemasks_hier(struct cpuset *cs, struct cpuset *trialcs,
 		struct cpuset *parent = parent_cs(cp);
 		nodemask_t *new_mems = &trialcs->effective_mems;
 
-		nodes_and(*new_mems, cp->mems_allowed,
-			  parent->effective_mems);
+		/*
+		 * If !sane_behavior, new_mems will equal mems_allowed,
+		 * which is not empty, so it's guaranteed the effective mask
+		 * is the same with the configured mask.
+		 */
+		nodes_and(*new_mems, cp->mems_allowed, parent->effective_mems);
 
 		/*
-		 * Skip the whole subtree if the nodemask remains the same
-		 * and isn't empty. If it's empty, we need to update tasks
-		 * to take on an ancestor's nodemask.
+		 * If it becomes empty, inherit the effective mask of the
+		 * parent, which is guaranteed to have some MEMs.
 		 */
-		if (nodes_equal(*new_mems, cp->effective_mems) &&
-		    ((cp == cs) || !nodes_empty(*new_mems))) {
+		if (nodes_empty(*new_mems))
+			*new_mems = parent->effective_mems;
+
+		/* Skip the whole subtree if the nodemask is not changed. */
+		if (nodes_equal(*new_mems, cp->effective_mems)) {
 			pos_css = css_rightmost_descendant(pos_css);
 			continue;
 		}
@@ -2203,7 +2215,11 @@ retry:
 
 	mutex_lock(&callback_mutex);
 	cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, &off_cpus);
+
+	/* Inherit the effective mask of the parent, if it becomes empty. */
 	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, &off_cpus);
+	if (sane && cpumask_empty(cs->effective_cpus))
+		cpumask_copy(cs->effective_cpus, parent_cs(cs)->effective_cpus);
 	mutex_unlock(&callback_mutex);
 
 	/*
@@ -2218,7 +2234,11 @@ retry:
 
 	mutex_lock(&callback_mutex);
 	nodes_andnot(cs->mems_allowed, cs->mems_allowed, off_mems);
+
+	/* Inherit the effective mask of the parent, if it becomes empty */
 	nodes_andnot(cs->effective_mems, cs->effective_mems, off_mems);
+	if (sane && nodes_empty(cs->effective_mems))
+		cs->effective_mems = parent_cs(cs)->effective_mems;
 	mutex_unlock(&callback_mutex);
 
 	/*
-- 
1.8.0.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 05/12] cpuset: use effective cpumask to build sched domains
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (3 preceding siblings ...)
  2013-10-11  9:50 ` [PATCH v2 04/12] cpuset: inherit ancestor's masks if effective_{cpus,mems} becomes empty Li Zefan
@ 2013-10-11  9:50 ` Li Zefan
  2013-10-15 15:25   ` Tejun Heo
  2013-10-11  9:50 ` [PATCH v2 06/12] cpuset: initialize top_cpuset's configured masks at mount Li Zefan
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

We're going to have separate user-configured masks and effective ones.

Eventually configured masks can only be changed by writing cpuset.cpus
and cpuset.mems, and they won't be restricted by parent cpuset. While
effective masks reflect cpu/memory hotplug and hierachical restriction,
and these are the real masks that apply to the tasks in the cpuset.

We calculate effective mask this way:
  - top cpuset's effective_mask == online_mask, otherwise
  - cpuset's effective_mask == configured_mask & parent effective_mask,
    if the result is empty, it inherits parent effective mask.

Those behavior changes are for sane_behavior only. For !sane_behavior
effective_mask and configured_mask are the same, so we won't break old
interfaces.

This patch updatse cpuset to use effective masks to build sched domains.

This won't introduce behavior change.

v2:
- Add a comment for the call of rebuild_sched_domains(), suggested
by Tejun.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 6723b88..360e547 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -499,11 +499,11 @@ out:
 #ifdef CONFIG_SMP
 /*
  * Helper routine for generate_sched_domains().
- * Do cpusets a, b have overlapping cpus_allowed masks?
+ * Do cpusets a, b have overlapping effective cpus_allowed masks?
  */
 static int cpusets_overlap(struct cpuset *a, struct cpuset *b)
 {
-	return cpumask_intersects(a->cpus_allowed, b->cpus_allowed);
+	return cpumask_intersects(a->effective_cpus, b->effective_cpus);
 }
 
 static void
@@ -620,7 +620,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 			*dattr = SD_ATTR_INIT;
 			update_domain_attr_tree(dattr, &top_cpuset);
 		}
-		cpumask_copy(doms[0], top_cpuset.cpus_allowed);
+		cpumask_copy(doms[0], top_cpuset.effective_cpus);
 
 		goto done;
 	}
@@ -727,7 +727,7 @@ restart:
 			struct cpuset *b = csa[j];
 
 			if (apn == b->pn) {
-				cpumask_or(dp, dp, b->cpus_allowed);
+				cpumask_or(dp, dp, b->effective_cpus);
 				if (dattr)
 					update_domain_attr_tree(dattr + nslot, b);
 
@@ -893,6 +893,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpuset *trialcs,
 {
 	struct cgroup_subsys_state *pos_css;
 	struct cpuset *cp;
+	bool need_rebuild_sched_domains = false;
 
 	rcu_read_lock();
 	cpuset_for_each_descendant_pre(cp, pos_css, cs) {
@@ -930,10 +931,21 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpuset *trialcs,
 
 		update_tasks_cpumask(cp, heap);
 
+		/*
+		 * If the effective cpumask of any non-empty cpuset is
+		 * changed, we need to rebuild sched domains.
+		 */
+		if (!cpumask_empty(cp->cpus_allowed) &&
+		    is_sched_load_balance(cp))
+			need_rebuild_sched_domains = true;
+
 		rcu_read_lock();
 		css_put(&cp->css);
 	}
 	rcu_read_unlock();
+
+	if (need_rebuild_sched_domains)
+		rebuild_sched_domains_locked();
 }
 
 /**
@@ -987,9 +999,6 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
 	update_cpumasks_hier(cs, trialcs, &heap);
 
 	heap_free(&heap);
-
-	if (is_sched_load_balance(cs))
-		rebuild_sched_domains_locked();
 	return 0;
 }
 
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 06/12] cpuset: initialize top_cpuset's configured masks at mount
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (4 preceding siblings ...)
  2013-10-11  9:50 ` [PATCH v2 05/12] cpuset: use effective cpumask to build sched domains Li Zefan
@ 2013-10-11  9:50 ` Li Zefan
  2013-10-11  9:51 ` [PATCH v2 07/12] cpuset: apply cs->effective_{cpus,mems} Li Zefan
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

As we now have to support both sane_behavior and !sane_behavior,
top_cpuset's configured masks need to be initialized accordingly.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 360e547..5c53ba5 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2113,8 +2113,27 @@ static void cpuset_css_free(struct cgroup_subsys_state *css)
 	kfree(cs);
 }
 
+void cpuset_bind(struct cgroup_subsys_state *root_css)
+{
+	mutex_lock(&cpuset_mutex);
+	mutex_lock(&callback_mutex);
+
+	if (cgroup_sane_behavior(root_css->cgroup)) {
+		cpumask_copy(top_cpuset.cpus_allowed, cpu_possible_mask);
+		top_cpuset.mems_allowed = node_possible_map;
+	} else {
+		cpumask_copy(top_cpuset.cpus_allowed,
+			     top_cpuset.effective_cpus);
+		top_cpuset.mems_allowed = top_cpuset.effective_mems;
+	}
+
+	mutex_unlock(&callback_mutex);
+	mutex_unlock(&cpuset_mutex);
+}
+
 struct cgroup_subsys cpuset_subsys = {
 	.name = "cpuset",
+	.bind = cpuset_bind,
 	.css_alloc = cpuset_css_alloc,
 	.css_online = cpuset_css_online,
 	.css_offline = cpuset_css_offline,
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 07/12] cpuset: apply cs->effective_{cpus,mems}
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (5 preceding siblings ...)
  2013-10-11  9:50 ` [PATCH v2 06/12] cpuset: initialize top_cpuset's configured masks at mount Li Zefan
@ 2013-10-11  9:51 ` Li Zefan
  2013-10-11  9:51 ` [PATCH v2 08/12] cpuset: make cs->{cpus,mems}_allowed as user-configured masks Li Zefan
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

Now we can use cs->effective_{cpus,mems} as effective masks. It's
used whenever:

- we update tasks' cpus_allowed/mems_allowed,
- we want to retrieve tasks_cs(tsk)'s cpus_allowed/mems_allowed.

They actually replace effective_{cpu,node}mask_cpuset().

effective_mask == configured_mask & parent effective_mask except when
the reault is empty, in which case it inherits parent effective_mask.
The result equals the mask computed from effective_{cpu,node}mask_cpuset().

This won't affect the original !sane_bevior, because in this case we
make sure the effective masks are always the same with user-configured
masks.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 83 ++++++++++-----------------------------------------------
 1 file changed, 14 insertions(+), 69 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 5c53ba5..040ec59 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -318,9 +318,9 @@ static struct file_system_type cpuset_fs_type = {
  */
 static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
 {
-	while (!cpumask_intersects(cs->cpus_allowed, cpu_online_mask))
+	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
 		cs = parent_cs(cs);
-	cpumask_and(pmask, cs->cpus_allowed, cpu_online_mask);
+	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
 }
 
 /*
@@ -336,9 +336,9 @@ static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
  */
 static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
 {
-	while (!nodes_intersects(cs->mems_allowed, node_states[N_MEMORY]))
+	while (!nodes_intersects(cs->effective_mems, node_states[N_MEMORY]))
 		cs = parent_cs(cs);
-	nodes_and(*pmask, cs->mems_allowed, node_states[N_MEMORY]);
+	nodes_and(*pmask, cs->effective_mems, node_states[N_MEMORY]);
 }
 
 /*
@@ -803,45 +803,6 @@ void rebuild_sched_domains(void)
 	mutex_unlock(&cpuset_mutex);
 }
 
-/*
- * effective_cpumask_cpuset - return nearest ancestor with non-empty cpus
- * @cs: the cpuset in interest
- *
- * A cpuset's effective cpumask is the cpumask of the nearest ancestor
- * with non-empty cpus. We use effective cpumask whenever:
- * - we update tasks' cpus_allowed. (they take on the ancestor's cpumask
- *   if the cpuset they reside in has no cpus)
- * - we want to retrieve task_cs(tsk)'s cpus_allowed.
- *
- * Called with cpuset_mutex held. cpuset_cpus_allowed_fallback() is an
- * exception. See comments there.
- */
-static struct cpuset *effective_cpumask_cpuset(struct cpuset *cs)
-{
-	while (cpumask_empty(cs->cpus_allowed))
-		cs = parent_cs(cs);
-	return cs;
-}
-
-/*
- * effective_nodemask_cpuset - return nearest ancestor with non-empty mems
- * @cs: the cpuset in interest
- *
- * A cpuset's effective nodemask is the nodemask of the nearest ancestor
- * with non-empty memss. We use effective nodemask whenever:
- * - we update tasks' mems_allowed. (they take on the ancestor's nodemask
- *   if the cpuset they reside in has no mems)
- * - we want to retrieve task_cs(tsk)'s mems_allowed.
- *
- * Called with cpuset_mutex held.
- */
-static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
-{
-	while (nodes_empty(cs->mems_allowed))
-		cs = parent_cs(cs);
-	return cs;
-}
-
 /**
  * cpuset_change_cpumask - make a task's cpus_allowed the same as its cpuset's
  * @tsk: task to test
@@ -856,9 +817,8 @@ static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
 static void cpuset_change_cpumask(struct task_struct *tsk, void *data)
 {
 	struct cpuset *cs = data;
-	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
 
-	set_cpus_allowed_ptr(tsk, cpus_cs->cpus_allowed);
+	set_cpus_allowed_ptr(tsk, cs->effective_cpus);
 }
 
 /**
@@ -1026,14 +986,12 @@ static void cpuset_migrate_mm(struct mm_struct *mm, const nodemask_t *from,
 							const nodemask_t *to)
 {
 	struct task_struct *tsk = current;
-	struct cpuset *mems_cs;
 
 	tsk->mems_allowed = *to;
 
 	do_migrate_pages(mm, from, to, MPOL_MF_MOVE_ALL);
 
-	mems_cs = effective_nodemask_cpuset(task_cs(tsk));
-	guarantee_online_mems(mems_cs, &tsk->mems_allowed);
+	guarantee_online_mems(task_cs(tsk), &tsk->mems_allowed);
 }
 
 /*
@@ -1128,13 +1086,12 @@ static void *cpuset_being_rebound;
 static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
 {
 	static nodemask_t newmems;	/* protected by cpuset_mutex */
-	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
 	struct cpuset_change_nodemask_arg arg = { .cs = cs,
 						  .newmems = &newmems };
 
 	cpuset_being_rebound = cs;		/* causes mpol_dup() rebind */
 
-	guarantee_online_mems(mems_cs, &newmems);
+	guarantee_online_mems(cs, &newmems);
 
 	/*
 	 * The mpol_rebind_mm() call takes mmap_sem, which we couldn't
@@ -1572,8 +1529,6 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
 							cpuset_subsys_id);
 	struct cpuset *cs = css_cs(css);
 	struct cpuset *oldcs = css_cs(oldcss);
-	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
-	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
 
 	mutex_lock(&cpuset_mutex);
 
@@ -1581,9 +1536,9 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
 	if (cs == &top_cpuset)
 		cpumask_copy(cpus_attach, cpu_possible_mask);
 	else
-		guarantee_online_cpus(cpus_cs, cpus_attach);
+		guarantee_online_cpus(cs, cpus_attach);
 
-	guarantee_online_mems(mems_cs, &cpuset_attach_nodemask_to);
+	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
 
 	cgroup_taskset_for_each(task, css, tset) {
 		/*
@@ -1600,11 +1555,9 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
 	 * expensive and may sleep.
 	 */
-	cpuset_attach_nodemask_to = cs->mems_allowed;
+	cpuset_attach_nodemask_to = cs->effective_mems;
 	mm = get_task_mm(leader);
 	if (mm) {
-		struct cpuset *mems_oldcs = effective_nodemask_cpuset(oldcs);
-
 		mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
 
 		/*
@@ -1615,7 +1568,7 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
 		 * mm from.
 		 */
 		if (is_memory_migrate(cs)) {
-			cpuset_migrate_mm(mm, &mems_oldcs->old_mems_allowed,
+			cpuset_migrate_mm(mm, &oldcs->old_mems_allowed,
 					  &cpuset_attach_nodemask_to);
 		}
 		mmput(mm);
@@ -2433,23 +2386,17 @@ void __init cpuset_init_smp(void)
 
 void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask)
 {
-	struct cpuset *cpus_cs;
-
 	mutex_lock(&callback_mutex);
 	task_lock(tsk);
-	cpus_cs = effective_cpumask_cpuset(task_cs(tsk));
-	guarantee_online_cpus(cpus_cs, pmask);
+	guarantee_online_cpus(task_cs(tsk), pmask);
 	task_unlock(tsk);
 	mutex_unlock(&callback_mutex);
 }
 
 void cpuset_cpus_allowed_fallback(struct task_struct *tsk)
 {
-	struct cpuset *cpus_cs;
-
 	rcu_read_lock();
-	cpus_cs = effective_cpumask_cpuset(task_cs(tsk));
-	do_set_cpus_allowed(tsk, cpus_cs->cpus_allowed);
+	do_set_cpus_allowed(tsk, task_cs(tsk)->effective_cpus);
 	rcu_read_unlock();
 
 	/*
@@ -2488,13 +2435,11 @@ void cpuset_init_current_mems_allowed(void)
 
 nodemask_t cpuset_mems_allowed(struct task_struct *tsk)
 {
-	struct cpuset *mems_cs;
 	nodemask_t mask;
 
 	mutex_lock(&callback_mutex);
 	task_lock(tsk);
-	mems_cs = effective_nodemask_cpuset(task_cs(tsk));
-	guarantee_online_mems(mems_cs, &mask);
+	guarantee_online_mems(task_cs(tsk), &mask);
 	task_unlock(tsk);
 	mutex_unlock(&callback_mutex);
 
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 08/12] cpuset: make cs->{cpus,mems}_allowed as user-configured masks
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (6 preceding siblings ...)
  2013-10-11  9:51 ` [PATCH v2 07/12] cpuset: apply cs->effective_{cpus,mems} Li Zefan
@ 2013-10-11  9:51 ` Li Zefan
  2013-10-11  9:51 ` [PATCH v2 09/12] cpuset: refactor cpuset_hotplug_update_tasks() Li Zefan
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

Now we've used effective cpumasks to enforce hierarchical manner,
we can use cs->{cpus,mems}_allowed as configured masks.

Configured masks can be changed by writing cpuset.cpus and cpuset.mems
only. The new behaviors are:

- They won't be changed by hotplug anymore.
- They won't be limited by its parent's masks.

This ia a behavior change, but won't take effect unless mount with
sane_behavior.

v2:
- Add comments to explain the differences between configured masks and
effective masks.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 51 ++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 11 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 040ec59..e47115e 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -82,6 +82,26 @@ struct cpuset {
 
 	unsigned long flags;		/* "unsigned long" so bitops work */
 
+	/*
+	 * If sane_behavior is set:
+	 *
+	 * The user-configured masks can only be changed by writing to
+	 * cpuset.cpus and cpuset.mems, and won't be limited by the
+	 * parent masks.
+	 *
+	 * The effective masks is the real masks that apply to the tasks
+	 * in the cpuset. They may be changed if the configured masks are
+	 * changed or hotplug happens.
+	 *
+	 * effective_mask == configured_mask & parent's effective_mask,
+	 * and if it ends up empty, it will inherit the parent's mask.
+	 *
+	 *
+	 * If sane_behavior is not set:
+	 *
+	 * The user-configured masks are always the same with effective masks.
+	 */
+
 	/* user-configured CPUs and Memory Nodes allow to tasks */
 	cpumask_var_t cpus_allowed;
 	nodemask_t mems_allowed;
@@ -455,9 +475,13 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 
 	par = parent_cs(cur);
 
-	/* We must be a subset of our parent cpuset */
+	/*
+	 * We must be a subset of our parent cpuset, unless sane_behavior
+	 * flag is set.
+	 */
 	ret = -EACCES;
-	if (!is_cpuset_subset(trial, par))
+	if (!cgroup_sane_behavior(cur->css.cgroup) &&
+	    !is_cpuset_subset(trial, par))
 		goto out;
 
 	/*
@@ -779,7 +803,7 @@ static void rebuild_sched_domains_locked(void)
 	 * passing doms with offlined cpu to partition_sched_domains().
 	 * Anyways, hotplug work item will rebuild sched domains.
 	 */
-	if (!cpumask_equal(top_cpuset.cpus_allowed, cpu_active_mask))
+	if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
 		goto out;
 
 	/* Generate domain masks and attrs */
@@ -2191,11 +2215,12 @@ retry:
 		goto retry;
 	}
 
-	cpumask_andnot(&off_cpus, cs->cpus_allowed, top_cpuset.cpus_allowed);
-	nodes_andnot(off_mems, cs->mems_allowed, top_cpuset.mems_allowed);
+	cpumask_andnot(&off_cpus, cs->effective_cpus, top_cpuset.effective_cpus);
+	nodes_andnot(off_mems, cs->effective_mems, top_cpuset.effective_mems);
 
 	mutex_lock(&callback_mutex);
-	cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, &off_cpus);
+	if (!sane)
+		cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, &off_cpus);
 
 	/* Inherit the effective mask of the parent, if it becomes empty. */
 	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, &off_cpus);
@@ -2214,7 +2239,8 @@ retry:
 		update_tasks_cpumask(cs, NULL);
 
 	mutex_lock(&callback_mutex);
-	nodes_andnot(cs->mems_allowed, cs->mems_allowed, off_mems);
+	if (!sane)
+		nodes_andnot(cs->mems_allowed, cs->mems_allowed, off_mems);
 
 	/* Inherit the effective mask of the parent, if it becomes empty */
 	nodes_andnot(cs->effective_mems, cs->effective_mems, off_mems);
@@ -2269,6 +2295,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 	static cpumask_t new_cpus;
 	static nodemask_t new_mems;
 	bool cpus_updated, mems_updated;
+	bool sane = cgroup_sane_behavior(top_cpuset.css.cgroup);
 
 	mutex_lock(&cpuset_mutex);
 
@@ -2276,13 +2303,14 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 	cpumask_copy(&new_cpus, cpu_active_mask);
 	new_mems = node_states[N_MEMORY];
 
-	cpus_updated = !cpumask_equal(top_cpuset.cpus_allowed, &new_cpus);
-	mems_updated = !nodes_equal(top_cpuset.mems_allowed, new_mems);
+	cpus_updated = !cpumask_equal(top_cpuset.effective_cpus, &new_cpus);
+	mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
 
 	/* synchronize cpus_allowed to cpu_active_mask */
 	if (cpus_updated) {
 		mutex_lock(&callback_mutex);
-		cpumask_copy(top_cpuset.cpus_allowed, &new_cpus);
+		if (!sane)
+			cpumask_copy(top_cpuset.cpus_allowed, &new_cpus);
 		cpumask_copy(top_cpuset.effective_cpus, &new_cpus);
 		mutex_unlock(&callback_mutex);
 		/* we don't mess with cpumasks of tasks in top_cpuset */
@@ -2291,7 +2319,8 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 	/* synchronize mems_allowed to N_MEMORY */
 	if (mems_updated) {
 		mutex_lock(&callback_mutex);
-		top_cpuset.mems_allowed = new_mems;
+		if (!sane)
+			top_cpuset.mems_allowed = new_mems;
 		top_cpuset.effective_mems = new_mems;
 		mutex_unlock(&callback_mutex);
 		update_tasks_nodemask(&top_cpuset, NULL);
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 09/12] cpuset: refactor cpuset_hotplug_update_tasks()
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (7 preceding siblings ...)
  2013-10-11  9:51 ` [PATCH v2 08/12] cpuset: make cs->{cpus,mems}_allowed as user-configured masks Li Zefan
@ 2013-10-11  9:51 ` Li Zefan
  2013-10-11  9:51 ` [PATCH v2 10/12] cpuset: enable onlined cpu/node in effective masks Li Zefan
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

We mix the handling for both sane_behavior and !sane_behavior in the
same function, and it's quite messy, so split into two functions.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 118 ++++++++++++++++++++++++++++++--------------------------
 1 file changed, 63 insertions(+), 55 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index e47115e..cefc8f4 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2186,6 +2186,65 @@ static void remove_tasks_in_empty_cpuset(struct cpuset *cs)
 	}
 }
 
+static void hotplug_update_tasks_insane(struct cpuset *cs,
+					struct cpumask *off_cpus,
+					nodemask_t *off_mems)
+{
+	bool is_empty;
+
+	mutex_lock(&callback_mutex);
+	cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, off_cpus);
+	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, off_cpus);
+	nodes_andnot(cs->mems_allowed, cs->mems_allowed, *off_mems);
+	nodes_andnot(cs->effective_mems, cs->effective_mems, *off_mems);
+	mutex_unlock(&callback_mutex);
+
+	/*
+	 * Don't call update_tasks_cpumask() if the cpuset becomes empty,
+	 * as the tasks will be migratecd to an ancestor.
+	 */
+	if (!cpumask_empty(off_cpus) && !cpumask_empty(cs->cpus_allowed))
+		update_tasks_cpumask(cs, NULL);
+	if (!nodes_empty(*off_mems) && !nodes_empty(cs->mems_allowed))
+		update_tasks_nodemask(cs, NULL);
+
+	is_empty = cpumask_empty(cs->cpus_allowed) ||
+		   nodes_empty(cs->mems_allowed);
+
+	mutex_unlock(&cpuset_mutex);
+
+	/*
+	 * Move tasks to the nearest ancestor with execution resources,
+	 * This is full cgroup operation which will also call back into
+	 * cpuset. Should be done outside any lock.
+	 */
+	if (is_empty)
+		remove_tasks_in_empty_cpuset(cs);
+
+	mutex_lock(&cpuset_mutex);
+}
+
+static void hotplug_update_tasks_sane(struct cpuset *cs,
+				      struct cpumask *off_cpus,
+				      nodemask_t *off_mems)
+{
+	mutex_lock(&callback_mutex);
+	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, off_cpus);
+	if (cpumask_empty(cs->effective_cpus))
+		cpumask_copy(cs->effective_cpus,
+			     parent_cs(cs)->effective_cpus);
+
+	nodes_andnot(cs->effective_mems, cs->effective_mems, *off_mems);
+	if (nodes_empty(cs->effective_mems))
+		cs->effective_mems = parent_cs(cs)->effective_mems;
+	mutex_unlock(&callback_mutex);
+
+	if (!cpumask_empty(off_cpus))
+		update_tasks_cpumask(cs, NULL);
+	if (!nodes_empty(*off_mems))
+		update_tasks_nodemask(cs, NULL);
+}
+
 /**
  * cpuset_hotplug_update_tasks - update tasks in a cpuset for hotunplug
  * @cs: cpuset in interest
@@ -2198,9 +2257,6 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs)
 {
 	static cpumask_t off_cpus;
 	static nodemask_t off_mems;
-	bool is_empty;
-	bool sane = cgroup_sane_behavior(cs->css.cgroup);
-
 retry:
 	wait_event(cpuset_attach_wq, cs->attach_in_progress == 0);
 
@@ -2218,60 +2274,12 @@ retry:
 	cpumask_andnot(&off_cpus, cs->effective_cpus, top_cpuset.effective_cpus);
 	nodes_andnot(off_mems, cs->effective_mems, top_cpuset.effective_mems);
 
-	mutex_lock(&callback_mutex);
-	if (!sane)
-		cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, &off_cpus);
-
-	/* Inherit the effective mask of the parent, if it becomes empty. */
-	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, &off_cpus);
-	if (sane && cpumask_empty(cs->effective_cpus))
-		cpumask_copy(cs->effective_cpus, parent_cs(cs)->effective_cpus);
-	mutex_unlock(&callback_mutex);
-
-	/*
-	 * If sane_behavior flag is set, we need to update tasks' cpumask
-	 * for empty cpuset to take on ancestor's cpumask. Otherwise, don't
-	 * call update_tasks_cpumask() if the cpuset becomes empty, as
-	 * the tasks in it will be migrated to an ancestor.
-	 */
-	if ((sane && cpumask_empty(cs->cpus_allowed)) ||
-	    (!cpumask_empty(&off_cpus) && !cpumask_empty(cs->cpus_allowed)))
-		update_tasks_cpumask(cs, NULL);
-
-	mutex_lock(&callback_mutex);
-	if (!sane)
-		nodes_andnot(cs->mems_allowed, cs->mems_allowed, off_mems);
-
-	/* Inherit the effective mask of the parent, if it becomes empty */
-	nodes_andnot(cs->effective_mems, cs->effective_mems, off_mems);
-	if (sane && nodes_empty(cs->effective_mems))
-		cs->effective_mems = parent_cs(cs)->effective_mems;
-	mutex_unlock(&callback_mutex);
-
-	/*
-	 * If sane_behavior flag is set, we need to update tasks' nodemask
-	 * for empty cpuset to take on ancestor's nodemask. Otherwise, don't
-	 * call update_tasks_nodemask() if the cpuset becomes empty, as
-	 * the tasks in it will be migratd to an ancestor.
-	 */
-	if ((sane && nodes_empty(cs->mems_allowed)) ||
-	    (!nodes_empty(off_mems) && !nodes_empty(cs->mems_allowed)))
-		update_tasks_nodemask(cs, NULL);
-
-	is_empty = cpumask_empty(cs->cpus_allowed) ||
-		nodes_empty(cs->mems_allowed);
+	if (cgroup_sane_behavior(cs->css.cgroup))
+		hotplug_update_tasks_sane(cs, &off_cpus, &off_mems);
+	else
+		hotplug_update_tasks_insane(cs, &off_cpus, &off_mems);
 
 	mutex_unlock(&cpuset_mutex);
-
-	/*
-	 * If sane_behavior flag is set, we'll keep tasks in empty cpusets.
-	 *
-	 * Otherwise move tasks to the nearest ancestor with execution
-	 * resources.  This is full cgroup operation which will
-	 * also call back into cpuset.  Should be done outside any lock.
-	 */
-	if (!sane && is_empty)
-		remove_tasks_in_empty_cpuset(cs);
 }
 
 /**
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 10/12] cpuset: enable onlined cpu/node in effective masks
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (8 preceding siblings ...)
  2013-10-11  9:51 ` [PATCH v2 09/12] cpuset: refactor cpuset_hotplug_update_tasks() Li Zefan
@ 2013-10-11  9:51 ` Li Zefan
  2013-10-11  9:51 ` [PATCH v2 11/12] cpuset: allow writing offlined masks to cpuset.cpus/mems Li Zefan
  2013-10-11  9:52 ` [PATCH v2 12/12] cpuset: export effective masks to userspace Li Zefan
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

Firstly offline cpu1:

  # echo 0-1 > cpuset.cpus
  # echo 0 > /sys/devices/system/cpu/cpu1/online
  # cat cpuset.cpus
  0-1
  # cat cpuset.effective_cpus
  0

Then online it:

  # echo 1 > /sys/devices/system/cpu/cpu1/online
  # cat cpuset.cpus
  0-1
  # cat cpuset.effective_cpus
  0-1

And cpuset will bring it back to the effective mask.

The implementation is quite straightforward. Instead of calculating the
offlined cpus/mems and do updates, we just set the new effective_mask
to online_mask & congifured_mask.

This is a behavior change for sane_behavior, so !sane_behavior won't
be affected.

v2:
- make refactoring of cpuset_hotplug_update_tasks() as seperate patch,
suggested by Tejun.
- make hotplug_update_tasks_insane() use @new_cpus and @new_mems as
hotplug_update_tasks_sane() does.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 64 ++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 36 insertions(+), 28 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index cefc8f4..e71c04f 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2186,26 +2186,27 @@ static void remove_tasks_in_empty_cpuset(struct cpuset *cs)
 	}
 }
 
-static void hotplug_update_tasks_insane(struct cpuset *cs,
-					struct cpumask *off_cpus,
-					nodemask_t *off_mems)
+static void
+hotplug_update_tasks_insane(struct cpuset *cs,
+			    struct cpumask *new_cpus, nodemask_t *new_mems,
+			    bool cpus_updated, bool mems_updated)
 {
 	bool is_empty;
 
 	mutex_lock(&callback_mutex);
-	cpumask_andnot(cs->cpus_allowed, cs->cpus_allowed, off_cpus);
-	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, off_cpus);
-	nodes_andnot(cs->mems_allowed, cs->mems_allowed, *off_mems);
-	nodes_andnot(cs->effective_mems, cs->effective_mems, *off_mems);
+	cpumask_copy(cs->cpus_allowed, new_cpus);
+	cpumask_copy(cs->effective_cpus, new_cpus);
+	cs->mems_allowed = *new_mems;
+	cs->effective_mems = *new_mems;
 	mutex_unlock(&callback_mutex);
 
 	/*
 	 * Don't call update_tasks_cpumask() if the cpuset becomes empty,
 	 * as the tasks will be migratecd to an ancestor.
 	 */
-	if (!cpumask_empty(off_cpus) && !cpumask_empty(cs->cpus_allowed))
+	if (cpus_updated && !cpumask_empty(cs->cpus_allowed))
 		update_tasks_cpumask(cs, NULL);
-	if (!nodes_empty(*off_mems) && !nodes_empty(cs->mems_allowed))
+	if (mems_updated && !nodes_empty(cs->mems_allowed))
 		update_tasks_nodemask(cs, NULL);
 
 	is_empty = cpumask_empty(cs->cpus_allowed) ||
@@ -2224,24 +2225,24 @@ static void hotplug_update_tasks_insane(struct cpuset *cs,
 	mutex_lock(&cpuset_mutex);
 }
 
-static void hotplug_update_tasks_sane(struct cpuset *cs,
-				      struct cpumask *off_cpus,
-				      nodemask_t *off_mems)
+static void
+hotplug_update_tasks_sane(struct cpuset *cs,
+			  struct cpumask *new_cpus, nodemask_t *new_mems,
+			  bool cpus_updated, bool mems_updated)
 {
+	if (cpumask_empty(new_cpus))
+		cpumask_copy(new_cpus, parent_cs(cs)->effective_cpus);
+	if (nodes_empty(*new_mems))
+		*new_mems = parent_cs(cs)->effective_mems;
+
 	mutex_lock(&callback_mutex);
-	cpumask_andnot(cs->effective_cpus, cs->effective_cpus, off_cpus);
-	if (cpumask_empty(cs->effective_cpus))
-		cpumask_copy(cs->effective_cpus,
-			     parent_cs(cs)->effective_cpus);
-
-	nodes_andnot(cs->effective_mems, cs->effective_mems, *off_mems);
-	if (nodes_empty(cs->effective_mems))
-		cs->effective_mems = parent_cs(cs)->effective_mems;
+	cpumask_copy(cs->effective_cpus, new_cpus);
+	cs->effective_mems = *new_mems;
 	mutex_unlock(&callback_mutex);
 
-	if (!cpumask_empty(off_cpus))
+	if (cpus_updated)
 		update_tasks_cpumask(cs, NULL);
-	if (!nodes_empty(*off_mems))
+	if (mems_updated)
 		update_tasks_nodemask(cs, NULL);
 }
 
@@ -2255,8 +2256,10 @@ static void hotplug_update_tasks_sane(struct cpuset *cs,
  */
 static void cpuset_hotplug_update_tasks(struct cpuset *cs)
 {
-	static cpumask_t off_cpus;
-	static nodemask_t off_mems;
+	static cpumask_t new_cpus;
+	static nodemask_t new_mems;
+	bool cpus_updated;
+	bool mems_updated;
 retry:
 	wait_event(cpuset_attach_wq, cs->attach_in_progress == 0);
 
@@ -2271,13 +2274,18 @@ retry:
 		goto retry;
 	}
 
-	cpumask_andnot(&off_cpus, cs->effective_cpus, top_cpuset.effective_cpus);
-	nodes_andnot(off_mems, cs->effective_mems, top_cpuset.effective_mems);
+	cpumask_and(&new_cpus, cs->cpus_allowed, parent_cs(cs)->effective_cpus);
+	nodes_and(new_mems, cs->mems_allowed, parent_cs(cs)->effective_mems);
+
+	cpus_updated = !cpumask_equal(&new_cpus, cs->effective_cpus);
+	mems_updated = !nodes_equal(new_mems, cs->effective_mems);
 
 	if (cgroup_sane_behavior(cs->css.cgroup))
-		hotplug_update_tasks_sane(cs, &off_cpus, &off_mems);
+		hotplug_update_tasks_sane(cs, &new_cpus, &new_mems,
+					  cpus_updated, mems_updated);
 	else
-		hotplug_update_tasks_insane(cs, &off_cpus, &off_mems);
+		hotplug_update_tasks_insane(cs, &new_cpus, &new_mems,
+					    cpus_updated, mems_updated);
 
 	mutex_unlock(&cpuset_mutex);
 }
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 11/12] cpuset: allow writing offlined masks to cpuset.cpus/mems
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (9 preceding siblings ...)
  2013-10-11  9:51 ` [PATCH v2 10/12] cpuset: enable onlined cpu/node in effective masks Li Zefan
@ 2013-10-11  9:51 ` Li Zefan
  2013-10-15 15:36   ` Tejun Heo
  2013-10-11  9:52 ` [PATCH v2 12/12] cpuset: export effective masks to userspace Li Zefan
  11 siblings, 1 reply; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

As the configured masks won't be limited by its parent, and the top
cpuset's masks won't change when hotplug happens, it's natural to
allow writing offlined masks to the configured masks.

Signed-off-by; Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index e71c04f..a98723d 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -960,7 +960,8 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
 		if (retval < 0)
 			return retval;
 
-		if (!cpumask_subset(trialcs->cpus_allowed, cpu_active_mask))
+		if (!cpumask_subset(trialcs->cpus_allowed,
+				    top_cpuset.cpus_allowed))
 			return -EINVAL;
 	}
 
@@ -1238,8 +1239,8 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
 			goto done;
 
 		if (!nodes_subset(trialcs->mems_allowed,
-				node_states[N_MEMORY])) {
-			retval =  -EINVAL;
+				  top_cpuset.mems_allowed)) {
+			retval = -EINVAL;
 			goto done;
 		}
 	}
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 12/12] cpuset: export effective masks to userspace
  2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
                   ` (10 preceding siblings ...)
  2013-10-11  9:51 ` [PATCH v2 11/12] cpuset: allow writing offlined masks to cpuset.cpus/mems Li Zefan
@ 2013-10-11  9:52 ` Li Zefan
  11 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-10-11  9:52 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, cgroups

cpuset.cpus and cpuset.mems are the configured masks, and we need
to export effective masks to userspace, so users know the real
cpus_allowed and mems_allowed that apply to the tasks in a cpuset.

cpuset.effective_cpus and cpuset.effective_mems will be created for
sane_behavior only.

v2:
- export those masks unconditionally, suggested by Tejun.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index a98723d..c8ba514 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1614,6 +1614,8 @@ typedef enum {
 	FILE_MEMORY_MIGRATE,
 	FILE_CPULIST,
 	FILE_MEMLIST,
+	FILE_EFFECTIVE_CPULIST,
+	FILE_EFFECTIVE_MEMLIST,
 	FILE_CPU_EXCLUSIVE,
 	FILE_MEM_EXCLUSIVE,
 	FILE_MEM_HARDWALL,
@@ -1762,23 +1764,23 @@ out_unlock:
  * across a page fault.
  */
 
-static size_t cpuset_sprintf_cpulist(char *page, struct cpuset *cs)
+static size_t cpuset_sprintf_cpulist(char *page, struct cpumask *pmask)
 {
 	size_t count;
 
 	mutex_lock(&callback_mutex);
-	count = cpulist_scnprintf(page, PAGE_SIZE, cs->cpus_allowed);
+	count = cpulist_scnprintf(page, PAGE_SIZE, pmask);
 	mutex_unlock(&callback_mutex);
 
 	return count;
 }
 
-static size_t cpuset_sprintf_memlist(char *page, struct cpuset *cs)
+static size_t cpuset_sprintf_memlist(char *page, nodemask_t mask)
 {
 	size_t count;
 
 	mutex_lock(&callback_mutex);
-	count = nodelist_scnprintf(page, PAGE_SIZE, cs->mems_allowed);
+	count = nodelist_scnprintf(page, PAGE_SIZE, mask);
 	mutex_unlock(&callback_mutex);
 
 	return count;
@@ -1802,10 +1804,16 @@ static ssize_t cpuset_common_file_read(struct cgroup_subsys_state *css,
 
 	switch (type) {
 	case FILE_CPULIST:
-		s += cpuset_sprintf_cpulist(s, cs);
+		s += cpuset_sprintf_cpulist(s, cs->cpus_allowed);
 		break;
 	case FILE_MEMLIST:
-		s += cpuset_sprintf_memlist(s, cs);
+		s += cpuset_sprintf_memlist(s, cs->mems_allowed);
+		break;
+	case FILE_EFFECTIVE_CPULIST:
+		s += cpuset_sprintf_cpulist(s, cs->effective_cpus);
+		break;
+	case FILE_EFFECTIVE_MEMLIST:
+		s += cpuset_sprintf_memlist(s, cs->effective_mems);
 		break;
 	default:
 		retval = -EINVAL;
@@ -1880,6 +1888,13 @@ static struct cftype files[] = {
 	},
 
 	{
+		.name = "effective_cpus",
+		.read = cpuset_common_file_read,
+		.max_write_len = (100U + 6 * NR_CPUS),
+		.private = FILE_EFFECTIVE_CPULIST,
+	},
+
+	{
 		.name = "mems",
 		.read = cpuset_common_file_read,
 		.write_string = cpuset_write_resmask,
@@ -1888,6 +1903,13 @@ static struct cftype files[] = {
 	},
 
 	{
+		.name = "effective_mems",
+		.read = cpuset_common_file_read,
+		.max_write_len = (100U + 6 * MAX_NUMNODES),
+		.private = FILE_EFFECTIVE_MEMLIST,
+	},
+
+	{
 		.name = "cpu_exclusive",
 		.read_u64 = cpuset_read_u64,
 		.write_u64 = cpuset_write_u64,
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 03/12] cpuset: update cs->effective_{cpus,mems} when config changes
  2013-10-11  9:50 ` [PATCH v2 03/12] cpuset: update cs->effective_{cpus,mems} when config changes Li Zefan
@ 2013-10-15 15:18   ` Tejun Heo
  0 siblings, 0 replies; 16+ messages in thread
From: Tejun Heo @ 2013-10-15 15:18 UTC (permalink / raw)
  To: Li Zefan; +Cc: LKML, cgroups

On Fri, Oct 11, 2013 at 05:50:04PM +0800, Li Zefan wrote:
> +	cpuset_for_each_descendant_pre(cp, pos_css, cs) {
> +		struct cpuset *parent = parent_cs(cp);
> +		struct cpumask *new_cpus = trialcs->effective_cpus;
> +
> +		cpumask_and(new_cpus, cp->cpus_allowed,
> +			    parent->effective_cpus);

So, @trial_cs is only passed in to use its ->effective_cpus as
temporary buffer?  If allocating from inside the function isn't an
option, wouldn't it be better to pass in cpumask * instead of the
whole trial_cs and explicitly note that the argument is used as a temp
var?

...
> +	cpuset_for_each_descendant_pre(cp, pos_css, cs) {
> +		struct cpuset *parent = parent_cs(cp);
> +		nodemask_t *new_mems = &trialcs->effective_mems;
> +
> +		nodes_and(*new_mems, cp->mems_allowed,
> +			  parent->effective_mems);

Ditto.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 05/12] cpuset: use effective cpumask to build sched domains
  2013-10-11  9:50 ` [PATCH v2 05/12] cpuset: use effective cpumask to build sched domains Li Zefan
@ 2013-10-15 15:25   ` Tejun Heo
  0 siblings, 0 replies; 16+ messages in thread
From: Tejun Heo @ 2013-10-15 15:25 UTC (permalink / raw)
  To: Li Zefan; +Cc: LKML, cgroups

On Fri, Oct 11, 2013 at 05:50:38PM +0800, Li Zefan wrote:
> @@ -930,10 +931,21 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpuset *trialcs,
>  
>  		update_tasks_cpumask(cp, heap);
>  
> +		/*
> +		 * If the effective cpumask of any non-empty cpuset is
> +		 * changed, we need to rebuild sched domains.
> +		 */
> +		if (!cpumask_empty(cp->cpus_allowed) &&
> +		    is_sched_load_balance(cp))
> +			need_rebuild_sched_domains = true;
> +

Can you please explain *why* this change is being made in the patch
description?  The patch description doesn't give me anything and the
comment explain "what" but not why it's moved from update_cpumask()
into this function with an extra condition.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 11/12] cpuset: allow writing offlined masks to cpuset.cpus/mems
  2013-10-11  9:51 ` [PATCH v2 11/12] cpuset: allow writing offlined masks to cpuset.cpus/mems Li Zefan
@ 2013-10-15 15:36   ` Tejun Heo
  0 siblings, 0 replies; 16+ messages in thread
From: Tejun Heo @ 2013-10-15 15:36 UTC (permalink / raw)
  To: Li Zefan; +Cc: LKML, cgroups

On Fri, Oct 11, 2013 at 05:51:54PM +0800, Li Zefan wrote:
> As the configured masks won't be limited by its parent, and the top
> cpuset's masks won't change when hotplug happens, it's natural to
> allow writing offlined masks to the configured masks.
> 
> Signed-off-by; Li Zefan <lizefan@huawei.com>
> ---
>  kernel/cpuset.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index e71c04f..a98723d 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -960,7 +960,8 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
>  		if (retval < 0)
>  			return retval;
>  
> -		if (!cpumask_subset(trialcs->cpus_allowed, cpu_active_mask))
> +		if (!cpumask_subset(trialcs->cpus_allowed,
> +				    top_cpuset.cpus_allowed))

Shouldn't this gated by sane_behavior?

>  
> @@ -1238,8 +1239,8 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
>  			goto done;
>  
>  		if (!nodes_subset(trialcs->mems_allowed,
> -				node_states[N_MEMORY])) {
> -			retval =  -EINVAL;
> +				  top_cpuset.mems_allowed)) {
> +			retval = -EINVAL;

Ditto.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-10-15 15:36 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-11  9:49 [PATCH v2 00/12] cpuset: separate configured masks and effective masks Li Zefan
2013-10-11  9:49 ` [PATCH v2 01/12] cpuset: add cs->effective_cpus and cs->effective_mems Li Zefan
2013-10-11  9:49 ` [PATCH v2 02/12] cpuset: update cpuset->effective_{cpus,mems} at hotplug Li Zefan
2013-10-11  9:50 ` [PATCH v2 03/12] cpuset: update cs->effective_{cpus,mems} when config changes Li Zefan
2013-10-15 15:18   ` Tejun Heo
2013-10-11  9:50 ` [PATCH v2 04/12] cpuset: inherit ancestor's masks if effective_{cpus,mems} becomes empty Li Zefan
2013-10-11  9:50 ` [PATCH v2 05/12] cpuset: use effective cpumask to build sched domains Li Zefan
2013-10-15 15:25   ` Tejun Heo
2013-10-11  9:50 ` [PATCH v2 06/12] cpuset: initialize top_cpuset's configured masks at mount Li Zefan
2013-10-11  9:51 ` [PATCH v2 07/12] cpuset: apply cs->effective_{cpus,mems} Li Zefan
2013-10-11  9:51 ` [PATCH v2 08/12] cpuset: make cs->{cpus,mems}_allowed as user-configured masks Li Zefan
2013-10-11  9:51 ` [PATCH v2 09/12] cpuset: refactor cpuset_hotplug_update_tasks() Li Zefan
2013-10-11  9:51 ` [PATCH v2 10/12] cpuset: enable onlined cpu/node in effective masks Li Zefan
2013-10-11  9:51 ` [PATCH v2 11/12] cpuset: allow writing offlined masks to cpuset.cpus/mems Li Zefan
2013-10-15 15:36   ` Tejun Heo
2013-10-11  9:52 ` [PATCH v2 12/12] cpuset: export effective masks to userspace Li Zefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).