linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cpuset: hotunplug cpus and mems in all cpusets
@ 2006-08-29  6:08 Paul Jackson
  2006-08-29  6:19 ` Andrew Morton
  2006-08-31 10:21 ` Paul Jackson
  0 siblings, 2 replies; 4+ messages in thread
From: Paul Jackson @ 2006-08-29  6:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: nathanl, Simon.Derr, linux-kernel, ntl, y-goto, Anton Blanchard,
	Paul Jackson, Dave Hansen, kamezawa.hiroyu

From: Paul Jackson <pj@sgi.com>

The cpuset code handling hot unplug of CPUs or Memory Nodes
was incorrect - it could remove a CPU or Node from the top
cpuset, while leaving it still in some child cpusets.

One basic rule of cpusets is that each cpusets cpus and mems
are subsets of its parents.  The cpuset hot unplug code
violated this rule.

So the cpuset hotunplug handler must walk down the tree,
removing any removed CPU or Node from all cpusets.

However, it is not allowed to make a cpusets cpus or mems
become empty.  They can only transition from empty to non-empty,
not back.

So if the last CPU or Node would be removed from a cpuset by
the above walk, we scan back up the cpuset hierarchy, finding
the nearest ancestor that still has something online, and copy
its CPU or Memory placement.

Signed-off-by: Paul Jackson <pj@sgi.com>

---

Anton or Nathan - can you test this again?  I still lack any
hotplug enabled test rig.  It builds, boots and still does
the usual cpuset unit tests just fine on ia64.

Andrew - I see you've sent one of these patches onto Linus:
    cpuset: top_cpuset tracks hotplug changes to cpu_online_map
but you are still holding in *-mm this patch:
    cpuset: top_cpuset tracks hotplug changes to node_online_map
So far as I can see - ** good **.  I am not aware of any reason
to hurry sending along the second patch above, or this new patch
here.  Whether they end up in 2.6.18 o 2.6.19 doesn't matter
so far as I know.

I think I am done with this patch series now -- but I thought
that the last two times as well, so that apparently means
little.


 kernel/cpuset.c |   87 +++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 70 insertions(+), 17 deletions(-)

--- 2.6.18-rc4-mm3.orig/kernel/cpuset.c	2006-08-28 19:18:39.384949360 -0700
+++ 2.6.18-rc4-mm3/kernel/cpuset.c	2006-08-28 22:29:38.668178349 -0700
@@ -2040,48 +2040,101 @@ out:
 	return err;
 }
 
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
 /*
- * The top_cpuset tracks what CPUs and Memory Nodes are online,
- * period.  This is necessary in order to make cpusets transparent
- * (of no affect) on systems that are actively using CPU hotplug
- * but making no active use of cpusets.
+ * If common_cpu_mem_hotplug_unplug(), below, unplugs any CPUs
+ * or memory nodes, we need to walk over the cpuset hierarchy,
+ * removing that CPU or node from all cpusets.  If this removes the
+ * last CPU or node from a cpuset, then the guarantee_online_cpus()
+ * or guarantee_online_mems() code will use that emptied cpusets
+ * parent online CPUs or nodes.  Cpusets that were already empty of
+ * CPUs or nodes are left empty.
+ *
+ * This routine is intentionally inefficient in a couple of regards.
+ * It will check all cpusets in a subtree even if the top cpuset of
+ * the subtree has no offline CPUs or nodes.  It checks both CPUs and
+ * nodes, even though the caller could have been coded to know that
+ * only one of CPUs or nodes needed to be checked on a given call.
+ * This was done to minimize text size rather than cpu cycles.
  *
- * This routine ensures that top_cpuset.cpus_allowed tracks
- * cpu_online_map on each CPU hotplug (cpuhp) event.
+ * Call with both manage_mutex and callback_mutex held.
+ *
+ * Recursive, on depth of cpuset subtree.
  */
 
-#ifdef CONFIG_HOTPLUG_CPU
-static int cpuset_handle_cpuhp(struct notifier_block *nb,
-				unsigned long phase, void *cpu)
+static void guarantee_online_cpus_mems_in_subtree(const struct cpuset *cur)
+{
+	struct cpuset *c;
+
+	/* Each of our child cpusets mems must be online */
+	list_for_each_entry(c, &cur->children, sibling) {
+		guarantee_online_cpus_mems_in_subtree(c);
+		if (!cpus_empty(c->cpus_allowed))
+			guarantee_online_cpus(c, &c->cpus_allowed);
+		if (!nodes_empty(c->mems_allowed))
+			guarantee_online_mems(c, &c->mems_allowed);
+	}
+}
+
+/*
+ * The cpus_allowed and mems_allowed nodemasks in the top_cpuset track
+ * cpu_online_map and node_online_map.  Force the top cpuset to track
+ * whats online after any CPU or memory node hotplug or unplug event.
+ *
+ * To ensure that we don't remove a CPU or node from the top cpuset
+ * that is currently in use by a child cpuset (which would violate
+ * the rule that cpusets must be subsets of their parent), we first
+ * call the recursive routine guarantee_online_cpus_mems_in_subtree().
+ *
+ * Since there are two callers of this routine, one for CPU hotplug
+ * events and one for memory node hotplug events, we could have coded
+ * two separate routines here.  We code it as a single common routine
+ * in order to minimize text size.
+ */
+
+static void common_cpu_mem_hotplug_unplug(void)
 {
 	mutex_lock(&manage_mutex);
 	mutex_lock(&callback_mutex);
 
+	guarantee_online_cpus_mems_in_subtree(&top_cpuset);
 	top_cpuset.cpus_allowed = cpu_online_map;
+	top_cpuset.mems_allowed = node_online_map;
 
 	mutex_unlock(&callback_mutex);
 	mutex_unlock(&manage_mutex);
+}
+#endif
+
+#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * The top_cpuset tracks what CPUs and Memory Nodes are online,
+ * period.  This is necessary in order to make cpusets transparent
+ * (of no affect) on systems that are actively using CPU hotplug
+ * but making no active use of cpusets.
+ *
+ * This routine ensures that top_cpuset.cpus_allowed tracks
+ * cpu_online_map on each CPU hotplug (cpuhp) event.
+ */
 
+static int cpuset_handle_cpuhp(struct notifier_block *nb,
+				unsigned long phase, void *cpu)
+{
+	common_cpu_mem_hotplug_unplug();
 	return 0;
 }
 #endif
 
+#ifdef CONFIG_MEMORY_HOTPLUG
 /*
  * Keep top_cpuset.mems_allowed tracking node_online_map.
  * Call this routine anytime after you change node_online_map.
  * See also the previous routine cpuset_handle_cpuhp().
  */
 
-#ifdef CONFIG_MEMORY_HOTPLUG
 void cpuset_track_online_nodes()
 {
-	mutex_lock(&manage_mutex);
-	mutex_lock(&callback_mutex);
-
-	top_cpuset.mems_allowed = node_online_map;
-
-	mutex_unlock(&callback_mutex);
-	mutex_unlock(&manage_mutex);
+	common_cpu_mem_hotplug_unplug();
 }
 #endif
 

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cpuset: hotunplug cpus and mems in all cpusets
  2006-08-29  6:08 [PATCH] cpuset: hotunplug cpus and mems in all cpusets Paul Jackson
@ 2006-08-29  6:19 ` Andrew Morton
  2006-08-29  7:15   ` Paul Jackson
  2006-08-31 10:21 ` Paul Jackson
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2006-08-29  6:19 UTC (permalink / raw)
  To: Paul Jackson
  Cc: nathanl, Simon.Derr, linux-kernel, ntl, y-goto, Anton Blanchard,
	Dave Hansen, kamezawa.hiroyu

On Mon, 28 Aug 2006 23:08:24 -0700
Paul Jackson <pj@sgi.com> wrote:

> The cpuset code handling hot unplug of CPUs or Memory Nodes
> was incorrect - it could remove a CPU or Node from the top
> cpuset, while leaving it still in some child cpusets.
> 
> One basic rule of cpusets is that each cpusets cpus and mems
> are subsets of its parents.  The cpuset hot unplug code
> violated this rule.
> 
> So the cpuset hotunplug handler must walk down the tree,
> removing any removed CPU or Node from all cpusets.
> 
> However, it is not allowed to make a cpusets cpus or mems
> become empty.  They can only transition from empty to non-empty,
> not back.
> 
> So if the last CPU or Node would be removed from a cpuset by
> the above walk, we scan back up the cpuset hierarchy, finding
> the nearest ancestor that still has something online, and copy
> its CPU or Memory placement.

Did you consider failing the hotremove request instead?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cpuset: hotunplug cpus and mems in all cpusets
  2006-08-29  6:19 ` Andrew Morton
@ 2006-08-29  7:15   ` Paul Jackson
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Jackson @ 2006-08-29  7:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: nathanl, Simon.Derr, linux-kernel, ntl, y-goto, anton, haveblue,
	kamezawa.hiroyu

Andrew wrote:
> Did you consider failing the hotremove request instead?

Eh ... we'd end up with another complaint from the hotplug
folks, in another year, when some obscure constraint on
nested cpusets thwarted their efforts to unplug something.

It's best if cpusets just deals with it somehow, and doesn't
complain about the comings and goings of hardware.

At least, that's my take on it ...

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cpuset: hotunplug cpus and mems in all cpusets
  2006-08-29  6:08 [PATCH] cpuset: hotunplug cpus and mems in all cpusets Paul Jackson
  2006-08-29  6:19 ` Andrew Morton
@ 2006-08-31 10:21 ` Paul Jackson
  1 sibling, 0 replies; 4+ messages in thread
From: Paul Jackson @ 2006-08-31 10:21 UTC (permalink / raw)
  To: Paul Jackson
  Cc: akpm, nathanl, Simon.Derr, linux-kernel, ntl, y-goto, anton,
	haveblue, kamezawa.hiroyu

Anton or Nathan - did you get a chance to test this patch?

I finally got my hands on a hotplug capable system - just 4 CPUs and 1
memory node (SMP, not NUMA).  This patch plugged and unplugged CPUs ok
from what I could see, updating the cpuset 'cpus' files as intended.

I am confident it's a good patch.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-08-31 10:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-29  6:08 [PATCH] cpuset: hotunplug cpus and mems in all cpusets Paul Jackson
2006-08-29  6:19 ` Andrew Morton
2006-08-29  7:15   ` Paul Jackson
2006-08-31 10:21 ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).