linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/topology: Use Identity node only if required
@ 2018-08-08  7:09 Srikar Dronamraju
  2018-08-08  7:58 ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-08  7:09 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: LKML, Mel Gorman, Rik van Riel, Srikar Dronamraju,
	Thomas Gleixner, Michael Ellerman, Heiko Carstens,
	Suravee Suthikulpanit, Andre Wild

With Commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
sched domain") scheduler introduces an extra numa level. However that
leads to

 - numa topology on 2 node systems no more marked as NUMA_DIRECT.  After
   this commit, it gets reported as NUMA_BACKPLANE. This is because
   sched_domains_numa_level now equals 2 on 2 node systems.

 - Extra numa sched domain that gets added and degenerated on most
   machines.  The Identity node is only needed on very few systems.
   Also all non-numa systems will end up populating
   sched_domains_numa_distance and sched_domains_numa_masks tables.

 - On shared lpars like powerpc, this extra sched domain creation can
   lead to repeated rcu stalls, sometimes even causing unresponsive
   systems on boot. On such stalls, it was noticed that
   init_sched_groups_capacity() (sg != sd->groups is always true).

INFO: rcu_sched self-detected stall on CPU
 1-....: (240039 ticks this GP) idle=c32/1/4611686018427387906 softirq=782/782 fqs=80012
  (t=240039 jiffies g=6272 c=6271 q=263040)
NMI backtrace for cpu 1
CPU: 1 PID: 1576 Comm: kworker/1:1 Kdump: loaded Tainted: G            E     4.18.0-rc7-master+ #42
Workqueue: events topology_work_fn
Call Trace:
[c00000832132f190] [c0000000009557ac] dump_stack+0xb0/0xf4 (unreliable)
[c00000832132f1d0] [c00000000095ed54] nmi_cpu_backtrace+0x1b4/0x230
[c00000832132f270] [c00000000095efac] nmi_trigger_cpumask_backtrace+0x1dc/0x220
[c00000832132f310] [c00000000005f77c] arch_trigger_cpumask_backtrace+0x2c/0x40
[c00000832132f330] [c0000000001a32d4] rcu_dump_cpu_stacks+0x100/0x15c
[c00000832132f380] [c0000000001a2024] rcu_check_callbacks+0x894/0xaa0
[c00000832132f4a0] [c0000000001ace9c] update_process_times+0x4c/0xa0
[c00000832132f4d0] [c0000000001c5400] tick_sched_handle.isra.13+0x50/0x80
[c00000832132f4f0] [c0000000001c549c] tick_sched_timer+0x6c/0xd0
[c00000832132f530] [c0000000001ae044] __hrtimer_run_queues+0x134/0x360
[c00000832132f5b0] [c0000000001aeea4] hrtimer_interrupt+0x124/0x300
[c00000832132f660] [c000000000024a04] timer_interrupt+0x114/0x2f0
[c00000832132f6c0] [c0000000000090f4] decrementer_common+0x114/0x120
--- interrupt: 901 at __bitmap_weight+0x70/0x100
    LR = __bitmap_weight+0x78/0x100
[c00000832132f9b0] [c0000000009bb738] __func__.61127+0x0/0x20 (unreliable)
[c00000832132fa00] [c00000000016c178] build_sched_domains+0xf98/0x13f0
[c00000832132fb30] [c00000000016d73c] partition_sched_domains+0x26c/0x440
[c00000832132fc20] [c0000000001ee284] rebuild_sched_domains_locked+0x64/0x80
[c00000832132fc50] [c0000000001f11ec] rebuild_sched_domains+0x3c/0x60
[c00000832132fc80] [c00000000007e1c4] topology_work_fn+0x24/0x40
[c00000832132fca0] [c000000000126704] process_one_work+0x1a4/0x470
[c00000832132fd30] [c000000000126a68] worker_thread+0x98/0x540
[c00000832132fdc0] [c00000000012f078] kthread+0x168/0x1b0
[c00000832132fe30] [c00000000000b65c]
ret_from_kernel_thread+0x5c/0x80

Similar problem was earlier also reported at
https://lwn.net/ml/linux-kernel/20180512100233.GB3738@osiris/

One easy alternative would be to use a hint from architectures that have
a liking for identity node.

Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/x86/include/asm/topology.h |  2 ++
 arch/x86/kernel/smpboot.c       |  5 +++++
 kernel/sched/topology.c         | 34 ++++++++++++++++++++++++----------
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index c1d2a9892352..524cb900e273 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -79,7 +79,9 @@ extern void setup_node_to_cpumask_map(void);
 
 extern int __node_distance(int, int);
 #define node_distance(a, b) __node_distance(a, b)
+#define arch_supports_identity_node arch_supports_identity_node
 
+extern int arch_supports_identity_node(void);
 #else /* !CONFIG_NUMA */
 
 static inline int numa_node_id(void)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index db9656e13ea0..08de8ca06232 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1346,6 +1346,11 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 	mtrr_aps_init();
 }
 
+int arch_supports_identity_node(void)
+{
+	return x86_has_numa_in_package;
+}
+
 static int __initdata setup_possible_cpus = -1;
 static int __init _setup_possible_cpus(char *str)
 {
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 56a0fed30c0a..8f61df23948a 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1322,20 +1322,30 @@ static void init_numa_topology_type(void)
 	}
 }
 
+#ifndef arch_supports_identity_node
+static inline int arch_supports_identity_node(void)
+{
+	return 0;
+}
+#endif
+
 void sched_init_numa(void)
 {
 	int next_distance, curr_distance = node_distance(0, 0);
 	struct sched_domain_topology_level *tl;
-	int level = 0;
+	int numa_in_package, level = 0;
 	int i, j, k;
 
 	sched_domains_numa_distance = kzalloc(sizeof(int) * nr_node_ids, GFP_KERNEL);
 	if (!sched_domains_numa_distance)
 		return;
 
-	/* Includes NUMA identity node at level 0. */
-	sched_domains_numa_distance[level++] = curr_distance;
-	sched_domains_numa_levels = level;
+	numa_in_package = arch_supports_identity_node();
+	if (numa_in_package) {
+		/* Includes NUMA identity node at level 0. */
+		sched_domains_numa_distance[level++] = curr_distance;
+		sched_domains_numa_levels = level;
+	}
 
 	/*
 	 * O(nr_nodes^2) deduplicating selection sort -- in order to find the
@@ -1445,19 +1455,23 @@ void sched_init_numa(void)
 	for (i = 0; sched_domain_topology[i].mask; i++)
 		tl[i] = sched_domain_topology[i];
 
+	j  = 0;
 	/*
 	 * Add the NUMA identity distance, aka single NODE.
 	 */
-	tl[i++] = (struct sched_domain_topology_level){
-		.mask = sd_numa_mask,
-		.numa_level = 0,
-		SD_INIT_NAME(NODE)
-	};
+	if (numa_in_package) {
+		tl[i++] = (struct sched_domain_topology_level){
+			.mask = sd_numa_mask,
+			.numa_level = 0,
+			SD_INIT_NAME(NODE)
+		};
+		j++;
+	}
 
 	/*
 	 * .. and append 'j' levels of NUMA goodness.
 	 */
-	for (j = 1; j < level; i++, j++) {
+	for (; j < level; i++, j++) {
 		tl[i] = (struct sched_domain_topology_level){
 			.mask = sd_numa_mask,
 			.sd_flags = cpu_numa_flags,
-- 
2.12.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-08  7:09 [PATCH] sched/topology: Use Identity node only if required Srikar Dronamraju
@ 2018-08-08  7:58 ` Peter Zijlstra
  2018-08-08  8:19   ` Srikar Dronamraju
  2018-08-10 16:45   ` Srikar Dronamraju
  0 siblings, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-08  7:58 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild

On Wed, Aug 08, 2018 at 12:39:31PM +0530, Srikar Dronamraju wrote:
> With Commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
> sched domain") scheduler introduces an extra numa level. However that
> leads to
> 
>  - numa topology on 2 node systems no more marked as NUMA_DIRECT.  After
>    this commit, it gets reported as NUMA_BACKPLANE. This is because
>    sched_domains_numa_level now equals 2 on 2 node systems.
> 
>  - Extra numa sched domain that gets added and degenerated on most
>    machines.  The Identity node is only needed on very few systems.
>    Also all non-numa systems will end up populating
>    sched_domains_numa_distance and sched_domains_numa_masks tables.
> 
>  - On shared lpars like powerpc, this extra sched domain creation can
>    lead to repeated rcu stalls, sometimes even causing unresponsive
>    systems on boot. On such stalls, it was noticed that
>    init_sched_groups_capacity() (sg != sd->groups is always true).

The idea was that if the topology level is redundant (as it often is);
then the degenerate code would take it out.

Why is that not working (right) and can we fix that instead?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-08  7:58 ` Peter Zijlstra
@ 2018-08-08  8:19   ` Srikar Dronamraju
  2018-08-08  8:43     ` Peter Zijlstra
  2018-08-10 16:45   ` Srikar Dronamraju
  1 sibling, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-08  8:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild

* Peter Zijlstra <peterz@infradead.org> [2018-08-08 09:58:41]:

> On Wed, Aug 08, 2018 at 12:39:31PM +0530, Srikar Dronamraju wrote:
> >  - numa topology on 2 node systems no more marked as NUMA_DIRECT.  After
> >    this commit, it gets reported as NUMA_BACKPLANE. This is because
> >    sched_domains_numa_level now equals 2 on 2 node systems.
> > 
> >  - Extra numa sched domain that gets added and degenerated on most
> >    machines.  The Identity node is only needed on very few systems.
> >    Also all non-numa systems will end up populating
> >    sched_domains_numa_distance and sched_domains_numa_masks tables.
> > 
> >  - On shared lpars like powerpc, this extra sched domain creation can
> >    lead to repeated rcu stalls, sometimes even causing unresponsive
> >    systems on boot. On such stalls, it was noticed that
> >    init_sched_groups_capacity() (sg != sd->groups is always true).
> 
> The idea was that if the topology level is redundant (as it often is);
> then the degenerate code would take it out.
> 
> Why is that not working (right) and can we fix that instead?
> 

All I have found is regular NUMA sched_domains use OVERLAP flag, which
inturn results in build_overlap_sched_groups(). The Identity node uses
build_sched_groups. Somehow build_sched_groups is unable to create the
group list. I still getting to see why that makes a difference.

I have tried with passing .flags = SDTL_OVERLAP to the identity node and
that works well. 

However, I still think if majority of the cases the identity node is
going to be redundant, then we should use hint.

We could fix the numa topology to be NUMA_DIRECT for 2 node machines, by
checking if sched_domains_numa_levels == 2, but then I dont know what it
means for a system that has only NODE but not NUMA level.

i.e what should we say the numa topology type of a machine that has only
NODE but not NUMA sched_domain?

-- 
Thanks and Regards
Srikar


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-08  8:19   ` Srikar Dronamraju
@ 2018-08-08  8:43     ` Peter Zijlstra
  2018-08-08  9:30       ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-08  8:43 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild

On Wed, Aug 08, 2018 at 01:19:42AM -0700, Srikar Dronamraju wrote:

> However, I still think if majority of the cases the identity node is
> going to be redundant, then we should use hint.

No, same way we always generate SMT domains when we have
CONFIG_SCHED_SMT irrespective of the hardware having SMT.

Also, an arch hook like you did is just fugly. If anything you add the
single node thing to the regular topology setup of x86_numa_in_package.

But the thing is, even for x86_numa_in_package the single node domain is
mostly redundant because the MC domain will match the NODE domain most
times.

> We could fix the numa topology to be NUMA_DIRECT for 2 node machines, by
> checking if sched_domains_numa_levels == 2, but then I dont know what it
> means for a system that has only NODE but not NUMA level.

You have a point there; I think we should not have added the NODE thing
to sched_domains_numa_level, let me see if we can fix that sanely.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-08  8:43     ` Peter Zijlstra
@ 2018-08-08  9:30       ` Peter Zijlstra
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-08  9:30 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild

On Wed, Aug 08, 2018 at 10:43:02AM +0200, Peter Zijlstra wrote:
> On Wed, Aug 08, 2018 at 01:19:42AM -0700, Srikar Dronamraju wrote:

> > We could fix the numa topology to be NUMA_DIRECT for 2 node machines, by
> > checking if sched_domains_numa_levels == 2, but then I dont know what it
> > means for a system that has only NODE but not NUMA level.
> 
> You have a point there; I think we should not have added the NODE thing
> to sched_domains_numa_level, let me see if we can fix that sanely.

Hurm,. looking this over I don't see anything better than changing that
NUMA_DIRECT test to <= 2.

But as the comment says, DIRECT means: all nodes are directly connected,
or not a NUMA system. A system with a single node is not a NUMA system
and would thus qualify.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-08  7:58 ` Peter Zijlstra
  2018-08-08  8:19   ` Srikar Dronamraju
@ 2018-08-10 16:45   ` Srikar Dronamraju
  2018-08-29  8:43     ` Peter Zijlstra
  1 sibling, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-10 16:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild

* Peter Zijlstra <peterz@infradead.org> [2018-08-08 09:58:41]:

> On Wed, Aug 08, 2018 at 12:39:31PM +0530, Srikar Dronamraju wrote:
> > With Commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
> > sched domain") scheduler introduces an extra numa level. However that
> > leads to
> > 
> >  - numa topology on 2 node systems no more marked as NUMA_DIRECT.  After
> >    this commit, it gets reported as NUMA_BACKPLANE. This is because
> >    sched_domains_numa_level now equals 2 on 2 node systems.
> > 
> >  - Extra numa sched domain that gets added and degenerated on most
> >    machines.  The Identity node is only needed on very few systems.
> >    Also all non-numa systems will end up populating
> >    sched_domains_numa_distance and sched_domains_numa_masks tables.
> > 
> >  - On shared lpars like powerpc, this extra sched domain creation can
> >    lead to repeated rcu stalls, sometimes even causing unresponsive
> >    systems on boot. On such stalls, it was noticed that
> >    init_sched_groups_capacity() (sg != sd->groups is always true).
> 
> The idea was that if the topology level is redundant (as it often is);
> then the degenerate code would take it out.
> 
> Why is that not working (right) and can we fix that instead?
> 

Here is my analysis on another box showing same issue.
numactl o/p

available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39 64 65 66 67 68 69 70 71 96 97 98 99 100 101 102 103 128 129 130 131 132 133 134 135 160 161 162 163 164 165 166 167 192 193 194 195 196 197 198 199 224 225 226 227 228 229 230 231 256 257 258 259 260 261 262 263 288 289 290 291 292 293 294 295
node 0 size: 536888 MB
node 0 free: 533582 MB
node 1 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63 88 89 90 91 92 93 94 95 120 121 122 123 124 125 126 127 152 153 154 155 156 157 158 159 184 185 186 187 188 189 190 191 216 217 218 219 220 221 222 223 248 249 250 251 252 253 254 255 280 281 282 283 284 285 286 287
node 1 size: 502286 MB
node 1 free: 501283 MB
node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 80 81 82 83 84 85 86 87 112 113 114 115 116 117 118 119 144 145 146 147 148 149 150 151 176 177 178 179 180 181 182 183 208 209 210 211 212 213 214 215 240 241 242 243 244 245 246 247 272 273 274 275 276 277 278 279
node 2 size: 503054 MB
node 2 free: 502854 MB
node 3 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 104 105 106 107 108 109 110 111 136 137 138 139 140 141 142 143 168 169 170 171 172 173 174 175 200 201 202 203 204 205 206 207 232 233 234 235 236 237 238 239 264 265 266 267 268 269 270 271 296 297 298 299 300 301 302 303
node 3 size: 503310 MB
node 3 free: 498465 MB
node distances:
node   0   1   2   3
  0:  10  40  40  40
  1:  40  10  40  40
  2:  40  40  10  40
  3:  40  40  40  10

Extracting the contents of dmesg using sched_debug kernel parameter

CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
....
....
CPU302 attaching NULL sched-domain.
CPU303 attaching NULL sched-domain.
BUG: arch topology borken
     the DIE domain not a subset of the NODE domain
BUG: arch topology borken
     the DIE domain not a subset of the NODE domain
.....
.....
BUG: arch topology borken
     the DIE domain not a subset of the NODE domain
BUG: arch topology borken
     the DIE domain not a subset of the NODE domain
BUG: arch topology borken
     the DIE domain not a subset of the NODE domain

 CPU0 attaching sched-domain(s):
   domain-2: sdA, span=0-303 level=NODE
    groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
CPU1  attaching sched-domain(s):
   domain-2: sdB, span=0-303 level=NODE
[  367.739387]     groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }


CPU8  attaching sched-domain(s):
   domain-2: sdC, span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
    groups: sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
    domain-3: sdD, span=0-303 level=NUMA
     groups: sgX 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sgY 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgZ 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
ERROR: groups don't span domain->span

CPU9  attaching sched-domain(s):
   domain-2: sdE span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
    groups: sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
    domain-3: sdF span=0-303 level=NUMA
     groups: sgP 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sgQ 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgR 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
ERROR: groups don't span domain->span


Trying to summarize further

+ Node sched domain groups are initialised with build_sched_groups (that
tried to share groups)
+ Numa sched domain groups are initialised with build_overlap_sched_groups

Cpu 0: sdA->groups  sgL  ->next= sgM ->next= sgN ->next= sgO
Cpu 1: sdB->groups  sgL  ->next= sgM ->next= sgN ->next= sgO

However
Cpu 8: sdC->groups -> sgM ->next= sgM  (NODE)
Cpu 8: sdD->groups  sgX  ->next= sgY ->next= sgZ (NUMA)
Cpu 9: sdE->groups -> sgM ->next= sgM  (NODE)
Cpu 1: sdB->groups  sgP  ->next= sgQ ->next= sg (NUMA)

In init_sched_group_capacity(), When we start with sdB->groups and reach sgM
but sgM->next happens to be sgM. However sdB->groups != sdM

With non-identity NUMA sched_domains, build_overlap_sched_groups creates new
groups per sched-domain, so the problem is masked.

i.e On a topology update, the sched_domain_numa_mask aren't getting updated.
causing very wierd sched domains. The Identity node sched domain further
complicates the problem.

One solution would be to expose sched_domain_numa_mask_set/clear so that the
archs can help build correct/proper sched_domains.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] sched/topology: Set correct numa topology type
       [not found] <reply-to=<20180808081942.GA37418@linux.vnet.ibm.com>
@ 2018-08-10 17:00 ` Srikar Dronamraju
  2018-08-10 17:00   ` [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch Srikar Dronamraju
                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-10 17:00 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Srikar Dronamraju, LKML, Mel Gorman, Rik van Riel,
	Thomas Gleixner, Michael Ellerman, Heiko Carstens,
	Suravee Suthikulpanit, Andre Wild, linuxppc-dev

With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
sched domain") scheduler introduces an new numa level. However this
leads to numa topology on 2 node systems no more marked as NUMA_DIRECT.
After this commit, it gets reported as NUMA_BACKPLANE. This is because
sched_domains_numa_level is now 2 on 2 node systems.

Fix this by allowing setting systems that have upto 2 numa levels as
NUMA_DIRECT.

While here remove a code that assumes level can be 0.

Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 kernel/sched/topology.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index a6e6b855ba81..cec3ee3ed320 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1315,7 +1315,7 @@ static void init_numa_topology_type(void)
 
 	n = sched_max_numa_distance;
 
-	if (sched_domains_numa_levels <= 1) {
+	if (sched_domains_numa_levels <= 2) {
 		sched_numa_topology_type = NUMA_DIRECT;
 		return;
 	}
@@ -1400,9 +1400,6 @@ void sched_init_numa(void)
 			break;
 	}
 
-	if (!level)
-		return;
-
 	/*
 	 * 'level' contains the number of unique distances
 	 *
-- 
2.12.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-10 17:00 ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
@ 2018-08-10 17:00   ` Srikar Dronamraju
  2018-08-29  8:02     ` Peter Zijlstra
  2018-08-21 11:02   ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
  2018-09-10 10:06   ` [tip:sched/core] sched/topology: Set correct NUMA " tip-bot for Srikar Dronamraju
  2 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-10 17:00 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Srikar Dronamraju, LKML, Mel Gorman, Rik van Riel,
	Thomas Gleixner, Michael Ellerman, Heiko Carstens,
	Suravee Suthikulpanit, Andre Wild, linuxppc-dev

With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
sched domain") scheduler introduces an new numa level. However on shared
lpars like powerpc, this extra sched domain creation can lead to
repeated rcu stalls, sometimes even causing unresponsive systems on
boot. On such stalls, it was noticed that init_sched_groups_capacity()
(sg != sd->groups is always true).

INFO: rcu_sched self-detected stall on CPU
 1-....: (240039 ticks this GP) idle=c32/1/4611686018427387906 softirq=782/782 fqs=80012
  (t=240039 jiffies g=6272 c=6271 q=263040)
NMI backtrace for cpu 1
CPU: 1 PID: 1576 Comm: kworker/1:1 Kdump: loaded Tainted: G            E     4.18.0-rc7-master+ #42
Workqueue: events topology_work_fn
Call Trace:
[c00000832132f190] [c0000000009557ac] dump_stack+0xb0/0xf4 (unreliable)
[c00000832132f1d0] [c00000000095ed54] nmi_cpu_backtrace+0x1b4/0x230
[c00000832132f270] [c00000000095efac] nmi_trigger_cpumask_backtrace+0x1dc/0x220
[c00000832132f310] [c00000000005f77c] arch_trigger_cpumask_backtrace+0x2c/0x40
[c00000832132f330] [c0000000001a32d4] rcu_dump_cpu_stacks+0x100/0x15c
[c00000832132f380] [c0000000001a2024] rcu_check_callbacks+0x894/0xaa0
[c00000832132f4a0] [c0000000001ace9c] update_process_times+0x4c/0xa0
[c00000832132f4d0] [c0000000001c5400] tick_sched_handle.isra.13+0x50/0x80
[c00000832132f4f0] [c0000000001c549c] tick_sched_timer+0x6c/0xd0
[c00000832132f530] [c0000000001ae044] __hrtimer_run_queues+0x134/0x360
[c00000832132f5b0] [c0000000001aeea4] hrtimer_interrupt+0x124/0x300
[c00000832132f660] [c000000000024a04] timer_interrupt+0x114/0x2f0
[c00000832132f6c0] [c0000000000090f4] decrementer_common+0x114/0x120
--- interrupt: 901 at __bitmap_weight+0x70/0x100
    LR = __bitmap_weight+0x78/0x100
[c00000832132f9b0] [c0000000009bb738] __func__.61127+0x0/0x20 (unreliable)
[c00000832132fa00] [c00000000016c178] build_sched_domains+0xf98/0x13f0
[c00000832132fb30] [c00000000016d73c] partition_sched_domains+0x26c/0x440
[c00000832132fc20] [c0000000001ee284] rebuild_sched_domains_locked+0x64/0x80
[c00000832132fc50] [c0000000001f11ec] rebuild_sched_domains+0x3c/0x60
[c00000832132fc80] [c00000000007e1c4] topology_work_fn+0x24/0x40
[c00000832132fca0] [c000000000126704] process_one_work+0x1a4/0x470
[c00000832132fd30] [c000000000126a68] worker_thread+0x98/0x540
[c00000832132fdc0] [c00000000012f078] kthread+0x168/0x1b0
[c00000832132fe30] [c00000000000b65c]
ret_from_kernel_thread+0x5c/0x80

Similar problem was earlier also reported at
https://lwn.net/ml/linux-kernel/20180512100233.GB3738@osiris/

Allow arch to set and clear masks corresponding to numa sched domain.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 include/linux/sched/topology.h | 6 ++++++
 kernel/sched/sched.h           | 4 ----
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 26347741ba50..13c7baeb7789 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -52,6 +52,12 @@ static inline int cpu_numa_flags(void)
 {
 	return SD_NUMA;
 }
+
+extern void sched_domains_numa_masks_set(unsigned int cpu);
+extern void sched_domains_numa_masks_clear(unsigned int cpu);
+#else
+static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
+static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
 #endif
 
 extern int arch_asym_cpu_priority(int cpu);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c7742dcc136c..1028f3df8777 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1057,12 +1057,8 @@ extern bool find_numa_distance(int distance);
 
 #ifdef CONFIG_NUMA
 extern void sched_init_numa(void);
-extern void sched_domains_numa_masks_set(unsigned int cpu);
-extern void sched_domains_numa_masks_clear(unsigned int cpu);
 #else
 static inline void sched_init_numa(void) { }
-static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
-static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
 #endif
 
 #ifdef CONFIG_NUMA_BALANCING
-- 
2.12.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] sched/topology: Set correct numa topology type
  2018-08-10 17:00 ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
  2018-08-10 17:00   ` [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch Srikar Dronamraju
@ 2018-08-21 11:02   ` Srikar Dronamraju
  2018-08-21 13:59     ` Peter Zijlstra
  2018-09-10 10:06   ` [tip:sched/core] sched/topology: Set correct NUMA " tip-bot for Srikar Dronamraju
  2 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-21 11:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild, linuxppc-dev

* Srikar Dronamraju <srikar@linux.vnet.ibm.com> [2018-08-10 22:30:18]:

> With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
> sched domain") scheduler introduces an new numa level. However this
> leads to numa topology on 2 node systems no more marked as NUMA_DIRECT.
> After this commit, it gets reported as NUMA_BACKPLANE. This is because
> sched_domains_numa_level is now 2 on 2 node systems.
> 
> Fix this by allowing setting systems that have upto 2 numa levels as
> NUMA_DIRECT.
> 
> While here remove a code that assumes level can be 0.
> 
> Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---

Hey Peter,

Did you look at these two patches?

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] sched/topology: Set correct numa topology type
  2018-08-21 11:02   ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
@ 2018-08-21 13:59     ` Peter Zijlstra
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-21 13:59 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild, linuxppc-dev

On Tue, Aug 21, 2018 at 04:02:58AM -0700, Srikar Dronamraju wrote:
> * Srikar Dronamraju <srikar@linux.vnet.ibm.com> [2018-08-10 22:30:18]:
> 
> > With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
> > sched domain") scheduler introduces an new numa level. However this
> > leads to numa topology on 2 node systems no more marked as NUMA_DIRECT.
> > After this commit, it gets reported as NUMA_BACKPLANE. This is because
> > sched_domains_numa_level is now 2 on 2 node systems.
> > 
> > Fix this by allowing setting systems that have upto 2 numa levels as
> > NUMA_DIRECT.
> > 
> > While here remove a code that assumes level can be 0.
> > 
> > Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
> > Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > ---
> 
> Hey Peter,
> 
> Did you look at these two patches?

Nope, been on holidays and the inbox is an even bigger mess than normal.
I'll get to it, eventually :/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-10 17:00   ` [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch Srikar Dronamraju
@ 2018-08-29  8:02     ` Peter Zijlstra
  2018-08-31 10:27       ` Srikar Dronamraju
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-29  8:02 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	Andre Wild, linuxppc-dev

On Fri, Aug 10, 2018 at 10:30:19PM +0530, Srikar Dronamraju wrote:
> With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
> sched domain") scheduler introduces an new numa level. However on shared
> lpars like powerpc, this extra sched domain creation can lead to
> repeated rcu stalls, sometimes even causing unresponsive systems on
> boot. On such stalls, it was noticed that init_sched_groups_capacity()
> (sg != sd->groups is always true).
> 
> INFO: rcu_sched self-detected stall on CPU
>  1-....: (240039 ticks this GP) idle=c32/1/4611686018427387906 softirq=782/782 fqs=80012
>   (t=240039 jiffies g=6272 c=6271 q=263040)
> NMI backtrace for cpu 1

> --- interrupt: 901 at __bitmap_weight+0x70/0x100
>     LR = __bitmap_weight+0x78/0x100
> [c00000832132f9b0] [c0000000009bb738] __func__.61127+0x0/0x20 (unreliable)
> [c00000832132fa00] [c00000000016c178] build_sched_domains+0xf98/0x13f0
> [c00000832132fb30] [c00000000016d73c] partition_sched_domains+0x26c/0x440
> [c00000832132fc20] [c0000000001ee284] rebuild_sched_domains_locked+0x64/0x80
> [c00000832132fc50] [c0000000001f11ec] rebuild_sched_domains+0x3c/0x60
> [c00000832132fc80] [c00000000007e1c4] topology_work_fn+0x24/0x40
> [c00000832132fca0] [c000000000126704] process_one_work+0x1a4/0x470
> [c00000832132fd30] [c000000000126a68] worker_thread+0x98/0x540
> [c00000832132fdc0] [c00000000012f078] kthread+0x168/0x1b0
> [c00000832132fe30] [c00000000000b65c]
> ret_from_kernel_thread+0x5c/0x80
> 
> Similar problem was earlier also reported at
> https://lwn.net/ml/linux-kernel/20180512100233.GB3738@osiris/
> 
> Allow arch to set and clear masks corresponding to numa sched domain.

What this Changelog fails to do is explain the problem and motivate why
this is the right solution.

As-is, this reads like, something's buggered, I changed this random thing
and it now works.

So what is causing that domain construction error?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-10 16:45   ` Srikar Dronamraju
@ 2018-08-29  8:43     ` Peter Zijlstra
  2018-08-29  8:57       ` Peter Zijlstra
  2018-08-31 10:22       ` Srikar Dronamraju
  0 siblings, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-29  8:43 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild

On Fri, Aug 10, 2018 at 09:45:33AM -0700, Srikar Dronamraju wrote:

> available: 4 nodes (0-3)
> node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39 64 65 66 67 68 69 70 71 96 97 98 99 100 101 102 103 128 129 130 131 132 133 134 135 160 161 162 163 164 165 166 167 192 193 194 195 196 197 198 199 224 225 226 227 228 229 230 231 256 257 258 259 260 261 262 263 288 289 290 291 292 293 294 295
> node 0 size: 536888 MB
> node 0 free: 533582 MB
> node 1 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63 88 89 90 91 92 93 94 95 120 121 122 123 124 125 126 127 152 153 154 155 156 157 158 159 184 185 186 187 188 189 190 191 216 217 218 219 220 221 222 223 248 249 250 251 252 253 254 255 280 281 282 283 284 285 286 287
> node 1 size: 502286 MB
> node 1 free: 501283 MB
> node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 80 81 82 83 84 85 86 87 112 113 114 115 116 117 118 119 144 145 146 147 148 149 150 151 176 177 178 179 180 181 182 183 208 209 210 211 212 213 214 215 240 241 242 243 244 245 246 247 272 273 274 275 276 277 278 279
> node 2 size: 503054 MB
> node 2 free: 502854 MB
> node 3 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 104 105 106 107 108 109 110 111 136 137 138 139 140 141 142 143 168 169 170 171 172 173 174 175 200 201 202 203 204 205 206 207 232 233 234 235 236 237 238 239 264 265 266 267 268 269 270 271 296 297 298 299 300 301 302 303
> node 3 size: 503310 MB
> node 3 free: 498465 MB
> node distances:
> node   0   1   2   3
>   0:  10  40  40  40
>   1:  40  10  40  40
>   2:  40  40  10  40
>   3:  40  40  40  10
> 
> Extracting the contents of dmesg using sched_debug kernel parameter
> 
> CPU0 attaching NULL sched-domain.
> CPU1 attaching NULL sched-domain.
> ....
> ....
> CPU302 attaching NULL sched-domain.
> CPU303 attaching NULL sched-domain.
> BUG: arch topology borken
>      the DIE domain not a subset of the NODE domain

^^^^^ CLUE!!

but nowhere did you show what it thinks the DIE mask is.

>  CPU0 attaching sched-domain(s):
>    domain-2: sdA, span=0-303 level=NODE
>     groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> CPU1  attaching sched-domain(s):
>    domain-2: sdB, span=0-303 level=NODE
> [  367.739387]     groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }

You forgot to provide the rest of it... what's domain-[01] look like?

DIE(j) should be:

	cpu_cpu_mask(j) := cpumask_of_node(cpu_to_node(j))

and NODE(j) should be:

	\Union_k cpumask_of_node(k) ; where node_distance(j,k) <= node_distance(0,0)

which, _should_ reduce to:

	cpumask_of_node(j)

and thus DIE and NODE _should_ be the same here.

So what's going sideways?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-29  8:43     ` Peter Zijlstra
@ 2018-08-29  8:57       ` Peter Zijlstra
  2018-08-31 10:22       ` Srikar Dronamraju
  1 sibling, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-29  8:57 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild

On Wed, Aug 29, 2018 at 10:43:48AM +0200, Peter Zijlstra wrote:
> DIE(j) should be:
> 
> 	cpu_cpu_mask(j) := cpumask_of_node(cpu_to_node(j))

FWIW, I was expecting that to be topology_core_cpumask(), so I'm a
little confused myself just now.

> and NODE(j) should be:
> 
> 	\Union_k cpumask_of_node(k) ; where node_distance(j,k) <= node_distance(0,0)
> 
> which, _should_ reduce to:
> 
> 	cpumask_of_node(j)
> 
> and thus DIE and NODE _should_ be the same here.
> 
> So what's going sideways?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-29  8:43     ` Peter Zijlstra
  2018-08-29  8:57       ` Peter Zijlstra
@ 2018-08-31 10:22       ` Srikar Dronamraju
  2018-08-31 10:41         ` Peter Zijlstra
  1 sibling, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-31 10:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild, Benjamin Herrenschmidt

* Peter Zijlstra <peterz@infradead.org> [2018-08-29 10:43:48]:

> On Fri, Aug 10, 2018 at 09:45:33AM -0700, Srikar Dronamraju wrote:
> 
> > ....
> > CPU302 attaching NULL sched-domain.
> > CPU303 attaching NULL sched-domain.
> > BUG: arch topology borken
> >      the DIE domain not a subset of the NODE domain
> 
> ^^^^^ CLUE!!
> 
> but nowhere did you show what it thinks the DIE mask is.
> 
> >  CPU0 attaching sched-domain(s):
> >    domain-2: sdA, span=0-303 level=NODE
> >     groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> > CPU1  attaching sched-domain(s):
> >    domain-2: sdB, span=0-303 level=NODE
> > [  367.739387]     groups: sg=sgL 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, sgM 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, sdN 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, sgO 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> 
> You forgot to provide the rest of it... what's domain-[01] look like?

At boot: Before topology update.

For  CPU 0 
domain-0: span=0-7 level=SMT
 groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }
 domain-1: span=0-303 level=DIE
  groups: 0:{ span=0-7 cap=8192 }, 8:{ span=8-15 cap=8192 }, 16:{ span=16-23 cap=8192 }, 24:{ span=24-31 cap=8192 }, 32:{ span=32-39 cap=8192 }, 40:{ span=40-47 cap=8192 }, 48:{ span=48-55 cap=8192 }, 56:{ span=56-63 cap=8192 }, 64:{ span=64-71 cap=8192 }, 72:{ span=72-79 cap=8192 }, 80:{ span=80-87 cap=8192 }, 88:{ span=88-95 cap=8192 }, 96:{ span=96-103 cap=8192 }, 104:{ span=104-111 cap=8192 }, 112:{ span=112-119 cap=8192 }, 120:{ span=120-127 cap=8192 }, 128:{ span=128-135 cap=8192 }, 136:{ span=136-143 cap=8192 }, 144:{ span=144-151 cap=8192 }, 152:{ span=152-159 cap=8192 }, 160:{ span=160-167 cap=8192 }, 168:{ span=168-175 cap=8192 }, 176:{ span=176-183 cap=8192 }, 184:{ span=184-191 cap=8192 }, 192:{ span=192-199 cap=8192 }, 200:{ span=200-207 cap=8192 }, 208:{ span=208-215 cap=8192 }, 216:{ span=216-223 cap=8192 }, 224:{ span=224-231 cap=8192 }, 232:{ span=232-239 cap=8192 }, 240:{ span=240-247 cap=8192 }, 248:{ span=248-255 cap=8192 }, 256:{ span=256-263 cap=8192 }, 264:{ sp
 an=264-271 cap=8192 }, 272:{ span=272-279 cap=8192 }, 280:{ span=280-287 cap=8192 }, 288:{ span=288-295 cap=8192 }, 296:{ span=296-303 cap=8192 }

For  CPU 1 
domain-0: span=0-7 level=SMT
 groups: 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }, 0:{ span=0 }
 domain-1: span=0-303 level=DIE
  groups: 0:{ span=0-7 cap=8192 }, 8:{ span=8-15 cap=8192 }, 16:{ span=16-23 cap=8192 }, 24:{ span=24-31 cap=8192 }, 32:{ span=32-39 cap=8192 }, 40:{ span=40-47 cap=8192 }, 48:{ span=48-55 cap=8192 }, 56:{ span=56-63 cap=8192 }, 64:{ span=64-71 cap=8192 }, 72:{ span=72-79 cap=8192 }, 80:{ span=80-87 cap=8192 }, 88:{ span=88-95 cap=8192 }, 96:{ span=96-103 cap=8192 }, 104:{ span=104-111 cap=8192 }, 112:{ span=112-119 cap=8192 }, 120:{ span=120-127 cap=8192 }, 128:{ span=128-135 cap=8192 }, 136:{ span=136-143 cap=8192 }, 144:{ span=144-151 cap=8192 }, 152:{ span=152-159 cap=8192 }, 160:{ span=160-167 cap=8192 }, 168:{ span=168-175 cap=8192 }, 176:{ span=176-183 cap=8192 }, 184:{ span=184-191 cap=8192 }, 192:{ span=192-199 cap=8192 }, 200:{ span=200-207 cap=8192 }, 208:{ span=208-215 cap=8192}, 216:{ span=216-223 cap=8192 }, 224:{ span=224-231 cap=8192 }, 232:{ span=232-239 cap=8192 }, 240:{ span=240-247 cap=8192 }, 248:{ span=248-255 cap=8192 }, 256:{ span=256-263 cap=8192 }, 264:{ spa
 n=264-271 cap=8192 }, 272:{ span=272-279 cap=8192 }, 280:{ span=280-287 cap=8192 }, 288:{ span=288-295 cap=8192 }, 296:{ span=296-303 cap=8192 }


For  CPU 8
domain-0: span=8-15 level=SMT
 groups: 8:{ span=8 }, 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }
 domain-1: span=0-303 level=DIE
  groups: 8:{ span=8-15 cap=8192 }, 16:{ span=16-23 cap=8192 }, 24:{ span=24-31 cap=8192 }, 32:{ span=32-39 cap=8192 }, 40:{ span=40-47 cap=8192 }, 48:{ span=48-55 cap=8192 }, 56:{ span=56-63 cap=8192 }, 64:{ span=64-71 cap=8192 }, 72:{ span=72-79 cap=8192 }, 80:{ span=80-87 cap=8192 }, 88:{ span=88-95 cap=8192 }, 96:{ span=96-103 cap=8192 }, 104:{ span=104-111 cap=8192 }, 112:{ span=112-119 cap=8192 }, 120:{ span=120-127 cap=8192 }, 128:{ span=128-135 cap=8192 }, 136:{ span=136-143 cap=8192 }, 144:{ span=144-151 cap=8192 }, 152:{ span=152-159 cap=8192 }, 160:{ span=160-167 cap=8192 }, 168:{ span=168-175 cap=8192 }, 176:{ span=176-183 cap=8192 }, 184:{ span=184-191 cap=8192 }, 192:{ span=192-199 cap=8192 }, 200:{ span=200-207 cap=8192 }, 208:{ span=208-215 cap=8192 }, 216:{ span=216-223 cap=8192 }, 224:{ span=224-231 cap=8192 }, 232:{ span=232-239 cap=8192 }, 240:{ span=240-247 cap=8192 }, 248:{ span=248-255 cap=8192 }, 256:{ span=256-263 cap=8192 }, 264:{ span=264-271 cap=8192 }, 27
 2:{ span=272-279 cap=8192 }, 280:{ span=280-287 cap=8192 }, 288:{ span=288-295 cap=8192 }, 296:{ span=296-303 cap=8192 }, 0:{ span=0-7 cap=8192 }

For  CPU 9 
domain-0: span=8-15 level=SMT
 groups: 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }, 8:{ span=8 }
 domain-1: span=0-303 level=DIE
  groups: 8:{ span=8-15 cap=8192 }, 16:{ span=16-23 cap=8192 }, 24:{ span=24-31 cap=8192 }, 32:{ span=32-39 cap=8192 }, 40:{ span=40-47 cap=8192 }, 48:{ span=48-55 cap=8192 }, 56:{ span=56-63 cap=8192 }, 64:{ span=64-71 cap=8192 }, 72:{ span=72-79 cap=8192 }, 80:{ span=80-87 cap=8192 }, 88:{ span=88-95 cap=8192 }, 96:{ span=96-103 cap=8192 }, 104:{ span=104-111 cap=8192 }, 112:{ span=112-119 cap=8192 }, 120:{ span=120-127 cap=8192 }, 128:{ span=128-135 cap=8192 }, 136:{ span=136-143 cap=8192 }, 144:{ span=144-151 cap=8192 }, 152:{ span=152-159 cap=8192 }, 160:{ span=160-167 cap=8192 }, 168:{ span=168-175 cap=8192 }, 176:{ span=176-183 cap=8192 }, 184:{ span=184-191 cap=8192 }, 192:{ span=192-199 cap=8192 }, 200:{ span=200-207 cap=8192 }, 208:{ span=208-215 cap=8192 }, 216:{ span=216-223 cap=8192 }, 224:{ span=224-231 cap=8192 }, 232:{ span=232-239 cap=8192 }, 240:{ span=240-247 cap=8192 }, 248:{ span=248-255 cap=8192 }, 256:{ span=256-263 cap=8192 }, 264:{ span=264-271 cap=8192 }, 27
 2:{ span=272-279 cap=8192 }, 280:{ span=280-287 cap=8192 }, 288:{ span=288-295 cap=8192 }, 296:{ span=296-303 cap=8192 }, 0:{ span=0-7 cap=8192 }


After topology update.

For CPU 0
domain-0: span=0-7 level=SMT
 groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }
 domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE
  groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 }
  domain-2: span=0-303 level=NODE
   groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }

For CPU 1
domain-0: span=0-7 level=SMT
 groups: 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }, 0:{ span=0 }
 domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE
  groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 }
  domain-2: span=0-303 level=NODE
   groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }


For CPU 8
 domain-0: span=8-15 level=SMT
  groups: 8:{ span=8 }, 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }
  domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE
   groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 }
   domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
    groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
    domain-3: span=0-303 level=NUMA
     groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
ERROR: groups don't span domain->span

For CPU 9
 domain-0: span=8-15 level=SMT
  groups: 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }, 8:{ span=8 }
  domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE
   groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 }
   domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
    groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
    domain-3: span=0-303 level=NUMA
     groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
ERROR: groups don't span domain->span

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-29  8:02     ` Peter Zijlstra
@ 2018-08-31 10:27       ` Srikar Dronamraju
  2018-08-31 11:12         ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-31 10:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Benjamin Herrenschmidt

* Peter Zijlstra <peterz@infradead.org> [2018-08-29 10:02:19]:

> On Fri, Aug 10, 2018 at 10:30:19PM +0530, Srikar Dronamraju wrote:
> > With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node
> > sched domain") scheduler introduces an new numa level. However on shared
> > lpars like powerpc, this extra sched domain creation can lead to
> > repeated rcu stalls, sometimes even causing unresponsive systems on
> > boot. On such stalls, it was noticed that init_sched_groups_capacity()
> > (sg != sd->groups is always true).
> > 
> > INFO: rcu_sched self-detected stall on CPU
> >  1-....: (240039 ticks this GP) idle=c32/1/4611686018427387906 softirq=782/782 fqs=80012
> >   (t=240039 jiffies g=6272 c=6271 q=263040)
> > NMI backtrace for cpu 1
> 
> > --- interrupt: 901 at __bitmap_weight+0x70/0x100
> >     LR = __bitmap_weight+0x78/0x100
> > [c00000832132f9b0] [c0000000009bb738] __func__.61127+0x0/0x20 (unreliable)
> > [c00000832132fa00] [c00000000016c178] build_sched_domains+0xf98/0x13f0
> > [c00000832132fb30] [c00000000016d73c] partition_sched_domains+0x26c/0x440
> > [c00000832132fc20] [c0000000001ee284] rebuild_sched_domains_locked+0x64/0x80
> > [c00000832132fc50] [c0000000001f11ec] rebuild_sched_domains+0x3c/0x60
> > [c00000832132fc80] [c00000000007e1c4] topology_work_fn+0x24/0x40
> > [c00000832132fca0] [c000000000126704] process_one_work+0x1a4/0x470
> > [c00000832132fd30] [c000000000126a68] worker_thread+0x98/0x540
> > [c00000832132fdc0] [c00000000012f078] kthread+0x168/0x1b0
> > [c00000832132fe30] [c00000000000b65c]
> > ret_from_kernel_thread+0x5c/0x80
> > 
> > Similar problem was earlier also reported at
> > https://lwn.net/ml/linux-kernel/20180512100233.GB3738@osiris/
> > 
> > Allow arch to set and clear masks corresponding to numa sched domain.
> 

> What this Changelog fails to do is explain the problem and motivate why
> this is the right solution.
> 
> As-is, this reads like, something's buggered, I changed this random thing
> and it now works.
> 
> So what is causing that domain construction error?
> 

Powerpc lpars running on Phyp have 2 modes. Dedicated and shared.

Dedicated lpars are similar to kvm guest with vcpupin.

Shared  lpars are similar to kvm guest without any pinning. When running
shared lpar mode, Phyp allows overcommitting. Now if more lpars are
created/destroyed, Phyp will internally move / consolidate the cores. The
objective is similar to what autonuma tries achieves on the host but with a
different approach (consolidating to optimal nodes to achieve the best
possible output).  This would mean that the actual underlying cpus/node
mapping has changed. Phyp will propogate upwards an event to the lpar.  The
lpar / os can choose to ignore or act on the same.

We have found that acting on the event will provide upto 40% improvement
over ignoring the event. Acting on the event would mean moving the cpu from
one node to the other, and topology_work_fn exactly does that.

In the case where we didn't have the NUMA sched domain, we would build the
independent (aka overlap) sched_groups. With NUMA  sched domain
introduction, we try to reuse sched_groups (aka non-overlay). This results
in the above, which I thought I tried to explain in
https://lwn.net/ml/linux-kernel/20180810164533.GB42350@linux.vnet.ibm.com

In the typical case above, lets take 2 node, 8 core each having SMT 8
threads.  Initially all the 8 cores might come from node 0.  Hence
sched_domains_numa_masks[NODE][node1] and
sched_domains_numa_mask[NUMA][node1] is set at sched_init_numa will have
blank cpumasks.

Let say Phyp decides to move some of the load to another node, node 1, which
till now has 0 cpus.  Hence we will see

"BUG: arch topology borken \n the DIE domain not a subset of the NODE
domain"   which is probably okay. This problem is even present even before
NODE domain was created and systems still booted and ran.

However with the introduction of NODE sched_domain,
init_sched_groups_capacity() gets called for non-overlay sched_domains which
gets us into even worse problems. Here we will end up in a situation where
sgA->sgB->sgC-sgD->sgA gets converted into sgA->sgB->sgC->sgB which ends up
creating cpu stalls.

So the request is to expose the sched_domains_numa_masks_set /
sched_domains_numa_masks_clear to arch, so that on topology update i.e event
from phyp, arch set the mask correctly. The scheduler seems to take care of
everything else.

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-31 10:22       ` Srikar Dronamraju
@ 2018-08-31 10:41         ` Peter Zijlstra
  2018-08-31 11:26           ` Srikar Dronamraju
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-31 10:41 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild, Benjamin Herrenschmidt

On Fri, Aug 31, 2018 at 03:22:48AM -0700, Srikar Dronamraju wrote:

> At boot: Before topology update.

How does that work; you do SMP bringup _before_ you know the topology !?

> After topology update.
> 
> For CPU 0
> domain-0: span=0-7 level=SMT
>  groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }
>  domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE
>   groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 }
>   domain-2: span=0-303 level=NODE
>    groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> 
> For CPU 1
> domain-0: span=0-7 level=SMT
>  groups: 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }, 0:{ span=0 }
>  domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE
>   groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 }
>   domain-2: span=0-303 level=NODE
>    groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> 
> 
> For CPU 8
>  domain-0: span=8-15 level=SMT
>   groups: 8:{ span=8 }, 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }
>   domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE
>    groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 }
>    domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
>     groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
>     domain-3: span=0-303 level=NUMA
>      groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> ERROR: groups don't span domain->span
> 
> For CPU 9
>  domain-0: span=8-15 level=SMT
>   groups: 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }, 8:{ span=8 }
>   domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE
>    groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 }
>    domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
>     groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
>     domain-3: span=0-303 level=NUMA
>      groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> ERROR: groups don't span domain->span

This is all very confused... and does not include the error we saw
earlier.

CPU 0 has: SMT, DIE, NODE
CPU 8 has: SMT, DIE, NODE, NUMA

Something is completely buggered in your topology setup.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-31 10:27       ` Srikar Dronamraju
@ 2018-08-31 11:12         ` Peter Zijlstra
  2018-08-31 11:26           ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-31 11:12 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Benjamin Herrenschmidt

On Fri, Aug 31, 2018 at 03:27:24AM -0700, Srikar Dronamraju wrote:
> * Peter Zijlstra <peterz@infradead.org> [2018-08-29 10:02:19]:


> Powerpc lpars running on Phyp have 2 modes. Dedicated and shared.
> 
> Dedicated lpars are similar to kvm guest with vcpupin.

Like i know what that means... I'm not big on virt. I suppose you're
saying it has a fixed virt to phys mapping.

> Shared  lpars are similar to kvm guest without any pinning. When running
> shared lpar mode, Phyp allows overcommitting. Now if more lpars are
> created/destroyed, Phyp will internally move / consolidate the cores. The
> objective is similar to what autonuma tries achieves on the host but with a
> different approach (consolidating to optimal nodes to achieve the best
> possible output).  This would mean that the actual underlying cpus/node
> mapping has changed.

AFAIK Linux can _not_ handle cpu:node relations changing. And I'm pretty
sure I told you that before.

> Phyp will propogate upwards an event to the lpar.  The
> lpar / os can choose to ignore or act on the same.
> 
> We have found that acting on the event will provide upto 40% improvement
> over ignoring the event. Acting on the event would mean moving the cpu from
> one node to the other, and topology_work_fn exactly does that.

How? Last time I checked there was a ton of code that relies on
cpu_to_node() not changing during the runtime of the kernel.

Stuff like the per-cpu memory allocations are done using the boot time
cpu_to_node() map for instance. Similarly, kthread creation uses the
cpu_to_node() map at the time of creation.

A lot of stuff is not re-evaluated. If you're dynamically changing the
node map, you're in for a world of hurt.

> In the case where we didn't have the NUMA sched domain, we would build the
> independent (aka overlap) sched_groups. With NUMA  sched domain
> introduction, we try to reuse sched_groups (aka non-overlay). This results
> in the above, which I thought I tried to explain in
> https://lwn.net/ml/linux-kernel/20180810164533.GB42350@linux.vnet.ibm.com

That email was a ton of confusion; you show an error and you don't
explain how you get there.

> In the typical case above, lets take 2 node, 8 core each having SMT 8
> threads.  Initially all the 8 cores might come from node 0.  Hence
> sched_domains_numa_masks[NODE][node1] and
> sched_domains_numa_mask[NUMA][node1] is set at sched_init_numa will have
> blank cpumasks.
> 
> Let say Phyp decides to move some of the load to another node, node 1, which
> till now has 0 cpus.  Hence we will see
> 
> "BUG: arch topology borken \n the DIE domain not a subset of the NODE
> domain"   which is probably okay. This problem is even present even before
> NODE domain was created and systems still booted and ran.

No that is _NOT_ OKAY. The fact that it boots and runs just means we
cope with it, but it violates a base assumption when building domains.

> However with the introduction of NODE sched_domain,
> init_sched_groups_capacity() gets called for non-overlay sched_domains which
> gets us into even worse problems. Here we will end up in a situation where
> sgA->sgB->sgC-sgD->sgA gets converted into sgA->sgB->sgC->sgB which ends up
> creating cpu stalls.
> 
> So the request is to expose the sched_domains_numa_masks_set /
> sched_domains_numa_masks_clear to arch, so that on topology update i.e event
> from phyp, arch set the mask correctly. The scheduler seems to take care of
> everything else.

NAK, not until you've fixed every cpu_to_node() user in the kernel to
deal with that mask changing.

This is absolutely insane.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-31 10:41         ` Peter Zijlstra
@ 2018-08-31 11:26           ` Srikar Dronamraju
  2018-08-31 12:06             ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-31 11:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild, Benjamin Herrenschmidt

* Peter Zijlstra <peterz@infradead.org> [2018-08-31 12:41:15]:

> On Fri, Aug 31, 2018 at 03:22:48AM -0700, Srikar Dronamraju wrote:
> 
> > At boot: Before topology update.
> 
> How does that work; you do SMP bringup _before_ you know the topology !?
> 

If you look at the other mail that I sent, the system boots to its regular
state with a certain topology. The hypervisor might detect and push topology
updates after the system has been booted and initialized.  This topology
update can happen much much later after boot. We boot with a particular
topology and a later point of time, the topology update event occurs.


> > After topology update.
> > 
> > For CPU 0
> > domain-0: span=0-7 level=SMT
> >  groups: 0:{ span=0 }, 1:{ span=1 }, 2:{ span=2 }, 3:{ span=3 }, 4:{ span=4 }, 5:{ span=5 }, 6:{ span=6 }, 7:{ span=7 }
> >  domain-1: span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 level=DIE
> >   groups: 0:{ span=0-7 cap=8192 }, 32:{ span=32-39 cap=8192 }, 64:{ span=64-71 cap=8192 }, 96:{ span=96-103 cap=8192 }, 128:{ span=128-135 cap=8192 }, 160:{ span=160-167 cap=8192 }, 192:{ span=192-199 cap=8192 }, 224:{ span=224-231 cap=8192 }, 256:{ span=256-263 cap=8192 }, 288:{ span=288-295 cap=8192 }
> >   domain-2: span=0-303 level=NODE
> >    groups: 0:{ span=0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295 cap=81920 }, 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> > 

> > For CPU 9
> >  domain-0: span=8-15 level=SMT
> >   groups: 9:{ span=9 }, 10:{ span=10 }, 11:{ span=11 }, 12:{ span=12 }, 13:{ span=13 }, 14:{ span=14 }, 15:{ span=15 }, 8:{ span=8 }
> >   domain-1: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=DIE
> >    groups: 8:{ span=8-15 cap=8192 }, 40:{ span=40-47 cap=8192 }, 72:{ span=72-79 cap=8192 }, 104:{ span=104-111 cap=8192 }, 136:{ span=136-143 cap=8192 }, 168:{ span=168-175 cap=8192 }, 200:{ span=200-207 cap=8192 }, 232:{ span=232-239 cap=8192 }, 264:{ span=264-271 cap=8192 }, 296:{ span=296-303 cap=8192 }
> >    domain-2: span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 level=NODE
> >     groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }
> >     domain-3: span=0-303 level=NUMA
> >      groups: 8:{ span=8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303 cap=81920 }, 16:{ span=16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279 cap=73728 }, 24:{ span=24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287 cap=73728 }
> > ERROR: groups don't span domain->span
> 
> This is all very confused... and does not include the error we saw
> earlier.

> 
> CPU 0 has: SMT, DIE, NODE
> CPU 8 has: SMT, DIE, NODE, NUMA
> 

This was the same in my previous posting too. Before the topology update
happened, all the cpus would be in SMT, DIE. The topology updates can be
disabled using a kernel parameter topology_updates=off. Its documented under
https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html as

      topology_updates= [KNL, PPC, NUMA] Format: {off} Specify if the kernel
      should ignore (off) topology updates sent by the hypervisor to this
      LPAR.

and is not something new in powerpc.

> Something is completely buggered in your topology setup.
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-31 11:12         ` Peter Zijlstra
@ 2018-08-31 11:26           ` Peter Zijlstra
  2018-08-31 11:53             ` Srikar Dronamraju
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-31 11:26 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Benjamin Herrenschmidt

On Fri, Aug 31, 2018 at 01:12:53PM +0200, Peter Zijlstra wrote:
> NAK, not until you've fixed every cpu_to_node() user in the kernel to
> deal with that mask changing.

Also, what happens if userspace reads that information; uses libnuma and
then you go and shift the world underneath their feet?

> This is absolutely insane.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-31 11:26           ` Peter Zijlstra
@ 2018-08-31 11:53             ` Srikar Dronamraju
  2018-08-31 12:05               ` Peter Zijlstra
  2018-08-31 12:08               ` Peter Zijlstra
  0 siblings, 2 replies; 24+ messages in thread
From: Srikar Dronamraju @ 2018-08-31 11:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Benjamin Herrenschmidt

* Peter Zijlstra <peterz@infradead.org> [2018-08-31 13:26:39]:

> On Fri, Aug 31, 2018 at 01:12:53PM +0200, Peter Zijlstra wrote:
> > NAK, not until you've fixed every cpu_to_node() user in the kernel to
> > deal with that mask changing.
> 
> Also, what happens if userspace reads that information; uses libnuma and
> then you go and shift the world underneath their feet?
> 
> > This is absolutely insane.
> 

The topology events are suppose to be very rare.
From whatever small experiments I have done till now, unless tasks are
bound to both cpu and memory, they seem to be coping well with topology
updates. I know things weren't optimal after a topology change but they
worked. Now after 051f3ca02e46 "Introduce NUMA identity node sched
domain", systems stall. I am only exploring at ways to keep them working
as much as they were before that commit.

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-31 11:53             ` Srikar Dronamraju
@ 2018-08-31 12:05               ` Peter Zijlstra
  2018-08-31 12:08               ` Peter Zijlstra
  1 sibling, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-31 12:05 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Benjamin Herrenschmidt

On Fri, Aug 31, 2018 at 04:53:50AM -0700, Srikar Dronamraju wrote:
> * Peter Zijlstra <peterz@infradead.org> [2018-08-31 13:26:39]:
> 
> > On Fri, Aug 31, 2018 at 01:12:53PM +0200, Peter Zijlstra wrote:
> > > NAK, not until you've fixed every cpu_to_node() user in the kernel to
> > > deal with that mask changing.
> > 
> > Also, what happens if userspace reads that information; uses libnuma and
> > then you go and shift the world underneath their feet?
> > 
> > > This is absolutely insane.
> > 
> 
> The topology events are suppose to be very rare.
> From whatever small experiments I have done till now, unless tasks are
> bound to both cpu and memory, they seem to be coping well with topology
> updates. I know things weren't optimal after a topology change but they
> worked. Now after 051f3ca02e46 "Introduce NUMA identity node sched
> domain", systems stall. I am only exploring at ways to keep them working
> as much as they were before that commit.

I'm saying things were fundamentally buggered and this just made it show.

If you cannot guarantee cpu:node relations, you do not have NUMA, end of
story.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] sched/topology: Use Identity node only if required
  2018-08-31 11:26           ` Srikar Dronamraju
@ 2018-08-31 12:06             ` Peter Zijlstra
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-31 12:06 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Andre Wild, Benjamin Herrenschmidt

On Fri, Aug 31, 2018 at 04:56:18PM +0530, Srikar Dronamraju wrote:
> This was the same in my previous posting too. Before the topology update
> happened, all the cpus would be in SMT, DIE. The topology updates can be
> disabled using a kernel parameter topology_updates=off. Its documented under
> https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html as
> 
>       topology_updates= [KNL, PPC, NUMA] Format: {off} Specify if the kernel
>       should ignore (off) topology updates sent by the hypervisor to this
>       LPAR.
> 
> and is not something new in powerpc.

Doesn't mean it isn't utterly broken.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch
  2018-08-31 11:53             ` Srikar Dronamraju
  2018-08-31 12:05               ` Peter Zijlstra
@ 2018-08-31 12:08               ` Peter Zijlstra
  1 sibling, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-08-31 12:08 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, LKML, Mel Gorman, Rik van Riel, Thomas Gleixner,
	Michael Ellerman, Heiko Carstens, Suravee Suthikulpanit,
	linuxppc-dev, Benjamin Herrenschmidt

On Fri, Aug 31, 2018 at 04:53:50AM -0700, Srikar Dronamraju wrote:

> The topology events are suppose to be very rare.
> From whatever small experiments I have done till now, unless tasks are
> bound to both cpu and memory, they seem to be coping well with topology
> updates.

IOW, if you're not using NUMA, it works if you change the NUMA setup.

You don't see anything wrong with that?!

Those programs would work as well if you didn't expose the NUMA stuff,
because they're not using it anyway.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/topology: Set correct NUMA topology type
  2018-08-10 17:00 ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
  2018-08-10 17:00   ` [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch Srikar Dronamraju
  2018-08-21 11:02   ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
@ 2018-09-10 10:06   ` tip-bot for Srikar Dronamraju
  2 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Srikar Dronamraju @ 2018-09-10 10:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, linuxppc-dev, torvalds, wild,
	suravee.suthikulpanit, hpa, mingo, mpe, heiko.carstens, tglx,
	peterz, riel, srikar, mgorman

Commit-ID:  e5e96fafd9028b1478b165db78c52d981c14f471
Gitweb:     https://git.kernel.org/tip/e5e96fafd9028b1478b165db78c52d981c14f471
Author:     Srikar Dronamraju <srikar@linux.vnet.ibm.com>
AuthorDate: Fri, 10 Aug 2018 22:30:18 +0530
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 10 Sep 2018 10:13:45 +0200

sched/topology: Set correct NUMA topology type

With the following commit:

  051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain")

the scheduler introduced a new NUMA level. However this leads to the NUMA topology
on 2 node systems to not be marked as NUMA_DIRECT anymore.

After this commit, it gets reported as NUMA_BACKPLANE, because
sched_domains_numa_level is now 2 on 2 node systems.

Fix this by allowing setting systems that have up to 2 NUMA levels as
NUMA_DIRECT.

While here remove code that assumes that level can be 0.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andre Wild <wild@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
Link: http://lkml.kernel.org/r/1533920419-17410-1-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/topology.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 56a0fed30c0a..505a41c42b96 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1295,7 +1295,7 @@ static void init_numa_topology_type(void)
 
 	n = sched_max_numa_distance;
 
-	if (sched_domains_numa_levels <= 1) {
+	if (sched_domains_numa_levels <= 2) {
 		sched_numa_topology_type = NUMA_DIRECT;
 		return;
 	}
@@ -1380,9 +1380,6 @@ void sched_init_numa(void)
 			break;
 	}
 
-	if (!level)
-		return;
-
 	/*
 	 * 'level' contains the number of unique distances
 	 *

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-09-10 10:06 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-08  7:09 [PATCH] sched/topology: Use Identity node only if required Srikar Dronamraju
2018-08-08  7:58 ` Peter Zijlstra
2018-08-08  8:19   ` Srikar Dronamraju
2018-08-08  8:43     ` Peter Zijlstra
2018-08-08  9:30       ` Peter Zijlstra
2018-08-10 16:45   ` Srikar Dronamraju
2018-08-29  8:43     ` Peter Zijlstra
2018-08-29  8:57       ` Peter Zijlstra
2018-08-31 10:22       ` Srikar Dronamraju
2018-08-31 10:41         ` Peter Zijlstra
2018-08-31 11:26           ` Srikar Dronamraju
2018-08-31 12:06             ` Peter Zijlstra
     [not found] <reply-to=<20180808081942.GA37418@linux.vnet.ibm.com>
2018-08-10 17:00 ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
2018-08-10 17:00   ` [PATCH 2/2] sched/topology: Expose numa_mask set/clear functions to arch Srikar Dronamraju
2018-08-29  8:02     ` Peter Zijlstra
2018-08-31 10:27       ` Srikar Dronamraju
2018-08-31 11:12         ` Peter Zijlstra
2018-08-31 11:26           ` Peter Zijlstra
2018-08-31 11:53             ` Srikar Dronamraju
2018-08-31 12:05               ` Peter Zijlstra
2018-08-31 12:08               ` Peter Zijlstra
2018-08-21 11:02   ` [PATCH 1/2] sched/topology: Set correct numa topology type Srikar Dronamraju
2018-08-21 13:59     ` Peter Zijlstra
2018-09-10 10:06   ` [tip:sched/core] sched/topology: Set correct NUMA " tip-bot for Srikar Dronamraju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).