linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Minature NUMA scheduler
@ 2003-01-09 23:54 Martin J. Bligh
  2003-01-10  5:36 ` [Lse-tech] " Michael Hohnbaum
  0 siblings, 1 reply; 96+ messages in thread
From: Martin J. Bligh @ 2003-01-09 23:54 UTC (permalink / raw)
  To: Erich Focht, Michael Hohnbaum, Robert Love, Ingo Molnar
  Cc: linux-kernel, lse-tech

I tried a small experiment today - did a simple restriction of
the O(1) scheduler to only balance inside a node. Coupled with
the small initial load balancing patch floating around, this
covers 95% of cases, is a trivial change (3 lines), performs 
just as well as Erich's patch on a kernel compile, and actually
better on schedbench.

This is NOT meant to be a replacement for the code Erich wrote,
it's meant to be a simple way to get integration and acceptance.
Code that just forks and never execs will stay on one node - but
we can take the code Erich wrote, and put it in seperate rebalancer
that fires much less often to do a cross-node rebalance. All that
would be under #ifdef CONFIG_NUMA, the only thing that would touch
mainline is these three lines of change, and it's trivial to see
they're completely equivalent to the current code on non-NUMA systems.

I also believe that this is the more correct approach in design, it
should result in much less cross-node migration of tasks, and less 
scanning of remote runqueues.

Opinions / comments?

M.

Kernbench:
                                   Elapsed        User      System         CPU
                   2.5.54-mjb3      19.41s     186.38s     39.624s     1191.4%
          2.5.54-mjb3-mjbsched     19.508s    186.356s     39.888s     1164.6%

Schedbench 4:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                   2.5.54-mjb3        0.00       35.14       88.82        0.64
          2.5.54-mjb3-mjbsched        0.00       31.84       88.91        0.49

Schedbench 8:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                   2.5.54-mjb3        0.00       47.55      269.36        1.48
          2.5.54-mjb3-mjbsched        0.00       41.01      252.34        1.07

Schedbench 16:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                   2.5.54-mjb3        0.00       76.53      957.48        4.17
          2.5.54-mjb3-mjbsched        0.00       69.01      792.71        2.74

Schedbench 32:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                   2.5.54-mjb3        0.00      145.20     1993.97       11.05
          2.5.54-mjb3-mjbsched        0.00      117.47     1798.93        5.95

Schedbench 64:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                   2.5.54-mjb3        0.00      307.80     4643.55       20.36
          2.5.54-mjb3-mjbsched        0.00      241.04     3589.55       12.74

-----------------------------------------

diff -purN -X /home/mbligh/.diff.exclude virgin/kernel/sched.c mjbsched/kernel/sched.c
--- virgin/kernel/sched.c	Mon Dec  9 18:46:15 2002
+++ mjbsched/kernel/sched.c	Thu Jan  9 14:09:17 2003
@@ -654,7 +654,7 @@ static inline unsigned int double_lock_b
 /*
  * find_busiest_queue - find the busiest runqueue.
  */
-static inline runqueue_t *find_busiest_queue(runqueue_t *this_rq, int this_cpu, int idle, int *imbalance)
+static inline runqueue_t *find_busiest_queue(runqueue_t *this_rq, int this_cpu, int idle, int *imbalance, unsigned long cpumask)
 {
 	int nr_running, load, max_load, i;
 	runqueue_t *busiest, *rq_src;
@@ -689,7 +689,7 @@ static inline runqueue_t *find_busiest_q
 	busiest = NULL;
 	max_load = 1;
 	for (i = 0; i < NR_CPUS; i++) {
-		if (!cpu_online(i))
+		if (!cpu_online(i) || !((1 << i) & cpumask) )
 			continue;
 
 		rq_src = cpu_rq(i);
@@ -764,7 +764,8 @@ static void load_balance(runqueue_t *thi
 	struct list_head *head, *curr;
 	task_t *tmp;
 
-	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance);
+	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance, 
+				__node_to_cpu_mask(__cpu_to_node(this_cpu)) );
 	if (!busiest)
 		goto out;
 

---------------------------------------------------

A tiny change in the current ilb patch is also needed to stop it
using a macro from the first patch:

diff -purN -X /home/mbligh/.diff.exclude ilbold/kernel/sched.c ilbnew/kernel/sched.c
--- ilbold/kernel/sched.c	Thu Jan  9 15:20:53 2003
+++ ilbnew/kernel/sched.c	Thu Jan  9 15:27:49 2003
@@ -2213,6 +2213,7 @@ static void sched_migrate_task(task_t *p
 static int sched_best_cpu(struct task_struct *p)
 {
 	int i, minload, load, best_cpu, node = 0;
+	unsigned long cpumask;
 
 	best_cpu = task_cpu(p);
 	if (cpu_rq(best_cpu)->nr_running <= 2)
@@ -2226,9 +2227,11 @@ static int sched_best_cpu(struct task_st
 			node = i;
 		}
 	}
+
 	minload = 10000000;
-	loop_over_node(i,node) {
-		if (!cpu_online(i))
+	cpumask = __node_to_cpu_mask(node);
+	for (i = 0; i < NR_CPUS; ++i) {
+		if (!(cpumask & (1 << i)))
 			continue;
 		if (cpu_rq(i)->nr_running < minload) {
 			best_cpu = i;




^ permalink raw reply	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2003-02-04  9:21 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-09 23:54 Minature NUMA scheduler Martin J. Bligh
2003-01-10  5:36 ` [Lse-tech] " Michael Hohnbaum
2003-01-10 16:34   ` Erich Focht
2003-01-10 16:57     ` Martin J. Bligh
2003-01-12 23:35       ` Erich Focht
2003-01-12 23:55       ` NUMA scheduler 2nd approach Erich Focht
2003-01-13  8:02         ` Christoph Hellwig
2003-01-13 11:32           ` Erich Focht
2003-01-13 15:26             ` [Lse-tech] " Christoph Hellwig
2003-01-13 15:46               ` Erich Focht
2003-01-13 19:03             ` Michael Hohnbaum
2003-01-14  1:23         ` Michael Hohnbaum
2003-01-14  4:45           ` [Lse-tech] " Andrew Theurer
2003-01-14  4:56             ` Martin J. Bligh
2003-01-14 11:14               ` Erich Focht
2003-01-14 15:55                 ` [PATCH 2.5.58] new NUMA scheduler Erich Focht
2003-01-14 16:07                   ` [Lse-tech] " Christoph Hellwig
2003-01-14 16:23                   ` [PATCH 2.5.58] new NUMA scheduler: fix Erich Focht
2003-01-14 16:43                     ` Erich Focht
2003-01-14 19:02                       ` Michael Hohnbaum
2003-01-14 21:56                         ` [Lse-tech] " Michael Hohnbaum
2003-01-15 15:10                         ` Erich Focht
2003-01-16  0:14                           ` Michael Hohnbaum
2003-01-16  6:05                           ` Martin J. Bligh
2003-01-16 16:47                             ` Erich Focht
2003-01-16 18:07                               ` Robert Love
2003-01-16 18:48                                 ` Martin J. Bligh
2003-01-16 19:07                                 ` Ingo Molnar
2003-01-16 18:59                                   ` Martin J. Bligh
2003-01-16 19:10                                   ` Christoph Hellwig
2003-01-16 19:44                                     ` Ingo Molnar
2003-01-16 19:43                                       ` Martin J. Bligh
2003-01-16 20:19                                         ` Ingo Molnar
2003-01-16 20:29                                           ` [Lse-tech] " Rick Lindsley
2003-01-16 23:31                                           ` Martin J. Bligh
2003-01-17  7:23                                             ` Ingo Molnar
2003-01-17  8:47                                             ` [patch] sched-2.5.59-A2 Ingo Molnar
2003-01-17 14:35                                               ` Erich Focht
2003-01-17 15:11                                                 ` Ingo Molnar
2003-01-17 15:30                                                   ` Erich Focht
2003-01-17 16:58                                                   ` Martin J. Bligh
2003-01-18 20:54                                                     ` NUMA sched -> pooling scheduler (inc HT) Martin J. Bligh
2003-01-18 21:34                                                       ` [Lse-tech] " Martin J. Bligh
2003-01-19  0:13                                                         ` Andrew Theurer
2003-01-17 18:19                                                   ` [patch] sched-2.5.59-A2 Michael Hohnbaum
2003-01-18  7:08                                                   ` William Lee Irwin III
2003-01-18  8:12                                                     ` Martin J. Bligh
2003-01-18  8:16                                                       ` William Lee Irwin III
2003-01-19  4:22                                                     ` William Lee Irwin III
2003-01-17 17:21                                                 ` Martin J. Bligh
2003-01-17 17:23                                                 ` Martin J. Bligh
2003-01-17 18:11                                                 ` Erich Focht
2003-01-17 19:04                                                   ` Martin J. Bligh
2003-01-17 19:26                                                     ` [Lse-tech] " Martin J. Bligh
2003-01-18  0:13                                                       ` Michael Hohnbaum
2003-01-18 13:31                                                         ` [patch] tunable rebalance rates for sched-2.5.59-B0 Erich Focht
2003-01-18 23:09                                                         ` [patch] sched-2.5.59-A2 Erich Focht
2003-01-20  9:28                                                           ` Ingo Molnar
2003-01-20 12:07                                                             ` Erich Focht
2003-01-20 16:56                                                               ` Ingo Molnar
2003-01-20 17:04                                                                 ` Ingo Molnar
2003-01-20 17:10                                                                   ` Martin J. Bligh
2003-01-20 17:24                                                                     ` Ingo Molnar
2003-01-20 19:13                                                                       ` Andrew Theurer
2003-01-20 19:33                                                                         ` Martin J. Bligh
2003-01-20 19:52                                                                           ` Andrew Theurer
2003-01-20 19:52                                                                             ` Martin J. Bligh
2003-01-20 21:18                                                                               ` [patch] HT scheduler, sched-2.5.59-D7 Ingo Molnar
2003-01-20 22:28                                                                                 ` Andrew Morton
2003-01-21  1:11                                                                                   ` Michael Hohnbaum
2003-01-22  3:15                                                                                 ` Michael Hohnbaum
2003-01-22 16:41                                                                                   ` Andrew Theurer
2003-01-22 16:17                                                                                     ` Martin J. Bligh
2003-01-22 16:20                                                                                       ` Andrew Theurer
2003-01-22 16:35                                                                                     ` Michael Hohnbaum
2003-02-03 18:23                                                                                 ` [patch] HT scheduler, sched-2.5.59-E2 Ingo Molnar
2003-02-03 20:47                                                                                   ` Robert Love
2003-02-04  9:31                                                                                   ` Erich Focht
2003-01-20 17:04                                                                 ` [patch] sched-2.5.59-A2 Martin J. Bligh
2003-01-21 17:44                                                                 ` Erich Focht
2003-01-20 16:23                                                             ` Martin J. Bligh
2003-01-20 16:59                                                               ` Ingo Molnar
2003-01-17 23:09                                                     ` Matthew Dobson
2003-01-16 23:45                                           ` [PATCH 2.5.58] new NUMA scheduler: fix Michael Hohnbaum
2003-01-17 11:10                                           ` Erich Focht
2003-01-17 14:07                                             ` Ingo Molnar
2003-01-16 19:44                                       ` John Bradford
2003-01-14 16:51                     ` Christoph Hellwig
2003-01-15  0:05                     ` Michael Hohnbaum
2003-01-15  7:47                     ` Martin J. Bligh
2003-01-14  5:50             ` [Lse-tech] Re: NUMA scheduler 2nd approach Michael Hohnbaum
2003-01-14 16:52               ` Andrew Theurer
2003-01-14 15:13                 ` Erich Focht
2003-01-14 10:56           ` Erich Focht
2003-01-11 14:43     ` [Lse-tech] Minature NUMA scheduler Bill Davidsen
2003-01-12 23:24       ` Erich Focht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).