* Re: O(1) scheduler gives big boost to tbench 192
@ 2002-05-03 16:37 John Hawkes
0 siblings, 0 replies; 24+ messages in thread
From: John Hawkes @ 2002-05-03 16:37 UTC (permalink / raw)
To: linux-kernel; +Cc: rwhron
From: <rwhron@earthlink.net>
...
> tbench 192 is an anomaly test too. AIM looks like a nice
> "mixed" bench. Do you have any scripts for it? I'd like
> to use AIM too.
Try http://www.caldera.com/developers/community/contrib/aim.html for a tarball
with everything you'll need.
The "Multiuser Shared System Mix" (aka "workfile.shared") is the one I use.
You'll need several disk spindles to keep it compute-bound, though. Several
of the disk subtests, especially the sync_* tests, quickly drive one or two
spindles to their max transaction rates, and from that point AIM7 will be
I/O-bound and produce a largely idle system, which isn't very interesting if
you're trying to example CPU scheduler performance with high process counts.
One thing you can do is to comment-out the three sync_* tests in the
workfile.shared configuration file, and then watch your idle time with
something like vmstat. Experiment with commenting-out more disk subtests,
like creat-clo, disk_cp, and disk_src, one by one, until AIM7 becomes
compute-bound.
John Hawkes
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
@ 2002-05-20 12:46 rwhron
0 siblings, 0 replies; 24+ messages in thread
From: rwhron @ 2002-05-20 12:46 UTC (permalink / raw)
To: linux-kernel; +Cc: kravetz, jamagallon, rml
> On Tue, May 07, 2002 at 04:39:34PM -0700, Robert Love wrote:
> > It is just for pipes we previously used sync, no?
On Tue, 7 May 2002 16:48:57 -0700, Mike Kravetz wrote
> That's the only thing I know of that used it.
> I'd really like to know if there are any real workloads that
> benefited from this feature, rather than just some benchmark.
> I can do some research, but was hoping someone on this list
> might remember. If there is a valid workload, I'll propose
> a patch.
On Mon, 13 May 2002 02:06:31 +0200, J.A. Magallon wrote:
> - Re-introduction of wake_up_sync to make pipes run fast again. No idea
> about this is useful or not, that is the point, to test it
2.4.19-pre8-jam2 showed slightly better performance on the quad Xeon
for most benchmarks with 25-wake_up_sync backed out. However, it's
not clear to me 25-wake_up_sync was proper patch to backout for this
test, as there wasn't a dramatic change in Pipe latency or bandwidth
without it.
There was a > 300% improvement lmbench Pipe bandwidth and latency
comparing pre8-jam2 to pre7-jam6.
Average of 25 lmbench runs on jam2 kernels, 12 on the others:
2.4.19-pre8-jam2-nowuos (backed out 25-wake_up_sync patch)
*Local* Communication latencies in microseconds - smaller is better
AF
kernel Pipe UNIX
----------------------- ----- -----
2.4.19-pre7-jam6 29.51 42.37
2.4.19-pre8 10.73 29.94
2.4.19-pre8-aa2 12.45 29.53
2.4.19-pre8-ac1 35.39 45.59
2.4.19-pre8-jam2 7.70 15.27
2.4.19-pre8-jam2-nowuos 7.74 14.93
*Local* Communication bandwidths in MB/s - bigger is better
AF
kernel Pipe UNIX
----------------------- ------ ------
2.4.19-pre7-jam6 66.41 260.39
2.4.19-pre8 468.57 273.32
2.4.19-pre8-aa2 418.09 273.59
2.4.19-pre8-ac1 110.62 241.06
2.4.19-pre8-jam2 545.66 233.68
2.4.19-pre8-jam2-nowuos 544.57 246.53
The kernel build test, which applies patches through a pipe
and compiles with -pipe didn't reflect an improvement.
kernel average min_time max_time runs notes
2.4.19-pre7-jam6 237.0 235 239 3 All successful
2.4.19-pre8 239.7 238 241 3 All successful
2.4.19-pre8-aa2 237.7 237 238 3 All successful
2.4.19-pre8-ac1 239.3 238 241 3 All successful
2.4.19-pre8-jam2 240.0 238 241 3 All successful
2.4.19-pre8-jam2-nowuos 238.7 236 241 3 All successful
I don't know how much of the kernel build test is dependant on
pipe performance. There is probably a better "real world"
measurement.
On a single processor box, there was an improvement on kernel build
between pre7-jam6 and pre8-jam2. That was only on one sample though.
Xeon page:
http://home.earthlink.net/~rwhron/kernel/bigbox.html
Latest on uniproc:
http://home.earthlink.net/~rwhron/kernel/latest.html
--
Randy Hron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-08 8:50 ` Andrea Arcangeli
@ 2002-05-09 23:18 ` Mike Kravetz
0 siblings, 0 replies; 24+ messages in thread
From: Mike Kravetz @ 2002-05-09 23:18 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: rwhron, mingo, gh, linux-kernel, alan
On Wed, May 08, 2002 at 10:50:49AM +0200, Andrea Arcangeli wrote:
>
> I would like if you could pass over your changes to the O(1) scheduler
> to resurrect the sync-wakeup.
>
Here is a patch to reintroduce __wake_up_sync() in 2.5.14.
It would be interesting to see if this helps in some of the
areas where people have seen a drop in performance. Since,
'full pipes' are the only users of this code, I would only
expect to see benefit in workloads expecting high pipe
bandwidth.
Let me know what you think.
--
Mike
diff -Naur linux-2.5.14/fs/pipe.c linux-2.5.14-pipe/fs/pipe.c
--- linux-2.5.14/fs/pipe.c Mon May 6 03:37:52 2002
+++ linux-2.5.14-pipe/fs/pipe.c Wed May 8 22:48:39 2002
@@ -116,7 +116,7 @@
* writers synchronously that there is more
* room.
*/
- wake_up_interruptible(PIPE_WAIT(*inode));
+ wake_up_interruptible_sync(PIPE_WAIT(*inode));
if (!PIPE_EMPTY(*inode))
BUG();
goto do_more_read;
diff -Naur linux-2.5.14/include/linux/sched.h linux-2.5.14-pipe/include/linux/sched.h
--- linux-2.5.14/include/linux/sched.h Mon May 6 03:37:54 2002
+++ linux-2.5.14-pipe/include/linux/sched.h Thu May 9 20:47:19 2002
@@ -485,6 +485,11 @@
#define wake_up_interruptible(x) __wake_up((x),TASK_INTERRUPTIBLE, 1)
#define wake_up_interruptible_nr(x, nr) __wake_up((x),TASK_INTERRUPTIBLE, nr)
#define wake_up_interruptible_all(x) __wake_up((x),TASK_INTERRUPTIBLE, 0)
+#ifdef CONFIG_SMP
+#define wake_up_interruptible_sync(x) __wake_up_sync((x),TASK_INTERRUPTIBLE, 1)
+#else
+#define wake_up_interruptible_sync(x) __wake_up((x),TASK_INTERRUPTIBLE, 1)
+#endif
asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struct rusage * ru);
extern int in_group_p(gid_t);
diff -Naur linux-2.5.14/kernel/sched.c linux-2.5.14-pipe/kernel/sched.c
--- linux-2.5.14/kernel/sched.c Mon May 6 03:37:57 2002
+++ linux-2.5.14-pipe/kernel/sched.c Thu May 9 21:04:13 2002
@@ -329,27 +329,38 @@
* "current->state = TASK_RUNNING" to mark yourself runnable
* without the overhead of this.
*/
-static int try_to_wake_up(task_t * p)
+static int try_to_wake_up(task_t * p, int sync)
{
unsigned long flags;
int success = 0;
runqueue_t *rq;
+repeat_lock_task:
rq = task_rq_lock(p, &flags);
- p->state = TASK_RUNNING;
if (!p->array) {
+ if (unlikely(sync)) {
+ if (p->thread_info->cpu != smp_processor_id()) {
+ p->thread_info->cpu = smp_processor_id();
+ task_rq_unlock(rq, &flags);
+ goto repeat_lock_task;
+ }
+ }
activate_task(p, rq);
+ /*
+ * If sync is set, a resched_task() is a NOOP
+ */
if (p->prio < rq->curr->prio)
resched_task(rq->curr);
success = 1;
}
+ p->state = TASK_RUNNING;
task_rq_unlock(rq, &flags);
return success;
}
int wake_up_process(task_t * p)
{
- return try_to_wake_up(p);
+ return try_to_wake_up(p, 0);
}
void wake_up_forked_process(task_t * p)
@@ -872,7 +883,7 @@
* started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
* zero in this (rare) case, and we handle it by continuing to scan the queue.
*/
-static inline void __wake_up_common(wait_queue_head_t *q, unsigned int mode, int nr_exclusive)
+static inline void __wake_up_common(wait_queue_head_t *q, unsigned int mode, int nr_exclusive, int sync)
{
struct list_head *tmp;
unsigned int state;
@@ -883,7 +894,7 @@
curr = list_entry(tmp, wait_queue_t, task_list);
p = curr->task;
state = p->state;
- if ((state & mode) && try_to_wake_up(p) &&
+ if ((state & mode) && try_to_wake_up(p, sync) &&
((curr->flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive))
break;
}
@@ -897,7 +908,22 @@
return;
wq_read_lock_irqsave(&q->lock, flags);
- __wake_up_common(q, mode, nr_exclusive);
+ __wake_up_common(q, mode, nr_exclusive, 0);
+ wq_read_unlock_irqrestore(&q->lock, flags);
+}
+
+void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr_exclusive)
+{
+ unsigned long flags;
+
+ if (unlikely(!q))
+ return;
+
+ wq_read_lock_irqsave(&q->lock, flags);
+ if (likely(nr_exclusive))
+ __wake_up_common(q, mode, nr_exclusive, 1);
+ else
+ __wake_up_common(q, mode, nr_exclusive, 0);
wq_read_unlock_irqrestore(&q->lock, flags);
}
@@ -907,7 +933,7 @@
spin_lock_irqsave(&x->wait.lock, flags);
x->done++;
- __wake_up_common(&x->wait, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, 1);
+ __wake_up_common(&x->wait, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, 1, 0);
spin_unlock_irqrestore(&x->wait.lock, flags);
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-08 17:02 ` Mike Kravetz
@ 2002-05-09 0:26 ` Jussi Laako
0 siblings, 0 replies; 24+ messages in thread
From: Jussi Laako @ 2002-05-09 0:26 UTC (permalink / raw)
To: Mike Kravetz; +Cc: Robert Love, mingo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]
Mike Kravetz wrote:
>
> Is there anything simple I can do to check the latencies of the
> pthread_cond_*() functions? I'd like to do some analysis of
> scheduler behavior, but am unfamiliar with the user level code.
I just wrote small test program (attached) for testing this. It is similar
to my app, but for this test the soundcard is just used more as a timing
source.
Adjust the buffer size so that it's on the edge of missing blocks then
generate some system load to run it over the edge to see how sensitive it is
and how many blocks are lost. I can see clear difference using kernel with
original scheduler and with kernel having the O(1) scheduler.
It's quite sensitive when you run it without setuid privileges, but normal
situation is to run it at SCHED_FIFO.
Because of famous ReiserFS tailmerging feature lost-and-rewrote part of it
after crashing system with it. Found the the binary overwriting top of it
after reboot... I was happy to have copied it to another machine over NFS
just a few minutes before crash, so not much was lost.
- Jussi Laako
--
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B 39DD A4DE 63EB C216 1E4B
Available at PGP keyservers
[-- Attachment #2: schedtest.c --]
[-- Type: application/x-unknown-content-type-cfile, Size: 5022 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-08 16:31 ` Robert Love
@ 2002-05-08 17:02 ` Mike Kravetz
2002-05-09 0:26 ` Jussi Laako
0 siblings, 1 reply; 24+ messages in thread
From: Mike Kravetz @ 2002-05-08 17:02 UTC (permalink / raw)
To: Robert Love; +Cc: Jussi Laako, mingo, linux-kernel
On Wed, May 08, 2002 at 09:31:39AM -0700, Robert Love wrote:
> On Wed, 2002-05-08 at 08:34, Jussi Laako wrote:
> >
> > Maybe this is the reason why O(1) scheduler has big latencies with
> > pthread_cond_*() functions which original scheduler doesn't have?
> > I think I tracked the problem down to try_to_wake_up(), but I was unable to
> > fix it.
>
> Ah this could be the same case. I just looked into the definition of
> the conditional variable pthread stuff and it looks like it _could_ be
> implemented using pipes but I do not see why it would per se. If it
> does not use pipes, then this sync issue is not at hand (only the pipe
> code passed 1 for the sync flag).
>
> If it does not use pipes, we could have another problem - but I doubt
> it. Maybe the benchmark is just another case where it shows worse
> performance due to some attribute of the scheduler or load balancer?
>
In some cases, the O(1) scheduler will produce higher latencies than
the old scheduler. On 'some' workloads/benchmarks the old scheduler
was better because it had a greater tendency to schedule tasks on the
same CPU. This is certainly the case with the lat_ctx and lat_pipe
components of LMbench. Note that this has nothing to do with the
wake_up sync behavior. Rather, it is the difference between scheduling
a new task on the current CPU as opposed to a 'remote' CPU. You can
schedule the task on the current CPU quicker, but this is not good for
optimal cache usage. I believe the O(1) scheduler makes the correct
trade off in this area.
Is there anything simple I can do to check the latencies of the
pthread_cond_*() functions? I'd like to do some analysis of
scheduler behavior, but am unfamiliar with the user level code.
--
Mike
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
@ 2002-05-08 16:39 Bill Davidsen
0 siblings, 0 replies; 24+ messages in thread
From: Bill Davidsen @ 2002-05-08 16:39 UTC (permalink / raw)
To: Linux-Kernel Mailing List
Forgive me if you feel I've clipped too much from your posting, I'm trying
to capture the points made by various folks without responding to each
message.
---------- Forwarded message ----------
From: Mike Kravetz <kravetz@us.ibm.com>
Date: Tue, 7 May 2002 15:13:56 -0700
I have experimented with reintroducing '__wake_up_sync' support
into the O(1) scheduler. The modifications are limited to the
'try_to_wake_up' routine as they were before. If the 'synchronous'
flag is set, then 'try_to_wake_up' trys to put the awakened task
on the same runqueue as the caller without forcing a reschedule.
If the task is not already on a runqueue, this is easy. If not,
we give up. Results, restore previous bandwidth results.
BEFORE
------
Pipe latency: 6.5185 microseconds
Pipe bandwidth: 86.35 MB/sec
AFTER
-----
Pipe latency: 6.5723 microseconds
Pipe bandwidth: 540.13 MB/sec
---------- Forwarded message ----------
From: Andrea Arcangeli <andrea@suse.de>
So my hypothesis about the sync wakeup in the below email proven to be right:
http://marc.theaimsgroup.com/?l=linux-kernel&m=102050009725367&w=2
Many thanks for verifying this.
Personally if the two tasks ends blocking waiting each other, then I
prefer them to be in the same cpu. That was the whole point of the
optimization. If the pipe buffer is large enough not to require reader
or writer to block, then we don't do the sync wakeup just now (there's a
detail with the reader that may block simply because the writer is slow
at writing, but it probably doesn't matter much). There are many cases
where a PAGE_SIZE of buffer gets filled in much less then a timeslice,
and for all those cases rescheduling the two tasks one after the other
in the same cpu is a win, just like the benchmark shows. Think the
normal pipes we do from the shell, like a "| grep something", they are
very common and they all wants to be handled as a sync wakeups. In
short when loads of data pass through the pipe with max bandwith, the
sync-wakeup is a definitive win. If the pipe never gets filled then the
writer never sync-wakeup, it just returns the write call asynchronously,
but of course the pipe doesn't get filled because it's not a
max-bandiwth scenario, and so the producer and the consumer are allowed
to scale in multiple cpus by the design of the workload.
Comments?
I would like if you could pass over your changes to the O(1) scheduler
to resurrect the sync-wakeup.
---------- Forwarded message ----------
From: Mike Kravetz <kravetz@us.ibm.com>
Date: Tue, 7 May 2002 15:43:22 -0700
I'm not sure if 'synchronous' is still being passed all the way
down to try_to_wake_up in your tree (since it was removed in 2.5).
This is based off a back port of O(1) to 2.4.18 that Robert Love
did. The rest of try_to_wake_up (the normal/common path) remains
the same.
---------- Forwarded message ----------
From: Robert Love <rml@tech9.net>
Date: 07 May 2002 16:39:34 -0700
Hm, interesting. When Ingo removed the sync variants of wake_up he did
it believing the load balancer would handle the case. Apparently, at
least in this case, that assumption was wrong.
I agree with your earlier statement, though - this benchmark may be a
case where it shows up negatively but in general the balancing is
preferred. I can think of plenty of workloads where that is the case.
I also wonder if over time the load balancer would end up putting the
tasks on the same CPU. That is something the quick pipe benchmark would
not show.
---------- Forwarded message ----------
From: Mike Kravetz <kravetz@us.ibm.com>
Date: Tue, 7 May 2002 16:48:57 -0700
On Tue, May 07, 2002 at 04:39:34PM -0700, Robert Love wrote:
> It is just for pipes we previously used sync, no?
That's the only thing I know of that used it.
I'd really like to know if there are any real workloads that
benefited from this feature, rather than just some benchmark.
I can do some research, but was hoping someone on this list
might remember. If there is a valid workload, I'll propose
a patch. However, I don't think we should be adding patches/
features just to help some benchmark that is unrelated to
real world use.
==== start original material ====
Got to change mailers...
Consider the command line:
grep pattern huge_log_file | cut -f1-2,5,7 | sed 's/stuff/things/' |
tee extract.tmp | less
Ideally I would like the pipes to run as fast as possible since I'm
waiting for results, using cache and one CPU where that is best, and using
all the CPUs needed if the machine is SMP and processing is complex. I
believe that the original code came closer to that ideal than the recent
code, and obviously I think the example is "valid workload" since I do
stuff like that every time I look for/at server problems.
I believe the benchmark shows a performance issue which will occur in
normal usage.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-08 15:34 ` Jussi Laako
@ 2002-05-08 16:31 ` Robert Love
2002-05-08 17:02 ` Mike Kravetz
0 siblings, 1 reply; 24+ messages in thread
From: Robert Love @ 2002-05-08 16:31 UTC (permalink / raw)
To: Jussi Laako; +Cc: Mike Kravetz, mingo, linux-kernel
On Wed, 2002-05-08 at 08:34, Jussi Laako wrote:
> Mike Kravetz wrote:
> >
> > I'd really like to know if there are any real workloads that
> > benefited from this feature, rather than just some benchmark.
>
> Maybe this is the reason why O(1) scheduler has big latencies with
> pthread_cond_*() functions which original scheduler doesn't have?
> I think I tracked the problem down to try_to_wake_up(), but I was unable to
> fix it.
Ah this could be the same case. I just looked into the definition of
the conditional variable pthread stuff and it looks like it _could_ be
implemented using pipes but I do not see why it would per se. If it
does not use pipes, then this sync issue is not at hand (only the pipe
code passed 1 for the sync flag).
If it does not use pipes, we could have another problem - but I doubt
it. Maybe the benchmark is just another case where it shows worse
performance due to some attribute of the scheduler or load balancer?
Robert Love
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-07 23:48 ` Mike Kravetz
@ 2002-05-08 15:34 ` Jussi Laako
2002-05-08 16:31 ` Robert Love
0 siblings, 1 reply; 24+ messages in thread
From: Jussi Laako @ 2002-05-08 15:34 UTC (permalink / raw)
To: Mike Kravetz; +Cc: Robert Love, mingo, linux-kernel
Mike Kravetz wrote:
>
> I'd really like to know if there are any real workloads that
> benefited from this feature, rather than just some benchmark.
Maybe this is the reason why O(1) scheduler has big latencies with
pthread_cond_*() functions which original scheduler doesn't have?
I think I tracked the problem down to try_to_wake_up(), but I was unable to
fix it.
- Jussi Laako
--
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B 39DD A4DE 63EB C216 1E4B
Available at PGP keyservers
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-07 22:13 ` Mike Kravetz
2002-05-07 22:44 ` Alan Cox
@ 2002-05-08 8:50 ` Andrea Arcangeli
2002-05-09 23:18 ` Mike Kravetz
1 sibling, 1 reply; 24+ messages in thread
From: Andrea Arcangeli @ 2002-05-08 8:50 UTC (permalink / raw)
To: Mike Kravetz; +Cc: rwhron, mingo, gh, linux-kernel, alan
On Tue, May 07, 2002 at 03:13:56PM -0700, Mike Kravetz wrote:
> I believe the decrease in pipe bandwidth is a direct result of the
> removal of the '__wake_up_sync' support. I'm not exactly sure what
> the arguments were for adding this support to the 'old' scheduler.
> However, it was only used by the 'pipe_write' code when it had to
> block after waking up a the reader on the pipe. The 'bw_pipe'
> test exercised this code path. In the 'old' scheduler '__wake_up_sync'
> seemed to accomplish the following:
> 1) Eliminated (possibly) unnecessary schedules on 'remote' CPUs
> 2) Eliminated IPI latency by having both reader and writer
> execute on the same CPU
> 3) ? Took advantage of pipe data being in the CPU cache, by
> having the reader read data the writer just wrote into the
> cache. ?
> As I said, I'm not sure of the arguments for introducing this
> functionality in the 'old' scheduler. Hopefully, it was not
> just a 'benchmark enhancing' patch.
>
> I have experimented with reintroducing '__wake_up_sync' support
> into the O(1) scheduler. The modifications are limited to the
> 'try_to_wake_up' routine as they were before. If the 'synchronous'
> flag is set, then 'try_to_wake_up' trys to put the awakened task
> on the same runqueue as the caller without forcing a reschedule.
> If the task is not already on a runqueue, this is easy. If not,
> we give up. Results, restore previous bandwidth results.
>
> BEFORE
> ------
> Pipe latency: 6.5185 microseconds
> Pipe bandwidth: 86.35 MB/sec
>
> AFTER
> -----
> Pipe latency: 6.5723 microseconds
> Pipe bandwidth: 540.13 MB/sec
So my hypothesis about the sync wakeup in the below email proven to be right:
http://marc.theaimsgroup.com/?l=linux-kernel&m=102050009725367&w=2
Many thanks for verifying this.
Personally if the two tasks ends blocking waiting each other, then I
prefer them to be in the same cpu. That was the whole point of the
optimization. If the pipe buffer is large enough not to require reader
or writer to block, then we don't do the sync wakeup just now (there's a
detail with the reader that may block simply because the writer is slow
at writing, but it probably doesn't matter much). There are many cases
where a PAGE_SIZE of buffer gets filled in much less then a timeslice,
and for all those cases rescheduling the two tasks one after the other
in the same cpu is a win, just like the benchmark shows. Think the
normal pipes we do from the shell, like a "| grep something", they are
very common and they all wants to be handled as a sync wakeups. In
short when loads of data pass through the pipe with max bandwith, the
sync-wakeup is a definitive win. If the pipe never gets filled then the
writer never sync-wakeup, it just returns the write call asynchronously,
but of course the pipe doesn't get filled because it's not a
max-bandiwth scenario, and so the producer and the consumer are allowed
to scale in multiple cpus by the design of the workload.
Comments?
I would like if you could pass over your changes to the O(1) scheduler
to resurrect the sync-wakeup.
Andrea
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-07 23:39 ` Robert Love
@ 2002-05-07 23:48 ` Mike Kravetz
2002-05-08 15:34 ` Jussi Laako
0 siblings, 1 reply; 24+ messages in thread
From: Mike Kravetz @ 2002-05-07 23:48 UTC (permalink / raw)
To: Robert Love; +Cc: Alan Cox, rwhron, mingo, gh, linux-kernel, andrea
On Tue, May 07, 2002 at 04:39:34PM -0700, Robert Love wrote:
> It is just for pipes we previously used sync, no?
That's the only thing I know of that used it.
I'd really like to know if there are any real workloads that
benefited from this feature, rather than just some benchmark.
I can do some research, but was hoping someone on this list
might remember. If there is a valid workload, I'll propose
a patch. However, I don't think we should be adding patches/
features just to help some benchmark that is unrelated to
real world use.
--
Mike
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-07 22:43 ` Mike Kravetz
@ 2002-05-07 23:39 ` Robert Love
2002-05-07 23:48 ` Mike Kravetz
0 siblings, 1 reply; 24+ messages in thread
From: Robert Love @ 2002-05-07 23:39 UTC (permalink / raw)
To: Mike Kravetz; +Cc: Alan Cox, rwhron, mingo, gh, linux-kernel, andrea
On Tue, 2002-05-07 at 15:43, Mike Kravetz wrote:
> I'm not doing any prefetches in the code (if that is what you are
> talking about). The code just moves the pipe reader to the same
> CPU as the pipe writer (which is about to block). Certainly, the
> pipe reader could take advantage of any data written by the writer
> still being in the cache.
Hm, interesting. When Ingo removed the sync variants of wake_up he did
it believing the load balancer would handle the case. Apparently, at
least in this case, that assumption was wrong.
I agree with your earlier statement, though - this benchmark may be a
case where it shows up negatively but in general the balancing is
preferred. I can think of plenty of workloads where that is the case.
I also wonder if over time the load balancer would end up putting the
tasks on the same CPU. That is something the quick pipe benchmark would
not show.
> I'm not sure if 'synchronous' is still being passed all the way
> down to try_to_wake_up in your tree (since it was removed in 2.5).
> This is based off a back port of O(1) to 2.4.18 that Robert Love
> did. The rest of try_to_wake_up (the normal/common path) remains
> the same.
In 2.5 nor the 2.4 backport I did (what is in -ac) I don't think the
sync flag is being passed down since the functionality was removed. The
functions were rewritten I believe to not have that parameter at all.
It is just for pipes we previously used sync, no?
Robert Love
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-07 22:13 ` Mike Kravetz
@ 2002-05-07 22:44 ` Alan Cox
2002-05-07 22:43 ` Mike Kravetz
2002-05-08 8:50 ` Andrea Arcangeli
1 sibling, 1 reply; 24+ messages in thread
From: Alan Cox @ 2002-05-07 22:44 UTC (permalink / raw)
To: Mike Kravetz; +Cc: rwhron, mingo, gh, linux-kernel, alan, andrea
> BEFORE
> ------
> Pipe latency: 6.5185 microseconds
> Pipe bandwidth: 86.35 MB/sec
>
> AFTER
> -----
> Pipe latency: 6.5723 microseconds
> Pipe bandwidth: 540.13 MB/sec
>
> Comments? If anyone would like to see/test the code (pretty simple
> really) let me know.
Are you doing prefetches on the pipe data in your system. Im curious if
this is an SMP cross processor pipe issue or simply cache behaviour ?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-07 22:44 ` Alan Cox
@ 2002-05-07 22:43 ` Mike Kravetz
2002-05-07 23:39 ` Robert Love
0 siblings, 1 reply; 24+ messages in thread
From: Mike Kravetz @ 2002-05-07 22:43 UTC (permalink / raw)
To: Alan Cox; +Cc: rwhron, mingo, gh, linux-kernel, andrea
On Tue, May 07, 2002 at 11:44:35PM +0100, Alan Cox wrote:
> > BEFORE
> > ------
> > Pipe latency: 6.5185 microseconds
> > Pipe bandwidth: 86.35 MB/sec
> >
> > AFTER
> > -----
> > Pipe latency: 6.5723 microseconds
> > Pipe bandwidth: 540.13 MB/sec
> >
> > Comments? If anyone would like to see/test the code (pretty simple
> > really) let me know.
>
> Are you doing prefetches on the pipe data in your system. Im curious if
> this is an SMP cross processor pipe issue or simply cache behaviour ?
I'm not doing any prefetches in the code (if that is what you are
talking about). The code just moves the pipe reader to the same
CPU as the pipe writer (which is about to block). Certainly, the
pipe reader could take advantage of any data written by the writer
still being in the cache.
The code added to 'try_to_wake_up' looks something like this:
if (unlikely(synchronous)){
rq = lock_task_rq(p, &flags);
if (p->array || p->state == TASK_RUNNING) {
/* We're too late */
unlock_task_rq(rq, &flags);
return success;
}
p->cpu = smp_processor_id(); /* Change CPU id */
unlock_task_rq(rq, &flags);
rq = lock_task_rq(p, &flags);
p->state = TASK_RUNNING;
if (!p->array) {
activate_task(p, rq);
}
unlock_task_rq(rq, &flags);
return success;
}
I'm not sure if 'synchronous' is still being passed all the way
down to try_to_wake_up in your tree (since it was removed in 2.5).
This is based off a back port of O(1) to 2.4.18 that Robert Love
did. The rest of try_to_wake_up (the normal/common path) remains
the same.
--
Mike
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-03 13:38 rwhron
2002-05-03 20:29 ` Gerrit Huizenga
@ 2002-05-07 22:13 ` Mike Kravetz
2002-05-07 22:44 ` Alan Cox
2002-05-08 8:50 ` Andrea Arcangeli
1 sibling, 2 replies; 24+ messages in thread
From: Mike Kravetz @ 2002-05-07 22:13 UTC (permalink / raw)
To: rwhron, mingo; +Cc: gh, linux-kernel, alan, andrea
On Fri, May 03, 2002 at 09:38:56AM -0400, rwhron@earthlink.net wrote:
>
> A side effect of O(1) in ac2 and jam6 on the 4 way box is a decrease
> in pipe bandwidth and an increase in pipe latency measured by lmbench:
>
Believe it or not, the increase in pipe latency could be considered
a desirable result. I believe that lat_pipe (the latency test) uses
two tasks that simply pass a token back and forth. With the 'old'
scheduler these two tasks (mostly) got scheduled and ran on the same
CPU which produced the 'best results'. With the O(1) scheduler, tasks
have some affinity to the CPUs they last ran on. If the tasks end
up on different CPUs, then they will have a tendency to stay there.
In the case of lat_pipe, IPI latency (used to awaken/schedule a task
on another CPU) is added to every 'pipe transfer'. This is bad for
the benchmark, but good for most workloads where it is more important
to run with warm caches than to be scheduled as fast as possible.
I believe the decrease in pipe bandwidth is a direct result of the
removal of the '__wake_up_sync' support. I'm not exactly sure what
the arguments were for adding this support to the 'old' scheduler.
However, it was only used by the 'pipe_write' code when it had to
block after waking up a the reader on the pipe. The 'bw_pipe'
test exercised this code path. In the 'old' scheduler '__wake_up_sync'
seemed to accomplish the following:
1) Eliminated (possibly) unnecessary schedules on 'remote' CPUs
2) Eliminated IPI latency by having both reader and writer
execute on the same CPU
3) ? Took advantage of pipe data being in the CPU cache, by
having the reader read data the writer just wrote into the
cache. ?
As I said, I'm not sure of the arguments for introducing this
functionality in the 'old' scheduler. Hopefully, it was not
just a 'benchmark enhancing' patch.
I have experimented with reintroducing '__wake_up_sync' support
into the O(1) scheduler. The modifications are limited to the
'try_to_wake_up' routine as they were before. If the 'synchronous'
flag is set, then 'try_to_wake_up' trys to put the awakened task
on the same runqueue as the caller without forcing a reschedule.
If the task is not already on a runqueue, this is easy. If not,
we give up. Results, restore previous bandwidth results.
BEFORE
------
Pipe latency: 6.5185 microseconds
Pipe bandwidth: 86.35 MB/sec
AFTER
-----
Pipe latency: 6.5723 microseconds
Pipe bandwidth: 540.13 MB/sec
Comments? If anyone would like to see/test the code (pretty simple
really) let me know.
--
Mike
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-06 8:20 rwhron
@ 2002-05-06 16:42 ` Andrea Arcangeli
0 siblings, 0 replies; 24+ messages in thread
From: Andrea Arcangeli @ 2002-05-06 16:42 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel
On Mon, May 06, 2002 at 04:20:05AM -0400, rwhron@earthlink.net wrote:
> > BTW, Randy, I seen my tree runs slower with tiobench, that's probably
> > because I made the elevator anti-starvation logic more aggressive than
> > mainline and the other kernel trees (to help interactive usage), could
> > you try to run tiobench on -aa after elvtune -r 8192 -w 16384
> > /dev/hd[abcd] to verify? Thanks for the great benchmarking effort.
>
> I will have results on the big machine in a couple days. On the
> small machine, elvtune increases tiobench sequential reads by
> 30-50%, and lowers worst case latency a little.
ok, everything is fine then, thanks for the further benchmarks. Not sure
if I should increase the elvtune defaults, the max latency with 8
reading threads literally doubles (from a mean of 500/600 to 1200). OTOH
with 128 threads max latency even decreases (most probably because of
higher mean throughput).
> > And for the reason fork is faster in -aa that's partly thanks to the
> > reschedule-child-first logic, that can be easily merged in mainline,
> > it's just in 2.5.
>
> Is that part of parent_timeslice patch? parent_timeslice helped
Yep.
> fork a little when I tried to isolating patches to find what
> makes fork faster in -aa. It is more than one patch as far as
> I can tell.
>
> On uniprocessor the unixbench execl test, all -aa kernel's going back
> at least to 2.4.15aa1 are about 20% faster than other trees, even those
> like jam and akpm's splitted vm. Fork in -aa for more "real world"
> test (autoconf build) is about 8-10% over other kernel trees.
>
> On quad Xeon, with bigger L2 cache, autoconf (fork test) the difference
> between mainline and -aa is smaller. The -aa based VMs in aa, jam, and
> mainline have about 15% edge over rmap VM in ac and rmap. jam has a
> slight advantage for autoconf build, possibly because of O(1) effect
> which is more likely to show up since more processes execute
> on the 4 way box.
>
> More quad Xeon at:
> http://home.earthlink.net/~rwhron/kernel/bigbox.html
>
>
> --
> Randy Hron
Andrea
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
@ 2002-05-06 8:20 rwhron
2002-05-06 16:42 ` Andrea Arcangeli
0 siblings, 1 reply; 24+ messages in thread
From: rwhron @ 2002-05-06 8:20 UTC (permalink / raw)
To: andrea; +Cc: linux-kernel
> BTW, Randy, I seen my tree runs slower with tiobench, that's probably
> because I made the elevator anti-starvation logic more aggressive than
> mainline and the other kernel trees (to help interactive usage), could
> you try to run tiobench on -aa after elvtune -r 8192 -w 16384
> /dev/hd[abcd] to verify? Thanks for the great benchmarking effort.
I will have results on the big machine in a couple days. On the
small machine, elvtune increases tiobench sequential reads by
30-50%, and lowers worst case latency a little.
More -aa at:
http://home.earthlink.net/~rwhron/kernel/aa.html
> And for the reason fork is faster in -aa that's partly thanks to the
> reschedule-child-first logic, that can be easily merged in mainline,
> it's just in 2.5.
Is that part of parent_timeslice patch? parent_timeslice helped
fork a little when I tried to isolating patches to find what
makes fork faster in -aa. It is more than one patch as far as
I can tell.
On uniprocessor the unixbench execl test, all -aa kernel's going back
at least to 2.4.15aa1 are about 20% faster than other trees, even those
like jam and akpm's splitted vm. Fork in -aa for more "real world"
test (autoconf build) is about 8-10% over other kernel trees.
On quad Xeon, with bigger L2 cache, autoconf (fork test) the difference
between mainline and -aa is smaller. The -aa based VMs in aa, jam, and
mainline have about 15% edge over rmap VM in ac and rmap. jam has a
slight advantage for autoconf build, possibly because of O(1) effect
which is more likely to show up since more processes execute
on the 4 way box.
More quad Xeon at:
http://home.earthlink.net/~rwhron/kernel/bigbox.html
--
Randy Hron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-03 20:29 ` Gerrit Huizenga
@ 2002-05-04 8:13 ` Andrea Arcangeli
0 siblings, 0 replies; 24+ messages in thread
From: Andrea Arcangeli @ 2002-05-04 8:13 UTC (permalink / raw)
To: Gerrit Huizenga; +Cc: rwhron, linux-kernel, alan
On Fri, May 03, 2002 at 01:29:11PM -0700, Gerrit Huizenga wrote:
> In message <20020503093856.A27263@rushmore>, > : rwhron@earthlink.net writes:
> > > > > Rumor is that on some workloads MQ it outperforms O(1), but it
> > > > > may be that the latest (post K3?) O(1) is catching up?
> >
> > Is MQ based on the Davide Libenzi scheduler?
> > (a version of Davide's scheduler is in the -aa tree).
>
> No - Davide's is another variant. All three had similar goals
Davide's patch reduces the complexity of the scheduler from O(N) where N
is the number of tasks in the system, to O(N) where N is the number of
simultaneous running tasks in the system. It's also a simple
optimization and it can make responsiveness even better than the mainline
scheduler. I know many people is using 2.4 with some thousand tasks
with only a few of them (let's say a dozen of them) running
simultaneously, so while the O(1) scheduler would be even better, the
dyn-sched patch from Davide looked the most attractive for production
usage given its simplicty. I also refined it a little while merging it
with Davide's overview.
Soon I will also get into merging the O(1) scheduler, but first I want
to inspect the interactivity and pipe bandwith effect, at least to
understand why they are getting affected (one first variable could be
the removal of the sync-wakeup, O(1) scheduler needs to get the wakeup
anyways in order to run the task but still we can teach the O(1)
scheduler that it must not try to reschedule the current task after
queueing the new one). In theory dyn-sched would be almost optimal for
well written applications with only 1 task per-cpu using async-io, but
of course in many benchmark (and in some real life environemtn) there
are an huge number of tasks running simultaneously and so dyn-sched
doesn't help much there compared to mainline.
BTW, Randy, I seen my tree runs slower with tiobench, that's probably
because I made the elevator anti-starvation logic more aggressive than
mainline and the other kernel trees (to help interactive usage), could
you try to run tiobench on -aa after elvtune -r 8192 -w 16384
/dev/hd[abcd] to verify? Thanks for the great benchmarking effort.
And for the reason fork is faster in -aa that's partly thanks to the
reschedule-child-first logic, that can be easily merged in mainline,
it's just in 2.5.
> and similar changes. MQ was the "first" public one written by
> Mike Kravetz and Hubertus Franke with help from a number of others.
>
> > tbench 192 is an anomaly test too. AIM looks like a nice
> > "mixed" bench. Do you have any scripts for it? I'd like
> > to use AIM too.
>
> The SGI folks may be using more custom scripts. I think there
> is a reasonable set of options in the released package. OSDL
> might also be playing with it (Wookie, are you out here?). Sequent
> used to have a large set of scripts but I don't know where those
> are at the moment. I may check around.
>
> > A side effect of O(1) in ac2 and jam6 on the 4 way box is a decrease
> > in pipe bandwidth and an increase in pipe latency measured by lmbench:
>
> Not surprised. That seems to be one of our problems with
> volanomark testing at the moment and we have some hacks to help,
> one in TCP which allows the receiver to be scheduled on a "close"
> CPU which seems to help latency. Others are tweaks of the
> scheduler itself, with nothing conclusively better yet.
>
> gerrit
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Andrea
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-03 13:38 rwhron
@ 2002-05-03 20:29 ` Gerrit Huizenga
2002-05-04 8:13 ` Andrea Arcangeli
2002-05-07 22:13 ` Mike Kravetz
1 sibling, 1 reply; 24+ messages in thread
From: Gerrit Huizenga @ 2002-05-03 20:29 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel, alan
In message <20020503093856.A27263@rushmore>, > : rwhron@earthlink.net writes:
> > > > Rumor is that on some workloads MQ it outperforms O(1), but it
> > > > may be that the latest (post K3?) O(1) is catching up?
>
> Is MQ based on the Davide Libenzi scheduler?
> (a version of Davide's scheduler is in the -aa tree).
No - Davide's is another variant. All three had similar goals
and similar changes. MQ was the "first" public one written by
Mike Kravetz and Hubertus Franke with help from a number of others.
> tbench 192 is an anomaly test too. AIM looks like a nice
> "mixed" bench. Do you have any scripts for it? I'd like
> to use AIM too.
The SGI folks may be using more custom scripts. I think there
is a reasonable set of options in the released package. OSDL
might also be playing with it (Wookie, are you out here?). Sequent
used to have a large set of scripts but I don't know where those
are at the moment. I may check around.
> A side effect of O(1) in ac2 and jam6 on the 4 way box is a decrease
> in pipe bandwidth and an increase in pipe latency measured by lmbench:
Not surprised. That seems to be one of our problems with
volanomark testing at the moment and we have some hacks to help,
one in TCP which allows the receiver to be scheduled on a "close"
CPU which seems to help latency. Others are tweaks of the
scheduler itself, with nothing conclusively better yet.
gerrit
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
@ 2002-05-03 13:38 rwhron
2002-05-03 20:29 ` Gerrit Huizenga
2002-05-07 22:13 ` Mike Kravetz
0 siblings, 2 replies; 24+ messages in thread
From: rwhron @ 2002-05-03 13:38 UTC (permalink / raw)
To: gh; +Cc: linux-kernel, alan
> > > Rumor is that on some workloads MQ it outperforms O(1), but it
> > > may be that the latest (post K3?) O(1) is catching up?
Is MQ based on the Davide Libenzi scheduler?
(a version of Davide's scheduler is in the -aa tree).
> > I'd be interested to know what workloads ?
> AIM on large CPU count machines was the most significant I had heard
> about. Haven't measured recently on database load - we made a cut to
> O(1) some time back for simplicity. Supposedly volanomark was doing
> better for a while but again we haven't cut back to MQ in quite a while;
> trying instead to refine O(1). Volanomark is something of a scheduling
> anomaly though - sender/receiver timing on loopback affects scheduling
> decisions and overall throughput in ways that may or may not be consistent
> with real workloads. AIM is probably a better workload for "real life"
> random scheduling testing.
tbench 192 is an anomaly test too. AIM looks like a nice
"mixed" bench. Do you have any scripts for it? I'd like
to use AIM too.
A side effect of O(1) in ac2 and jam6 on the 4 way box is a decrease
in pipe bandwidth and an increase in pipe latency measured by lmbench:
kernel Pipe bandwidth in MB/s - bigger is better
----------------------- ------
2.4.16 383.93
2.4.19-pre3aa2 316.88
2.4.19-pre5 385.56
2.4.19-pre5-aa1 345.93
2.4.19-pre5-aa1-2g-hio 371.87
2.4.19-pre5-aa1-3g-hio 355.97
2.4.19-pre7 462.80
2.4.19-pre7-aa1 382.90
2.4.19-pre7-ac2 85.66
2.4.19-pre7-jam6 66.41
2.4.19-pre7-rl 464.60
2.4.19-pre7-rmap13 453.24
kernel Pipe latency in microseconds - smaller is better
----------------------- -----
2.4.16 12.73
2.4.19-pre3aa2 13.58
2.4.19-pre5 12.98
2.4.19-pre5-aa1 13.46
2.4.19-pre5-aa1-2g-hio 12.83
2.4.19-pre5-aa1-3g-hio 13.08
2.4.19-pre7 10.71
2.4.19-pre7-aa1 13.32
2.4.19-pre7-ac2 31.95
2.4.19-pre7-jam6 29.51
2.4.19-pre7-rl 10.71
2.4.19-pre7-rmap13 10.75
More at:
http://home.earthlink.net/~rwhron/kernel/bigbox.html
--
Randy Hron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-03 0:14 ` Alan Cox
@ 2002-05-03 1:08 ` Gerrit Huizenga
0 siblings, 0 replies; 24+ messages in thread
From: Gerrit Huizenga @ 2002-05-03 1:08 UTC (permalink / raw)
To: Alan Cox; +Cc: rwhron, linux-kernel
In message <E173QiK-0005Bd-00@the-village.bc.nu>, > : Alan Cox writes:
>
> > Rumor is that on some workloads MQ it outperforms O(1), but it
> > may be that the latest (post K3?) O(1) is catching up?
>
> I'd be interested to know what workloads ?
AIM on large CPU count machines was the most significant I had heard
about. Haven't measured recently on database load - we made a cut to
O(1) some time back for simplicity. Supposedly volanomark was doing
better for a while but again we haven't cut back to MQ in quite a while;
trying instead to refine O(1). Volanomark is something of a scheduling
anomaly though - sender/receiver timing on loopback affects scheduling
decisions and overall throughput in ways that may or may not be consistent
with real workloads. AIM is probably a better workload for "real life"
random scheduling testing.
gerrit
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-03 0:09 ` Gerrit Huizenga
2002-05-02 23:17 ` J.A. Magallon
@ 2002-05-03 0:14 ` Alan Cox
2002-05-03 1:08 ` Gerrit Huizenga
1 sibling, 1 reply; 24+ messages in thread
From: Alan Cox @ 2002-05-03 0:14 UTC (permalink / raw)
To: gh; +Cc: rwhron, linux-kernel
> If you are bored, you might compare this to the MQ scheduler
> at http://prdownloads.sourceforge.net/lse/2.4.14.mq-sched
>
> Also, I think rml did a backport of the 2.5.X version of O(1);
> I'm not sure if htat is in -ac or -jam as yet.
-ac has Robert Love's backport of the additional fixes
> Rumor is that on some workloads MQ it outperforms O(1), but it
> may be that the latest (post K3?) O(1) is catching up?
I'd be interested to know what workloads ?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-02 21:36 rwhron
@ 2002-05-03 0:09 ` Gerrit Huizenga
2002-05-02 23:17 ` J.A. Magallon
2002-05-03 0:14 ` Alan Cox
0 siblings, 2 replies; 24+ messages in thread
From: Gerrit Huizenga @ 2002-05-03 0:09 UTC (permalink / raw)
To: rwhron; +Cc: linux-kernel
In message <20020502173656.A26986@rushmore>, > : rwhron@earthlink.net writes:
> On an OSDL 4 way x86 box the O(1) scheduler effect
> becomes obvious as the run queue gets large.
>
> 2.4.19-pre7-ac2 and 2.4.19-pre7-jam6 have the O(1) scheduler.
>
> At 192 processes, O(1) shows about 340% improvement in throughput.
> The dyn-sched in -aa appears to be somewhat improved over the
> standard scheduler.
>
> Numbers are in MB/second.
>
If you are bored, you might compare this to the MQ scheduler
at http://prdownloads.sourceforge.net/lse/2.4.14.mq-sched
Also, I think rml did a backport of the 2.5.X version of O(1);
I'm not sure if htat is in -ac or -jam as yet.
Rumor is that on some workloads MQ it outperforms O(1), but it
may be that the latest (post K3?) O(1) is catching up?
gerrit
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: O(1) scheduler gives big boost to tbench 192
2002-05-03 0:09 ` Gerrit Huizenga
@ 2002-05-02 23:17 ` J.A. Magallon
2002-05-03 0:14 ` Alan Cox
1 sibling, 0 replies; 24+ messages in thread
From: J.A. Magallon @ 2002-05-02 23:17 UTC (permalink / raw)
To: Gerrit Huizenga; +Cc: linux-kernel
On 2002.05.03 Gerrit Huizenga wrote:
>In message <20020502173656.A26986@rushmore>, > : rwhron@earthlink.net writes:
>> On an OSDL 4 way x86 box the O(1) scheduler effect
>> becomes obvious as the run queue gets large.
>>
>> 2.4.19-pre7-ac2 and 2.4.19-pre7-jam6 have the O(1) scheduler.
>>
>> At 192 processes, O(1) shows about 340% improvement in throughput.
>> The dyn-sched in -aa appears to be somewhat improved over the
>> standard scheduler.
>>
>> Numbers are in MB/second.
>>
>
>If you are bored, you might compare this to the MQ scheduler
>at http://prdownloads.sourceforge.net/lse/2.4.14.mq-sched
>
>Also, I think rml did a backport of the 2.5.X version of O(1);
>I'm not sure if htat is in -ac or -jam as yet.
>
-jam6 is sched-O1-rml-2 (the backport).
>Rumor is that on some workloads MQ it outperforms O(1), but it
>may be that the latest (post K3?) O(1) is catching up?
>
--
J.A. Magallon # Let the source be with you...
mailto:jamagallon@able.es
Mandrake Linux release 8.3 (Cooker) for i586
Linux werewolf 2.4.19-pre7-jam9 #2 SMP mié may 1 12:09:38 CEST 2002 i686
^ permalink raw reply [flat|nested] 24+ messages in thread
* O(1) scheduler gives big boost to tbench 192
@ 2002-05-02 21:36 rwhron
2002-05-03 0:09 ` Gerrit Huizenga
0 siblings, 1 reply; 24+ messages in thread
From: rwhron @ 2002-05-02 21:36 UTC (permalink / raw)
To: linux-kernel
On an OSDL 4 way x86 box the O(1) scheduler effect
becomes obvious as the run queue gets large.
2.4.19-pre7-ac2 and 2.4.19-pre7-jam6 have the O(1) scheduler.
At 192 processes, O(1) shows about 340% improvement in throughput.
The dyn-sched in -aa appears to be somewhat improved over the
standard scheduler.
Numbers are in MB/second.
tbench 192 processes
2.4.16 29.39
2.4.17 29.70
2.4.19-pre5 29.01
2.4.19-pre5-aa1 29.22
2.4.19-pre5-aa1-2g-hio 29.94
2.4.19-pre5-aa1-3g-hio 28.66
2.4.19-pre7 29.93
2.4.19-pre7-aa1 32.75
2.4.19-pre7-ac2 103.98
2.4.19-pre7-rmap13 29.46
2.4.19-pre7-jam6 104.98
2.4.19-pre7-rl 29.74
At 64 processes, O(1) helps a little. ac2 and jam6 have
the highest numbers here too.
tbench 64 processes
2.4.16 101.99
2.4.17 103.49
2.4.19-pre5-aa1 102.43
2.4.19-pre5-aa1-2g-hio 104.30
2.4.19-pre5-aa1-3g-hio 104.60
2.4.19-pre7 100.86
2.4.19-pre7-aa1 101.76
2.4.19-pre7-ac2 105.89
2.4.19-pre7-rmap13 100.94
2.4.19-pre7-rl 99.65
2.4.19-pre7-jam6 108.23
I've seen some benefit on a uniprocessor box running tbench 32
for kernels with O(1). Hmm, have to try tbench 192 on uniproc
and see if the difference is all scheduler overhead.
I'm putting together a page with more results on this machine.
It will be growing at:
http://home.earthlink.net/~rwhron/kernel/bigbox.html
--
Randy Hron
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2002-05-20 12:46 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-03 16:37 O(1) scheduler gives big boost to tbench 192 John Hawkes
-- strict thread matches above, loose matches on Subject: below --
2002-05-20 12:46 rwhron
2002-05-08 16:39 Bill Davidsen
2002-05-06 8:20 rwhron
2002-05-06 16:42 ` Andrea Arcangeli
2002-05-03 13:38 rwhron
2002-05-03 20:29 ` Gerrit Huizenga
2002-05-04 8:13 ` Andrea Arcangeli
2002-05-07 22:13 ` Mike Kravetz
2002-05-07 22:44 ` Alan Cox
2002-05-07 22:43 ` Mike Kravetz
2002-05-07 23:39 ` Robert Love
2002-05-07 23:48 ` Mike Kravetz
2002-05-08 15:34 ` Jussi Laako
2002-05-08 16:31 ` Robert Love
2002-05-08 17:02 ` Mike Kravetz
2002-05-09 0:26 ` Jussi Laako
2002-05-08 8:50 ` Andrea Arcangeli
2002-05-09 23:18 ` Mike Kravetz
2002-05-02 21:36 rwhron
2002-05-03 0:09 ` Gerrit Huizenga
2002-05-02 23:17 ` J.A. Magallon
2002-05-03 0:14 ` Alan Cox
2002-05-03 1:08 ` Gerrit Huizenga
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).