linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: lmbench results for 2.4 and 2.5 -- updated results
@ 2003-03-24 19:53 Pallipadi, Venkatesh
  2003-03-24 20:01 ` Larry McVoy
  0 siblings, 1 reply; 15+ messages in thread
From: Pallipadi, Venkatesh @ 2003-03-24 19:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds


> -----Original Message-----
> From: Linus Torvalds [mailto:torvalds@transmeta.com] 
> Sent: Monday, March 24, 2003 12:40 AM
> To: linux-kernel@vger.kernel.org
> Subject: Re: lmbench results for 2.4 and 2.5 -- updated results
> 
> 
> >--page fault (is this significant?)
> 
> I don't think so, there's something strange with the lmbench pagefault
> tests, it only has one significant digit of accuracy, and I don't even
> know what it is testing. Because of the single lack of precision, it's
> hard to tell what the real change is.
> 

This single digit accuracy is coming from a minor integer division bug
in lmbench.
Appended LMbench patch should resolve it.

Thanks,
-Venkatesh

--- LMbench/src/lat_pagefault.c.org	Mon Mar 24 10:40:46 2003
+++ LMbench/src/lat_pagefault.c	Mon Mar 24 10:54:34 2003
@@ -67,5 +67,5 @@
 		n++;
 	}
 	use_int(sum);
-	fprintf(stderr, "Pagefaults on %s: %d usecs\n", file, usecs/n);
+	fprintf(stderr, "Pagefaults on %s: %f usecs\n", file, (1.0 *
usecs) / n);
 }


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 19:53 lmbench results for 2.4 and 2.5 -- updated results Pallipadi, Venkatesh
@ 2003-03-24 20:01 ` Larry McVoy
  2003-03-24 21:09   ` Martin J. Bligh
  0 siblings, 1 reply; 15+ messages in thread
From: Larry McVoy @ 2003-03-24 20:01 UTC (permalink / raw)
  To: Pallipadi, Venkatesh; +Cc: linux-kernel, Linus Torvalds

On Mon, Mar 24, 2003 at 11:53:44AM -0800, Pallipadi, Venkatesh wrote:
> --- LMbench/src/lat_pagefault.c.org	Mon Mar 24 10:40:46 2003
> +++ LMbench/src/lat_pagefault.c	Mon Mar 24 10:54:34 2003
> @@ -67,5 +67,5 @@
>  		n++;
>  	}
>  	use_int(sum);
> -	fprintf(stderr, "Pagefaults on %s: %d usecs\n", file, usecs/n);
> +	fprintf(stderr, "Pagefaults on %s: %f usecs\n", file, (1.0 *
> usecs) / n);
>  }

It's been a long time since I've looked at this benchmark, has anyone 
stared at it and do you believe it measures anything useful?  If not,
I'll drop it from a future release.  If I remember correctly what I
was trying to do was to measure the cost of setting up the mapping
but I might be crackin smoke.
-- 
---
Larry McVoy              lm at bitmover.com          http://www.bitmover.com/lm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 20:01 ` Larry McVoy
@ 2003-03-24 21:09   ` Martin J. Bligh
  2003-03-24 23:36     ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Martin J. Bligh @ 2003-03-24 21:09 UTC (permalink / raw)
  To: Larry McVoy, Pallipadi, Venkatesh; +Cc: linux-kernel, Linus Torvalds

>> --- LMbench/src/lat_pagefault.c.org	Mon Mar 24 10:40:46 2003
>> +++ LMbench/src/lat_pagefault.c	Mon Mar 24 10:54:34 2003
>> @@ -67,5 +67,5 @@
>>  		n++;
>>  	}
>>  	use_int(sum);
>> -	fprintf(stderr, "Pagefaults on %s: %d usecs\n", file, usecs/n);
>> +	fprintf(stderr, "Pagefaults on %s: %f usecs\n", file, (1.0 *
>> usecs) / n);
>>  }
> 
> It's been a long time since I've looked at this benchmark, has anyone 
> stared at it and do you believe it measures anything useful?  If not,
> I'll drop it from a future release.  If I remember correctly what I
> was trying to do was to measure the cost of setting up the mapping
> but I might be crackin smoke.

On a slightly related note, I played with lmbench a bit over the weekend,
but the results were too unstable to be useful ... they're also too short
to profile ;-( 

I presume it does 100 iterations of a test (like fork latency?). Or does 
it just do one? Can I make it do 1,000,000 iterations or something
fairly easily ? ;-) I didn't really look closely, just apt-get install
lmbench ... 

Thanks,

M.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 22:04       ` Larry McVoy
@ 2003-03-24 22:04         ` Martin J. Bligh
  2003-03-24 22:23           ` Larry McVoy
  2003-03-24 22:19         ` Chris Friesen
  2003-03-25 18:23         ` Martin J. Bligh
  2 siblings, 1 reply; 15+ messages in thread
From: Martin J. Bligh @ 2003-03-24 22:04 UTC (permalink / raw)
  To: Larry McVoy, Andrew Morton; +Cc: venkatesh.pallipadi, linux-kernel, torvalds

>> > I presume it does 100 iterations of a test (like fork latency?). Or does 
>> > it just do one? Can I make it do 1,000,000 iterations or something
>> > fairly easily ? ;-) I didn't really look closely, just apt-get install
>> > lmbench ... 
>> 
>> Yes, that is something I've wanted several times.  Just a way to say "run
>> this test for ever so I can profile the thing".
>> 
>> Even a sleazy environment string would suffice.
> 
> It's been there, I suppose you need to read the source to figure it out
> though the lmbench script also plays with this I believe.

Yay! Thank you.
 
> work ~/LMbench2/bin/i686-pc-linux-gnu ENOUGH=1000000 time bw_pipe
> Pipe bandwidth: 655.37 MB/sec
> real    0m23.411s
> user    0m0.480s
> sys     0m1.180s
> 
> work ~/LMbench2/bin/i686-pc-linux-gnu time bw_pipe
> Pipe bandwidth: 809.81 MB/sec
> 
> real    0m2.821s
> user    0m0.480s
> sys     0m1.180s

Mmmm. Any idea why the results are so dramtically different? 655 vs 809?
Looks odd ;-)

m.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 23:36     ` Andrew Morton
@ 2003-03-24 22:04       ` Larry McVoy
  2003-03-24 22:04         ` Martin J. Bligh
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Larry McVoy @ 2003-03-24 22:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, lm, venkatesh.pallipadi, linux-kernel, torvalds

On Mon, Mar 24, 2003 at 03:36:02PM -0800, Andrew Morton wrote:
> "Martin J. Bligh" <mbligh@aracnet.com> wrote:
> >
> > On a slightly related note, I played with lmbench a bit over the weekend,
> > but the results were too unstable to be useful ... they're also too short
> > to profile ;-( 
> > 
> > I presume it does 100 iterations of a test (like fork latency?). Or does 
> > it just do one? Can I make it do 1,000,000 iterations or something
> > fairly easily ? ;-) I didn't really look closely, just apt-get install
> > lmbench ... 
> 
> Yes, that is something I've wanted several times.  Just a way to say "run
> this test for ever so I can profile the thing".
> 
> Even a sleazy environment string would suffice.

It's been there, I suppose you need to read the source to figure it out
though the lmbench script also plays with this I believe.

work ~/LMbench2/bin/i686-pc-linux-gnu ENOUGH=1000000 time bw_pipe
Pipe bandwidth: 655.37 MB/sec
real    0m23.411s
user    0m0.480s
sys     0m1.180s

work ~/LMbench2/bin/i686-pc-linux-gnu time bw_pipe
Pipe bandwidth: 809.81 MB/sec

real    0m2.821s
user    0m0.480s
sys     0m1.180s


-- 
---
Larry McVoy              lm at bitmover.com          http://www.bitmover.com/lm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 22:04       ` Larry McVoy
  2003-03-24 22:04         ` Martin J. Bligh
@ 2003-03-24 22:19         ` Chris Friesen
  2003-03-25 18:23         ` Martin J. Bligh
  2 siblings, 0 replies; 15+ messages in thread
From: Chris Friesen @ 2003-03-24 22:19 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

Larry McVoy wrote:

> work ~/LMbench2/bin/i686-pc-linux-gnu ENOUGH=1000000 time bw_pipe
> Pipe bandwidth: 655.37 MB/sec
> real    0m23.411s
> user    0m0.480s
> sys     0m1.180s
> 
> work ~/LMbench2/bin/i686-pc-linux-gnu time bw_pipe
> Pipe bandwidth: 809.81 MB/sec
> 
> real    0m2.821s
> user    0m0.480s
> sys     0m1.180s

Why the difference?  Is it being scheduled out?  Should lmbench be (optionally) 
putting itself into a realtime scheduling class?

Chris

-- 
Chris Friesen                    | MailStop: 043/33/F10
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 22:04         ` Martin J. Bligh
@ 2003-03-24 22:23           ` Larry McVoy
  0 siblings, 0 replies; 15+ messages in thread
From: Larry McVoy @ 2003-03-24 22:23 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Larry McVoy, Andrew Morton, venkatesh.pallipadi, linux-kernel, torvalds

> Mmmm. Any idea why the results are so dramtically different? 655 vs 809?

Yeah, two run-away mutt processes (*) eating up all the CPU.  When ENOUGH 
is small, i.e., less than a second or so, LMbench does a series of tests
and takes the mean (I believe, look at the source, lib_timing.c and *.h).
When ENOUGH is big it just does one run and reports that.  So the big run
was long enough it was competing for time slices and the default ones are
short enough they get the whole slice.  It's actually possible to run
LMbench on a loaded system and get fairly accurate results if you have 
a decent enough clock.  

(*) I use rsh to get into the main machine here and ever since Red Hat 7.?
if I'm rsh-ed in from a laptop, put the laptop to sleep and the connection
gets dropped, my mutt sessions don't get SIGHUP or whatever they should 
get and they start sucking up CPU like there is no tomorrow.  Does anyone
know of a fix for this?
-- 
---
Larry McVoy              lm at bitmover.com          http://www.bitmover.com/lm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 21:09   ` Martin J. Bligh
@ 2003-03-24 23:36     ` Andrew Morton
  2003-03-24 22:04       ` Larry McVoy
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2003-03-24 23:36 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: lm, venkatesh.pallipadi, linux-kernel, torvalds

"Martin J. Bligh" <mbligh@aracnet.com> wrote:
>
> On a slightly related note, I played with lmbench a bit over the weekend,
> but the results were too unstable to be useful ... they're also too short
> to profile ;-( 
> 
> I presume it does 100 iterations of a test (like fork latency?). Or does 
> it just do one? Can I make it do 1,000,000 iterations or something
> fairly easily ? ;-) I didn't really look closely, just apt-get install
> lmbench ... 

Yes, that is something I've wanted several times.  Just a way to say "run
this test for ever so I can profile the thing".

Even a sleazy environment string would suffice.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24 22:04       ` Larry McVoy
  2003-03-24 22:04         ` Martin J. Bligh
  2003-03-24 22:19         ` Chris Friesen
@ 2003-03-25 18:23         ` Martin J. Bligh
  2003-03-26  1:50           ` Larry McVoy
  2 siblings, 1 reply; 15+ messages in thread
From: Martin J. Bligh @ 2003-03-25 18:23 UTC (permalink / raw)
  To: Larry McVoy, Andrew Morton; +Cc: venkatesh.pallipadi, linux-kernel, torvalds

> work ~/LMbench2/bin/i686-pc-linux-gnu ENOUGH=1000000 time bw_pipe
> Pipe bandwidth: 655.37 MB/sec
> real    0m23.411s
> user    0m0.480s
> sys     0m1.180s
> 
> work ~/LMbench2/bin/i686-pc-linux-gnu time bw_pipe
> Pipe bandwidth: 809.81 MB/sec
> 
> real    0m2.821s
> user    0m0.480s
> sys     0m1.180s

OK, is a bit more stable now ... before:

Process fork+exit: 294.4118 microseconds
Process fork+exit: 279.1500 microseconds
Process fork+exit: 280.0000 microseconds
Process fork+exit: 280.0000 microseconds
Process fork+exit: 277.2222 microseconds
Process fork+exit: 286.0000 microseconds
Process fork+exit: 277.6231 microseconds
Process fork+exit: 307.1176 microseconds
Process fork+exit: 295.4706 microseconds
Process fork+exit: 294.3529 microseconds

after:

Process fork+exit: 298.4124 microseconds
Process fork+exit: 298.6746 microseconds
Process fork+exit: 297.7784 microseconds
Process fork+exit: 294.8297 microseconds
Process fork+exit: 299.6249 microseconds
Process fork+exit: 297.6771 microseconds
Process fork+exit: 297.9801 microseconds
Process fork+exit: 293.1421 microseconds
Process fork+exit: 281.9868 microseconds

I can probably butcher that around by taking a few derived medians and
averages to get pretty consistent numbers out of it (std dev < 1% for 99%
of the time). Though 10 runs with ENOUGH=1000000 is kinda slow for all
tests, so I probably won't be able to do this by default for every version.
If there are any more suggestions on added stability, I'd love to hear them.

Is cool to have something big enough to profile too ;-)

Thanks very much,

M.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-25 18:23         ` Martin J. Bligh
@ 2003-03-26  1:50           ` Larry McVoy
  2003-03-26  2:09             ` Martin J. Bligh
  0 siblings, 1 reply; 15+ messages in thread
From: Larry McVoy @ 2003-03-26  1:50 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Larry McVoy, Andrew Morton, venkatesh.pallipadi, linux-kernel, torvalds

In general, LMbench optimizes for fast results over exactness.  You can
definitely get more accurate results by doing longer runs.  My view
at the time of writing it was that I was looking for the broad stroke
results because I was trying to measure differences between various
operating systems.  There was more than enough to show so the results
didn't need to be precise, getting people to run the benchmark and
report results was more important.

If people are doing release runs to see if there are regressions, I
think that setting ENOUGH up to something longer is a good idea.
If there is enough interest, I could spend some time on this and
try and make a more accurate way to get results.  Let me know.

On Tue, Mar 25, 2003 at 10:23:50AM -0800, Martin J. Bligh wrote:
> > work ~/LMbench2/bin/i686-pc-linux-gnu ENOUGH=1000000 time bw_pipe
> > Pipe bandwidth: 655.37 MB/sec
> > real    0m23.411s
> > user    0m0.480s
> > sys     0m1.180s
> > 
> > work ~/LMbench2/bin/i686-pc-linux-gnu time bw_pipe
> > Pipe bandwidth: 809.81 MB/sec
> > 
> > real    0m2.821s
> > user    0m0.480s
> > sys     0m1.180s
> 
> OK, is a bit more stable now ... before:
> 
> Process fork+exit: 294.4118 microseconds
> Process fork+exit: 279.1500 microseconds
> Process fork+exit: 280.0000 microseconds
> Process fork+exit: 280.0000 microseconds
> Process fork+exit: 277.2222 microseconds
> Process fork+exit: 286.0000 microseconds
> Process fork+exit: 277.6231 microseconds
> Process fork+exit: 307.1176 microseconds
> Process fork+exit: 295.4706 microseconds
> Process fork+exit: 294.3529 microseconds
> 
> after:
> 
> Process fork+exit: 298.4124 microseconds
> Process fork+exit: 298.6746 microseconds
> Process fork+exit: 297.7784 microseconds
> Process fork+exit: 294.8297 microseconds
> Process fork+exit: 299.6249 microseconds
> Process fork+exit: 297.6771 microseconds
> Process fork+exit: 297.9801 microseconds
> Process fork+exit: 293.1421 microseconds
> Process fork+exit: 281.9868 microseconds
> 
> I can probably butcher that around by taking a few derived medians and
> averages to get pretty consistent numbers out of it (std dev < 1% for 99%
> of the time). Though 10 runs with ENOUGH=1000000 is kinda slow for all
> tests, so I probably won't be able to do this by default for every version.
> If there are any more suggestions on added stability, I'd love to hear them.
> 
> Is cool to have something big enough to profile too ;-)
> 
> Thanks very much,
> 
> M.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
---
Larry McVoy              lm at bitmover.com          http://www.bitmover.com/lm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-26  1:50           ` Larry McVoy
@ 2003-03-26  2:09             ` Martin J. Bligh
  0 siblings, 0 replies; 15+ messages in thread
From: Martin J. Bligh @ 2003-03-26  2:09 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Andrew Morton, venkatesh.pallipadi, linux-kernel, torvalds

> In general, LMbench optimizes for fast results over exactness.  You can
> definitely get more accurate results by doing longer runs.  My view
> at the time of writing it was that I was looking for the broad stroke
> results because I was trying to measure differences between various
> operating systems.  There was more than enough to show so the results
> didn't need to be precise, getting people to run the benchmark and
> report results was more important.
> 
> If people are doing release runs to see if there are regressions, I
> think that setting ENOUGH up to something longer is a good idea.
> If there is enough interest, I could spend some time on this and
> try and make a more accurate way to get results.  Let me know.

Well, I'd certainly be interested ... I think it's almost inevitable 
that there's a bit of variations in timing ... the way I normally deal 
with that is to do, say 30 runs, sort the results, throw away the top 
and bottom 10, and take the average of the middle 10. 

Now at the moment, I presume you're presenting back an average of all
the runs you did ... if so, that looses some of the data to calculate
that, so we have to do multiple runs, etc and it all gets a bit slower.
If you can do that kind of stats op inside the lmbench tools, it
might help. Of course, I can do that outside as a wrapper, but then
there's the fork/exec setup time of the program to consider, etc etc.

The other interesting question is *why* there's so much variability
in results in the first place. Indicative of some inner kernel problem?
People have talked about page colouring, etc before, but I've tried
before, and never mananged to generate any data showing a benefit for
std dev of runs or whatever. Incidentally, this means that giving std
dev (of the used subset, and all results) from lmbench would be fun ;-)

The other problem was the "make it long enough to profile" thing ...
I think the ENOUGH trick you showed us is perfectly sufficent to solve
that one ;-)

Thanks,

M.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: lmbench results for 2.4 and 2.5 -- updated results
@ 2003-03-24 20:11 Nakajima, Jun
  0 siblings, 0 replies; 15+ messages in thread
From: Nakajima, Jun @ 2003-03-24 20:11 UTC (permalink / raw)
  To: Larry McVoy, Pallipadi, Venkatesh; +Cc: linux-kernel, Linus Torvalds

I don't think it's measuring anything useful at this point, especially with the cost of start(0) and stop(0,0) (they are eventually gettimeofday(), as far as I looked at the code) are substantial compared to page fault (sum += *end;) itself. If you move them outside the while loop, I think it's beter.

void
timeit(char *file, char *where, int size)
{
        char    *end = where + size - 16*1024;
        int     sum = 0;
        int     n = 0, usecs = 0;

        while (end > where) {
                start(0); 
                sum += *end;
                end -= 256*1024;
                usecs += stop(0,0);
                n++;
        }
        use_int(sum);
        fprintf(stderr, "Pagefaults on %s: %d usecs\n", file, usecs/n);



> -----Original Message-----
> From: Larry McVoy [mailto:lm@bitmover.com]
> Sent: Monday, March 24, 2003 12:01 PM
> To: Pallipadi, Venkatesh
> Cc: linux-kernel@vger.kernel.org; Linus Torvalds
> Subject: Re: lmbench results for 2.4 and 2.5 -- updated results
> 
> On Mon, Mar 24, 2003 at 11:53:44AM -0800, Pallipadi, Venkatesh wrote:
> > --- LMbench/src/lat_pagefault.c.org	Mon Mar 24 10:40:46 2003
> > +++ LMbench/src/lat_pagefault.c	Mon Mar 24 10:54:34 2003
> > @@ -67,5 +67,5 @@
> >  		n++;
> >  	}
> >  	use_int(sum);
> > -	fprintf(stderr, "Pagefaults on %s: %d usecs\n", file, usecs/n);
> > +	fprintf(stderr, "Pagefaults on %s: %f usecs\n", file, (1.0 *
> > usecs) / n);
> >  }
> 
> It's been a long time since I've looked at this benchmark, has anyone
> stared at it and do you believe it measures anything useful?  If not,
> I'll drop it from a future release.  If I remember correctly what I
> was trying to do was to measure the cost of setting up the mapping
> but I might be crackin smoke.
> --
> ---
> Larry McVoy              lm at bitmover.com
> http://www.bitmover.com/lm
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24  8:39   ` Linus Torvalds
@ 2003-03-24  9:03     ` William Lee Irwin III
  0 siblings, 0 replies; 15+ messages in thread
From: William Lee Irwin III @ 2003-03-24  9:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, rwhron

Chris Friesen  <cfriesen@nortelnetworks.com> wrote:
>> The ones that stand out are:
>> --fork/exec (due to rmap I assume?)
>> --mmap (also due to rmap?)

On Mon, Mar 24, 2003 at 08:39:34AM +0000, Linus Torvalds wrote:
> Yes. You could try the objrmap patches, they are supposed to help. They
> may be in -mm, I'm not sure.

I recently asked Randy Hron which 2.5.x patches made the biggest
difference in the tests he's done. He pasted the following:

<hrandoz:#kernelnewbies>                                  null     null       
+               open    signal   signal    fork    execve  /bin/sh
<hrandoz:#kernelnewbies> kernel                           call      I/O    
+stat    fstat    close   install   handle  process  process  process
<hrandoz:#kernelnewbies> 2.5.65                            0.66  0.96298     3
+.60     1.48     5.31     1.92     3.89     1279     3233    13703
<hrandoz:#kernelnewbies> 2.5.65-mm1                        0.63  1.04114     3
+.65     1.57     6.39     2.29     3.92     1370     3621    13985
<hrandoz:#kernelnewbies> 2.5.65-mm2                        0.65  0.98654     3
+.64     1.46     6.88     1.91     3.94     1511     3676    13502
<hrandoz:#kernelnewbies> 2.5.65-mm2-anobjrmap              0.66  0.96061     3
+.82     1.45     5.38     1.90     4.68     1414     3497    13169
<hrandoz:#kernelnewbies> 2.2.23                            0.42  0.80455     4
+.76     1.24     5.77     1.43     2.74      788     2303    30829
<hrandoz:#kernelnewbies> 2.4.21-pre4aa3                    0.62  0.72201     3
+.44     1.02     5.32     1.41     3.43      848     2114    10117
<hrandoz:#kernelnewbies> 2.4.21-pre5                       0.62  0.75284     3
+.18     1.02     5.35     1.41     3.25      927     2559    11884
<hrandoz:#kernelnewbies> 2.4.21-pre5-akpm                  0.61  0.73119     3
+.32     1.02     5.28     1.41     3.16      865     2421    11636
<hrandoz:#kernelnewbies> 2.5.63-mjb1                       0.66  1.12795     4
+.01     1.64     6.66     1.92     4.49     1125     2793    12475
<hrandoz:#kernelnewbies> 2.5.62-mjb2                       0.64  1.09703     4
+.12     1.66     5.77     1.89     4.05     1128     2888    12669
<hrandoz:#kernelnewbies> 2.5.63-mjb2                       0.67  1.03824     4
+.12     1.66     5.87     1.90     4.39     1144     2985    12650
<hrandoz:#kernelnewbies> 2.5.62-mm3                        0.62  0.95155     4
+.72     1.42     7.55     1.90     3.92     1164     3073    13101


-- wli

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-24  6:08 ` lmbench results for 2.4 and 2.5 -- updated results Chris Friesen
@ 2003-03-24  8:39   ` Linus Torvalds
  2003-03-24  9:03     ` William Lee Irwin III
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2003-03-24  8:39 UTC (permalink / raw)
  To: linux-kernel

In article <3E7EA0F6.8000308@nortelnetworks.com>,
Chris Friesen  <cfriesen@nortelnetworks.com> wrote:
>
>Here are the results of 2.4.20 and 2.5.65 with as close to matching configs as I 
>could make them.
>
>The ones that stand out are:
>--fork/exec (due to rmap I assume?)
>--mmap (also due to rmap?)

Yes. You could try the objrmap patches, they are supposed to help. They
may be in -mm, I'm not sure.

>--select latency (any ideas?)

I think this is due to the extra TCP debugging, but it might be
something else. To disable the debugging, remove the setting of 
NETIF_F_TSO in linux/drivers/net/loopback.c, and re-test:

        /* Current netfilter will die with oom linearizing large skbs,
         * however this will be cured before 2.5.x is done.
         */
        dev->features          |= NETIF_F_TSO;

>--udp latency (related to select latency?)

I doubt it. But there might be some more overhead somewhere. You should
also run lmbench at least three times to get some feeling for the
variance of the numbers, it can be quite big.

>--page fault (is this significant?)

I don't think so, there's something strange with the lmbench pagefault
tests, it only has one significant digit of accuracy, and I don't even
know what it is testing. Because of the single lack of precision, it's
hard to tell what the real change is.

>--tcp bandwidth (explained as debugging code)

See if the NETIF_F_TSO change makes any difference. If performance is
still bad, holler.

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: lmbench results for 2.4 and 2.5 -- updated results
  2003-03-22 16:11 lmbench results for 2.4 and 2.5 Chris Friesen
@ 2003-03-24  6:08 ` Chris Friesen
  2003-03-24  8:39   ` Linus Torvalds
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Friesen @ 2003-03-24  6:08 UTC (permalink / raw)
  To: linux-kernel


Okay, I'm somewhat chagrined but a bit relieved at the same time.  Linus' 
comment about being sure that I'm testing the same setup promped me to go 
through and doublecheck my config. Turns out that I had some debug stuff turned 
on.  Duh.

Here are the results of 2.4.20 and 2.5.65 with as close to matching configs as I 
could make them.

The ones that stand out are:
--fork/exec (due to rmap I assume?)
--mmap (also due to rmap?)
--select latency (any ideas?)
--udp latency (related to select latency?)
--page fault (is this significant?)
--tcp bandwidth (explained as debugging code)

Sorry about the bogus numbers last time around.

Chris


                  L M B E N C H  2 . 0   S U M M A R Y
                  ------------------------------------

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host            OS  Mhz null null      open selct sig  sig  fork exec sh
                         call  I/O stat clos TCP   inst hndl proc proc proc
---- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
doug  Linux 2.5.65  750 0.38 0.73 5.46 7.13  64.2 1.03 3.25 231. 1729 17.K
doug  Linux 2.4.20  750 0.37 0.50 3.84 5.48  17.5 0.96 3.36 185. 1373 15.K

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host            OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                    ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
---- ------------- ----- ------ ------ ------ ------ ------- -------
doug  Linux 2.5.65 1.420 2.9700  108.7   46.6  157.6    46.7   157.5
doug  Linux 2.4.20 1.120 2.3400   91.5   43.5  155.5    45.2   156.0

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host            OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                    ctxsw       UNIX         UDP         TCP conn
---- ------------- ----- ----- ---- ----- ----- ----- ----- ----
doug  Linux 2.5.65 1.420 7.642 11.4  21.9  45.0  27.2  60.5 104.
doug  Linux 2.4.20 1.120 6.606 10.3  15.8  40.9  26.2  56.5 82.9

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host            OS   0K File      10K File      Mmap    Prot    Page
                    Create Delete Create Delete  Latency Fault   Fault
---- ------------- ------ ------ ------ ------  ------- -----   -----
doug  Linux 2.5.65   64.8   21.0  165.6   42.0   2550.0 0.946 4.00000
doug  Linux 2.4.20   66.1   20.5  192.8   51.6   1612.0 0.764 2.00000

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host           OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                         UNIX      reread reread (libc) (hand) read write
---- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
doug  Linux 2.5.65 196. 107. 51.1  222.5  363.4  217.9  217.6 489. 326.0
doug  Linux 2.4.20 233. 111. 90.0  253.6  370.0  223.8  226.1 498. 328.9



-- 
Chris Friesen                    | MailStop: 043/33/F10
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2003-03-26  2:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-24 19:53 lmbench results for 2.4 and 2.5 -- updated results Pallipadi, Venkatesh
2003-03-24 20:01 ` Larry McVoy
2003-03-24 21:09   ` Martin J. Bligh
2003-03-24 23:36     ` Andrew Morton
2003-03-24 22:04       ` Larry McVoy
2003-03-24 22:04         ` Martin J. Bligh
2003-03-24 22:23           ` Larry McVoy
2003-03-24 22:19         ` Chris Friesen
2003-03-25 18:23         ` Martin J. Bligh
2003-03-26  1:50           ` Larry McVoy
2003-03-26  2:09             ` Martin J. Bligh
  -- strict thread matches above, loose matches on Subject: below --
2003-03-24 20:11 Nakajima, Jun
2003-03-22 16:11 lmbench results for 2.4 and 2.5 Chris Friesen
2003-03-24  6:08 ` lmbench results for 2.4 and 2.5 -- updated results Chris Friesen
2003-03-24  8:39   ` Linus Torvalds
2003-03-24  9:03     ` William Lee Irwin III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).