Re: [BENCHMARK] contest 0.50 results to date

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [BENCHMARK] contest 0.50 results to date
@ 2002-10-05 18:28 Paolo Ciarrocchi
  2002-10-05 19:15 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Ciarrocchi @ 2002-10-05 18:28 UTC (permalink / raw)
  To: conman, linux-kernel; +Cc: akpm, rmaureira, rcastro

And here are my results:
noload:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [3]              128.8   97      0       0       1.01
2.4.19-0.24pre4 [3]     127.4   98      0       0       0.99
2.5.40 [3]              134.4   96      0       0       1.05
2.5.40-nopree [3]       133.7   96      0       0       1.04

process_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [3]              194.1   60      134     40      1.52
2.4.19-0.24pre4 [3]     193.2   60      133     40      1.51
2.5.40 [3]              184.5   70      53      31      1.44
2.5.40-nopree [3]       286.4   45      163     55      2.24

io_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [3]              461.0   28      46      8       3.60
2.4.19-0.24pre4 [3]     235.4   55      26      10      1.84
2.5.40 [3]              293.6   45      25      8       2.29
2.5.40-nopree [3]       269.4   50      20      7       2.10

mem_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [3]              161.1   80      38      2       1.26
2.4.19-0.24pre4 [3]     181.2   76      253     19      1.41
2.5.40 [3]              163.0   80      34      2       1.27
2.5.40-nopree [3]       161.7   80      34      2       1.26

Comments ?

Paolo
-- 
Get your free email from www.linuxmail.org 


Powered by Outblaze

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BENCHMARK] contest 0.50 results to date
  2002-10-05 18:28 [BENCHMARK] contest 0.50 results to date Paolo Ciarrocchi
@ 2002-10-05 19:15 ` Andrew Morton
  2002-10-05 20:56   ` Rodrigo Souza de Castro
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Andrew Morton @ 2002-10-05 19:15 UTC (permalink / raw)
  To: Paolo Ciarrocchi; +Cc: conman, linux-kernel, rmaureira, rcastro

Paolo Ciarrocchi wrote:
> 
> And here are my results:
> noload:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [3]              128.8   97      0       0       1.01
> 2.4.19-0.24pre4 [3]     127.4   98      0       0       0.99
> 2.5.40 [3]              134.4   96      0       0       1.05
> 2.5.40-nopree [3]       133.7   96      0       0       1.04
> 
> process_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [3]              194.1   60      134     40      1.52
> 2.4.19-0.24pre4 [3]     193.2   60      133     40      1.51
> 2.5.40 [3]              184.5   70      53      31      1.44
> 2.5.40-nopree [3]       286.4   45      163     55      2.24
> 
> io_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [3]              461.0   28      46      8       3.60
> 2.4.19-0.24pre4 [3]     235.4   55      26      10      1.84
> 2.5.40 [3]              293.6   45      25      8       2.29
> 2.5.40-nopree [3]       269.4   50      20      7       2.10
> 
> mem_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [3]              161.1   80      38      2       1.26
> 2.4.19-0.24pre4 [3]     181.2   76      253     19      1.41
> 2.5.40 [3]              163.0   80      34      2       1.27
> 2.5.40-nopree [3]       161.7   80      34      2       1.26
> 

I think I'm going to have to be reminded what "Loads" and "LCPU"
mean, please.

For these sorts of tests, I think system-wide CPU% is an interesting
thing to track - both user and system.  If it is high then we're doing
well - doing real work.

The same isn't necessarily true of the compressed-cache kernel, because
it's doing extra work in-kernel, so CPU load comparisons there need
to be made with some caution.

Apart from observing overall CPU occupancy, we also need to monitor
fairness - one way of doing that is to measure the total kernel build
elapsed time.  Another way would be to observe how much actual progress
the streaming IO makes during the kernel build.

What is "2.4.19-0.24pre4"?

I'd suggest that more tests be added.  Perhaps

- one competing streaming read

- several competing streaming reads

- competing "tar cf foo ./linux"

- competing "tar xf foo"

- competing "ls -lR > /dev/null"

It would be interesting to test -aa kernels as well - Andrea's kernels
tend to be well tuned.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BENCHMARK] contest 0.50 results to date
  2002-10-05 19:15 ` Andrew Morton
@ 2002-10-05 20:56   ` Rodrigo Souza de Castro
  2002-10-06  1:03   ` Con Kolivas
  2002-10-06  5:38   ` load additions to contest Con Kolivas
  2 siblings, 0 replies; 8+ messages in thread
From: Rodrigo Souza de Castro @ 2002-10-05 20:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Paolo Ciarrocchi, conman, linux-kernel, rmaureira

On Sat, Oct 05, 2002 at 12:15:30PM -0700, Andrew Morton wrote:
> Paolo Ciarrocchi wrote:
> > 
[snip]
> > mem_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [3]              161.1   80      38      2       1.26
> > 2.4.19-0.24pre4 [3]     181.2   76      253     19      1.41
> > 2.5.40 [3]              163.0   80      34      2       1.27
> > 2.5.40-nopree [3]       161.7   80      34      2       1.26
> > 
> 
> I think I'm going to have to be reminded what "Loads" and "LCPU"
> mean, please.
> 
> For these sorts of tests, I think system-wide CPU% is an interesting
> thing to track - both user and system.  If it is high then we're doing
> well - doing real work.
> 
> The same isn't necessarily true of the compressed-cache kernel, because
> it's doing extra work in-kernel, so CPU load comparisons there need
> to be made with some caution.

Agreed. 

I guess the scheduling is another important point here. Firstly, this
extra work, usually not substantial, may change a little the
scheduling in the system.

Secondly, given that compressed cache usually reduces the IO performed
by the system, it may change the scheduling depending on how much IO
it saves and on what the applications do. For example, mem_load
doesn't swap any page when compressed cache is enabled (those data are
highly compressible), turning out to use most of its CPU time
slice. In vanilla, mem_load is scheduled all the time to service page
faults.

-- 
Rodrigo



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BENCHMARK] contest 0.50 results to date
  2002-10-05 19:15 ` Andrew Morton
  2002-10-05 20:56   ` Rodrigo Souza de Castro
@ 2002-10-06  1:03   ` Con Kolivas
  2002-10-06  5:38   ` load additions to contest Con Kolivas
  2 siblings, 0 replies; 8+ messages in thread
From: Con Kolivas @ 2002-10-06  1:03 UTC (permalink / raw)
  To: Andrew Morton, Paolo Ciarrocchi; +Cc: linux-kernel, rmaureira, rcastro

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 06 Oct 2002 5:15 am, Andrew Morton wrote:
> Paolo Ciarrocchi wrote:
> > And here are my results:
> I think I'm going to have to be reminded what "Loads" and "LCPU"
> mean, please.

Loads for process_load is the number of iterations each load manages to 
succeed doing divided by 10000. Loads for mem_load is the number of times 
mem_load manages to access the ram divided by 1000. Loads for io_load is the 
approximate number of megabytes divided by 100 it writes during the kernel 
compile.  The denominator for loads was chosen to easily represent the data, 
and also correlates well with significant digits.

LCPU% is the load's cpu% usage while it is running. The load is started 3 
seconds before the kernel compile and takes a variable length of time to 
finish, so it can be more than 100-cpu%

> For these sorts of tests, I think system-wide CPU% is an interesting
> thing to track - both user and system.  If it is high then we're doing
> well - doing real work.

So total cpu% being used during the kernel compile? The cpu% + lcpu% should be 
very close to this.  However I'm not sure of the most accurate way to find 
out average total cpu% used during just the kernel compile - suggestion?

> The same isn't necessarily true of the compressed-cache kernel, because
> it's doing extra work in-kernel, so CPU load comparisons there need
> to be made with some caution.

That is clear, and also the reason I have a measure of work done by the load 
as well as just the lcpu% (which by itself is not very helpful).

> Apart from observing overall CPU occupancy, we also need to monitor
> fairness - one way of doing that is to measure the total kernel build
> elapsed time.  Another way would be to observe how much actual progress
> the streaming IO makes during the kernel build.

I believe that is what I'm already showing with the time for each load == 
kernel build time, and loads==io work done.

> What is "2.4.19-0.24pre4"?

Latest version of compressed cache. Note that in my testing of cc I used the 
optional LZO compression.

> I'd suggest that more tests be added.  Perhaps
>
> - one competing streaming read
>
> - several competing streaming reads
>
> - competing "tar cf foo ./linux"
>
> - competing "tar xf foo"
>
> - competing "ls -lR > /dev/null"
>

Sure, adding loads is easy enough. Just exactly what to add I wasn't sure 
about. I'll give those a shot soon.

> It would be interesting to test -aa kernels as well - Andrea's kernels
> tend to be well tuned.

Where time permits sure.

Regards,
Con.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9n4viF6dfvkL3i1gRArn8AJ9c+jKc/CMPxV0GWaXbVJasmBNX5QCghVYX
dbvST9mdltwuwmqEk0HHXYY=
=pcOu
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* load additions to contest
  2002-10-05 19:15 ` Andrew Morton
  2002-10-05 20:56   ` Rodrigo Souza de Castro
  2002-10-06  1:03   ` Con Kolivas
@ 2002-10-06  5:38   ` Con Kolivas
  2002-10-06  6:11     ` Andrew Morton
  2 siblings, 1 reply; 8+ messages in thread
From: Con Kolivas @ 2002-10-06  5:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, rcastro, ciarrocchi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've added some load conditions to an experimental version of contest 
(http://contest.kolivas.net) and here are some of the results I've obtained 
so far:

First an explanation
The time column shows how long it took to conduct the kernel compile in the 
presence of the load
The cpu% shows what percentage of the cpu the kernel compile managed to use 
during compilation
Loads shows how many times the load managed to run while the kernel compile 
was happening
Lcpu% shows the percentage cpu the load used while running
Ratio shows a ratio of kernel compilation time to the reference (2.4.19)

Use a fixed width font to see the tables correctly.

tarc_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [2]              88.0    74      50      25      1.31
2.4.19-cc [1]           86.1    78      51      26      1.28
2.5.38 [1]              91.8    74      46      22      1.37
2.5.39 [1]              94.4    71      58      27      1.41
2.5.40 [1]              95.0    71      59      27      1.41
2.5.40-mm1 [1]          93.8    72      56      26      1.40

This load repeatedly creates a tar of the include directory of the linux 
kernel. You can see a decrease in performance was visible at 2.5.38 without a 
concomitant increase in loads, but this improved by 2.5.39.

tarx_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [2]              87.6    74      13      24      1.30
2.4.19-cc [1]           81.5    80      12      24      1.21
2.5.38 [1]              296.5   23      54      28      4.41
2.5.39 [1]              108.2   64      9       12      1.61
2.5.40 [1]              107.0   64      8       11      1.59
2.5.40-mm1 [1]          120.5   58      12      16      1.79

This load repeatedly extracts a tar  of the include directory of the linux 
kernel. A performance boost is noted by the compressed cache kernel 
consistent with this data being cached better (less IO). 2.5.38 shows very 
heavy writing and a performance penalty with that. All the 2.5 kernels show 
worse performance than the 2.4 kernels as the time taken to compile the 
kernel is longer even though the amount of work done by the load has 
decreased.

read_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [2]              134.1   54      14      5       2.00
2.4.19-cc [2]           92.5    72      22      20      1.38
2.5.38 [2]              100.5   76      9       5       1.50
2.5.39 [2]              101.3   74      14      6       1.51
2.5.40 [1]              101.5   73      13      5       1.51
2.5.40-mm1 [1]          104.5   74      9       5       1.56

This load repeatedly copies a file the size of the physical memory to 
/dev/null. Compressed caching shows the performance boost of caching more of 
this data in physical ram - caveat is that this data would be simple to 
compress so the advantage is overstated. The 2.5 kernels show equivalent 
performance at 2.5.38 (time down at the expense of load down) but have better 
performance at 2.5.39-40 (time down with equivalent load being performed). 
2.5.40-mm1 seems to exhibit the same performance as 2.5.38.

lslr_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [2]              83.1    77      34      24      1.24
2.4.19-cc [1]           82.8    79      34      24      1.23
2.5.38 [1]              74.8    89      16      13      1.11
2.5.39 [1]              76.7    88      18      14      1.14
2.5.40 [1]              74.9    89      15      12      1.12
2.5.40-mm1 [1]          76.0    89      15      12      1.13

This load repeatedly does a `ls -lR >/dev/null`. The performance seems to be 
overall similar, with the bias towards the kernel compilation being performed 
sooner.

These were very interesting loads to conduct as suggested by AKPM and 
depending on the feedback I get I will probably incorporate them into 
contest.

Comments?
Con 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9n8xCF6dfvkL3i1gRAqQMAJwJ1lgYI0ebW1yw7frZt7lncYBFVQCeIsYN
NNgrrWyrqTWGLO11IlxtyPs=
=Ldnh
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: load additions to contest
  2002-10-06  5:38   ` load additions to contest Con Kolivas
@ 2002-10-06  6:11     ` Andrew Morton
  2002-10-06  6:56       ` Con Kolivas
  2002-10-06 12:07       ` Con Kolivas
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2002-10-06  6:11 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, rcastro, ciarrocchi

Con Kolivas wrote:
> 
> ...
> 
> tarc_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [2]              88.0    74      50      25      1.31
> 2.4.19-cc [1]           86.1    78      51      26      1.28
> 2.5.38 [1]              91.8    74      46      22      1.37
> 2.5.39 [1]              94.4    71      58      27      1.41
> 2.5.40 [1]              95.0    71      59      27      1.41
> 2.5.40-mm1 [1]          93.8    72      56      26      1.40
> 
> This load repeatedly creates a tar of the include directory of the linux
> kernel. You can see a decrease in performance was visible at 2.5.38 without a
> concomitant increase in loads, but this improved by 2.5.39.

Well the kernel compile took 7% longer, but the tar got 10% more
work done.  I expect this is a CPU scheduler artifact.  The scheduler
has changed so much, it's hard to draw any conclusions.

Everything there will be in cache.  I'd suggest that you increase the
size of the tarball a *lot*, so the two activities are competing for
disk.

> tarx_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [2]              87.6    74      13      24      1.30
> 2.4.19-cc [1]           81.5    80      12      24      1.21
> 2.5.38 [1]              296.5   23      54      28      4.41
> 2.5.39 [1]              108.2   64      9       12      1.61
> 2.5.40 [1]              107.0   64      8       11      1.59
> 2.5.40-mm1 [1]          120.5   58      12      16      1.79
> 
> This load repeatedly extracts a tar  of the include directory of the linux
> kernel. A performance boost is noted by the compressed cache kernel
> consistent with this data being cached better (less IO). 2.5.38 shows very
> heavy writing and a performance penalty with that. All the 2.5 kernels show
> worse performance than the 2.4 kernels as the time taken to compile the
> kernel is longer even though the amount of work done by the load has
> decreased.

hm, that's interesting.  I assume the tar file is being extracted
into the same place each time?  Is tar overwriting the old version,
or are you unlinking the destination first?

It would be most interesting to rename the untarred tree, so nothing
is getting deleted.

Which filesystem are you using here?
 
> read_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [2]              134.1   54      14      5       2.00
> 2.4.19-cc [2]           92.5    72      22      20      1.38
> 2.5.38 [2]              100.5   76      9       5       1.50
> 2.5.39 [2]              101.3   74      14      6       1.51
> 2.5.40 [1]              101.5   73      13      5       1.51
> 2.5.40-mm1 [1]          104.5   74      9       5       1.56
> 
> This load repeatedly copies a file the size of the physical memory to
> /dev/null. Compressed caching shows the performance boost of caching more of
> this data in physical ram - caveat is that this data would be simple to
> compress so the advantage is overstated. The 2.5 kernels show equivalent
> performance at 2.5.38 (time down at the expense of load down) but have better
> performance at 2.5.39-40 (time down with equivalent load being performed).
> 2.5.40-mm1 seems to exhibit the same performance as 2.5.38.

That's complex.  I expect there's a lot of eviction of executable
text happening here.  I'm working on tuning that up a bit.
 
> lslr_load:
> Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> 2.4.19 [2]              83.1    77      34      24      1.24
> 2.4.19-cc [1]           82.8    79      34      24      1.23
> 2.5.38 [1]              74.8    89      16      13      1.11
> 2.5.39 [1]              76.7    88      18      14      1.14
> 2.5.40 [1]              74.9    89      15      12      1.12
> 2.5.40-mm1 [1]          76.0    89      15      12      1.13
> 
> This load repeatedly does a `ls -lR >/dev/null`. The performance seems to be
> overall similar, with the bias towards the kernel compilation being performed
> sooner.

How many files were under the `ls -lR'?  I'd suggest "zillions", so
we get heavily into slab reclaim, and lots of inode and directory
cache thrashing and seeking...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: load additions to contest
  2002-10-06  6:11     ` Andrew Morton
@ 2002-10-06  6:56       ` Con Kolivas
  2002-10-06 12:07       ` Con Kolivas
  1 sibling, 0 replies; 8+ messages in thread
From: Con Kolivas @ 2002-10-06  6:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, rcastro, ciarrocchi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 06 Oct 2002 4:11 pm, Andrew Morton wrote:
> Con Kolivas wrote:
> > ...
> >
> > tarc_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              88.0    74      50      25      1.31
> > 2.4.19-cc [1]           86.1    78      51      26      1.28
> > 2.5.38 [1]              91.8    74      46      22      1.37
> > 2.5.39 [1]              94.4    71      58      27      1.41
> > 2.5.40 [1]              95.0    71      59      27      1.41
> > 2.5.40-mm1 [1]          93.8    72      56      26      1.40
> >
> > This load repeatedly creates a tar of the include directory of the linux
> > kernel. You can see a decrease in performance was visible at 2.5.38
> > without a concomitant increase in loads, but this improved by 2.5.39.
>
> Well the kernel compile took 7% longer, but the tar got 10% more
> work done.  I expect this is a CPU scheduler artifact.  The scheduler
> has changed so much, it's hard to draw any conclusions.
>
> Everything there will be in cache.  I'd suggest that you increase the
> size of the tarball a *lot*, so the two activities are competing for
> disk.

Ok I'll go back to the original idea of tarring the whole kernel directory. It 
needs to be something constant size obviously.

>
> > tarx_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              87.6    74      13      24      1.30
> > 2.4.19-cc [1]           81.5    80      12      24      1.21
> > 2.5.38 [1]              296.5   23      54      28      4.41
> > 2.5.39 [1]              108.2   64      9       12      1.61
> > 2.5.40 [1]              107.0   64      8       11      1.59
> > 2.5.40-mm1 [1]          120.5   58      12      16      1.79
> >
> > This load repeatedly extracts a tar  of the include directory of the
> > linux kernel. A performance boost is noted by the compressed cache kernel
> > consistent with this data being cached better (less IO). 2.5.38 shows
> > very heavy writing and a performance penalty with that. All the 2.5
> > kernels show worse performance than the 2.4 kernels as the time taken to
> > compile the kernel is longer even though the amount of work done by the
> > load has decreased.
>
> hm, that's interesting.  I assume the tar file is being extracted
> into the same place each time?  Is tar overwriting the old version,
> or are you unlinking the destination first?

Into the same place and overwriting the original.

>
> It would be most interesting to rename the untarred tree, so nothing
> is getting deleted.

Ok, this is going to take up a lot of space though.

>
> Which filesystem are you using here?

ReiserFS (sorry dont have any other hardware/fs to test on)

>
> > read_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              134.1   54      14      5       2.00
> > 2.4.19-cc [2]           92.5    72      22      20      1.38
> > 2.5.38 [2]              100.5   76      9       5       1.50
> > 2.5.39 [2]              101.3   74      14      6       1.51
> > 2.5.40 [1]              101.5   73      13      5       1.51
> > 2.5.40-mm1 [1]          104.5   74      9       5       1.56
> >
> > This load repeatedly copies a file the size of the physical memory to
> > /dev/null. Compressed caching shows the performance boost of caching more
> > of this data in physical ram - caveat is that this data would be simple
> > to compress so the advantage is overstated. The 2.5 kernels show
> > equivalent performance at 2.5.38 (time down at the expense of load down)
> > but have better performance at 2.5.39-40 (time down with equivalent load
> > being performed). 2.5.40-mm1 seems to exhibit the same performance as
> > 2.5.38.
>
> That's complex.  I expect there's a lot of eviction of executable
> text happening here.  I'm working on tuning that up a bit.
>
> > lslr_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              83.1    77      34      24      1.24
> > 2.4.19-cc [1]           82.8    79      34      24      1.23
> > 2.5.38 [1]              74.8    89      16      13      1.11
> > 2.5.39 [1]              76.7    88      18      14      1.14
> > 2.5.40 [1]              74.9    89      15      12      1.12
> > 2.5.40-mm1 [1]          76.0    89      15      12      1.13
> >
> > This load repeatedly does a `ls -lR >/dev/null`. The performance seems to
> > be overall similar, with the bias towards the kernel compilation being
> > performed sooner.
>
> How many files were under the `ls -lR'?  I'd suggest "zillions", so
> we get heavily into slab reclaim, and lots of inode and directory
> cache thrashing and seeking...

The ls -lR was an entire kernel tree (to remain constant between runs). I dont 
think I can keep it constant and make it much bigger without creating some 
sort of fake dir tree unless you can suggest a different approach. I guess 
overall `ls -lr /` will not be much different in size between runs if you 
think that would be satisfactory.

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9n96uF6dfvkL3i1gRAundAJ9YuHm4wPDw7OEWUgb3jOXk9oludgCfeslh
3aHxF8OcN1Cm8ep8g64K/ag=
=/f7y
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: load additions to contest
  2002-10-06  6:11     ` Andrew Morton
  2002-10-06  6:56       ` Con Kolivas
@ 2002-10-06 12:07       ` Con Kolivas
  1 sibling, 0 replies; 8+ messages in thread
From: Con Kolivas @ 2002-10-06 12:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, rcastro, ciarrocchi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Here are the modifications as you suggested.

On Sunday 06 Oct 2002 4:11 pm, Andrew Morton wrote:
> Con Kolivas wrote:
> > ...
> >
> > tarc_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              88.0    74      50      25      1.31
> > 2.4.19-cc [1]           86.1    78      51      26      1.28
> > 2.5.38 [1]              91.8    74      46      22      1.37
> > 2.5.39 [1]              94.4    71      58      27      1.41
> > 2.5.40 [1]              95.0    71      59      27      1.41
> > 2.5.40-mm1 [1]          93.8    72      56      26      1.40
> >
> > This load repeatedly creates a tar of the include directory of the linux
> > kernel. You can see a decrease in performance was visible at 2.5.38
> > without a concomitant increase in loads, but this improved by 2.5.39.

tarc_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [2]              106.5   70      1       8       1.59
2.5.38 [1]              97.2    79      1       6       1.45
2.5.39 [1]              91.8    83      1       6       1.37
2.5.40 [1]              96.9    80      1       6       1.44
2.5.40-mm1 [1]          94.4    81      1       6       1.41

This version tars the whole kernel tree. No files are overwritten this time. 
The results are definitely different, but the resolution in loads makes that 
value completely unhelpful. 

>
> Well the kernel compile took 7% longer, but the tar got 10% more
> work done.  I expect this is a CPU scheduler artifact.  The scheduler
> has changed so much, it's hard to draw any conclusions.
>
> Everything there will be in cache.  I'd suggest that you increase the
> size of the tarball a *lot*, so the two activities are competing for
> disk.
>
> > tarx_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              87.6    74      13      24      1.30
> > 2.4.19-cc [1]           81.5    80      12      24      1.21
> > 2.5.38 [1]              296.5   23      54      28      4.41
> > 2.5.39 [1]              108.2   64      9       12      1.61
> > 2.5.40 [1]              107.0   64      8       11      1.59
> > 2.5.40-mm1 [1]          120.5   58      12      16      1.79
> >
> > This load repeatedly extracts a tar  of the include directory of the
> > linux kernel. A performance boost is noted by the compressed cache kernel
> > consistent with this data being cached better (less IO). 2.5.38 shows
> > very heavy writing and a performance penalty with that. All the 2.5
> > kernels show worse performance than the 2.4 kernels as the time taken to
> > compile the kernel is longer even though the amount of work done by the
> > load has decreased.

tarx_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [1]              132.4   55      2       9       1.97
2.5.38 [1]              120.5   63      2       8       1.79
2.5.39 [1]              108.3   69      1       6       1.61
2.5.40 [1]              110.7   68      1       6       1.65
2.5.40-mm1 [1]          191.5   39      3       7       2.85

This version extracts a tar of the whole kernel tree. No files are overwritten 
this time. Once again the results are very different, and the loads' 
resolution is almost too low for comparison. 

>
> hm, that's interesting.  I assume the tar file is being extracted
> into the same place each time?  Is tar overwriting the old version,
> or are you unlinking the destination first?
>
> It would be most interesting to rename the untarred tree, so nothing
> is getting deleted.
>
> Which filesystem are you using here?
>
> > read_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              134.1   54      14      5       2.00
> > 2.4.19-cc [2]           92.5    72      22      20      1.38
> > 2.5.38 [2]              100.5   76      9       5       1.50
> > 2.5.39 [2]              101.3   74      14      6       1.51
> > 2.5.40 [1]              101.5   73      13      5       1.51
> > 2.5.40-mm1 [1]          104.5   74      9       5       1.56
> >
> > This load repeatedly copies a file the size of the physical memory to
> > /dev/null. Compressed caching shows the performance boost of caching more
> > of this data in physical ram - caveat is that this data would be simple
> > to compress so the advantage is overstated. The 2.5 kernels show
> > equivalent performance at 2.5.38 (time down at the expense of load down)
> > but have better performance at 2.5.39-40 (time down with equivalent load
> > being performed). 2.5.40-mm1 seems to exhibit the same performance as
> > 2.5.38.
>
> That's complex.  I expect there's a lot of eviction of executable
> text happening here.  I'm working on tuning that up a bit.
>
> > lslr_load:
> > Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
> > 2.4.19 [2]              83.1    77      34      24      1.24
> > 2.4.19-cc [1]           82.8    79      34      24      1.23
> > 2.5.38 [1]              74.8    89      16      13      1.11
> > 2.5.39 [1]              76.7    88      18      14      1.14
> > 2.5.40 [1]              74.9    89      15      12      1.12
> > 2.5.40-mm1 [1]          76.0    89      15      12      1.13
> >
> > This load repeatedly does a `ls -lR >/dev/null`. The performance seems to
> > be overall similar, with the bias towards the kernel compilation being
> > performed sooner.

lslr_load:
Kernel [runs]           Time    CPU%    Loads   LCPU%   Ratio
2.4.19 [1]              89.8    77      1       20      1.34
2.5.38 [1]              99.1    71      1       20      1.48
2.5.39 [1]              101.3   70      2       24      1.51
2.5.40 [1]              97.0    72      1       21      1.44
2.5.40-mm1 [1]          96.6    73      1       22      1.44

This version does `ls -lR /`. Note the balance is swayed toward taking longer 
to compile the kernel here, and loads' resolution is lost.

>
> How many files were under the `ls -lR'?  I'd suggest "zillions", so
> we get heavily into slab reclaim, and lots of inode and directory
> cache thrashing and seeking...

I hope this is helpful in some way. In this guise it would be difficult to 
release these as a standard part of contest (fast hardware could 
theoretically use up all the disk space with this). I'm more than happy to 
conduct personalised tests for linux kernel development as needed though.

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9oCd9F6dfvkL3i1gRAgBsAKCasfXxk9USmY70/R76+BD9AJOlRwCeM1mm
kKalOKo5gZF6igzXTdNVXeY=
=d/0R
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-10-06 12:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-05 18:28 [BENCHMARK] contest 0.50 results to date Paolo Ciarrocchi
2002-10-05 19:15 ` Andrew Morton
2002-10-05 20:56   ` Rodrigo Souza de Castro
2002-10-06  1:03   ` Con Kolivas
2002-10-06  5:38   ` load additions to contest Con Kolivas
2002-10-06  6:11     ` Andrew Morton
2002-10-06  6:56       ` Con Kolivas
2002-10-06 12:07       ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).