linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.5.8 final -
@ 2002-04-14 21:06 J Sloan
  2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
  2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
  0 siblings, 2 replies; 9+ messages in thread
From: J Sloan @ 2002-04-14 21:06 UTC (permalink / raw)
  To: linux kernel

Observations -

The up-fix for the setup_per_cpu_areas compile
issue apparently didn't make it into 2.5.8-final,
so we had to apply the patch from 2.5.8-pre3
to get it to compile.

That said, however, everything works, all services
are running, all devices working, Xfree is happy.

P4-B/1600,  genuine intel mobo running RH 7.2+rawhide

It also passes the q3a test with snappy results

:-)

Joe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-14 21:06 2.5.8 final - J Sloan
@ 2002-04-15  5:46 ` J Sloan
  2002-04-15  6:35   ` J Sloan
  2002-04-15  7:18   ` Andrew Morton
  2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
  1 sibling, 2 replies; 9+ messages in thread
From: J Sloan @ 2002-04-15  5:46 UTC (permalink / raw)
  To: linux kernel; +Cc: J Sloan

J Sloan wrote:

> Observations -
>
> The up-fix for the setup_per_cpu_areas compile
> issue apparently didn't make it into 2.5.8-final,
> so we had to apply the patch from 2.5.8-pre3
> to get it to compile.
>
> That said, however, everything works, all services
> are running, all devices working, Xfree is happy.

Stop me if you've heard this one before -

But there is one additional observation:

dbench performance has regressed significantly
since 2.5.8-pre1; the performance is equivalent
up to 8 instances, but at 16 and above, 2.5.8 final
takes a nosedive. Performance at 128 instances
is approximately 20% of the throughput of
2.5.8-pre1 - which is in turn not up to 2.4.xx
performance levels. I realize that the BIO has
been through heavy surgery, and nowhere near
optimized, but this is just a data point...

hdparm -t shows normal performance levels,
for what it's worth

2.5.8-pre1
--------------
Throughput 151.152 MB/sec (NB=188.94 MB/sec  1511.52 MBit/sec)  1 procs
Throughput 152.177 MB/sec (NB=190.221 MB/sec  1521.77 MBit/sec)  2 procs
Throughput 151.965 MB/sec (NB=189.957 MB/sec  1519.65 MBit/sec)  4 procs
Throughput 151.068 MB/sec (NB=188.835 MB/sec  1510.68 MBit/sec)  8 procs
Throughput 43.0191 MB/sec (NB=53.7738 MB/sec  430.191 MBit/sec)  16 procs
Throughput 9.65171 MB/sec (NB=12.0646 MB/sec  96.5171 MBit/sec)  32 procs
Throughput 37.8267 MB/sec (NB=47.2833 MB/sec  378.267 MBit/sec)  64 procs
Throughput 14.0459 MB/sec (NB=17.5573 MB/sec  140.459 MBit/sec)  80 procs
Throughput 16.2971 MB/sec (NB=20.3714 MB/sec  162.971 MBit/sec)  128 procs

2.5.8-final
---------------
Throughput 152.948 MB/sec (NB=191.185 MB/sec  1529.48 MBit/sec)  1 procs
Throughput 151.597 MB/sec (NB=189.497 MB/sec  1515.97 MBit/sec)  2 procs
Throughput 150.377 MB/sec (NB=187.972 MB/sec  1503.77 MBit/sec)  4 procs
Throughput 150.159 MB/sec (NB=187.698 MB/sec  1501.59 MBit/sec)  8 procs
Throughput 7.25691 MB/sec (NB=9.07113 MB/sec  72.5691 MBit/sec)  16 procs
Throughput 6.36332 MB/sec (NB=7.95415 MB/sec  63.6332 MBit/sec)  32 procs
Throughput 5.55008 MB/sec (NB=6.9376 MB/sec  55.5008 MBit/sec)  64 procs
Throughput 5.82333 MB/sec (NB=7.27916 MB/sec  58.2333 MBit/sec)  80 procs
Throughput 3.40741 MB/sec (NB=4.25926 MB/sec  34.0741 MBit/sec)  128 procs








^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
@ 2002-04-15  6:35   ` J Sloan
  2002-04-15  7:27     ` Andrew Morton
  2002-04-15  7:18   ` Andrew Morton
  1 sibling, 1 reply; 9+ messages in thread
From: J Sloan @ 2002-04-15  6:35 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

FWIW -

One other observation was the numerous
syslog entries generated during the test,
which were as follows:


Apr 14 20:40:35 neo kernel: invalidate: busy buffer
Apr 14 20:41:15 neo last message repeated 72 times
Apr 14 20:44:41 neo last message repeated 36 times
Apr 14 20:45:24 neo last message repeated 47 times


J Sloan wrote:

> dbench performance has regressed significantly
> since 2.5.8-pre1; 

>
> 2.5.8-pre1
> --------------
> Throughput 37.8267 MB/sec (NB=47.2833 MB/sec  378.267 MBit/sec)  64 procs
> Throughput 14.0459 MB/sec (NB=17.5573 MB/sec  140.459 MBit/sec)  80 procs
> Throughput 16.2971 MB/sec (NB=20.3714 MB/sec  162.971 MBit/sec)  128 
> procs
>
> 2.5.8-final
> ---------------
> Throughput 5.55008 MB/sec (NB=6.9376 MB/sec  55.5008 MBit/sec)  64 procs
> Throughput 5.82333 MB/sec (NB=7.27916 MB/sec  58.2333 MBit/sec)  80 procs
> Throughput 3.40741 MB/sec (NB=4.25926 MB/sec  34.0741 MBit/sec)  128 
> procs
>




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
  2002-04-15  6:35   ` J Sloan
@ 2002-04-15  7:18   ` Andrew Morton
  2002-04-15  8:14     ` J Sloan
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2002-04-15  7:18 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

J Sloan wrote:
> 
> ...
> dbench performance has regressed significantly
> since 2.5.8-pre1; the performance is equivalent
> up to 8 instances, but at 16 and above, 2.5.8 final
> takes a nosedive. Performance at 128 instances
> is approximately 20% of the throughput of
> 2.5.8-pre1 - which is in turn not up to 2.4.xx
> performance levels. I realize that the BIO has
> been through heavy surgery, and nowhere near
> optimized, but this is just a data point...

It's not related to BIO.  dbench is all about higher-level
memory management, high-level IO scheduling and butterfly
wings.
 
> ...
> Throughput 151.068 MB/sec (NB=188.835 MB/sec  1510.68 MBit/sec)  8 procs
> Throughput 43.0191 MB/sec (NB=53.7738 MB/sec  430.191 MBit/sec)  16 procs
> Throughput 9.65171 MB/sec (NB=12.0646 MB/sec  96.5171 MBit/sec)  32 procs
> Throughput 37.8267 MB/sec (NB=47.2833 MB/sec  378.267 MBit/sec)  64 procs

Consider that 32 proc line for a while.

>....
> 2.5.8-final
> ---------------
> Throughput 152.948 MB/sec (NB=191.185 MB/sec  1529.48 MBit/sec)  1 procs
> Throughput 151.597 MB/sec (NB=189.497 MB/sec  1515.97 MBit/sec)  2 procs
> Throughput 150.377 MB/sec (NB=187.972 MB/sec  1503.77 MBit/sec)  4 procs
> Throughput 150.159 MB/sec (NB=187.698 MB/sec  1501.59 MBit/sec)  8 procs
> Throughput 7.25691 MB/sec (NB=9.07113 MB/sec  72.5691 MBit/sec)  16 procs
> Throughput 6.36332 MB/sec (NB=7.95415 MB/sec  63.6332 MBit/sec)  32 procs

It's obviously fallen over some cliff.  Conceivably the larger readahead
window causes this.  How much memory does the machine have? `dbench 64'
on a 512 meg setup certainly causes readahead thrashing.  You can
stick a `printk("ouch");' into handle_ra_thrashing() and watch it...


But really, all this stuff is in churn at present. I have patches here
which take `dbench 64' on 512 megs from this:


2.5.8:
Throughput 12.7343 MB/sec (NB=15.9179 MB/sec  127.343 MBit/sec)

to this:

2.5.8-akpm:
Throughput 49.2223 MB/sec (NB=61.5278 MB/sec  492.223 MBit/sec)

This is partly by just throwing more memory at it.  The gap
widens on highmem...

And that code isn't tuned yet - I do know that threads are getting
blocked by each other at the inode level.  And that ext2 is serialising
itself at the lock_super() level, and that if you fix that,
threads serialise on slab's cache_chain_sem (which is pretty
amazing...).

Patience.  2.5.later-on will perform well.  :)

-

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  6:35   ` J Sloan
@ 2002-04-15  7:27     ` Andrew Morton
  2002-04-15  8:02       ` J Sloan
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2002-04-15  7:27 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

J Sloan wrote:
> 
> FWIW -
> 
> One other observation was the numerous
> syslog entries generated during the test,
> which were as follows:
> 
> Apr 14 20:40:35 neo kernel: invalidate: busy buffer
> Apr 14 20:41:15 neo last message repeated 72 times
> Apr 14 20:44:41 neo last message repeated 36 times
> Apr 14 20:45:24 neo last message repeated 47 times
> 

If that is happening during the dbench run, then something
is wrong.

What filesystem and I/O drivers are you using?  LVM?
RAID?

Please replace that line in fs:buffer.c:invalidate_bdev()
with a BUG(), or show_stack(0), send the ksymoops output.

Thanks.

-

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  7:27     ` Andrew Morton
@ 2002-04-15  8:02       ` J Sloan
  0 siblings, 0 replies; 9+ messages in thread
From: J Sloan @ 2002-04-15  8:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux kernel

Andrew Morton wrote:

>J Sloan wrote:
>
>>
>>Apr 14 20:40:35 neo kernel: invalidate: busy buffer
>>
>
>If that is happening during the dbench run, then something
>is wrong.
>
I am reasonably sure that's when it was happening.

>
>
>What filesystem and I/O drivers are you using?  LVM?
>RAID?
>
Actually just plain old ext2 on ide drives -

>
>Please replace that line in fs:buffer.c:invalidate_bdev()
>with a BUG(), or show_stack(0), send the ksymoops output.
>
OK, will do -

Joe




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  7:18   ` Andrew Morton
@ 2002-04-15  8:14     ` J Sloan
  0 siblings, 0 replies; 9+ messages in thread
From: J Sloan @ 2002-04-15  8:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux kernel

Andrew Morton wrote:

>It's not related to BIO.  dbench is all about higher-level
>memory management, high-level IO scheduling and butterfly
>wings.
>
Yes, no doubt and a lot of other deep magic
which is only dimly perceived by the likes
of yours truly....

>>
>>Throughput 150.159 MB/sec (NB=187.698 MB/sec  1501.59 MBit/sec)  8 procs
>>Throughput 7.25691 MB/sec (NB=9.07113 MB/sec  72.5691 MBit/sec)  16 procs
>>Throughput 6.36332 MB/sec (NB=7.95415 MB/sec  63.6332 MBit/sec)  32 procs
>>
>
>It's obviously fallen over some cliff.  Conceivably the larger readahead
>window causes this.  How much memory does the machine have? 
>
The box has 512 MB RAM -

>`dbench 64'
>on a 512 meg setup certainly causes readahead thrashing.  You can
>stick a `printk("ouch");' into handle_ra_thrashing() and watch it...
>
hmm - OK, will try that -

Just for giggles, same machine with 2.4.19-pre4-ac4 -

Throughput 150.979 MB/sec (NB=188.723 MB/sec  1509.79 MBit/sec)  1 procs
Throughput 150.796 MB/sec (NB=188.496 MB/sec  1507.96 MBit/sec)  2 procs
Throughput 151.185 MB/sec (NB=188.982 MB/sec  1511.85 MBit/sec)  4 procs
Throughput 141.255 MB/sec (NB=176.568 MB/sec  1412.55 MBit/sec)  8 procs
Throughput 105.066 MB/sec (NB=131.332 MB/sec  1050.66 MBit/sec)  16 procs
Throughput 69.3542 MB/sec (NB=86.6928 MB/sec  693.542 MBit/sec)  32 procs
Throughput 32.4904 MB/sec (NB=40.613 MB/sec  324.904 MBit/sec)  64 procs
Throughput 30.4824 MB/sec (NB=38.103 MB/sec  304.824 MBit/sec)  80 procs
Throughput 19.0265 MB/sec (NB=23.7832 MB/sec  190.265 MBit/sec)  128 procs

>
>
>Patience.  2.5.later-on will perform well.  :)
>
Oh, yes -

It's already quite usable for some workloads, and the
latency for workstation use is quite good -  I am looking
forward to the maturation of this diamond in the rough

:-)

Joe



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final -
  2002-04-14 21:06 2.5.8 final - J Sloan
  2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
@ 2002-04-15 14:15 ` Luigi Genoni
  2002-04-15 14:55   ` David S. Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Luigi Genoni @ 2002-04-15 14:15 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

OH well, on sparc64 setup_per_cpu_areas()  simply is
not declared, since it is not a GENERIC_PER_CPU.

then asm/cacheflush.h, required by linux/highmem.h,
does not exist.

And then PREEMPT_ACTIVE is not defined...

it seems that I could not test under sparc64 also this release, sigh!

On Sun, 14 Apr 2002, J Sloan wrote:

> Observations -
>
> The up-fix for the setup_per_cpu_areas compile
> issue apparently didn't make it into 2.5.8-final,
> so we had to apply the patch from 2.5.8-pre3
> to get it to compile.
>
> That said, however, everything works, all services
> are running, all devices working, Xfree is happy.
>
> P4-B/1600,  genuine intel mobo running RH 7.2+rawhide
>
> It also passes the q3a test with snappy results
>
> :-)
>
> Joe
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.8 final -
  2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
@ 2002-04-15 14:55   ` David S. Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2002-04-15 14:55 UTC (permalink / raw)
  To: kernel; +Cc: joe, linux-kernel

   From: Luigi Genoni <kernel@Expansa.sns.it>
   Date: Mon, 15 Apr 2002 16:15:04 +0200 (CEST)

   OH well, on sparc64 setup_per_cpu_areas()  simply is
   not declared, since it is not a GENERIC_PER_CPU.
   
   then asm/cacheflush.h, required by linux/highmem.h,
   does not exist.
   
   And then PREEMPT_ACTIVE is not defined...
   
   it seems that I could not test under sparc64 also this release, sigh!

I just haven't pushed my tree yet, it will be fixed soon.
I've been busy with other things this weekend...

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-04-15 15:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-04-14 21:06 2.5.8 final - J Sloan
2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
2002-04-15  6:35   ` J Sloan
2002-04-15  7:27     ` Andrew Morton
2002-04-15  8:02       ` J Sloan
2002-04-15  7:18   ` Andrew Morton
2002-04-15  8:14     ` J Sloan
2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
2002-04-15 14:55   ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).