* Some benchmarks on ARM
@ 2010-07-02 18:02 Robert Schwebel
2010-07-02 20:34 ` Magnus Lilja
` (6 more replies)
0 siblings, 7 replies; 29+ messages in thread
From: Robert Schwebel @ 2010-07-02 18:02 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
We have recently made some benchmarks, in order to get a little bit
better fealing about where ARM cpus are today, especially when it comes
to the "recent" ones, and in comparism to the Atom. So we collected a
few benchmarks (most from lmbench) and did some actual measurements.
Here is a little article:
http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
I'm pretty sure that there are quite a few things where people on ALKML
have good ideas where the effects come from or how to improve the
methodology - so I'd be glad to get some feedback from the community!
All measurements have been done on 2.6.34.
Thanks,
rsc
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
@ 2010-07-02 20:34 ` Magnus Lilja
2010-07-05 12:24 ` Marc Kleine-Budde
2010-07-03 5:44 ` Nicolas Pitre
` (5 subsequent siblings)
6 siblings, 1 reply; 29+ messages in thread
From: Magnus Lilja @ 2010-07-02 20:34 UTC (permalink / raw)
To: linux-arm-kernel
Hi Robert,
On 2010-07-02 20:02, Robert Schwebel wrote:
> Hi,
>
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it comes
> to the "recent" ones, and in comparism to the Atom. So we collected a
> few benchmarks (most from lmbench) and did some actual measurements.
>
> Here is a little article:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>
> I'm pretty sure that there are quite a few things where people on ALKML
> have good ideas where the effects come from or how to improve the
> methodology - so I'd be glad to get some feedback from the community!
It would be nice if you could add the exact command lines you used for the different tests so it's easy to run the same tests on other boards as well. I suppose you added some flag to gcc to make use of the floating point hardware in i.MX35?
Could also add information on the RAM characteristics of each board? DDR1/DDR2, 16 bit/32 bit wide, and speed (MHz9.
Regards, Magnus Lilja
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
2010-07-02 20:34 ` Magnus Lilja
@ 2010-07-03 5:44 ` Nicolas Pitre
2010-07-05 13:04 ` Maurus Cuelenaere
2010-07-03 19:48 ` Baruch Siach
` (4 subsequent siblings)
6 siblings, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2010-07-03 5:44 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 2 Jul 2010, Robert Schwebel wrote:
> Hi,
>
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it comes
> to the "recent" ones, and in comparism to the Atom. So we collected a
> few benchmarks (most from lmbench) and did some actual measurements.
>
> Here is a little article:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>
> I'm pretty sure that there are quite a few things where people on ALKML
> have good ideas where the effects come from or how to improve the
> methodology - so I'd be glad to get some feedback from the community!
It would be nice if you could add measurements for recent Marvell
products there, such as the Kirkwood (think SheevaPlug or the like
running at 1.2 GHz), or Dove. I wold expect memory throughput on those
to be quite good.
Also, your article is completely missing on the other metric which is
power consumption. After all what would be most interesting is not the
various performance numbers per MHz but those performance numbers per
Watt.
Nicolas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
2010-07-02 20:34 ` Magnus Lilja
2010-07-03 5:44 ` Nicolas Pitre
@ 2010-07-03 19:48 ` Baruch Siach
2010-07-03 20:08 ` Gilles Chanteperdrix
` (3 subsequent siblings)
6 siblings, 0 replies; 29+ messages in thread
From: Baruch Siach @ 2010-07-03 19:48 UTC (permalink / raw)
To: linux-arm-kernel
Hi Robert,
On Fri, Jul 02, 2010 at 08:02:57PM +0200, Robert Schwebel wrote:
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it comes
> to the "recent" ones, and in comparism to the Atom. So we collected a
> few benchmarks (most from lmbench) and did some actual measurements.
>
> Here is a little article:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
The ARM architecture version in the list of tested hardware is wrong for the
ARM1136 and Cortex-A8 cores. Should be ARMv6 and ARMv7, respectively.
baruch
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
` (2 preceding siblings ...)
2010-07-03 19:48 ` Baruch Siach
@ 2010-07-03 20:08 ` Gilles Chanteperdrix
2010-07-03 20:28 ` Russell King - ARM Linux
2010-07-05 8:51 ` Colin Tuckley
` (2 subsequent siblings)
6 siblings, 1 reply; 29+ messages in thread
From: Gilles Chanteperdrix @ 2010-07-03 20:08 UTC (permalink / raw)
To: linux-arm-kernel
Robert Schwebel wrote:
> Hi,
>
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it comes
> to the "recent" ones, and in comparism to the Atom. So we collected a
> few benchmarks (most from lmbench) and did some actual measurements.
>
> Here is a little article:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>
> I'm pretty sure that there are quite a few things where people on ALKML
> have good ideas where the effects come from or how to improve the
> methodology - so I'd be glad to get some feedback from the community!
>
> All measurements have been done on 2.6.34.
The context switch time for PXA270 looks really suspicious. The worst
case context switch time of an AT91RM9200, an armv4 running at 180MHz,
is less than 300us, so, I doubt that the context switch time of a PXA
can be that worse.
--
Gilles.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-03 20:08 ` Gilles Chanteperdrix
@ 2010-07-03 20:28 ` Russell King - ARM Linux
2010-07-04 9:47 ` Gilles Chanteperdrix
0 siblings, 1 reply; 29+ messages in thread
From: Russell King - ARM Linux @ 2010-07-03 20:28 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Jul 03, 2010 at 10:08:29PM +0200, Gilles Chanteperdrix wrote:
> Robert Schwebel wrote:
> > Hi,
> >
> > We have recently made some benchmarks, in order to get a little bit
> > better fealing about where ARM cpus are today, especially when it comes
> > to the "recent" ones, and in comparism to the Atom. So we collected a
> > few benchmarks (most from lmbench) and did some actual measurements.
> >
> > Here is a little article:
> > http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
> >
> > I'm pretty sure that there are quite a few things where people on ALKML
> > have good ideas where the effects come from or how to improve the
> > methodology - so I'd be glad to get some feedback from the community!
> >
> > All measurements have been done on 2.6.34.
>
> The context switch time for PXA270 looks really suspicious. The worst
> case context switch time of an AT91RM9200, an armv4 running at 180MHz,
> is less than 300us, so, I doubt that the context switch time of a PXA
> can be that worse.
The measurement is of the thread and MM switch time, which'll involve
cache flushes on <= ARMv5.
On PXA, we have to 'read' (via means of D cache line allocations) 32K
of data into the cache in order to cause the existing data to be written
out. These are done from a range of addresses which don't exist in the
page tables, and so should not cause any bus activity other than the
write-outs. However, the cache still has to interact with the MMU to
try to fetch the requested data - which probably consumes some cycles.
ARM920 on the other hand can walk through every cache line and clean+
invalidate it. It has 64 lines in each segment, and 8 segments, each
line 32 bytes long, which gives a cache size of 16K.
I'd therefore expect ARM920's cache flushing to be quicker (in terms
of cycles consumed) than PXA.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-03 20:28 ` Russell King - ARM Linux
@ 2010-07-04 9:47 ` Gilles Chanteperdrix
0 siblings, 0 replies; 29+ messages in thread
From: Gilles Chanteperdrix @ 2010-07-04 9:47 UTC (permalink / raw)
To: linux-arm-kernel
Russell King - ARM Linux wrote:
> On Sat, Jul 03, 2010 at 10:08:29PM +0200, Gilles Chanteperdrix wrote:
>> Robert Schwebel wrote:
>>> Hi,
>>>
>>> We have recently made some benchmarks, in order to get a little bit
>>> better fealing about where ARM cpus are today, especially when it comes
>>> to the "recent" ones, and in comparism to the Atom. So we collected a
>>> few benchmarks (most from lmbench) and did some actual measurements.
>>>
>>> Here is a little article:
>>> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>>>
>>> I'm pretty sure that there are quite a few things where people on ALKML
>>> have good ideas where the effects come from or how to improve the
>>> methodology - so I'd be glad to get some feedback from the community!
>>>
>>> All measurements have been done on 2.6.34.
>> The context switch time for PXA270 looks really suspicious. The worst
>> case context switch time of an AT91RM9200, an armv4 running at 180MHz,
>> is less than 300us, so, I doubt that the context switch time of a PXA
>> can be that worse.
>
> The measurement is of the thread and MM switch time, which'll involve
> cache flushes on <= ARMv5.
>
> On PXA, we have to 'read' (via means of D cache line allocations) 32K
> of data into the cache in order to cause the existing data to be written
> out. These are done from a range of addresses which don't exist in the
> page tables, and so should not cause any bus activity other than the
> write-outs. However, the cache still has to interact with the MMU to
> try to fetch the requested data - which probably consumes some cycles.
>
> ARM920 on the other hand can walk through every cache line and clean+
> invalidate it. It has 64 lines in each segment, and 8 segments, each
> line 32 bytes long, which gives a cache size of 16K.
>
> I'd therefore expect ARM920's cache flushing to be quicker (in terms
> of cycles consumed) than PXA.
Ok. But, the cache flush only accounts for only around 70us on the 290us
of context switch time of the AT91RM9200. Lokks like the cache flush on
PXA would have to be a lot longer.
--
Gilles.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
` (3 preceding siblings ...)
2010-07-03 20:08 ` Gilles Chanteperdrix
@ 2010-07-05 8:51 ` Colin Tuckley
2010-07-05 12:29 ` Marc Kleine-Budde
2010-07-05 12:41 ` Marc Kleine-Budde
2010-07-29 16:54 ` Robert Schwebel
2010-08-19 5:36 ` shiraz hashim
6 siblings, 2 replies; 29+ messages in thread
From: Colin Tuckley @ 2010-07-05 8:51 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> kernel-bounces at lists.infradead.org] On Behalf Of Robert Schwebel
> Sent: 02 July 2010 19:03
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it comes
> to the "recent" ones, and in comparism to the Atom. So we collected a
> few benchmarks (most from lmbench) and did some actual measurements.
You have the family wrong for (at least) the Cortex-A8, it's a v7. This probably accounts for some of it's slow relative performance since you probably used incorrect compiler switches.
> All measurements have been done on 2.6.34.
How were the kernels compiled? Was thumb mode used for the ones that support it?
You didn't mention the cache state anywhere - it can make a big difference.
Regards,
Colin
--
Colin Tuckley - ARM Ltd.
110 Fulbourn Rd
Cambridge, CB1 9NJ
Tel: +44 1223 400536
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 20:34 ` Magnus Lilja
@ 2010-07-05 12:24 ` Marc Kleine-Budde
2010-07-05 14:00 ` Russell King - ARM Linux
0 siblings, 1 reply; 29+ messages in thread
From: Marc Kleine-Budde @ 2010-07-05 12:24 UTC (permalink / raw)
To: linux-arm-kernel
Magnus Lilja wrote:
> Hi Robert,
>
> On 2010-07-02 20:02, Robert Schwebel wrote:
>> Hi,
>>
>> We have recently made some benchmarks, in order to get a little bit
>> better fealing about where ARM cpus are today, especially when it comes
>> to the "recent" ones, and in comparism to the Atom. So we collected a
>> few benchmarks (most from lmbench) and did some actual measurements.
>>
>> Here is a little article:
>> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>>
>> I'm pretty sure that there are quite a few things where people on ALKML
>> have good ideas where the effects come from or how to improve the
>> methodology - so I'd be glad to get some feedback from the community!
>
>
> It would be nice if you could add the exact command lines you used
> for the different tests so it's easy to run the same tests on other
> boards as well. I suppose you added some flag to gcc to make use of
> the floating point hardware in i.MX35?
We used a compiler generating hardware floating point instruction by
default:
gcc was configured with:
"--with-float=softfp --with-fpu=vfp --with-cpu=arm1136jf-s"
here the complete output of gcc -v:
Using built-in specs.
Target: arm-1136jfs-linux-gnueabi
Configured with:
/home/frogger/pengutronix/toolchain/releases/OSELAS.Toolchain-1.99.3/platform-arm-1136jfs-linux-gnueabi-gcc-4.3.2-glibc-2.8-binutils-2.19-kernel-2.6.27-sanitized/build-cross/gcc-4.3.2/configure
--target=arm-1136jfs-linux-gnueabi
--with-sysroot=/home/frogger/pengutronix/toolchain/releases/OSELAS.Toolchain-1.99.3/inst/opt/OSELAS.Toolchain-1.99.3/arm-1136jfs-linux-gnueabi/gcc-4.3.2-glibc-2.8-binutils-2.19-kernel-2.6.27-sanitized/sysroot-arm-1136jfs-linux-gnueabi
--disable-multilib --with-float=softfp --with-fpu=vfp
--with-cpu=arm1136jf-s --enable-__cxa_atexit --disable-sjlj-exceptions
--disable-nls --disable-decimal-float --disable-fixed-point
--disable-win32-registry --enable-symvers=gnu
--with-pkgversion=OSELAS.Toolchain-1.99.3
--with-gmp=/home/frogger/pengutronix/toolchain/releases/OSELAS.Toolchain-1.99.3/platform-arm-1136jfs-linux-gnueabi-gcc-4.3.2-glibc-2.8-binutils-2.19-kernel-2.6.27-sanitized/sysroot-host
--with-mpfr=/home/frogger/pengutronix/toolchain/releases/OSELAS.Toolchain-1.99.3/platform-arm-1136jfs-linux-gnueabi-gcc-4.3.2-glibc-2.8-binutils-2.19-kernel-2.6.27-sanitized/sysroot-host
--prefix=/home/frogger/pengutronix/toolchain/releases/OSELAS.Toolchain-1.99.3/inst/opt/OSELAS.Toolchain-1.99.3/arm-1136jfs-linux-gnueabi/gcc-4.3.2-glibc-2.8-binutils-2.19-kernel-2.6.27-sanitized
--enable-languages=c,c++ --enable-threads=posix --enable-c99
--enable-long-long --enable-libstdcxx-debug --enable-profile
--enable-shared --enable-libssp --enable-checking=release
Thread model: posix
gcc version 4.3.2 (OSELAS.Toolchain-1.99.3)
cheers, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100705/3df9031f/attachment.sig>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 8:51 ` Colin Tuckley
@ 2010-07-05 12:29 ` Marc Kleine-Budde
2010-07-05 12:41 ` Marc Kleine-Budde
1 sibling, 0 replies; 29+ messages in thread
From: Marc Kleine-Budde @ 2010-07-05 12:29 UTC (permalink / raw)
To: linux-arm-kernel
Colin Tuckley wrote:
>> -----Original Message-----
>> kernel-bounces at lists.infradead.org] On Behalf Of Robert Schwebel
>> Sent: 02 July 2010 19:03
>
>> We have recently made some benchmarks, in order to get a little bit
>> better fealing about where ARM cpus are today, especially when it comes
>> to the "recent" ones, and in comparism to the Atom. So we collected a
>> few benchmarks (most from lmbench) and did some actual measurements.
>
> You have the family wrong for (at least) the Cortex-A8, it's a v7. This probably accounts for some of it's slow relative performance since you probably used incorrect compiler switches.
>
>> All measurements have been done on 2.6.34.
>
> How were the kernels compiled? Was thumb mode used for the ones that support it?
gcc version 4.3.2 was for all kernels. The kernel .config will be added
to the website.
> You didn't mention the cache state anywhere - it can make a big difference.
cheers, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100705/811861cb/attachment.sig>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 8:51 ` Colin Tuckley
2010-07-05 12:29 ` Marc Kleine-Budde
@ 2010-07-05 12:41 ` Marc Kleine-Budde
2010-07-05 12:45 ` Marc Kleine-Budde
1 sibling, 1 reply; 29+ messages in thread
From: Marc Kleine-Budde @ 2010-07-05 12:41 UTC (permalink / raw)
To: linux-arm-kernel
Colin Tuckley wrote:
>> -----Original Message-----
>> kernel-bounces at lists.infradead.org] On Behalf Of Robert Schwebel
>> Sent: 02 July 2010 19:03
>
>> We have recently made some benchmarks, in order to get a little bit
>> better fealing about where ARM cpus are today, especially when it comes
>> to the "recent" ones, and in comparism to the Atom. So we collected a
>> few benchmarks (most from lmbench) and did some actual measurements.
>
> You have the family wrong for (at least) the Cortex-A8, it's a v7. This probably accounts for some of it's slow relative performance since you probably used incorrect compiler switches.
>
>> All measurements have been done on 2.6.34.
>
> How were the kernels compiled? Was thumb mode used for the ones that support it?
>
> You didn't mention the cache state anywhere - it can make a big difference.
After booting all tests are done via ssh, the system was "idle" otherwise.
We're just repeating the tests with "echo 1 > /proc/sys/vm/drop_caches"
prior to each test.
cheers, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100705/c98aa45f/attachment.sig>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 12:41 ` Marc Kleine-Budde
@ 2010-07-05 12:45 ` Marc Kleine-Budde
0 siblings, 0 replies; 29+ messages in thread
From: Marc Kleine-Budde @ 2010-07-05 12:45 UTC (permalink / raw)
To: linux-arm-kernel
Marc Kleine-Budde wrote:
> Colin Tuckley wrote:
>>> -----Original Message-----
>>> kernel-bounces at lists.infradead.org] On Behalf Of Robert Schwebel
>>> Sent: 02 July 2010 19:03
>>> We have recently made some benchmarks, in order to get a little bit
>>> better fealing about where ARM cpus are today, especially when it comes
>>> to the "recent" ones, and in comparism to the Atom. So we collected a
>>> few benchmarks (most from lmbench) and did some actual measurements.
>> You have the family wrong for (at least) the Cortex-A8, it's a v7. This probably accounts for some of it's slow relative performance since you probably used incorrect compiler switches.
>>
>>> All measurements have been done on 2.6.34.
>> How were the kernels compiled? Was thumb mode used for the ones that support it?
>>
>> You didn't mention the cache state anywhere - it can make a big difference.
>
> After booting all tests are done via ssh, the system was "idle" otherwise.
>
> We're just repeating the tests with "echo 1 > /proc/sys/vm/drop_caches"
> prior to each test.
We'll use "sync; echo 3 > /proc/sys/vm/drop_caches"
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100705/af25163f/attachment.sig>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-03 5:44 ` Nicolas Pitre
@ 2010-07-05 13:04 ` Maurus Cuelenaere
2010-07-05 13:23 ` Robert Schwebel
0 siblings, 1 reply; 29+ messages in thread
From: Maurus Cuelenaere @ 2010-07-05 13:04 UTC (permalink / raw)
To: linux-arm-kernel
Op 03-07-10 07:44, Nicolas Pitre schreef:
> On Fri, 2 Jul 2010, Robert Schwebel wrote:
>
>> Hi,
>>
>> We have recently made some benchmarks, in order to get a little bit
>> better fealing about where ARM cpus are today, especially when it comes
>> to the "recent" ones, and in comparism to the Atom. So we collected a
>> few benchmarks (most from lmbench) and did some actual measurements.
>>
>> Here is a little article:
>> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>>
>> I'm pretty sure that there are quite a few things where people on ALKML
>> have good ideas where the effects come from or how to improve the
>> methodology - so I'd be glad to get some feedback from the community!
>
> It would be nice if you could add measurements for recent Marvell
> products there, such as the Kirkwood (think SheevaPlug or the like
> running at 1.2 GHz), or Dove. I wold expect memory throughput on those
> to be quite good.
Some quick tests of lmbench on a Sheevaplug:
mcuelenaere at kot:/dev/shm/lmbench3/bin/armv5tel-linux-gnu$ ./lat_ops
integer bit: 0.85 nanoseconds
integer add: 0.02 nanoseconds
integer mul: 0.42 nanoseconds
integer div: 147.77 nanoseconds
integer mod: 36.94 nanoseconds
int64 bit: 1.71 nanoseconds
int64 add: 0.04 nanoseconds
int64 mul: 0.92 nanoseconds
int64 div: 425.89 nanoseconds
int64 mod: 273.85 nanoseconds
float add: 36.25 nanoseconds
float mul: 30.32 nanoseconds
float div: 161.29 nanoseconds
double add: 51.21 nanoseconds
double mul: 46.31 nanoseconds
double div: 542.06 nanoseconds
float bogomflops: 325.59 nanoseconds
double bogomflops: 799.14 nanoseconds
mcuelenaere at kot:/dev/shm/lmbench3/bin/armv5tel-linux-gnu$ mbw 128
Long uses 4 bytes. Allocating 2*33554432 elements = 268435456 bytes of memory.
Using 262144 bytes as blocks for memcpy block copy test.
Getting down to business... Doing 10 runs per test.
0 Method: MEMCPY Elapsed: 0.48203 MiB: 128.00000 Copy: 265.546 MiB/s
1 Method: MEMCPY Elapsed: 0.48165 MiB: 128.00000 Copy: 265.751 MiB/s
2 Method: MEMCPY Elapsed: 0.48163 MiB: 128.00000 Copy: 265.764 MiB/s
3 Method: MEMCPY Elapsed: 0.49714 MiB: 128.00000 Copy: 257.473 MiB/s
4 Method: MEMCPY Elapsed: 0.48168 MiB: 128.00000 Copy: 265.737 MiB/s
5 Method: MEMCPY Elapsed: 0.48163 MiB: 128.00000 Copy: 265.764 MiB/s
6 Method: MEMCPY Elapsed: 0.49695 MiB: 128.00000 Copy: 257.570 MiB/s
7 Method: MEMCPY Elapsed: 0.48196 MiB: 128.00000 Copy: 265.579 MiB/s
8 Method: MEMCPY Elapsed: 0.48164 MiB: 128.00000 Copy: 265.761 MiB/s
9 Method: MEMCPY Elapsed: 0.49695 MiB: 128.00000 Copy: 257.570 MiB/s
AVG Method: MEMCPY Elapsed: 0.48633 MiB: 128.00000 Copy: 263.198 MiB/s
0 Method: DUMB Elapsed: 0.29804 MiB: 128.00000 Copy: 429.475 MiB/s
1 Method: DUMB Elapsed: 0.29807 MiB: 128.00000 Copy: 429.429 MiB/s
2 Method: DUMB Elapsed: 0.29815 MiB: 128.00000 Copy: 429.310 MiB/s
3 Method: DUMB Elapsed: 0.29800 MiB: 128.00000 Copy: 429.530 MiB/s
4 Method: DUMB Elapsed: 0.31337 MiB: 128.00000 Copy: 408.458 MiB/s
5 Method: DUMB Elapsed: 0.29805 MiB: 128.00000 Copy: 429.462 MiB/s
6 Method: DUMB Elapsed: 0.29808 MiB: 128.00000 Copy: 429.411 MiB/s
7 Method: DUMB Elapsed: 0.29801 MiB: 128.00000 Copy: 429.510 MiB/s
8 Method: DUMB Elapsed: 0.29809 MiB: 128.00000 Copy: 429.403 MiB/s
9 Method: DUMB Elapsed: 0.31339 MiB: 128.00000 Copy: 408.437 MiB/s
AVG Method: DUMB Elapsed: 0.30113 MiB: 128.00000 Copy: 425.072 MiB/s
0 Method: MCBLOCK Elapsed: 0.21906 MiB: 128.00000 Copy: 584.317 MiB/s
1 Method: MCBLOCK Elapsed: 0.21554 MiB: 128.00000 Copy: 593.852 MiB/s
2 Method: MCBLOCK Elapsed: 0.21577 MiB: 128.00000 Copy: 593.238 MiB/s
3 Method: MCBLOCK Elapsed: 0.21671 MiB: 128.00000 Copy: 590.646 MiB/s
4 Method: MCBLOCK Elapsed: 0.21479 MiB: 128.00000 Copy: 595.942 MiB/s
5 Method: MCBLOCK Elapsed: 0.23519 MiB: 128.00000 Copy: 544.232 MiB/s
6 Method: MCBLOCK Elapsed: 0.21705 MiB: 128.00000 Copy: 589.734 MiB/s
7 Method: MCBLOCK Elapsed: 0.59684 MiB: 128.00000 Copy: 214.464 MiB/s
8 Method: MCBLOCK Elapsed: 0.21699 MiB: 128.00000 Copy: 589.889 MiB/s
9 Method: MCBLOCK Elapsed: 0.21418 MiB: 128.00000 Copy: 597.642 MiB/s
AVG Method: MCBLOCK Elapsed: 0.25621 MiB: 128.00000 Copy: 499.589 MiB/s
Couldn't get lat_ctx to work.
mcuelenaere at kot:/dev/shm/lmbench3/bin/armv5tel-linux-gnu$ ./lat_syscall open
Simple open/close: 7.2754 microseconds
mcuelenaere at kot:/dev/shm/lmbench3/bin/armv5tel-linux-gnu$ ./lat_syscall open /dev/shm/lmbench3.tar
Simple open/close: 6.9399 microseconds
mcuelenaere at kot:/dev/shm/lmbench3/bin/armv5tel-linux-gnu$ ./lat_proc fork
Process fork+exit: 763.5714 microseconds
mcuelenaere at kot:/dev/shm/lmbench3/bin/armv5tel-linux-gnu$ cat /proc/cpuinfo
Processor : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS : 1192.75
Features : swp half thumb fastmult edsp
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant : 0x2
CPU part : 0x131
CPU revision : 1
Hardware : Marvell SheevaPlug Reference Board
Revision : 0000
Serial : 0000000000000000
I'm not sure if I'm doing this right, but it looks like the Sheevaplug beats all ARM chips (except
on FP) on the tests done at [1]. Looks like these tests heavily depend on the clock frequency.
[1]: http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
--
Maurus Cuelenaere
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 13:04 ` Maurus Cuelenaere
@ 2010-07-05 13:23 ` Robert Schwebel
2010-07-05 13:31 ` Mike Rapoport
2010-07-06 14:02 ` Pavel Machek
0 siblings, 2 replies; 29+ messages in thread
From: Robert Schwebel @ 2010-07-05 13:23 UTC (permalink / raw)
To: linux-arm-kernel
Maurus,
On Mon, Jul 05, 2010 at 03:04:33PM +0200, Maurus Cuelenaere wrote:
> Some quick tests of lmbench on a Sheevaplug:
Thanks a lot - we'll update the document soon, wiht the exact command
lines included.
> I'm not sure if I'm doing this right, but it looks like the Sheevaplug
> beats all ARM chips (except on FP) on the tests done at [1]. Looks
> like these tests heavily depend on the clock frequency.
If anyone has an idea for better benchmarks, I'd be very interested. We
have been searching for benchmarks which:
- can be easily cross compiled to all involved platforms
- show the different aspects of the systems
rsc
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 13:23 ` Robert Schwebel
@ 2010-07-05 13:31 ` Mike Rapoport
2010-07-05 13:42 ` Robert Schwebel
2010-07-05 13:53 ` Marek Vasut
2010-07-06 14:02 ` Pavel Machek
1 sibling, 2 replies; 29+ messages in thread
From: Mike Rapoport @ 2010-07-05 13:31 UTC (permalink / raw)
To: linux-arm-kernel
Robert Schwebel wrote:
> Maurus,
>
> On Mon, Jul 05, 2010 at 03:04:33PM +0200, Maurus Cuelenaere wrote:
>> Some quick tests of lmbench on a Sheevaplug:
>
> Thanks a lot - we'll update the document soon, wiht the exact command
> lines included.
>
>> I'm not sure if I'm doing this right, but it looks like the Sheevaplug
>> beats all ARM chips (except on FP) on the tests done at [1]. Looks
>> like these tests heavily depend on the clock frequency.
>
> If anyone has an idea for better benchmarks, I'd be very interested. We
> have been searching for benchmarks which:
>
> - can be easily cross compiled to all involved platforms
> - show the different aspects of the systems
Native kernel build? ;-)
> rsc
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 13:31 ` Mike Rapoport
@ 2010-07-05 13:42 ` Robert Schwebel
2010-07-05 14:15 ` Nicolas Pitre
2010-07-05 13:53 ` Marek Vasut
1 sibling, 1 reply; 29+ messages in thread
From: Robert Schwebel @ 2010-07-05 13:42 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jul 05, 2010 at 04:31:27PM +0300, Mike Rapoport wrote:
> Native kernel build? ;-)
Hmm, I don't have a native compiler for all platforms yet.
rsc
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 13:31 ` Mike Rapoport
2010-07-05 13:42 ` Robert Schwebel
@ 2010-07-05 13:53 ` Marek Vasut
1 sibling, 0 replies; 29+ messages in thread
From: Marek Vasut @ 2010-07-05 13:53 UTC (permalink / raw)
To: linux-arm-kernel
Dne Po 5. ?ervence 2010 15:31:27 Mike Rapoport napsal(a):
> Robert Schwebel wrote:
> > Maurus,
> >
> > On Mon, Jul 05, 2010 at 03:04:33PM +0200, Maurus Cuelenaere wrote:
> >> Some quick tests of lmbench on a Sheevaplug:
> > Thanks a lot - we'll update the document soon, wiht the exact command
> > lines included.
> >
> >> I'm not sure if I'm doing this right, but it looks like the Sheevaplug
> >> beats all ARM chips (except on FP) on the tests done at [1]. Looks
> >> like these tests heavily depend on the clock frequency.
> >
> > If anyone has an idea for better benchmarks, I'd be very interested. We
> > have been searching for benchmarks which:
> >
> > - can be easily cross compiled to all involved platforms
> > - show the different aspects of the systems
>
> Native kernel build? ;-)
about 7 hours on xscale-pxa270 with 256MB of SDRAM ... compiling all xscale
platforms in :)
>
> > rsc
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 12:24 ` Marc Kleine-Budde
@ 2010-07-05 14:00 ` Russell King - ARM Linux
2010-07-05 15:14 ` Måns Rullgård
0 siblings, 1 reply; 29+ messages in thread
From: Russell King - ARM Linux @ 2010-07-05 14:00 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jul 05, 2010 at 02:24:03PM +0200, Marc Kleine-Budde wrote:
> We used a compiler generating hardware floating point instruction by
> default:
>
> gcc was configured with:
> "--with-float=softfp --with-fpu=vfp --with-cpu=arm1136jf-s"
This is unclear that it's generating hardware floating point instructions.
The --with-float=softfp suggests its using soft-fp.
Please check whether the resulting binaries are using hard-fp or soft-fp
by looking for VFP instructions.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 13:42 ` Robert Schwebel
@ 2010-07-05 14:15 ` Nicolas Pitre
2010-07-06 5:36 ` Mike Rapoport
0 siblings, 1 reply; 29+ messages in thread
From: Nicolas Pitre @ 2010-07-05 14:15 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 5 Jul 2010, Robert Schwebel wrote:
> On Mon, Jul 05, 2010 at 04:31:27PM +0300, Mike Rapoport wrote:
> > Native kernel build? ;-)
>
> Hmm, I don't have a native compiler for all platforms yet.
Kernel compile is a really bad test in this context anyway, unless you
compile the exact same kernel target using the same config with the SAME
gcc version on all test machines, and using the same filesystem type and
medium.
It is best to keep kernel compilation test for validation of
improvements to a single system.
Nicolas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 14:00 ` Russell King - ARM Linux
@ 2010-07-05 15:14 ` Måns Rullgård
0 siblings, 0 replies; 29+ messages in thread
From: Måns Rullgård @ 2010-07-05 15:14 UTC (permalink / raw)
To: linux-arm-kernel
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> On Mon, Jul 05, 2010 at 02:24:03PM +0200, Marc Kleine-Budde wrote:
>> We used a compiler generating hardware floating point instruction by
>> default:
>>
>> gcc was configured with:
>> "--with-float=softfp --with-fpu=vfp --with-cpu=arm1136jf-s"
>
> This is unclear that it's generating hardware floating point instructions.
> The --with-float=softfp suggests its using soft-fp.
That configuration will generate hardware vfp instructions while using
softfloat calling conventions (arguments passed in integer registers).
A pure softfloat configuration would be --with-float=soft.
--
M?ns Rullg?rd
mans at mansr.com
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 14:15 ` Nicolas Pitre
@ 2010-07-06 5:36 ` Mike Rapoport
0 siblings, 0 replies; 29+ messages in thread
From: Mike Rapoport @ 2010-07-06 5:36 UTC (permalink / raw)
To: linux-arm-kernel
Nicolas Pitre wrote:
> On Mon, 5 Jul 2010, Robert Schwebel wrote:
>
>> On Mon, Jul 05, 2010 at 04:31:27PM +0300, Mike Rapoport wrote:
>>> Native kernel build? ;-)
>> Hmm, I don't have a native compiler for all platforms yet.
>
> Kernel compile is a really bad test in this context anyway, unless you
> compile the exact same kernel target using the same config with the SAME
> gcc version on all test machines, and using the same filesystem type and
> medium.
It's quite possible with, say, Debian on USB disk...
> It is best to keep kernel compilation test for validation of
> improvements to a single system.
>
>
> Nicolas
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-05 13:23 ` Robert Schwebel
2010-07-05 13:31 ` Mike Rapoport
@ 2010-07-06 14:02 ` Pavel Machek
1 sibling, 0 replies; 29+ messages in thread
From: Pavel Machek @ 2010-07-06 14:02 UTC (permalink / raw)
To: linux-arm-kernel
Hi!
> > I'm not sure if I'm doing this right, but it looks like the Sheevaplug
> > beats all ARM chips (except on FP) on the tests done at [1]. Looks
> > like these tests heavily depend on the clock frequency.
>
> If anyone has an idea for better benchmarks, I'd be very interested. We
> have been searching for benchmarks which:
>
> - can be easily cross compiled to all involved platforms
> - show the different aspects of the systems
You tested stuff like syscalls/fork latencies... but usually most of
the time is spent in kernel. What about pure userlevel computing
stuff, like mp3 encoding (lame), maybe video transcoding, maybe grep
could be benchmarked, and maybe factor is worth benchmarking
(http://pavelmachek.livejournal.com/77425.html)...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
` (4 preceding siblings ...)
2010-07-05 8:51 ` Colin Tuckley
@ 2010-07-29 16:54 ` Robert Schwebel
2010-07-30 10:19 ` Richard Cochran
2010-08-19 5:36 ` shiraz hashim
6 siblings, 1 reply; 29+ messages in thread
From: Robert Schwebel @ 2010-07-29 16:54 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jul 02, 2010 at 08:02:57PM +0200, Robert Schwebel wrote:
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it
> comes to the "recent" ones, and in comparism to the Atom.
Thanks to everyone who posted feedback!
An updated version of the article is now here:
http://www.pengutronix.de/development/kernel/arm-benchmarks-20100729_en.html
Changes:
- all kernels ported to 2.6.34
- Atom without -rt
- all kernel configs available
- all benchmark commandlines added
- more info about compilers and generated code
- info about memory types and bus widths
rsc
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-29 16:54 ` Robert Schwebel
@ 2010-07-30 10:19 ` Richard Cochran
2010-07-30 11:40 ` Gilles Chanteperdrix
0 siblings, 1 reply; 29+ messages in thread
From: Richard Cochran @ 2010-07-30 10:19 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Jul 29, 2010 at 06:54:13PM +0200, Robert Schwebel wrote:
> Thanks to everyone who posted feedback!
>
> An updated version of the article is now here:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100729_en.html
Nice report.
I would be interesting for me if you could give the FCSE patch a try
on the v5 machines. Any chance of that happening?
Thanks,
Richard
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-30 10:19 ` Richard Cochran
@ 2010-07-30 11:40 ` Gilles Chanteperdrix
0 siblings, 0 replies; 29+ messages in thread
From: Gilles Chanteperdrix @ 2010-07-30 11:40 UTC (permalink / raw)
To: linux-arm-kernel
Richard Cochran wrote:
> On Thu, Jul 29, 2010 at 06:54:13PM +0200, Robert Schwebel wrote:
>> Thanks to everyone who posted feedback!
>>
>> An updated version of the article is now here:
>> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100729_en.html
>
> Nice report.
>
> I would be interesting for me if you could give the FCSE patch a try
> on the v5 machines. Any chance of that happening?
As I already said, I am very suspicious about the results of this
benchmark on PXA. We get user-space scheduling latencies under 300us on
PXA with Xenomai, and as you know, the worst case user-space scheduling
latency includes a context switch, so, this means that the context
switch is less than 300us. However, these benchmarks show some context
switches around 600us, so I suspect the measurement measures more than
just a context switch, maybe the execution of a long standing interrupt
or more probably a soft irq.
The gain induced by the FCSE patch is between 50 and 100us on the
machines where we measured it, so, it will not make a big difference on
a context switch time of 600us.
Anyway, I have not worked on the FCSE patch for 2.6.34, I was waiting
for 2.6.35 to be released to work on the two at a time, but if anyone is
interested, I can get it working before that.
--
Gilles.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
` (5 preceding siblings ...)
2010-07-29 16:54 ` Robert Schwebel
@ 2010-08-19 5:36 ` shiraz hashim
2010-08-19 6:28 ` Robert Schwebel
6 siblings, 1 reply; 29+ messages in thread
From: shiraz hashim @ 2010-08-19 5:36 UTC (permalink / raw)
To: linux-arm-kernel
Hi Robert,
On Fri, Jul 2, 2010 at 11:32 PM, Robert Schwebel
<r.schwebel@pengutronix.de> wrote:
> Hi,
>
> We have recently made some benchmarks, in order to get a little bit
> better fealing about where ARM cpus are today, especially when it comes
> to the "recent" ones, and in comparism to the Atom. So we collected a
> few benchmarks (most from lmbench) and did some actual measurements.
>
> Here is a little article:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
Thanks for this link, it was quite informative. Can you also please
share the detailed report of lmbench tests which you run on these
platforms. For example, average cpu bandwidth figures (as in bw_mem
test) are not giving information on significant impact of caches and
DDR.
--
regards
Shiraz Hashim
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-08-19 5:36 ` shiraz hashim
@ 2010-08-19 6:28 ` Robert Schwebel
2010-08-19 7:10 ` shiraz hashim
0 siblings, 1 reply; 29+ messages in thread
From: Robert Schwebel @ 2010-08-19 6:28 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Thu, Aug 19, 2010 at 11:06:04AM +0530, shiraz hashim wrote:
> > Here is a little article:
> > http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>
> Thanks for this link, it was quite informative. Can you also please
> share the detailed report of lmbench tests which you run on these
> platforms. For example, average cpu bandwidth figures (as in bw_mem
> test) are not giving information on significant impact of caches and
> DDR.
Please check the updated version of the article - the page you've linked
above contains a link at the top. The new version contains all the
commands which have been used.
rsc
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
2010-08-19 6:28 ` Robert Schwebel
@ 2010-08-19 7:10 ` shiraz hashim
0 siblings, 0 replies; 29+ messages in thread
From: shiraz hashim @ 2010-08-19 7:10 UTC (permalink / raw)
To: linux-arm-kernel
Thanks Robert,
On Thu, Aug 19, 2010 at 11:58 AM, Robert Schwebel
<r.schwebel@pengutronix.de> wrote:
> Hi,
>
> On Thu, Aug 19, 2010 at 11:06:04AM +0530, shiraz hashim wrote:
>> > Here is a little article:
>> > http://www.pengutronix.de/development/kernel/arm-benchmarks-20100702_en.html
>>
>> Thanks for this link, it was quite informative. Can you also please
>> share the detailed report of lmbench tests which you run on these
>> platforms. For example, average cpu bandwidth figures (as in bw_mem
>> test) are not giving information on significant impact of caches and
>> DDR.
>
> Please check the updated version of the article - the page you've linked
> above contains a link at the top. The new version contains all the
> commands which have been used.
Now I see this, it is more clear.
--
regards
Shiraz Hashim
^ permalink raw reply [flat|nested] 29+ messages in thread
* Some benchmarks on ARM
@ 2010-07-30 14:47 Tomasz Stanislawski
0 siblings, 0 replies; 29+ messages in thread
From: Tomasz Stanislawski @ 2010-07-30 14:47 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Jul 29, 2010 at 06:54:13PM +0200, Robert Schwebel wrote:
> Thanks to everyone who posted feedback!
>
> An updated version of the article is now here:
> http://www.pengutronix.de/development/kernel/arm-benchmarks-20100729_en.html
Thank you for the interesting analysis. I have conducted your tests on S5PC110
(HummingBird). Cannot tell what the memory type is. Tests fall in line with
BeagleBoard results including clock scaling into account except for test 3
(context switch) which is three times as fast for 60% clock advantage.
There are also minor differences in memory related tests (bandwidth and fork).
All these appears to be conducted by memory subsystem.
LMBench was compiled with instruction:
make CC="arm-linux-gnueabi-gcc -O2 -march=armv7-a -mtune=cortex-a8
-mfloat-abi=softfp -mfpu=neon"
GCC was vanilla-4.4.1 compiler.
LMBench version: 3.0-a9
Environment:
Debian:~/lmbench/bin/i686-pc-linux-gnu# uname -a
Linux Debian 2.6.34-rc6 #2 PREEMPT Fri Jul 30 15:31:57 CEST 2010 armv7l GNU/Linux
Debian:~/lmbench/bin/i686-pc-linux-gnu# cat /proc/cpuinfo
Processor : ARMv7 Processor rev 2 (v7l)
BogoMIPS : 797.90
Features : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc08
CPU revision : 2
----------------------------------------------
Test 1
Debian:~/lmbench/bin/# lat_ops -W 100 -N 100
Result: 12.52
Comments: The latency of a single instruction is closely related to a depth of
CPU's pipeline and as such is not a reliable indicator of usable performance.
The mean performance of IPC is in closer relation with processor's speed.
----------------------------------------------
Test 2
Result: 340.92
----------------------------------------------
Test 3
Result: 11.24
----------------------------------------------
Test 4
Result: 6.8816
----------------------------------------------
Test 5
Result: 780.1429
Best regards,
Tomasz Stanislawski
Samsung Poland R&D Center
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2010-08-19 7:10 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-02 18:02 Some benchmarks on ARM Robert Schwebel
2010-07-02 20:34 ` Magnus Lilja
2010-07-05 12:24 ` Marc Kleine-Budde
2010-07-05 14:00 ` Russell King - ARM Linux
2010-07-05 15:14 ` Måns Rullgård
2010-07-03 5:44 ` Nicolas Pitre
2010-07-05 13:04 ` Maurus Cuelenaere
2010-07-05 13:23 ` Robert Schwebel
2010-07-05 13:31 ` Mike Rapoport
2010-07-05 13:42 ` Robert Schwebel
2010-07-05 14:15 ` Nicolas Pitre
2010-07-06 5:36 ` Mike Rapoport
2010-07-05 13:53 ` Marek Vasut
2010-07-06 14:02 ` Pavel Machek
2010-07-03 19:48 ` Baruch Siach
2010-07-03 20:08 ` Gilles Chanteperdrix
2010-07-03 20:28 ` Russell King - ARM Linux
2010-07-04 9:47 ` Gilles Chanteperdrix
2010-07-05 8:51 ` Colin Tuckley
2010-07-05 12:29 ` Marc Kleine-Budde
2010-07-05 12:41 ` Marc Kleine-Budde
2010-07-05 12:45 ` Marc Kleine-Budde
2010-07-29 16:54 ` Robert Schwebel
2010-07-30 10:19 ` Richard Cochran
2010-07-30 11:40 ` Gilles Chanteperdrix
2010-08-19 5:36 ` shiraz hashim
2010-08-19 6:28 ` Robert Schwebel
2010-08-19 7:10 ` shiraz hashim
2010-07-30 14:47 Tomasz Stanislawski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.