All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91?
@ 2010-08-23 16:39 Agustin Ferrin Pozuelo
  2010-08-27 10:35 ` Thomas Gleixner
  0 siblings, 1 reply; 5+ messages in thread
From: Agustin Ferrin Pozuelo @ 2010-08-23 16:39 UTC (permalink / raw)
  To: linux-rt-users

Hello,

I am fine tuning the configuration for an ARM system derived from 
AT91SAM9263-EK.

My goal is to minimize latency, and I am using "cyclictest" from 
rt-tools v0.78 for measuring it.

I get consistently better latency with PREEMPT_DESKTOP over what I get 
with PREEMPT_RT. This is an example for a very simple test run, which 
reflects the overall results I am getting:

    ### PREEMPT-RT, HRT, no NO_HZ, no RTC, no USE_SLOW_CLOCK

    root@at91sam9263cpc:~# time cyclictest -i 700000 -r -p 80 -l 33
    Clock resolution: 0.000000001 (s.ns)
    policy: fifo: loadavg: 0.11 0.10 0.04 1/49 884

    T: 0 (  861) P:80 I:700000 C:     33 Min:    370 Act:  627 Avg:  437 Max:     627
    real	0m 23.51s
    user	0m 0.38s
    sys	0m 2.20s

    ### PREEMPT-DESKTOP (no RT), HRT, no NO_HZ, no USE_SLOW_CLOCK
    root@at91sam9263cpc:~# time cyclictest -i 700000 -r -p 80 -l 33
    Clock resolution: 0.000000001 (s.ns)
    policy: fifo: loadavg: 0.24 0.08 0.03 1/46 600

    T: 0 (  574) P:80 I:700000 C:     33 Min:    173 Act:  196 Avg:  222 Max:     378
    real	0m 23.66s
    user	0m 0.32s
    sys	0m 1.29s
       

That is, I get much better min, avg, and max latency with 
"PREEMPT_DESKTOP". How can this be?

Anyway, the results seem worse than the <150us worst case) obtained with 
2.6.24.7-rt27 by Remy Bohmer 
<https://rt.wiki.kernel.org/index.php/CONFIG_PREEMPT_RT_Patch#Platforms_Tested_and_in_Use_with_CONFIG_PREEMPT_RT>, 
though he was apparently/ not using 'Tickless and HRT' /because of CPU 
usage issues... But without those I can't get cyclictest to run properly!

My guess is that the RT patch adds code (checks, locks) to may critical 
sections that happens to be specially slow on ARM9...

Clues & comments welcome!
--Agustín.

-- 
[CG logo]

Agustín Ferrín Pozuelo

Embedded Systems Engineer

CG Power Systems Ireland Limited

Automation Systems Division

Herbert House, Harmony Row, Dublin 2, Ireland.

Phone: +353 1 4153700 Web: www.cgglobal.com <http://www.cgglobal.com>

Save the environment. Please print only if essential.



CG DISCLAIMER: This email contains confidential information. It is intended exclusively for the addressees. If you are not an addressee, you must not store, transmit or disclose its contents. Instead please notify the sender immediately; and permanently delete this e-mail from your computer systems. We have taken reasonable precautions to ensure that no viruses are present. However, you must check this email and the attachments, for viruses. We accept no liability whatsoever, for any detriment caused by any transmitted virus.

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91?
  2010-08-23 16:39 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91? Agustin Ferrin Pozuelo
@ 2010-08-27 10:35 ` Thomas Gleixner
  2010-09-15 15:42   ` Agustin Ferrin Pozuelo
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2010-08-27 10:35 UTC (permalink / raw)
  To: Agustin Ferrin Pozuelo; +Cc: linux-rt-users

On Mon, 23 Aug 2010, Agustin Ferrin Pozuelo wrote:

> Hello,
> 
> I am fine tuning the configuration for an ARM system derived from
> AT91SAM9263-EK.
> 
> My goal is to minimize latency, and I am using "cyclictest" from rt-tools
> v0.78 for measuring it.
> 
> I get consistently better latency with PREEMPT_DESKTOP over what I get with
> PREEMPT_RT. This is an example for a very simple test run, which reflects the
> overall results I am getting:
> 
>    ### PREEMPT-RT, HRT, no NO_HZ, no RTC, no USE_SLOW_CLOCK
> 
>    root@at91sam9263cpc:~# time cyclictest -i 700000 -r -p 80 -l 33
>    Clock resolution: 0.000000001 (s.ns)
>    policy: fifo: loadavg: 0.11 0.10 0.04 1/49 884
> 
>    T: 0 (  861) P:80 I:700000 C:     33 Min:    370 Act:  627 Avg:  437 Max:
> 627
>    real	0m 23.51s
>    user	0m 0.38s
>    sys	0m 2.20s
> 
>    ### PREEMPT-DESKTOP (no RT), HRT, no NO_HZ, no USE_SLOW_CLOCK
>    root@at91sam9263cpc:~# time cyclictest -i 700000 -r -p 80 -l 33
>    Clock resolution: 0.000000001 (s.ns)
>    policy: fifo: loadavg: 0.24 0.08 0.03 1/46 600
> 
>    T: 0 (  574) P:80 I:700000 C:     33 Min:    173 Act:  196 Avg:  222 Max:
> 378
>    real	0m 23.66s
>    user	0m 0.32s
>    sys	0m 1.29s

33 loops are not really giving you any useful information. Also run
both tests with some background load and not on a fully idle system.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91?
  2010-08-27 10:35 ` Thomas Gleixner
@ 2010-09-15 15:42   ` Agustin Ferrin Pozuelo
  2010-09-15 16:08     ` Thomas Gleixner
  0 siblings, 1 reply; 5+ messages in thread
From: Agustin Ferrin Pozuelo @ 2010-09-15 15:42 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-rt-users

  On 27/08/10 11:35, Thomas Gleixner wrote:
> On Mon, 23 Aug 2010, Agustin Ferrin Pozuelo wrote:
>> Hello,
>>
>> I am fine tuning the configuration for an ARM system derived from
>> AT91SAM9263-EK.
>>
>> My goal is to minimize latency, and I am using "cyclictest" from rt-tools
>> v0.78 for measuring it.
>>
>> I get consistently better latency with PREEMPT_DESKTOP over what I get with
>> PREEMPT_RT. This is an example for a very simple test run, which reflects the
>> overall results I am getting:
>>
>>     ### PREEMPT-RT, HRT, no NO_HZ, no RTC, no USE_SLOW_CLOCK
>>
>>     root@at91sam9263cpc:~# time cyclictest -i 700000 -r -p 80 -l 33
>>     Clock resolution: 0.000000001 (s.ns)
>>     policy: fifo: loadavg: 0.11 0.10 0.04 1/49 884
>>
>>     T: 0 (  861) P:80 I:700000 C:     33 Min:    370 Act:  627 Avg:  437 Max:
>> 627
>>     real	0m 23.51s
>>     user	0m 0.38s
>>     sys	0m 2.20s
>>
>>     ### PREEMPT-DESKTOP (no RT), HRT, no NO_HZ, no USE_SLOW_CLOCK
>>     root@at91sam9263cpc:~# time cyclictest -i 700000 -r -p 80 -l 33
>>     Clock resolution: 0.000000001 (s.ns)
>>     policy: fifo: loadavg: 0.24 0.08 0.03 1/46 600
>>
>>     T: 0 (  574) P:80 I:700000 C:     33 Min:    173 Act:  196 Avg:  222 Max:
>> 378
>>     real	0m 23.66s
>>     user	0m 0.32s
>>     sys	0m 1.29s
> 33 loops are not really giving you any useful information. Also run
> both tests with some background load and not on a fully idle system.
Sorry, my test setup is not very comprehensive at the moment. I have 
some overnight/overweekend results at hand with millions of loops:

     PREEMPT-DESKTOP:
     root@at91sam9263cpc:~# time cyclictest -i 70000 -p 80 -h 700 -r
     Clock resolution: 0.000000001 (s.ns)
     policy: fifo: loadavg: 0.07 0.09 0.02 1/38 8770
     T: 0 (  980) P:80 I:70000 C:1328241 Min:    135 Act:  218 Avg:  204 
Max:     720

     PREEMPT-RT:
     root@at91sam9263cpc:~# time cyclictest -i 70000 -p 80 -h 700 -r
     Clock resolution: 0.000000001 (s.ns)
     policy: fifo: loadavg: 0.97 1.12 1.18 4/51 31290
     T: 0 (  675) P:80 I:70000 C:3688312 Min:  244    Act:  358 Avg:  
395 Max:     732

Min and avg latencies much worse in PREEMPT-RT on otherwise same setup. 
Worst case similar though.

System is not loaded but many interrupts happen because it is running it 
interactively through ssh+ethernet. (Had to force ethernet to 10Mbit to 
avoid frequent buffer underruns due to sam9263 bug).

I am assuming context switching is more expensive on PREEMPT-RT under 
ARM9, where it seems already a bit expensive.

--Agustín.
-- 
[CG logo]

Agustín Ferrín Pozuelo

Embedded Systems Engineer

CG Power Systems Ireland Limited

Automation Systems Division

Herbert House, Harmony Row, Dublin 2, Ireland.

Phone: +353 1 4153700 Web: www.cgglobal.com <http://www.cgglobal.com>

Save the environment. Please print only if essential.



CG DISCLAIMER: This email contains confidential information. It is intended exclusively for the addressees. If you are not an addressee, you must not store, transmit or disclose its contents. Instead please notify the sender immediately; and permanently delete this e-mail from your computer systems. We have taken reasonable precautions to ensure that no viruses are present. However, you must check this email and the attachments, for viruses. We accept no liability whatsoever, for any detriment caused by any transmitted virus.

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91?
  2010-09-15 15:42   ` Agustin Ferrin Pozuelo
@ 2010-09-15 16:08     ` Thomas Gleixner
  2010-09-16 14:31       ` Agustin Ferrin Pozuelo
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2010-09-15 16:08 UTC (permalink / raw)
  To: Agustin Ferrin Pozuelo; +Cc: linux-rt-users

On Wed, 15 Sep 2010, Agustin Ferrin Pozuelo wrote:
>  On 27/08/10 11:35, Thomas Gleixner wrote:
> > On Mon, 23 Aug 2010, Agustin Ferrin Pozuelo wrote:
> Sorry, my test setup is not very comprehensive at the moment. I have some
> overnight/overweekend results at hand with millions of loops:
> 
>     PREEMPT-DESKTOP:
>     root@at91sam9263cpc:~# time cyclictest -i 70000 -p 80 -h 700 -r

Oh, you are running cyclictest with a signal based timer and relative
time. Any reason for this ?

What happens if you change the command line to

   cyclictest -i 70000 -p 80 -n

> I am assuming context switching is more expensive on PREEMPT-RT under ARM9,
> where it seems already a bit expensive.

No, the context switch is equally expensive, but signal delivery might
be a bit more overhead on -RT

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91?
  2010-09-15 16:08     ` Thomas Gleixner
@ 2010-09-16 14:31       ` Agustin Ferrin Pozuelo
  0 siblings, 0 replies; 5+ messages in thread
From: Agustin Ferrin Pozuelo @ 2010-09-16 14:31 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-rt-users

  On 15/09/10 17:08, Thomas Gleixner wrote:
> On Wed, 15 Sep 2010, Agustin Ferrin Pozuelo wrote:
>>   On 27/08/10 11:35, Thomas Gleixner wrote:
>>> On Mon, 23 Aug 2010, Agustin Ferrin Pozuelo wrote:
>> Sorry, my test setup is not very comprehensive at the moment. I have some
>> overnight/overweekend results at hand with millions of loops:
>>
>>      PREEMPT-DESKTOP:
>>      root@at91sam9263cpc:~# time cyclictest -i 70000 -p 80 -h 700 -r
> Oh, you are running cyclictest with a signal based timer and relative
> time. Any reason for this ?
Not sure about the "signal based timer", is there an alternative to it? 
(clock_nanosleep?)

The relative time setting "-r" I put there to prevent issues with some 
specific configurations, specially when the interval was smaller than 
max latency, in which case cyclictest would freeze or give unusable 
results. That was happening often before using the workaround for the 
9263 ethernet bug which induced long latencies (non RT at all).

> What happens if you change the command line to
>
>     cyclictest -i 70000 -p 80 -n
>
Interesting, "-n" setting (use clock_nanosleep) brings down the min and 
avg latencies for 35~40 us. (~70 when using "-r") on the PREEMPT-DESKTOP 
configuration:

     root@at91sam9263cpc:~# cyclictest -i 70000 -p 80 -l 1000
     Clock resolution: 0.000000001 (s.ns)
     policy: fifo: loadavg: 0.14 0.13 0.06 1/36 954
     T: 0 (  876) P:80 I:70000 C:   1000 Min:    107 Act:  135 Avg:  164 
Max:     262

     root@at91sam9263cpc:~# cyclictest -i 70000 -p 80 -l 1000 -n
     Clock resolution: 0.000000001 (s.ns)
     policy: fifo: loadavg: 0.14 0.12 0.05 1/37 874
     T: 0 (  795) P:80 I:70000 C:   1000 Min:     72 Act:  127 Avg:  125 
Max:     275

     root@at91sam9263cpc:~# cyclictest -i 70000 -p 80 -l 1000 -r
     Clock resolution: 0.000000001 (s.ns)
     policy: fifo: loadavg: 0.04 0.10 0.06 1/36 1035
     T: 0 (  956) P:80 I:70000 C:   1000 Min:    145 Act:  205 Avg:  203 
Max:     348

     root@at91sam9263cpc:~# cyclictest -i 70000 -p 80 -l 1000 -r -n
     Clock resolution: 0.000000001 (s.ns)
     policy: fifo: loadavg: 0.01 0.07 0.05 2/37 1115
     T: 0 ( 1037) P:80 I:70000 C:   1000 Min:     87 Act:  135 Avg:  150 
Max:     439

(Longer runs giving similar results)
Note that"-r" seems to lower CPU usage, is loadavg 0.14 == 14% or 0.0014%?
Right now I don't have the PREEMPT-RT version available for this test, 
should I expect big improvement there?

>> I am assuming context switching is more expensive on PREEMPT-RT under ARM9,
>> where it seems already a bit expensive.
> No, the context switch is equally expensive, but signal delivery might
> be a bit more overhead on -RT
Thanks for the clarification and tips!
Regards,
--Agustín.

-- 
[CG logo]

Agustín Ferrín Pozuelo

Embedded Systems Engineer

CG Power Systems Ireland Limited

Automation Systems Division

Herbert House, Harmony Row, Dublin 2, Ireland.

Phone: +353 1 4153700 Web: www.cgglobal.com <http://www.cgglobal.com>

Save the environment. Please print only if essential.



CG DISCLAIMER: This email contains confidential information. It is intended exclusively for the addressees. If you are not an addressee, you must not store, transmit or disclose its contents. Instead please notify the sender immediately; and permanently delete this e-mail from your computer systems. We have taken reasonable precautions to ensure that no viruses are present. However, you must check this email and the attachments, for viruses. We accept no liability whatsoever, for any detriment caused by any transmitted virus.

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-09-16 14:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-23 16:39 2.6.33.7-rt29 PREEMPT_RT worse latency than PREEMPT_DESKTOP on AT91? Agustin Ferrin Pozuelo
2010-08-27 10:35 ` Thomas Gleixner
2010-09-15 15:42   ` Agustin Ferrin Pozuelo
2010-09-15 16:08     ` Thomas Gleixner
2010-09-16 14:31       ` Agustin Ferrin Pozuelo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.