linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Scheduler degradation since 2.5.66
@ 2003-12-14 19:48 Guillaume Foliard
  2003-12-15  2:45 ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Guillaume Foliard @ 2003-12-14 19:48 UTC (permalink / raw)
  To: linux-kernel

Hello,

I have been playing with kernel 2.5/2.6 for around 6 months now. I was quite 
pleased with 2.5.65 to see that the soft real-time behaviour was much better 
than 2.4.x. Since then I tried most of the 2.5/2.6 versions. But recently 
someone warned me about some degradations with 2.6.0-test6. To show the 
degradation since 2.5.66 I have run a simple test program on most of the 
versions. This simple program is measuring the time it takes to a process to 
be woken up after a call to nanosleep.
As the results are plots, please visit this small website for more 
information : http://perso.wanadoo.fr/kayakgabon/linux
I'm ready to perform more tests or provide more information if necessary.

Guillaume

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2003-12-14 19:48 Scheduler degradation since 2.5.66 Guillaume Foliard
@ 2003-12-15  2:45 ` Nick Piggin
  2003-12-15  4:18   ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2003-12-15  2:45 UTC (permalink / raw)
  To: Guillaume Foliard; +Cc: linux-kernel, george anzinger



Guillaume Foliard wrote:

>Hello,
>
>I have been playing with kernel 2.5/2.6 for around 6 months now. I was quite 
>pleased with 2.5.65 to see that the soft real-time behaviour was much better 
>than 2.4.x. Since then I tried most of the 2.5/2.6 versions. But recently 
>someone warned me about some degradations with 2.6.0-test6. To show the 
>degradation since 2.5.66 I have run a simple test program on most of the 
>versions. This simple program is measuring the time it takes to a process to 
>be woken up after a call to nanosleep.
>As the results are plots, please visit this small website for more 
>information : http://perso.wanadoo.fr/kayakgabon/linux
>I'm ready to perform more tests or provide more information if necessary.
>

This isn't a problem with the scheduler, its a problem with sys_nanosleep.
jiffies_to_timespec( {1000000us} ) returns 2 jiffies, and nanosleep adds
an extra one and asks to sleep for that long (ie. 3ms).

The more erratic timings could be due to interactivity changes as you say,
but you probably aren't running without RT priority



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2003-12-15  2:45 ` Nick Piggin
@ 2003-12-15  4:18   ` Nick Piggin
  2003-12-16  0:39     ` George Anzinger
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2003-12-15  4:18 UTC (permalink / raw)
  To: Guillaume Foliard; +Cc: linux-kernel, george anzinger



Nick Piggin wrote:

>
>
> Guillaume Foliard wrote:
>
>> Hello,
>>
>> I have been playing with kernel 2.5/2.6 for around 6 months now. I 
>> was quite pleased with 2.5.65 to see that the soft real-time 
>> behaviour was much better than 2.4.x. Since then I tried most of the 
>> 2.5/2.6 versions. But recently someone warned me about some 
>> degradations with 2.6.0-test6. To show the degradation since 2.5.66 I 
>> have run a simple test program on most of the versions. This simple 
>> program is measuring the time it takes to a process to be woken up 
>> after a call to nanosleep.
>> As the results are plots, please visit this small website for more 
>> information : http://perso.wanadoo.fr/kayakgabon/linux
>> I'm ready to perform more tests or provide more information if 
>> necessary.
>>
>
> This isn't a problem with the scheduler, its a problem with 
> sys_nanosleep.
> jiffies_to_timespec( {1000000us} ) returns 2 jiffies, and nanosleep adds
> an extra one and asks to sleep for that long (ie. 3ms).


I think you should actually sleep for 2 jiffies here. You have asked
to sleep for _at least_ 1 real millisecond and you really don't care
about the number of jiffies that is. Depending on when the last timer
interrupt had fired, the next jiffy might be in another microsecond.

So I think you really must sleep for that extra jiffy (but 3 is too
many I think). Notice your first graphs are actually bad, because
some sleeps are much less than 1000us.

I don't know much about the timer code though, perhaps you do need to
sleep for 3 jiffies...

>
> The more erratic timings could be due to interactivity changes as you 
> say,
> but you probably aren't running without RT priority


s/without/with


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2003-12-15  4:18   ` Nick Piggin
@ 2003-12-16  0:39     ` George Anzinger
  2003-12-16  0:52       ` Jamie Lokier
  2004-01-15  0:43       ` Bill Davidsen
  0 siblings, 2 replies; 8+ messages in thread
From: George Anzinger @ 2003-12-16  0:39 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Guillaume Foliard, linux-kernel

Nick Piggin wrote:
> 
> 
> Nick Piggin wrote:
> 
>>
>>
>> Guillaume Foliard wrote:
>>
>>> Hello,
>>>
>>> I have been playing with kernel 2.5/2.6 for around 6 months now. I 
>>> was quite pleased with 2.5.65 to see that the soft real-time 
>>> behaviour was much better than 2.4.x. Since then I tried most of the 
>>> 2.5/2.6 versions. But recently someone warned me about some 
>>> degradations with 2.6.0-test6. To show the degradation since 2.5.66 I 
>>> have run a simple test program on most of the versions. This simple 
>>> program is measuring the time it takes to a process to be woken up 
>>> after a call to nanosleep.
>>> As the results are plots, please visit this small website for more 
>>> information : http://perso.wanadoo.fr/kayakgabon/linux
>>> I'm ready to perform more tests or provide more information if 
>>> necessary.
>>>
>>
>> This isn't a problem with the scheduler, its a problem with 
>> sys_nanosleep.
>> jiffies_to_timespec( {1000000us} ) returns 2 jiffies, and nanosleep adds
>> an extra one and asks to sleep for that long (ie. 3ms).
> 
> 
> 
> I think you should actually sleep for 2 jiffies here. You have asked
> to sleep for _at least_ 1 real millisecond and you really don't care
> about the number of jiffies that is. Depending on when the last timer
> interrupt had fired, the next jiffy might be in another microsecond.
> 
> So I think you really must sleep for that extra jiffy (but 3 is too
> many I think). Notice your first graphs are actually bad, because
> some sleeps are much less than 1000us.
> 
> I don't know much about the timer code though, perhaps you do need to
> sleep for 3 jiffies...


We get the request at some time t between tick tt and tt+1 to sleep for N ticks.
We round this up to the next higher tick count convert to jiffies dropping any 
fraction and then add 1.  So that should be 2 right?  This is added to NOW 
which, in the test code, is pretty well pined to the last tick plus processing 
time.  So why do you see 3?

What is missing here is that the request was for 1.000000 ms and a tick is 
really 0.999849 ms.  So the request is for a bit more than a tick which we are 
obligated to round up to 2 ticks.  Then adding the 1 tick guard we get the 3 you 
are seeing.  Now if you actually look at that elapsed time you should see it at 
about 2.999547 ms and ranging down to 1.999698 ms.

Try running the test with a requested sleep time of something less than 0.999849 
ms.  All this is for the x86 which is using this time to do the best it can with 
the PIT which can only get this close to 1 ms ticks.  You can even vary this 
number to see exactly where the round up actually happens.  Ah, life in the nano 
world :)


-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2003-12-16  0:39     ` George Anzinger
@ 2003-12-16  0:52       ` Jamie Lokier
  2003-12-16  7:45         ` George Anzinger
  2004-01-15  0:43       ` Bill Davidsen
  1 sibling, 1 reply; 8+ messages in thread
From: Jamie Lokier @ 2003-12-16  0:52 UTC (permalink / raw)
  To: George Anzinger; +Cc: Nick Piggin, Guillaume Foliard, linux-kernel

George Anzinger wrote:
> Try running the test with a requested sleep time of something less than 
> 0.999849 ms.  All this is for the x86 which is using this time to do the 
> best it can with the PIT which can only get this close to 1 ms ticks.  You 
> can even vary this number to see exactly where the round up actually 
> happens.  Ah, life in the nano world :)

Would it be better to program the PIT for lowest frequency that's >= 1.0ms.

-- Jamie

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2003-12-16  0:52       ` Jamie Lokier
@ 2003-12-16  7:45         ` George Anzinger
  0 siblings, 0 replies; 8+ messages in thread
From: George Anzinger @ 2003-12-16  7:45 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Nick Piggin, Guillaume Foliard, linux-kernel

Jamie Lokier wrote:
> George Anzinger wrote:
> 
>>Try running the test with a requested sleep time of something less than 
>>0.999849 ms.  All this is for the x86 which is using this time to do the 
>>best it can with the PIT which can only get this close to 1 ms ticks.  You 
>>can even vary this number to see exactly where the round up actually 
>>happens.  Ah, life in the nano world :)
> 
> 
> Would it be better to program the PIT for lowest frequency that's >= 1.0ms.

Possibly.  I haven't attempted to analize it.  I do know it would make some of 
the math a bear.  Integers like to round down (read truncate) so...  But then, 
what we have isn't exactly fun :)

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2003-12-16  0:39     ` George Anzinger
  2003-12-16  0:52       ` Jamie Lokier
@ 2004-01-15  0:43       ` Bill Davidsen
  2004-01-15  3:36         ` George Anzinger
  1 sibling, 1 reply; 8+ messages in thread
From: Bill Davidsen @ 2004-01-15  0:43 UTC (permalink / raw)
  To: linux-kernel, George Anzinger
  Cc: Nick Piggin, Guillaume Foliard, linux-kernel

George Anzinger wrote:

> We get the request at some time t between tick tt and tt+1 to sleep for 
> N ticks.
> We round this up to the next higher tick count convert to jiffies 
> dropping any fraction and then add 1.  So that should be 2 right?  This 
> is added to NOW which, in the test code, is pretty well pined to the 
> last tick plus processing time.  So why do you see 3?
> 
> What is missing here is that the request was for 1.000000 ms and a tick 
> is really 0.999849 ms.  So the request is for a bit more than a tick 
> which we are obligated to round up to 2 ticks.  Then adding the 1 tick 
> guard we get the 3 you are seeing.  Now if you actually look at that 
> elapsed time you should see it at about 2.999547 ms and ranging down to 
> 1.999698 ms.

Clearly the rounding between what you want and the resolution of the 
hardware tick is never going to be perfect if there is a non-integer 
ratio between the values. If this is a real concern, you can play with 
the algorithm and/or go to a faster clock. Or both.

You might also be much happier simply setting target times 2ms apart, 
and sleeping for target-NOW ns. That allows for the processing time.

If the kernel had a better idea of when the next tick would be instead 
of assuming counting from NOW instead of "last tick" you could probably 
do better, but I'm not suggesting that overhead be added to the ticks 
code in case someone needs a better nanosleep. I don't know how well 
that would work in the SMP case in any event. Sort of
   wait_ticks = 1 + int((NOW + delay - time_since_last_tick)/ns_per_tick)
or
   wait_ticks =
     int((NOW-delay - time_since_tick + ns_per_tick - 1)/ns_per_tick)

I think there's too much caution about going over, but without playing 
with the code I'm just dropping ideas.
-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Scheduler degradation since 2.5.66
  2004-01-15  0:43       ` Bill Davidsen
@ 2004-01-15  3:36         ` George Anzinger
  0 siblings, 0 replies; 8+ messages in thread
From: George Anzinger @ 2004-01-15  3:36 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Nick Piggin, Guillaume Foliard, linux-kernel

Bill Davidsen wrote:
> George Anzinger wrote:
> 
>> We get the request at some time t between tick tt and tt+1 to sleep 
>> for N ticks.
>> We round this up to the next higher tick count convert to jiffies 
>> dropping any fraction and then add 1.  So that should be 2 right?  
>> This is added to NOW which, in the test code, is pretty well pined to 
>> the last tick plus processing time.  So why do you see 3?
>>
>> What is missing here is that the request was for 1.000000 ms and a 
>> tick is really 0.999849 ms.  So the request is for a bit more than a 
>> tick which we are obligated to round up to 2 ticks.  Then adding the 1 
>> tick guard we get the 3 you are seeing.  Now if you actually look at 
>> that elapsed time you should see it at about 2.999547 ms and ranging 
>> down to 1.999698 ms.
> 
> 
> Clearly the rounding between what you want and the resolution of the 
> hardware tick is never going to be perfect if there is a non-integer 
> ratio between the values. If this is a real concern, you can play with 
> the algorithm and/or go to a faster clock. Or both.
> 
> You might also be much happier simply setting target times 2ms apart, 
> and sleeping for target-NOW ns. That allows for the processing time.
> 
> If the kernel had a better idea of when the next tick would be instead 
> of assuming counting from NOW instead of "last tick" you could probably 
> do better, 

But then you have a better resolution.  For this, see the high-res-timers patch 
in my signature, which will get you much closer, but still plays by the standard 
rules.

but I'm not suggesting that overhead be added to the ticks
> code in case someone needs a better nanosleep. I don't know how well 
> that would work in the SMP case in any event. Sort of
>   wait_ticks = 1 + int((NOW + delay - time_since_last_tick)/ns_per_tick)
> or
>   wait_ticks =
>     int((NOW-delay - time_since_tick + ns_per_tick - 1)/ns_per_tick)
> 
> I think there's too much caution about going over, but without playing 
> with the code I'm just dropping ideas.

The "caution" is around the standard that says "thou shalt never wake early" or 
words to that effect.

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-01-15  3:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-14 19:48 Scheduler degradation since 2.5.66 Guillaume Foliard
2003-12-15  2:45 ` Nick Piggin
2003-12-15  4:18   ` Nick Piggin
2003-12-16  0:39     ` George Anzinger
2003-12-16  0:52       ` Jamie Lokier
2003-12-16  7:45         ` George Anzinger
2004-01-15  0:43       ` Bill Davidsen
2004-01-15  3:36         ` George Anzinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).