linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Jiffy based timers/timeouts can expire too soon.
@ 2004-12-02 16:05 David Vrabel
  2004-12-02 16:35 ` Chris Friesen
  2004-12-02 18:47 ` john stultz
  0 siblings, 2 replies; 5+ messages in thread
From: David Vrabel @ 2004-12-02 16:05 UTC (permalink / raw)
  To: Linux Kernel

Hi,

Jiffy based timers and timeouts can expire too soon because the timer 
interrupt accounts for lost ticks and can increment jiffies by more than 1.

Consider the following:

     unsigned long timeout = jiffies + 1;

    <--- timer interrupt here:
         jiffies += 2 (i.e., catching up one missed interrupt)

    if (time_after(jiffies, timeout))
	/* but 1 tick worth of time hasn't (necessarily) elapsed */

This was originally observed on an ARM platform[1] but the i386 timer 
interrupt appears to behave in a similar way.

Is this solution here to:

1. Not use jiffies for timers/timeouts with only a few ticks?

or

2. Have two independant "jiffies": the existing one which is used for 
the wallclock only; and one which counts the number of timer interrupts 
and will guarantee that timers don't expire prematurely?

or

3. Something else?

David Vrabel

[1] 
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2004-December/025695.html
-- 
David Vrabel, Design Engineer

Arcom, Clifton Road           Tel: +44 (0)1223 411200 ext. 3233
Cambridge CB1 7EA, UK         Web: http://www.arcom.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Jiffy based timers/timeouts can expire too soon.
  2004-12-02 16:05 Jiffy based timers/timeouts can expire too soon David Vrabel
@ 2004-12-02 16:35 ` Chris Friesen
  2004-12-02 18:47 ` john stultz
  1 sibling, 0 replies; 5+ messages in thread
From: Chris Friesen @ 2004-12-02 16:35 UTC (permalink / raw)
  To: David Vrabel; +Cc: Linux Kernel

David Vrabel wrote:
> Hi,
> 
> Jiffy based timers and timeouts can expire too soon because the timer 
> interrupt accounts for lost ticks and can increment jiffies by more than 1.

On the other hand, you also need to account for lost ticks for timers that 
started before interrupts were turned off, otherwise they could run for extra time.

It would be nice to have some kind of constant frequency timestamp that 
increments regardless of sleep state or interrupt status, so that sleep periods 
and such are based on timestamps rather than ticks (the period of which can vary).

Chris


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Jiffy based timers/timeouts can expire too soon.
  2004-12-02 16:05 Jiffy based timers/timeouts can expire too soon David Vrabel
  2004-12-02 16:35 ` Chris Friesen
@ 2004-12-02 18:47 ` john stultz
  2004-12-02 23:28   ` Anton Blanchard
  1 sibling, 1 reply; 5+ messages in thread
From: john stultz @ 2004-12-02 18:47 UTC (permalink / raw)
  To: David Vrabel; +Cc: Linux Kernel, mann

On Thu, 2004-12-02 at 08:05, David Vrabel wrote:
> Jiffy based timers and timeouts can expire too soon because the timer 
> interrupt accounts for lost ticks and can increment jiffies by more than 1.
> 
> Consider the following:
> 
>      unsigned long timeout = jiffies + 1;
> 
>     <--- timer interrupt here:
>          jiffies += 2 (i.e., catching up one missed interrupt)
> 
>     if (time_after(jiffies, timeout))
> 	/* but 1 tick worth of time hasn't (necessarily) elapsed */

Well, hopefully the lost tick detection code won't over compensate, so
it shouldn't be an issue. However, as Tim Mann pointed out it, due to
interrupt delay and queuing, it is seen on virtualized systems.

See http://www.ussg.iu.edu/hypermail/linux/kernel/0411.2/2293.html for
his explanation.

> This was originally observed on an ARM platform[1] but the i386 timer 
> interrupt appears to behave in a similar way.
> 
> Is this solution here to:
> 
> 1. Not use jiffies for timers/timeouts with only a few ticks?
> 
> or
> 
> 2. Have two independant "jiffies": the existing one which is used for 
> the wallclock only; and one which counts the number of timer interrupts 
> and will guarantee that timers don't expire prematurely?
> 
> or
> 
> 3. Something else?

Ideally (in my mind), we need to first get a stable and reliable time
base that isn't affected by lost interrupts. This is what I've been
(unfortunately not very frequently) working on w/ my time of day re-work
(See http://lwn.net/Articles/100665/ - although this is somewhat out of
date as I've been re-working some bits and need to post a new set of
patches soon). 

Once that is done, we can convert the timer subsystem to use units of
nanoseconds, rather then jiffies for its time accounting. Interrupt
delay, loss, and queuing then become a latency issue, rather then a
correctness one.

However, patches speak louder then words (and I need to do less talking
and more coding), so don't let me keep you from implementing your own
solution. 

thanks
-john


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Jiffy based timers/timeouts can expire too soon.
  2004-12-02 18:47 ` john stultz
@ 2004-12-02 23:28   ` Anton Blanchard
  2004-12-16 20:38     ` George Anzinger
  0 siblings, 1 reply; 5+ messages in thread
From: Anton Blanchard @ 2004-12-02 23:28 UTC (permalink / raw)
  To: john stultz; +Cc: David Vrabel, Linux Kernel, mann

 
> Well, hopefully the lost tick detection code won't over compensate, so
> it shouldn't be an issue. However, as Tim Mann pointed out it, due to
> interrupt delay and queuing, it is seen on virtualized systems.

We saw this on ppc64 on earlier 2.6 kernels. There were some bugs with
the VM where interrupts would get disabled for a long time (we saw 20+
second periods). A SCSI timeout would occur on another CPU and at that
time irqs would get reenabled and 20 seconds of time would get replayed.

A bunch of timers would go off early and the SCSI adapter would explode.

Anton

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Jiffy based timers/timeouts can expire too soon.
  2004-12-02 23:28   ` Anton Blanchard
@ 2004-12-16 20:38     ` George Anzinger
  0 siblings, 0 replies; 5+ messages in thread
From: George Anzinger @ 2004-12-16 20:38 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: john stultz, David Vrabel, Linux Kernel, mann

Anton Blanchard wrote:
>  
> 
>>Well, hopefully the lost tick detection code won't over compensate, so
>>it shouldn't be an issue. However, as Tim Mann pointed out it, due to
>>interrupt delay and queuing, it is seen on virtualized systems.
> 
> 
> We saw this on ppc64 on earlier 2.6 kernels. There were some bugs with
> the VM where interrupts would get disabled for a long time (we saw 20+
> second periods). A SCSI timeout would occur on another CPU and at that
> time irqs would get reenabled and 20 seconds of time would get replayed.
> 
> A bunch of timers would go off early and the SCSI adapter would explode.

The problem is that "most" code believes jiffies is right.  Under long interrupt 
off times, it is not.  I suspect that most of the early timers came from code 
that set the timer with the interrupt system off.  Some might say they got what 
they deserved :).

In the HRT patch, we always correct jiffies to the real value (by marking the 
TSC value at the last jiffie push and using that plus the current TSC to 
correct).  It would be rather easy to provide an interface to get the current 
real current jiffie, but it is another thing to correct all the code that uses 
jiffie.  Attempts to make jiffie a macro pick up far too many uses of the word 
in several name spaces to make it a reasonable thing to do.
-
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-12-16 20:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-02 16:05 Jiffy based timers/timeouts can expire too soon David Vrabel
2004-12-02 16:35 ` Chris Friesen
2004-12-02 18:47 ` john stultz
2004-12-02 23:28   ` Anton Blanchard
2004-12-16 20:38     ` George Anzinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).