linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: apic and 8254 wraparound ...
  2004-12-24  0:11 apic and 8254 wraparound Herbert Poetzl
@ 2004-12-23 23:37 ` Alan Cox
  2004-12-24 20:00   ` Herbert Poetzl
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2004-12-23 23:37 UTC (permalink / raw)
  To: Herbert Poetzl; +Cc: Ingo Molnar, Linux Kernel Mailing List

On Gwe, 2004-12-24 at 00:11, Herbert Poetzl wrote:
> I don't know what kind of errors the buggy Mercury/
> Neptune chipset timers can cause, maybe they need some
> special handling, but from the available code, what 
> about something like this:

Data sheet is on the intel site, but essentially latching bugs on the
reads.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* apic and 8254 wraparound ...
@ 2004-12-24  0:11 Herbert Poetzl
  2004-12-23 23:37 ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Herbert Poetzl @ 2004-12-24  0:11 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel


Hi Ingo! Folks!

I was investigating why linux-2.6.10-rc3 wasn't booting
with HZ set to higher values (like 5k,10k or 20k) on
certain machines/configurations and I tracked that down
to the wait_8254_wraparound() which does the following:

/* next tick in 8254 can be caught by catching timer wraparound */
static void __init wait_8254_wraparound(void)
{
        unsigned int curr_count, prev_count=~0;
        int delta;

        curr_count = get_8254_timer_count();

        do {
                prev_count = curr_count;
                curr_count = get_8254_timer_count();
                delta = curr_count-prev_count;

        /*
         * This limit for delta seems arbitrary, but it isn't, it's
         * slightly above the level of error a buggy Mercury/Neptune
         * chipset timer can cause.
         */

        } while (delta < 300);
}

further investigation showed that the inital value for
the timer is lower than 300 for higher HZ values, so
the kernel keeps spinning in this loop (which can be 
easily fixed by 'adjusting' the limit), but I ask myself, 
if this check can be simplified ...

I don't know what kind of errors the buggy Mercury/
Neptune chipset timers can cause, maybe they need some
special handling, but from the available code, what 
about something like this:


@@ -878,22 +878,14 @@ static unsigned int __init get_8254_time
 static void __init wait_8254_wraparound(void)
 {
 	unsigned int curr_count, prev_count=~0;
-	int delta;
 
 	curr_count = get_8254_timer_count();
 
 	do {
 		prev_count = curr_count;
 		curr_count = get_8254_timer_count();
-		delta = curr_count-prev_count;
 
-	/*
-	 * This limit for delta seems arbitrary, but it isn't, it's
-	 * slightly above the level of error a buggy Mercury/Neptune
-	 * chipset timer can cause.
-	 */
-
-	} while (delta < 300);
+	} while (curr_count <= prev_count);
 }
 
 /*


TIA,
Herbert


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: apic and 8254 wraparound ...
  2004-12-24 20:00   ` Herbert Poetzl
@ 2004-12-24 19:40     ` Alan Cox
  2004-12-25 22:48       ` Herbert Poetzl
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2004-12-24 19:40 UTC (permalink / raw)
  To: Herbert Poetzl; +Cc: Ingo Molnar, Linux Kernel Mailing List

On Gwe, 2004-12-24 at 20:00, Herbert Poetzl wrote:
> somehow I'm unable to locate it, I can see the 
> 430TX and 430HX but not the 430NX and 430LX ...
> 
> do you have an url for me?

Not to hand, its under the retired chipset stuff. The details are as
follows from memory

When you read one 8bit value from an 8254 timer the values latch for
read so that when you read the other half of the 16bit value you get the
value from the moment of the first read. On 
neptune that didn't work right so you got halves of two differing
samples. That means the error would be worst case a bit under 300 (257
for the wrap + a few for timing)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: apic and 8254 wraparound ...
  2004-12-23 23:37 ` Alan Cox
@ 2004-12-24 20:00   ` Herbert Poetzl
  2004-12-24 19:40     ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Herbert Poetzl @ 2004-12-24 20:00 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ingo Molnar, Linux Kernel Mailing List


Hi Alan!

On Thu, Dec 23, 2004 at 11:37:17PM +0000, Alan Cox wrote:
> On Gwe, 2004-12-24 at 00:11, Herbert Poetzl wrote:
> > I don't know what kind of errors the buggy Mercury/
> > Neptune chipset timers can cause, maybe they need some
> > special handling, but from the available code, what 
> > about something like this:
> 
> Data sheet is on the intel site, but essentially 
> latching bugs on the reads.

somehow I'm unable to locate it, I can see the 
430TX and 430HX but not the 430NX and 430LX ...

do you have an url for me?

TIA,
Herbert

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: apic and 8254 wraparound ...
  2004-12-24 19:40     ` Alan Cox
@ 2004-12-25 22:48       ` Herbert Poetzl
  2004-12-26 18:08         ` Pavel Machek
  0 siblings, 1 reply; 7+ messages in thread
From: Herbert Poetzl @ 2004-12-25 22:48 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ingo Molnar, Linux Kernel Mailing List

On Fri, Dec 24, 2004 at 07:40:38PM +0000, Alan Cox wrote:
> On Gwe, 2004-12-24 at 20:00, Herbert Poetzl wrote:
> > somehow I'm unable to locate it, I can see the 
> > 430TX and 430HX but not the 430NX and 430LX ...
> > 
> > do you have an url for me?
> 
> Not to hand, its under the retired chipset stuff. The details are as
> follows from memory
> 
> When you read one 8bit value from an 8254 timer the values latch for
> read so that when you read the other half of the 16bit value you get the
> value from the moment of the first read. On 
> neptune that didn't work right so you got halves of two differing
> samples. That means the error would be worst case a bit under 300 (257
> for the wrap + a few for timing)

okay, I still wasn't able to find the documentation 
at the intel site, but I could extrapolate the issue
from your explanation (thanks by the way)

get_8254_timer_count() reads lo byte first, then the 
high byte, so assuming that the latch doesn't work
as expected on intel 430 NX and LX chipsets, can 
result in the following type of error:

counter >= 2^8 * N, 	LO is read (for example 0)
counter is decremented
counter <  2^8 * N  	HI is read (N - 1)

so the read value will be exactly 2^8 lower than
expected (assumed that the counter doesn't do more
than 256 counts between the two inb_p()s)

second the wrap-around will always happen _after_
the counter reached zero, so we can further assume
that the prev_count, has to be lower than 2^8, when
we observe a wraparound (otherwise we don't care)

let's further assume the counter does not decrement
more than 2^7 between two consecutive gets, then we
can change the wraparound check to something like
this:

        curr_count = get_8254_timer_count();

	do {
        	prev_count = curr_count;
	redo:
        	curr_count = get_8254_timer_count();

		/* workaround for broken Mercury/Neptune */
		if (prev_count - current_count >= 256)
	    		goto redo;

		/* ignore values far off from zero */
    		if (prev_count > 128)
	    		continue;

	} while (prev_count >= curr_count)


basically the check for (prev_count > 128) can be
removed but it feels a little more comfortable ...

would such change be acceptable for mainline?

TIA,
Herbert

> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: apic and 8254 wraparound ...
  2004-12-25 22:48       ` Herbert Poetzl
@ 2004-12-26 18:08         ` Pavel Machek
  2004-12-26 19:42           ` Herbert Poetzl
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Machek @ 2004-12-26 18:08 UTC (permalink / raw)
  To: Alan Cox, Ingo Molnar, Linux Kernel Mailing List

Hi!

> > When you read one 8bit value from an 8254 timer the values latch for
> > read so that when you read the other half of the 16bit value you get the
> > value from the moment of the first read. On 
> > neptune that didn't work right so you got halves of two differing
> > samples. That means the error would be worst case a bit under 300 (257
> > for the wrap + a few for timing)
> 
> okay, I still wasn't able to find the documentation 
> at the intel site, but I could extrapolate the issue
> from your explanation (thanks by the way)
> 
> get_8254_timer_count() reads lo byte first, then the 
> high byte, so assuming that the latch doesn't work
> as expected on intel 430 NX and LX chipsets, can 
> result in the following type of error:
> 
> counter >= 2^8 * N, 	LO is read (for example 0)
> counter is decremented
> counter <  2^8 * N  	HI is read (N - 1)
> 
> so the read value will be exactly 2^8 lower than
> expected (assumed that the counter doesn't do more
> than 256 counts between the two inb_p()s)
> 
> second the wrap-around will always happen _after_
> the counter reached zero, so we can further assume
> that the prev_count, has to be lower than 2^8, when
> we observe a wraparound (otherwise we don't care)
> 
> let's further assume the counter does not decrement
> more than 2^7 between two consecutive gets, then we
> can change the wraparound check to something like
> this:
> 
>         curr_count = get_8254_timer_count();
> 
> 	do {
>         	prev_count = curr_count;
> 	redo:
>         	curr_count = get_8254_timer_count();
> 
> 		/* workaround for broken Mercury/Neptune */
> 		if (prev_count - current_count >= 256)
> 	    		goto redo;
> 
> 		/* ignore values far off from zero */
>     		if (prev_count > 128)
> 	    		continue;
> 
> 	} while (prev_count >= curr_count)
> 
> 
> basically the check for (prev_count > 128) can be
> removed but it feels a little more comfortable ...
> 
> would such change be acceptable for mainline?

Not sure... Reading time is quite performance critical; doing it twice
would be bad. It should be acceptable if it was only done on
Mercury/Neptune systems.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: apic and 8254 wraparound ...
  2004-12-26 18:08         ` Pavel Machek
@ 2004-12-26 19:42           ` Herbert Poetzl
  0 siblings, 0 replies; 7+ messages in thread
From: Herbert Poetzl @ 2004-12-26 19:42 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Alan Cox, Ingo Molnar, Linux Kernel Mailing List

On Sun, Dec 26, 2004 at 07:08:06PM +0100, Pavel Machek wrote:
> Hi!
> 
> > > When you read one 8bit value from an 8254 timer the values latch for
> > > read so that when you read the other half of the 16bit value you get the
> > > value from the moment of the first read. On 
> > > neptune that didn't work right so you got halves of two differing
> > > samples. That means the error would be worst case a bit under 300 (257
> > > for the wrap + a few for timing)
> > 
> > okay, I still wasn't able to find the documentation 
> > at the intel site, but I could extrapolate the issue
> > from your explanation (thanks by the way)
> > 
> > get_8254_timer_count() reads lo byte first, then the 
> > high byte, so assuming that the latch doesn't work
> > as expected on intel 430 NX and LX chipsets, can 
> > result in the following type of error:
> > 
> > counter >= 2^8 * N, 	LO is read (for example 0)
> > counter is decremented
> > counter <  2^8 * N  	HI is read (N - 1)
> > 
> > so the read value will be exactly 2^8 lower than
> > expected (assumed that the counter doesn't do more
> > than 256 counts between the two inb_p()s)
> > 
> > second the wrap-around will always happen _after_
> > the counter reached zero, so we can further assume
> > that the prev_count, has to be lower than 2^8, when
> > we observe a wraparound (otherwise we don't care)
> > 
> > let's further assume the counter does not decrement
> > more than 2^7 between two consecutive gets, then we
> > can change the wraparound check to something like
> > this:
> > 
> >         curr_count = get_8254_timer_count();
> > 
> > 	do {
> >         	prev_count = curr_count;
> > 	redo:
> >         	curr_count = get_8254_timer_count();
> > 
> > 		/* workaround for broken Mercury/Neptune */
> > 		if (prev_count - current_count >= 256)
> > 	    		goto redo;
> > 
> > 		/* ignore values far off from zero */
> >     		if (prev_count > 128)
> > 	    		continue;
> > 
> > 	} while (prev_count >= curr_count)
> > 
> > 
> > basically the check for (prev_count > 128) can be
> > removed but it feels a little more comfortable ...
> > 
> > would such change be acceptable for mainline?
> 
> Not sure... Reading time is quite performance critical; doing it twice
> would be bad. It should be acceptable if it was only done on
> Mercury/Neptune systems.

this important detail got lost over the thread

static void __init wait_8254_wraparound(void)

so I guess it should not be _too_ critical ;)

best,
Herbert

> -- 
> People were complaining that M$ turns users into beta-testers...
> ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-12-26 19:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-24  0:11 apic and 8254 wraparound Herbert Poetzl
2004-12-23 23:37 ` Alan Cox
2004-12-24 20:00   ` Herbert Poetzl
2004-12-24 19:40     ` Alan Cox
2004-12-25 22:48       ` Herbert Poetzl
2004-12-26 18:08         ` Pavel Machek
2004-12-26 19:42           ` Herbert Poetzl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).