linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [BUG] 2.6.0-test2 loses time on 486
@ 2003-07-30 22:52 Mikael Pettersson
  2003-07-30 23:16 ` john stultz
  0 siblings, 1 reply; 9+ messages in thread
From: Mikael Pettersson @ 2003-07-30 22:52 UTC (permalink / raw)
  To: johnstul; +Cc: linux-kernel

On 30 Jul 2003 13:08:44 -0700, john stultz <johnstul@us.ibm.com> wrote:
>On Wed, 2003-07-30 at 12:19, Mikael Pettersson wrote:
>> On 29 Jul 2003 11:59:06 -0700, john stultz wrote:
>> >Hmm.  Sounds like you're loosing interrupts. This can happen due to
>> >poorly behaving drivers (disabling interrupts for too long), or odd
>> >hardware. The change from HZ=100 to HZ=1000 probably made this more
>> >visible on your box, so could you try setting HZ back to 100 and see if
>> >that helps (you may still lose time, but at a much slower rate). 
>> 
>> Yep, reducing HZ to 100 in param.h eliminated the time losses.
>
>Ok, that's what I figured. 
>
>> >Also what drivers are you running with?
>> 
>> IDE, no chipset driver, NE2000 ISA NIC (no traffic during the
>> tests), AT keyboard + PS/2 mouse (unused during the tests).
>> 
>> The only things I can think of are:
>> - a 486 simply cannot keep up with HZ=1000
>> - the plain IDE driver w/o chipset & DMA support somehow
>>   is much worse in 2.5/2.6 than in 2.4
>> - the "no TSC" time-keeping code is broken
>
>Well, I suspect its just the first. If you're not generating interrupts
>then I'm doubtful the IDE driver is at fault (although I'd believe it if
>you were losing time under load). Also the PIT based time source is
>pretty simple and hasn't functionally changed much (well, it has been
>moved around a bit). 
>
>It may be the timer interrupt has grown in cost since the argument to
>change HZ to 1000 was made. Although using the PIT there isn't much we
>do from a time of day perspective. If I can find a second, I'll see if I
>can compare interrupt overhead between 2.4 and 2.5. But I'd imagine the
>box would barely be usable if we're wasting all our time handling timer
>interrupts (is it usable??).

Well, the test the box was running (recompile 2.4.22-pre) generates
a lot of disk traffic, including swapping, since the box has so little
RAM (only 28M). So IDE interrupts are frequent and the box is both
CPU and I/O bound. I can still log in to it, type shell commands and
so on, but starting emacs would be a bad idea...
 
To test the "486 can't cope with HZ=1000" thesis I tried a RedHat
2.4.18-27.8 kernel which has a CONFIG_HZ option. Using 2.4.18-27.8
with CONFIG_HZ=1000, the box still lost time during the "recompile
2.4.22-pre" test, but only about 15 seconds per hour instead of 2
minutes per hour as it does with 2.6-test.

/Mikael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
  2003-07-30 22:52 [BUG] 2.6.0-test2 loses time on 486 Mikael Pettersson
@ 2003-07-30 23:16 ` john stultz
  2003-07-31  6:06   ` Jan-Benedict Glaw
  0 siblings, 1 reply; 9+ messages in thread
From: john stultz @ 2003-07-30 23:16 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: lkml

On Wed, 2003-07-30 at 15:52, Mikael Pettersson wrote:
> On 30 Jul 2003 13:08:44 -0700, john stultz <johnstul@us.ibm.com> wrote:

> >Well, I suspect its just the first. If you're not generating interrupts
> >then I'm doubtful the IDE driver is at fault (although I'd believe it if
> >you were losing time under load). Also the PIT based time source is
> >pretty simple and hasn't functionally changed much (well, it has been
> >moved around a bit). 
> >
> >It may be the timer interrupt has grown in cost since the argument to
> >change HZ to 1000 was made. Although using the PIT there isn't much we
> >do from a time of day perspective. If I can find a second, I'll see if I
> >can compare interrupt overhead between 2.4 and 2.5. But I'd imagine the
> >box would barely be usable if we're wasting all our time handling timer
> >interrupts (is it usable??).
> 
> Well, the test the box was running (recompile 2.4.22-pre) generates
> a lot of disk traffic, including swapping, since the box has so little
> RAM (only 28M). So IDE interrupts are frequent and the box is both
> CPU and I/O bound. I can still log in to it, type shell commands and
> so on, but starting emacs would be a bad idea...

Oh, if you're compiling then IDE is probably contributing to the
problem. However, I thought you said you lost time when idling as well?
 
> To test the "486 can't cope with HZ=1000" thesis I tried a RedHat
> 2.4.18-27.8 kernel which has a CONFIG_HZ option. Using 2.4.18-27.8
> with CONFIG_HZ=1000, the box still lost time during the "recompile
> 2.4.22-pre" test, but only about 15 seconds per hour instead of 2
> minutes per hour as it does with 2.6-test.

Ah, good call testing 2.4 w/ HZ=1000. Yea, as for the difference between
2.4 and 2.6-test, I'm guessing something in do_timer_interrupt_hook()
has grown. Booting a 586+ system w/ "clock=pit" and instrumenting that
function w/ rdtsc calls would probably show what has slowed down. 

Regardless, as you've demonstrated, it seems 486s just can't keep up w/
HZ=1000. Maybe we need to look into some sort of processor specific HZ
config option?

thanks
-john



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
  2003-07-30 23:16 ` john stultz
@ 2003-07-31  6:06   ` Jan-Benedict Glaw
  0 siblings, 0 replies; 9+ messages in thread
From: Jan-Benedict Glaw @ 2003-07-31  6:06 UTC (permalink / raw)
  To: lkml

[-- Attachment #1: Type: text/plain, Size: 1112 bytes --]

On Wed, 2003-07-30 16:16:59 -0700, john stultz <johnstul@us.ibm.com>
wrote in message <1059607019.14771.117.camel@w-jstultz2.beaverton.ibm.com>:
> On Wed, 2003-07-30 at 15:52, Mikael Pettersson wrote:
> > On 30 Jul 2003 13:08:44 -0700, john stultz <johnstul@us.ibm.com> wrote:

> Regardless, as you've demonstrated, it seems 486s just can't keep up w/
> HZ=1000. Maybe we need to look into some sort of processor specific HZ
> config option?

I'd like to see that. Eventually, I'll post some patch to do that, but
first, I need to make Debian to support i386 again (since libstdc++5 in
unstable is now compiled for i486, some apps (apt-get is one of
those...) will SIGILLed to death). I do have some of those boxes (Am386
with SIMM RAM, i386SX-16 with SIPP modules + single ICs :)

MfG, JBG

-- 
   Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481
   "Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg
    fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!
      ret = do_actions((curr | FREE_SPEECH) & ~(IRAQ_WAR_2 | DRM | TCPA));

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
  2003-07-30 19:19 Mikael Pettersson
@ 2003-07-30 20:08 ` john stultz
  0 siblings, 0 replies; 9+ messages in thread
From: john stultz @ 2003-07-30 20:08 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: lkml

On Wed, 2003-07-30 at 12:19, Mikael Pettersson wrote:
> On 29 Jul 2003 11:59:06 -0700, john stultz wrote:
> >Hmm.  Sounds like you're loosing interrupts. This can happen due to
> >poorly behaving drivers (disabling interrupts for too long), or odd
> >hardware. The change from HZ=100 to HZ=1000 probably made this more
> >visible on your box, so could you try setting HZ back to 100 and see if
> >that helps (you may still lose time, but at a much slower rate). 
> 
> Yep, reducing HZ to 100 in param.h eliminated the time losses.

Ok, that's what I figured. 

> >Also what drivers are you running with?
> 
> IDE, no chipset driver, NE2000 ISA NIC (no traffic during the
> tests), AT keyboard + PS/2 mouse (unused during the tests).
> 
> The only things I can think of are:
> - a 486 simply cannot keep up with HZ=1000
> - the plain IDE driver w/o chipset & DMA support somehow
>   is much worse in 2.5/2.6 than in 2.4
> - the "no TSC" time-keeping code is broken

Well, I suspect its just the first. If you're not generating interrupts
then I'm doubtful the IDE driver is at fault (although I'd believe it if
you were losing time under load). Also the PIT based time source is
pretty simple and hasn't functionally changed much (well, it has been
moved around a bit). 

It may be the timer interrupt has grown in cost since the argument to
change HZ to 1000 was made. Although using the PIT there isn't much we
do from a time of day perspective. If I can find a second, I'll see if I
can compare interrupt overhead between 2.4 and 2.5. But I'd imagine the
box would barely be usable if we're wasting all our time handling timer
interrupts (is it usable??).

thanks
-john





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
@ 2003-07-30 19:19 Mikael Pettersson
  2003-07-30 20:08 ` john stultz
  0 siblings, 1 reply; 9+ messages in thread
From: Mikael Pettersson @ 2003-07-30 19:19 UTC (permalink / raw)
  To: johnstul; +Cc: linux-kernel

On 29 Jul 2003 11:59:06 -0700, john stultz wrote:
>On Tue, 2003-07-29 at 10:34, Mikael Pettersson wrote:
>> My old 486 test box is losing time at an alarming rate
>> when running 2.6.0-test kernels. It loses almost 2 minutes
>> per hour, less if it sits idle. This problem does not
>> occur when it's running a 2.4 kernel.
>> 
>> There's nothing noteworthy in dmesg.
>> 
>> This has been going on since at least the 2.5.7x kernels,
>> and possible also the 2.5.6x kernels. I strongly suspect
>> a bug in the time-keeping changes in late 2.5 kernels.
>> The 486 has no TSC, and I don't have an NTP server to
>> keep my machines' times in sync.
>
>Hmm.  Sounds like you're loosing interrupts. This can happen due to
>poorly behaving drivers (disabling interrupts for too long), or odd
>hardware. The change from HZ=100 to HZ=1000 probably made this more
>visible on your box, so could you try setting HZ back to 100 and see if
>that helps (you may still lose time, but at a much slower rate). 

Yep, reducing HZ to 100 in param.h eliminated the time losses.

>Also what drivers are you running with?

IDE, no chipset driver, NE2000 ISA NIC (no traffic during the
tests), AT keyboard + PS/2 mouse (unused during the tests).

The only things I can think of are:
- a 486 simply cannot keep up with HZ=1000
- the plain IDE driver w/o chipset & DMA support somehow
  is much worse in 2.5/2.6 than in 2.4
- the "no TSC" time-keeping code is broken

/Mikael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
  2003-07-29 17:34 Mikael Pettersson
  2003-07-29 18:59 ` john stultz
  2003-07-29 19:15 ` Sean Estabrooks
@ 2003-07-29 21:47 ` Frank van Maarseveen
  2 siblings, 0 replies; 9+ messages in thread
From: Frank van Maarseveen @ 2003-07-29 21:47 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel

On Tue, Jul 29, 2003 at 07:34:43PM +0200, Mikael Pettersson wrote:
> My old 486 test box is losing time at an alarming rate
> when running 2.6.0-test kernels. It loses almost 2 minutes
> per hour, less if it sits idle. This problem does not
> occur when it's running a 2.4 kernel.

I recently saw a patch to compensate for the 1000/1024 ratio,
HZ being 1000 nowadays. I'm not sure if it is already in test2.
It wasn't in test1. There is a similar compensation for 100/128
in case HZ == 100. Search for HZ == 100 in kernel/timer.c in
second_overflow()

I'm not sure what this fix does does but 2 minutes per hour is close to
1000/1024 ratio.

-- 
Frank

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
  2003-07-29 17:34 Mikael Pettersson
  2003-07-29 18:59 ` john stultz
@ 2003-07-29 19:15 ` Sean Estabrooks
  2003-07-29 21:47 ` Frank van Maarseveen
  2 siblings, 0 replies; 9+ messages in thread
From: Sean Estabrooks @ 2003-07-29 19:15 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel

> My old 486 test box is losing time at an alarming rate

try changing the line in  "include/asm-i386/param.h":

# define HZ             1000  

to

# define HZ             100

and see if the problem remains after recompiling.

Regards,
Sean


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.0-test2 loses time on 486
  2003-07-29 17:34 Mikael Pettersson
@ 2003-07-29 18:59 ` john stultz
  2003-07-29 19:15 ` Sean Estabrooks
  2003-07-29 21:47 ` Frank van Maarseveen
  2 siblings, 0 replies; 9+ messages in thread
From: john stultz @ 2003-07-29 18:59 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: lkml

On Tue, 2003-07-29 at 10:34, Mikael Pettersson wrote:
> My old 486 test box is losing time at an alarming rate
> when running 2.6.0-test kernels. It loses almost 2 minutes
> per hour, less if it sits idle. This problem does not
> occur when it's running a 2.4 kernel.
> 
> There's nothing noteworthy in dmesg.
> 
> This has been going on since at least the 2.5.7x kernels,
> and possible also the 2.5.6x kernels. I strongly suspect
> a bug in the time-keeping changes in late 2.5 kernels.
> The 486 has no TSC, and I don't have an NTP server to
> keep my machines' times in sync.

Hmm.  Sounds like you're loosing interrupts. This can happen due to
poorly behaving drivers (disabling interrupts for too long), or odd
hardware. The change from HZ=100 to HZ=1000 probably made this more
visible on your box, so could you try setting HZ back to 100 and see if
that helps (you may still lose time, but at a much slower rate). 

Also what drivers are you running with?

thanks
-john



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [BUG] 2.6.0-test2 loses time on 486
@ 2003-07-29 17:34 Mikael Pettersson
  2003-07-29 18:59 ` john stultz
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Mikael Pettersson @ 2003-07-29 17:34 UTC (permalink / raw)
  To: linux-kernel

My old 486 test box is losing time at an alarming rate
when running 2.6.0-test kernels. It loses almost 2 minutes
per hour, less if it sits idle. This problem does not
occur when it's running a 2.4 kernel.

There's nothing noteworthy in dmesg.

This has been going on since at least the 2.5.7x kernels,
and possible also the 2.5.6x kernels. I strongly suspect
a bug in the time-keeping changes in late 2.5 kernels.
The 486 has no TSC, and I don't have an NTP server to
keep my machines' times in sync.

/Mikael

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-07-31  6:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-30 22:52 [BUG] 2.6.0-test2 loses time on 486 Mikael Pettersson
2003-07-30 23:16 ` john stultz
2003-07-31  6:06   ` Jan-Benedict Glaw
  -- strict thread matches above, loose matches on Subject: below --
2003-07-30 19:19 Mikael Pettersson
2003-07-30 20:08 ` john stultz
2003-07-29 17:34 Mikael Pettersson
2003-07-29 18:59 ` john stultz
2003-07-29 19:15 ` Sean Estabrooks
2003-07-29 21:47 ` Frank van Maarseveen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).