linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* tsc clock issues with dual core and question about irq balancing
@ 2005-12-13  7:26 Adrian Yee
  2005-12-14  1:04 ` john stultz
  2005-12-14 23:47 ` Jeff Carr
  0 siblings, 2 replies; 9+ messages in thread
From: Adrian Yee @ 2005-12-13  7:26 UTC (permalink / raw)
  To: linux-kernel

Hi,

I've been having tsc issues where it counts back occasionally causing
things like ping to break with errors: "Warning: time of day goes back
(-1451987us), taking countermeasures."  It seems related to
http://bugzilla.kernel.org/show_bug.cgi?id=5105 , but that bug seems to
be closed (and more x86_64 related).  I also get other timing issues
like single clicks registering as double clicks, and at times double
clicks that don't register.  In addition, if I stress the system with
something like prime95, then after about 2 minutes the system clock will
speed up where the clock advances by minutes every second.  As suggested
in bug 5105, I switched to use the pmtimer (clock=pmtmr, my system
doesn't seem to support hpet) and it has fixed the ping and clock issue,
but my system doesn't 'feel' right.  For example, ssh'ing out of the
machine is fine, but when ssh'ing into the system a dmesg is very slow
(spurts out a few pages then pauses for 10-20 seconds, then repeat). 
Also, general desktop usage seems a little sluggish and not what a smp
system should feel like.

I'm currently running an i386 (ie. not x86_64) 2.6.15-rc5 kernel w/ SMP,
APIC and ACPI enabled (AMD Cool & Quiet disabled), an Athlon 64 X2 3800+
and EVGA nForce4 SLI (NF41) motherboard.  I previously had the processor
running on an Abit AV8 (K8T800 Pro chipset) board and was having similar
issues, so it seems to be a dual core issue.  I'd just like to add that
I'm currently testing the system with "nosmp noapic acpi=off clock=tsc"
(it was losing interrupts and wouldn't boot properly with apic/acpi on)
and so far everything seems to work (this includes ssh and desktop usage
is better).

My other question is about irq balancing - I turned it on, but it
doesn't seem to be working properly:

           CPU0       CPU1       
  0:     109208        975    IO-APIC-edge  timer
  1:       1226         10    IO-APIC-edge  i8042
  8:     275272          1    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 12:       4133          4    IO-APIC-edge  i8042
 14:       5135          8    IO-APIC-edge  ide0
 15:         17          8    IO-APIC-edge  ide1
 16:      25084          1   IO-APIC-level  eth0
 17:      43597          1   IO-APIC-level  eth1
 18:        185          5   IO-APIC-level  libata
 19:          0          0   IO-APIC-level  libata
 20:      11525          1   IO-APIC-level  EMU10K1
 21:      24870          1   IO-APIC-level  nvidia
NMI:          0          0 
LOC:     110119     110118 
ERR:          0
MIS:          0

Are there certain conditions where irq balancing doesn't work properly? 
Thanks.

Adrian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-13  7:26 tsc clock issues with dual core and question about irq balancing Adrian Yee
@ 2005-12-14  1:04 ` john stultz
  2005-12-14  9:07   ` Adrian Yee
  2005-12-14 23:47 ` Jeff Carr
  1 sibling, 1 reply; 9+ messages in thread
From: john stultz @ 2005-12-14  1:04 UTC (permalink / raw)
  To: Adrian Yee; +Cc: linux-kernel

On Mon, 2005-12-12 at 23:26 -0800, Adrian Yee wrote:
> I've been having tsc issues where it counts back occasionally causing
> things like ping to break with errors: "Warning: time of day goes back
> (-1451987us), taking countermeasures."  It seems related to
> http://bugzilla.kernel.org/show_bug.cgi?id=5105 , but that bug seems to
> be closed (and more x86_64 related).  I also get other timing issues
> like single clicks registering as double clicks, and at times double
> clicks that don't register.  In addition, if I stress the system with
> something like prime95, then after about 2 minutes the system clock will
> speed up where the clock advances by minutes every second.  As suggested
> in bug 5105, I switched to use the pmtimer (clock=pmtmr, my system
> doesn't seem to support hpet) and it has fixed the ping and clock issue,
> but my system doesn't 'feel' right.  For example, ssh'ing out of the
> machine is fine, but when ssh'ing into the system a dmesg is very slow
> (spurts out a few pages then pauses for 10-20 seconds, then repeat). 
> Also, general desktop usage seems a little sluggish and not what a smp
> system should feel like.

I can't speak about the irq routing issue, but I'm interested in your
issues with the ACPI PM timer.

> I'm currently running an i386 (ie. not x86_64) 2.6.15-rc5 kernel w/ SMP,
> APIC and ACPI enabled (AMD Cool & Quiet disabled), an Athlon 64 X2 3800+
> and EVGA nForce4 SLI (NF41) motherboard.  I previously had the processor
> running on an Abit AV8 (K8T800 Pro chipset) board and was having similar
> issues, so it seems to be a dual core issue.  I'd just like to add that
> I'm currently testing the system with "nosmp noapic acpi=off clock=tsc"
> (it was losing interrupts and wouldn't boot properly with apic/acpi on)
> and so far everything seems to work (this includes ssh and desktop usage
> is better).

So keeping the above settings, does removing just the "clock=tsc" cause
the sluggishness to appear?

The TSC is *much* faster then the ACPI PM, however it is just not usable
for reliable timekeeping on many SMP systems. That said, the ACPI PM
should not cause performance issues unless you are constantly calling
gettimeofday().

Also would you open a bugzilla bug on this and attach your .config and
dmesg?

thanks
-john



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-14  1:04 ` john stultz
@ 2005-12-14  9:07   ` Adrian Yee
  2005-12-14  9:54     ` Jonas Oreland
  2005-12-14 20:14     ` john stultz
  0 siblings, 2 replies; 9+ messages in thread
From: Adrian Yee @ 2005-12-14  9:07 UTC (permalink / raw)
  To: john stultz; +Cc: Adrian Yee, linux-kernel

Hi John,

>> I'm currently testing the system with "nosmp noapic acpi=off
>> clock=tsc" (it was losing interrupts and wouldn't boot properly
>> with apic/acpi on) and so far everything seems to work (this
>> includes ssh and desktop usage is better).
> 
> So keeping the above settings, does removing just the "clock=tsc"
> cause the sluggishness to appear?

I just tried booting with the pmtmr enabled and incoming ssh is bad
(I had an ls pause for over 20 seconds, while another connection was
somewhat fine).  I wish I had more concrete tests since the problems
I'm seeing are so subjective.  I guess I'll have to ignore this
problem until I get a better test.
 
> Also would you open a bugzilla bug on this and attach your .config
> and dmesg?

Done: http://bugzilla.kernel.org/show_bug.cgi?id=5740

Thanks.

Adrian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-14  9:07   ` Adrian Yee
@ 2005-12-14  9:54     ` Jonas Oreland
  2005-12-14 20:14     ` john stultz
  1 sibling, 0 replies; 9+ messages in thread
From: Jonas Oreland @ 2005-12-14  9:54 UTC (permalink / raw)
  To: Adrian Yee; +Cc: john stultz, linux-kernel

Adrian Yee wrote:
> Hi John,
> 
> 
>>>I'm currently testing the system with "nosmp noapic acpi=off
>>>clock=tsc" (it was losing interrupts and wouldn't boot properly
>>>with apic/acpi on) and so far everything seems to work (this
>>>includes ssh and desktop usage is better).
>>
>>So keeping the above settings, does removing just the "clock=tsc"
>>cause the sluggishness to appear?
> 
> 
> I just tried booting with the pmtmr enabled and incoming ssh is bad
> (I had an ls pause for over 20 seconds, while another connection was
> somewhat fine).  I wish I had more concrete tests since the problems
> I'm seeing are so subjective.  I guess I'll have to ignore this
> problem until I get a better test.
>  
> 
>>Also would you open a bugzilla bug on this and attach your .config
>>and dmesg?
> 
> 
> Done: http://bugzilla.kernel.org/show_bug.cgi?id=5740
> 
> Thanks.
> 
> Adrian
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

Hi 

Dono if this helps, but

I also had problems with tsc, and ACPI timer wasnt properly detected

http://bugzilla.kernel.org/show_bug.cgi?id=5283 fixed the ACPI problem.

(idle=poll should fix it aswell, i think)

/Jonas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-14  9:07   ` Adrian Yee
  2005-12-14  9:54     ` Jonas Oreland
@ 2005-12-14 20:14     ` john stultz
  2005-12-14 20:27       ` Adrian Yee
  1 sibling, 1 reply; 9+ messages in thread
From: john stultz @ 2005-12-14 20:14 UTC (permalink / raw)
  To: Adrian Yee; +Cc: linux-kernel

On Wed, 2005-12-14 at 01:07 -0800, Adrian Yee wrote:
> Hi John,
> 
> >> I'm currently testing the system with "nosmp noapic acpi=off
> >> clock=tsc" (it was losing interrupts and wouldn't boot properly
> >> with apic/acpi on) and so far everything seems to work (this
> >> includes ssh and desktop usage is better).
> > 
> > So keeping the above settings, does removing just the "clock=tsc"
> > cause the sluggishness to appear?
> 
> I just tried booting with the pmtmr enabled and incoming ssh is bad
> (I had an ls pause for over 20 seconds, while another connection was
> somewhat fine).  I wish I had more concrete tests since the problems
> I'm seeing are so subjective.  I guess I'll have to ignore this
> problem until I get a better test.

>From your dmesg, you're still running w/ smp, apic, acpi as well. I was
curious if you could run just as you had before without issue using 
"nosmp noapic acpi=off clock=tsc", only drop the clock=tsc bit.

I just want to be sure we're only changing one variable at a time. :)


> > Also would you open a bugzilla bug on this and attach your .config
> > and dmesg?
> 
> Done: http://bugzilla.kernel.org/show_bug.cgi?id=5740

Thanks for filling that out! I'll see if I cannot reproduce anything
similar using your config.


thanks again,
-john


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-14 20:14     ` john stultz
@ 2005-12-14 20:27       ` Adrian Yee
  2005-12-14 20:57         ` john stultz
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian Yee @ 2005-12-14 20:27 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel

Hi John,

> >>>From your dmesg, you're still running w/ smp, apic, acpi as well.
> I was curious if you could run just as you had before without issue
> using "nosmp noapic acpi=off clock=tsc", only drop the clock=tsc bit.
> 
> I just want to be sure we're only changing one variable at a time.
> :)

I also have a dmesg with those options that I can upload, but I'm not
completely sure about the validity of the sluggishness "tests" because
the system felt the same after I booted with the different
configurations this time around.  ssh seems fine right now, so I guess
my Internet just happened to go bad at the same time I started play with
my hardware and kernel configurations.

I think the only solid problem I've got here is the tsc ocassionally
counting back.  Is switching to clock=pmtmr the permanent/proper
solution for this, or is there a bug in the kernel/hardware that should
be fixable?  Thanks.

Adrian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-14 20:27       ` Adrian Yee
@ 2005-12-14 20:57         ` john stultz
  0 siblings, 0 replies; 9+ messages in thread
From: john stultz @ 2005-12-14 20:57 UTC (permalink / raw)
  To: Adrian Yee; +Cc: linux-kernel

On Wed, 2005-12-14 at 12:27 -0800, Adrian Yee wrote:
> Hi John,
> 
> > >>>From your dmesg, you're still running w/ smp, apic, acpi as well.
> > I was curious if you could run just as you had before without issue
> > using "nosmp noapic acpi=off clock=tsc", only drop the clock=tsc bit.
> > 
> > I just want to be sure we're only changing one variable at a time.
> > :)
> 
> I also have a dmesg with those options that I can upload, but I'm not
> completely sure about the validity of the sluggishness "tests" because
> the system felt the same after I booted with the different
> configurations this time around.  ssh seems fine right now, so I guess
> my Internet just happened to go bad at the same time I started play with
> my hardware and kernel configurations.

Hmm. Please keep an eye on this. If there is something going funky
either in accessing the PM Timer hardware on your chipset, or some other
quirk (locking issues, timer starvation, etc) it would be good to
discover.

> I think the only solid problem I've got here is the tsc ocassionally
> counting back.  Is switching to clock=pmtmr the permanent/proper
> solution for this, or is there a bug in the kernel/hardware that should
> be fixable?  Thanks.

If the ACPI PM timer is enabled it should be used by default (is that
not the case? if you do not use clock= at all, what clocksource gets
selected?). Unfortunately using the TSC on some SMP systems is just not
feasible.

thanks
-john



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-13  7:26 tsc clock issues with dual core and question about irq balancing Adrian Yee
  2005-12-14  1:04 ` john stultz
@ 2005-12-14 23:47 ` Jeff Carr
  2005-12-15  4:35   ` Adrian Yee
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Carr @ 2005-12-14 23:47 UTC (permalink / raw)
  To: Adrian Yee; +Cc: linux-kernel

On 12/12/05 23:26, Adrian Yee wrote:

> My other question is about irq balancing - I turned it on, but it
> doesn't seem to be working properly:
> 
>            CPU0       CPU1       
>   0:     109208        975    IO-APIC-edge  timer
>   1:       1226         10    IO-APIC-edge  i8042
>   8:     275272          1    IO-APIC-edge  rtc
>   9:          0          0   IO-APIC-level  acpi
>  12:       4133          4    IO-APIC-edge  i8042
>  14:       5135          8    IO-APIC-edge  ide0
>  15:         17          8    IO-APIC-edge  ide1
>  16:      25084          1   IO-APIC-level  eth0
>  17:      43597          1   IO-APIC-level  eth1
>  18:        185          5   IO-APIC-level  libata
>  19:          0          0   IO-APIC-level  libata
>  20:      11525          1   IO-APIC-level  EMU10K1
>  21:      24870          1   IO-APIC-level  nvidia
> NMI:          0          0 
> LOC:     110119     110118 
> ERR:          0
> MIS:          0

I think there is an irqbalance userspace daemon.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tsc clock issues with dual core and question about irq balancing
  2005-12-14 23:47 ` Jeff Carr
@ 2005-12-15  4:35   ` Adrian Yee
  0 siblings, 0 replies; 9+ messages in thread
From: Adrian Yee @ 2005-12-15  4:35 UTC (permalink / raw)
  To: Jeff Carr; +Cc: linux-kernel

Hi Jeff,

>> My other question is about irq balancing - I turned it on, but it
>> doesn't seem to be working properly:
>>
>>            CPU0       CPU1
>>   0:     109208        975    IO-APIC-edge  timer
>>   1:       1226         10    IO-APIC-edge  i8042
>>   8:     275272          1    IO-APIC-edge  rtc
>>   9:          0          0   IO-APIC-level  acpi
>>  12:       4133          4    IO-APIC-edge  i8042
>>  14:       5135          8    IO-APIC-edge  ide0
>>  15:         17          8    IO-APIC-edge  ide1
>>  16:      25084          1   IO-APIC-level  eth0
>>  17:      43597          1   IO-APIC-level  eth1
>>  18:        185          5   IO-APIC-level  libata
>>  19:          0          0   IO-APIC-level  libata
>>  20:      11525          1   IO-APIC-level  EMU10K1
>>  21:      24870          1   IO-APIC-level  nvidia
>> NMI:          0          0
>> LOC:     110119     110118
>> ERR:          0
>> MIS:          0
>
> I think there is an irqbalance userspace daemon.

According to debian's package description for the irq balance package:

"Daemon to balance irq's across multiple CPUs on systems with the 2.4 or
2.6 kernel. This can lead to better performance and IO balance on SMP
systems. Useful mostly just for 2.4 kernels, or 2.6 kernels with
CONFIG_IRQBALANCE turned off."

I have the CONFIG_IRQBALANCE option turned on, so it should be balancing
the irq's itself; is it not?  Would there even be any benefit from
balancing the irq's with a single dual core processor?  Thanks.

Adrian

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-12-15  4:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-13  7:26 tsc clock issues with dual core and question about irq balancing Adrian Yee
2005-12-14  1:04 ` john stultz
2005-12-14  9:07   ` Adrian Yee
2005-12-14  9:54     ` Jonas Oreland
2005-12-14 20:14     ` john stultz
2005-12-14 20:27       ` Adrian Yee
2005-12-14 20:57         ` john stultz
2005-12-14 23:47 ` Jeff Carr
2005-12-15  4:35   ` Adrian Yee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).