linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: NForce2 pseudoscience stability testing (2.6.0-test11)
@ 2003-12-04 12:17 b
  2003-12-04 15:19 ` Craig Bradney
  2003-12-05 13:28 ` NForce2 pseudoscience stability testing (2.6.0-test11) Pat Erley
  0 siblings, 2 replies; 17+ messages in thread
From: b @ 2003-12-04 12:17 UTC (permalink / raw)
  To: dan; +Cc: linux-kernel

Dan Creswell wrote:

Thanks for the input, I'll pass it to the list.

Most of these Athlon victims are UP users, in fact, I
believe they are exclusively UP. Does MPS 1.1/1.4 play a role
in a UP system ever? I dont think the NForce2 chipset,
where we are seeing these hard hangs (no ping, no screen,
no blinking cursor, no toggling the caps lock, nothing) is
capable of SMP operation.

Now whats interesting is you finger the IDE as a potential
culprit and think its very low level. Interesting.

By the way, I've had trouble with SMP on a Tyan board with an
i840 chipset with Linux before - I was never able to resolve
the issue and had to return the board.

I've beaten on an Intel SR1300 and SR2300 dual Xeon (aka
Micron's Netframe 1610/2610 aka Sun 60x / 65x) and never run
into these hangs with kernels up to 2.4.22. The motherboard
is an Intel SE7501WV2 .


 >Hi,
 >Been following this thread silently for a while and thought
 >I'd drop you
 >a line as I have some other data you may find useful.
 >
 >My machine is a dual Xeon with 2Gb, E1000 NIC, MPT LSI SCSI
 >disks and an
 >IDE CDROM.
 >
 >2.6-test9 is only stable on this machine with noapic passed in the
 >kernel parameters - otherwise, it lock's up in no time flat.
 >I can also
 >run this kernel in single-processor mode with the APIC enabled and
 >that's stable.
 >
 >2.4.23-rc2 runs fine on the same machine with the APIC enabled
 >in SMP mode.
 >
 >2.4.23-rc5 locks up on this machine if I use the same .config as
for
 >-rc2.  However, if I disable ACPI and pass "pci=noacpi" to the
kernel,
 >this too runs fine.
 >    - Seems like the ACPI changes in -rc3 are a problem for my
machine.
 >
 >All of these behaviors have been observed with MPS 1.4 (I've
changed
 >that BIOS setting to 1.1 today in preparation for more testing of
the
 >above to see if that makes a difference).
 >
 >I mention all of this because none of my lock-ups have happened
whilst
 >accessing the IDE subsystem.  I *have* had lockups with
simultaneous
 >network and disk access and I've also seen it with simply
 >mouse-waggling.  I suspect that the problem is *very* low-level
and
 >likely related to the interrupt load.  In my case, the problems
only
 >seem to occur with SMP configurations which makes me suspect there
may
 >be a locking/simultaneous update problem.
 >
 >Oh, forgot to say, my motherboard is a Tyan Thunder S2665
 >(based on the
 >intel E7505 chipset).
 >
 >Hope that helps,
 >
 >Dan.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NForce2 pseudoscience stability testing (2.6.0-test11)
  2003-12-04 12:17 NForce2 pseudoscience stability testing (2.6.0-test11) b
@ 2003-12-04 15:19 ` Craig Bradney
  2003-12-04 16:32   ` Josh McKinney
  2003-12-05 13:28 ` NForce2 pseudoscience stability testing (2.6.0-test11) Pat Erley
  1 sibling, 1 reply; 17+ messages in thread
From: Craig Bradney @ 2003-12-04 15:19 UTC (permalink / raw)
  To: b; +Cc: dan, linux-kernel

As reported earlier today I had the first lockup this morning in over 5
days uptime. Having had that, I decided to go for the latest gentoo 2.6
test 11-r1 kernel. This means I was now running the following patches:
http://dev.gentoo.org/~brad_mssw/kernel_patches/2.6.0/genpatches-0.7/
instead of
http://dev.gentoo.org/~brad_mssw/kernel_patches/2.6.0/genpatches-0.6/
on top of vanilla test 11.

Anyway.. my two changes on rebuilding the kernel were initially:
-Add in preempt (because there were questions asked here) 
-Remove generic IDE support (I know I have Nvidia IDE so lets only have
that one).

In that configuration the "multiple hdparm -t /dev/hda" test hung my
system.

Rebuilt kernel without preempt.. no hang on hdparm test. 

So in summary, apart from the patch changes as above, the only
difference is to my 5 day kernel I now dont have generic IDE support
included.

So now, I'll leave it on and see how far it goes.. 13 mins so far :).

regards
Craig


On Thu, 2003-12-04 at 13:17, b@netzentry.com wrote:
> Dan Creswell wrote:
> 
> Thanks for the input, I'll pass it to the list.
> 
> Most of these Athlon victims are UP users, in fact, I
> believe they are exclusively UP. Does MPS 1.1/1.4 play a role
> in a UP system ever? I dont think the NForce2 chipset,
> where we are seeing these hard hangs (no ping, no screen,
> no blinking cursor, no toggling the caps lock, nothing) is
> capable of SMP operation.
> 
> Now whats interesting is you finger the IDE as a potential
> culprit and think its very low level. Interesting.
> 
> By the way, I've had trouble with SMP on a Tyan board with an
> i840 chipset with Linux before - I was never able to resolve
> the issue and had to return the board.
> 
> I've beaten on an Intel SR1300 and SR2300 dual Xeon (aka
> Micron's Netframe 1610/2610 aka Sun 60x / 65x) and never run
> into these hangs with kernels up to 2.4.22. The motherboard
> is an Intel SE7501WV2 .
> 
> 
>  >Hi,
>  >Been following this thread silently for a while and thought
>  >I'd drop you
>  >a line as I have some other data you may find useful.
>  >
>  >My machine is a dual Xeon with 2Gb, E1000 NIC, MPT LSI SCSI
>  >disks and an
>  >IDE CDROM.
>  >
>  >2.6-test9 is only stable on this machine with noapic passed in the
>  >kernel parameters - otherwise, it lock's up in no time flat.
>  >I can also
>  >run this kernel in single-processor mode with the APIC enabled and
>  >that's stable.
>  >
>  >2.4.23-rc2 runs fine on the same machine with the APIC enabled
>  >in SMP mode.
>  >
>  >2.4.23-rc5 locks up on this machine if I use the same .config as
> for
>  >-rc2.  However, if I disable ACPI and pass "pci=noacpi" to the
> kernel,
>  >this too runs fine.
>  >    - Seems like the ACPI changes in -rc3 are a problem for my
> machine.
>  >
>  >All of these behaviors have been observed with MPS 1.4 (I've
> changed
>  >that BIOS setting to 1.1 today in preparation for more testing of
> the
>  >above to see if that makes a difference).
>  >
>  >I mention all of this because none of my lock-ups have happened
> whilst
>  >accessing the IDE subsystem.  I *have* had lockups with
> simultaneous
>  >network and disk access and I've also seen it with simply
>  >mouse-waggling.  I suspect that the problem is *very* low-level
> and
>  >likely related to the interrupt load.  In my case, the problems
> only
>  >seem to occur with SMP configurations which makes me suspect there
> may
>  >be a locking/simultaneous update problem.
>  >
>  >Oh, forgot to say, my motherboard is a Tyan Thunder S2665
>  >(based on the
>  >intel E7505 chipset).
>  >
>  >Hope that helps,
>  >
>  >Dan.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11)
  2003-12-04 15:19 ` Craig Bradney
@ 2003-12-04 16:32   ` Josh McKinney
  2003-12-04 17:08     ` Julien Oster
  0 siblings, 1 reply; 17+ messages in thread
From: Josh McKinney @ 2003-12-04 16:32 UTC (permalink / raw)
  To: linux-kernel

On approximately Thu, Dec 04, 2003 at 04:19:10PM +0100, Craig Bradney wrote:
> As reported earlier today I had the first lockup this morning in over 5
> days uptime. Having had that, I decided to go for the latest gentoo 2.6
> test 11-r1 kernel. This means I was now running the following patches:
> http://dev.gentoo.org/~brad_mssw/kernel_patches/2.6.0/genpatches-0.7/
> instead of
> http://dev.gentoo.org/~brad_mssw/kernel_patches/2.6.0/genpatches-0.6/
> on top of vanilla test 11.
> 
> Anyway.. my two changes on rebuilding the kernel were initially:
> -Add in preempt (because there were questions asked here) 
> -Remove generic IDE support (I know I have Nvidia IDE so lets only have
> that one).
> 
> In that configuration the "multiple hdparm -t /dev/hda" test hung my
> system.
> 
> Rebuilt kernel without preempt.. no hang on hdparm test. 
> 
> So in summary, apart from the patch changes as above, the only
> difference is to my 5 day kernel I now dont have generic IDE support
> included.
> 
> So now, I'll leave it on and see how far it goes.. 13 mins so far :).
> 
> regards
> Craig
> 
> 

Just to add more inconsistency into the mix, I am running with preempt
enabled, generic ide disabled, and can't make it crash.  Ran netperf for
an hour over a crossover cable on 100mbit, a couple make -j 30 kernel
compiles, dbench, and playing some mp3's all at the same time and
nothing happens despite load average reaching over 100.  Maybe I am just
lucky.


-- 
Josh McKinney		     |	Webmaster: http://joshandangie.org
--------------------------------------------------------------------------
                             | They that can give up essential liberty
Linux, the choice       -o)  | to obtain a little temporary safety deserve 
of the GNU generation    /\  | neither liberty or safety. 
                        _\_v |                          -Benjamin Franklin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11)
  2003-12-04 16:32   ` Josh McKinney
@ 2003-12-04 17:08     ` Julien Oster
  2003-12-04 17:55       ` Josh McKinney
  0 siblings, 1 reply; 17+ messages in thread
From: Julien Oster @ 2003-12-04 17:08 UTC (permalink / raw)
  To: linux-kernel

Josh McKinney <forming@charter.net> writes:

Hello Josh,

> Just to add more inconsistency into the mix, I am running with preempt
> enabled, generic ide disabled, and can't make it crash.  Ran netperf for
> an hour over a crossover cable on 100mbit, a couple make -j 30 kernel
> compiles, dbench, and playing some mp3's all at the same time and
> nothing happens despite load average reaching over 100.  Maybe I am just
> lucky.

Or maybe not.

In the very beginning, 1 or 2 months ago right after I bought the
board, it did crash but it actually didn't crash very often. In fact,
most of the time (not every time, but most!) it crashed while the
system being rather idle. To add even more perplexity to it: I could
work on the system for hours and then leave the computer half an hour
alone for talking a walk or jogging or whatever and, after coming
back, run across a complete lockup. Normally, the clock applet on my
desktop told me that the box crashed several minutes after I went out,
since the clock of course froze with the mainboard as well.

A lot changed by now, hardware and software, and now I'm hardly able
to run the system with ACPI/APIC enabled at all. If the boot procedure
goes fine, it locks up shortly after. If fsck decides to check the
disks, the mainboard is doomed to lock up away immediately.

That really is a nasty problem.

Regards,
Julien

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11)
  2003-12-04 17:08     ` Julien Oster
@ 2003-12-04 17:55       ` Josh McKinney
  2003-12-04 20:02         ` NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ? cheuche+lkml
  0 siblings, 1 reply; 17+ messages in thread
From: Josh McKinney @ 2003-12-04 17:55 UTC (permalink / raw)
  To: linux-kernel

On approximately Thu, Dec 04, 2003 at 06:08:29PM +0100, Julien Oster wrote:
> Josh McKinney <forming@charter.net> writes:
> 
> Hello Josh,
> 
> > Just to add more inconsistency into the mix, I am running with preempt
> > enabled, generic ide disabled, and can't make it crash.  Ran netperf for
> > an hour over a crossover cable on 100mbit, a couple make -j 30 kernel
> > compiles, dbench, and playing some mp3's all at the same time and
> > nothing happens despite load average reaching over 100.  Maybe I am just
> > lucky.
> 
> Or maybe not.
> 
> In the very beginning, 1 or 2 months ago right after I bought the
> board, it did crash but it actually didn't crash very often. In fact,
> most of the time (not every time, but most!) it crashed while the
> system being rather idle. To add even more perplexity to it: I could
> work on the system for hours and then leave the computer half an hour
> alone for talking a walk or jogging or whatever and, after coming
> back, run across a complete lockup. Normally, the clock applet on my
> desktop told me that the box crashed several minutes after I went out,
> since the clock of course froze with the mainboard as well.
> 
> A lot changed by now, hardware and software, and now I'm hardly able
> to run the system with ACPI/APIC enabled at all. If the boot procedure
> goes fine, it locks up shortly after. If fsck decides to check the
> disks, the mainboard is doomed to lock up away immediately.
> 
> That really is a nasty problem.
> 

This issue seems to be funny like that.  When I first recieved this
mobo I too had crashes like you say you are having now.  Doing
practically anything would make it crash, passing noapic and nolapic
on boot solved the problems.  Now, as I said, stable with all ACPI
APIC LAPIC enabled.  I haven't changed any hardware or anything
either, except for now I am using the nvidia ethernet with the
forcedeth driver, but I somehow don't think that has anything to do
with it, maybe it is worth looking into. 

-- 
Josh McKinney		     |	Webmaster: http://joshandangie.org
--------------------------------------------------------------------------
                             | They that can give up essential liberty
Linux, the choice       -o)  | to obtain a little temporary safety deserve 
of the GNU generation    /\  | neither liberty or safety. 
                        _\_v |                          -Benjamin Franklin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 17:55       ` Josh McKinney
@ 2003-12-04 20:02         ` cheuche+lkml
  2003-12-04 20:48           ` Bob
                             ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: cheuche+lkml @ 2003-12-04 20:02 UTC (permalink / raw)
  To: linux-kernel

Hello,

Along with the lockups already described here, I've noticed an
unidentified source of interrupts on IRQ7. Several people posted their
/proc/interrupts but it only shows interrupts a driver registered and is
using. I noticed a non-constant stream of interrupts on IRQ7 using sar
and I am also able to see it under /proc/interrupts when loading
parport_pc with options to get the driver using the interrupt. However
it is not related to the parallel port because putting it on IRQ5 or
disabling it in the BIOS does not affect the stream of interrupts on
IRQ7. I wonder if people experiencing lockup problems also have these
noise interrupts, and I don't know if this has something to do with the
lockups or if it is an independant problem.

Of course, booting with noapic nolapic, the system is rock-solid, and
the interrupt counter on IRQ7 stays solidly at 1 (should it be 0 ?),
until I use something on the parallel port of course.

Motherboard : abit nfs7-s v2.0, nforce2 chipset
Kernels : 2.6.0-test9, 2.6.0-test10, 2.6.0-test10-mm1, 2.6.0-test11

/proc/interrupts with apic+lapic, shortly after boot, already 408
interrupts on IRQ7 :
           CPU0
  0:     121148          XT-PIC  timer
  1:        279    IO-APIC-edge  i8042
  2:          0          XT-PIC  cascade
  7:        408    IO-APIC-edge  parport0
  8:          4    IO-APIC-edge  rtc
  9:          0   IO-APIC-level  acpi
 14:       2591    IO-APIC-edge  ide0
 15:       2916    IO-APIC-edge  ide1
 16:         47   IO-APIC-level  eth0
 18:          0   IO-APIC-level  EMU10K1
 19:       4373   IO-APIC-level  mga@PCI:2:0:0
 20:         31   IO-APIC-level  ohci-hcd
 21:          0   IO-APIC-level  ehci_hcd, NVidia nForce2
 22:          0   IO-APIC-level  ohci-hcd
NMI:          0
LOC:     120982
ERR:          0
MIS:          0

/proc/interrupts with noapic and nolapic, with IRQ7 showing only 1
interrupt since boot : 
           CPU0
  0:     803521          XT-PIC  timer
  1:       1446          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  5:      54875          XT-PIC  mga@PCI:2:0:0
  7:          1          XT-PIC  parport0
  8:          4          XT-PIC  rtc
  9:          0          XT-PIC  acpi
 10:      16027          XT-PIC  ohci_hcd,eth0, NVidia nForce2
 11:    1903732          XT-PIC  bttv0, ehci_hcd, EMU10K1
 12:       9416          XT-PIC  ohci_hcd
 14:       8513          XT-PIC  ide0
 15:       3910          XT-PIC  ide1
NMI:          0
LOC:          0
ERR:          0
MIS:          0



Mathieu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 20:02         ` NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ? cheuche+lkml
@ 2003-12-04 20:48           ` Bob
  2003-12-04 23:05           ` Jesse Allen
  2003-12-05  8:16           ` cheuche+lkml
  2 siblings, 0 replies; 17+ messages in thread
From: Bob @ 2003-12-04 20:48 UTC (permalink / raw)
  To: linux-kernel

cheuche+lkml@free.fr wrote:

>Hello,
>
>Along with the lockups already described here, I've noticed an
>unidentified source of interrupts on IRQ7.
>
Have you seen "IRQ7 Disabled" errors?

On my nforce2 mboard(MSI K7N2 MCP2-T) I would
see that err if onboard ethernet was enabled. I haven't gotten
eth to work and I have to disable onboard ethernet. I
had to disable usb for a while, now it just doesn't work
but doesn't cause a problem so I disable it anyway. I
am using a tulip card instead of onboard ethernet. I
disable the parallel port.

This is /proc/interrupts with apic, lapic, onboard eth
disabled, parallel disabled, bios flash, stable with
kernel 2.6.0-test* test11   no irq 7!

 cat /proc/interrupts
           CPU0      
  0:   47830174          XT-PIC  timer
  1:      24026    IO-APIC-edge  i8042
  2:          0          XT-PIC  cascade
  4:        228    IO-APIC-edge  serial
  8:          1    IO-APIC-edge  rtc
  9:          0   IO-APIC-level  acpi
 12:     125368    IO-APIC-edge  i8042
 14:         22    IO-APIC-edge  ide0
 15:         24    IO-APIC-edge  ide1
 16:     592751   IO-APIC-level  3ware Storage Controller, yenta, yenta
 17:    1309695   IO-APIC-level  eth0
 21:          0   IO-APIC-level  NVidia nForce2
NMI:          0
LOC:   47829780
ERR:          0
MIS:          4

-Bob

> Several people posted their
>/proc/interrupts but it only shows interrupts a driver registered and is
>using. I noticed a non-constant stream of interrupts on IRQ7 using sar
>and I am also able to see it under /proc/interrupts when loading
>parport_pc with options to get the driver using the interrupt. However
>it is not related to the parallel port because putting it on IRQ5 or
>disabling it in the BIOS does not affect the stream of interrupts on
>IRQ7. I wonder if people experiencing lockup problems also have these
>noise interrupts, and I don't know if this has something to do with the
>lockups or if it is an independant problem.
>
>Of course, booting with noapic nolapic, the system is rock-solid, and
>the interrupt counter on IRQ7 stays solidly at 1 (should it be 0 ?),
>until I use something on the parallel port of course.
>
>Motherboard : abit nfs7-s v2.0, nforce2 chipset
>Kernels : 2.6.0-test9, 2.6.0-test10, 2.6.0-test10-mm1, 2.6.0-test11
>
>/proc/interrupts with apic+lapic, shortly after boot, already 408
>interrupts on IRQ7 :
>           CPU0
>  0:     121148          XT-PIC  timer
>  1:        279    IO-APIC-edge  i8042
>  2:          0          XT-PIC  cascade
>  7:        408    IO-APIC-edge  parport0
>  8:          4    IO-APIC-edge  rtc
>  9:          0   IO-APIC-level  acpi
> 14:       2591    IO-APIC-edge  ide0
> 15:       2916    IO-APIC-edge  ide1
> 16:         47   IO-APIC-level  eth0
> 18:          0   IO-APIC-level  EMU10K1
> 19:       4373   IO-APIC-level  mga@PCI:2:0:0
> 20:         31   IO-APIC-level  ohci-hcd
> 21:          0   IO-APIC-level  ehci_hcd, NVidia nForce2
> 22:          0   IO-APIC-level  ohci-hcd
>NMI:          0
>LOC:     120982
>ERR:          0
>MIS:          0
>
>/proc/interrupts with noapic and nolapic, with IRQ7 showing only 1
>interrupt since boot : 
>           CPU0
>  0:     803521          XT-PIC  timer
>  1:       1446          XT-PIC  i8042
>  2:          0          XT-PIC  cascade
>  5:      54875          XT-PIC  mga@PCI:2:0:0
>  7:          1          XT-PIC  parport0
>  8:          4          XT-PIC  rtc
>  9:          0          XT-PIC  acpi
> 10:      16027          XT-PIC  ohci_hcd,eth0, NVidia nForce2
> 11:    1903732          XT-PIC  bttv0, ehci_hcd, EMU10K1
> 12:       9416          XT-PIC  ohci_hcd
> 14:       8513          XT-PIC  ide0
> 15:       3910          XT-PIC  ide1
>NMI:          0
>LOC:          0
>ERR:          0
>MIS:          0
>
>
>
>Mathieu
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 20:02         ` NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ? cheuche+lkml
  2003-12-04 20:48           ` Bob
@ 2003-12-04 23:05           ` Jesse Allen
  2003-12-04 23:14             ` Prakash K. Cheemplavam
  2003-12-05  8:16           ` cheuche+lkml
  2 siblings, 1 reply; 17+ messages in thread
From: Jesse Allen @ 2003-12-04 23:05 UTC (permalink / raw)
  To: cheuche+lkml; +Cc: linux-kernel

On Thu, Dec 04, 2003 at 09:02:08PM +0100, cheuche+lkml@free.fr wrote:
> Hello,
> 
> Along with the lockups already described here, I've noticed an
> unidentified source of interrupts on IRQ7.
...
> I wonder if people experiencing lockup problems also have these
> noise interrupts,

I just took a look at this, by setting up parport_pc, and yes I get noise.

This was my first sample with a kernel with APIC:
  7:      29230    IO-APIC-edge  parport0

Then I took a look again about 5 seconds later:
  7:      41560    IO-APIC-edge  parport0

And I looked again, and it was higher.  If you take a look repeatally, you see 
it increases for 2-3 seconds, then stops for 2-3, then starts increasing again 
and continues like this.  This is pretty much an idle system other than me
cat'ing.  I'm not using the parallel port at all.

Then I looked at the irq with parport_pc setup and with a kernel with APIC all 
disabled:
  7:          0          XT-PIC  parport0

And it is the same on repeated cat's.

These kernels are exactly the same except ones compiled with UP APIC and the 
other isn't.  I don't know how parport works, but seeing two different events
under this condition does seem suspicious.

> and I don't know if this has something to do with the
> lockups or if it is an independant problem.
> 

I have no idea, but it is suspicious, as I get lockups and this noise with the 
APIC enabled kernel.

Jesse

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 23:05           ` Jesse Allen
@ 2003-12-04 23:14             ` Prakash K. Cheemplavam
  2003-12-04 23:21               ` Craig Bradney
  0 siblings, 1 reply; 17+ messages in thread
From: Prakash K. Cheemplavam @ 2003-12-04 23:14 UTC (permalink / raw)
  To: Jesse Allen; +Cc: cheuche+lkml, linux-kernel

Jesse Allen wrote:
> On Thu, Dec 04, 2003 at 09:02:08PM +0100, cheuche+lkml@free.fr wrote:
> 
>>Hello,
>>
>>Along with the lockups already described here, I've noticed an
>>unidentified source of interrupts on IRQ7.
> 
> ...
> 
>>I wonder if people experiencing lockup problems also have these
>>noise interrupts,
> 
> 
> I just took a look at this, by setting up parport_pc, and yes I get noise.
> 
> This was my first sample with a kernel with APIC:
>   7:      29230    IO-APIC-edge  parport0

I just did an experminent with a very light kernel, nearly nothing 
compiled inside, except apic acpi, preempt and needed stuff plus 
scsi+libata and no ide. IRQ 7 was not present and every device had its 
own irq. Nevertheless system locked up at second hdparm run...

Prakash


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 23:14             ` Prakash K. Cheemplavam
@ 2003-12-04 23:21               ` Craig Bradney
  2003-12-04 23:36                 ` Prakash K. Cheemplavam
  2003-12-05  5:47                 ` Bob
  0 siblings, 2 replies; 17+ messages in thread
From: Craig Bradney @ 2003-12-04 23:21 UTC (permalink / raw)
  To: Prakash K. Cheemplavam; +Cc: Jesse Allen, cheuche+lkml, linux-kernel

Prakash,

try it without preempt.. just to see. As soon as I removed it today the
crashes went away (for 5 hours).. PC is now up for 2.5 hours and I'm
waiting to see if it will be 5 hrs or 5 days this time around :)

Craig

On Fri, 2003-12-05 at 00:14, Prakash K. Cheemplavam wrote:
> Jesse Allen wrote:
> > On Thu, Dec 04, 2003 at 09:02:08PM +0100, cheuche+lkml@free.fr wrote:
> > 
> >>Hello,
> >>
> >>Along with the lockups already described here, I've noticed an
> >>unidentified source of interrupts on IRQ7.
> > 
> > ...
> > 
> >>I wonder if people experiencing lockup problems also have these
> >>noise interrupts,
> > 
> > 
> > I just took a look at this, by setting up parport_pc, and yes I get noise.
> > 
> > This was my first sample with a kernel with APIC:
> >   7:      29230    IO-APIC-edge  parport0
> 
> I just did an experminent with a very light kernel, nearly nothing 
> compiled inside, except apic acpi, preempt and needed stuff plus 
> scsi+libata and no ide. IRQ 7 was not present and every device had its 
> own irq. Nevertheless system locked up at second hdparm run...
> 
> Prakash
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 23:21               ` Craig Bradney
@ 2003-12-04 23:36                 ` Prakash K. Cheemplavam
  2003-12-05  5:47                 ` Bob
  1 sibling, 0 replies; 17+ messages in thread
From: Prakash K. Cheemplavam @ 2003-12-04 23:36 UTC (permalink / raw)
  To: Craig Bradney
  Cc: Prakash K. Cheemplavam, Jesse Allen, cheuche+lkml, linux-kernel

Craig Bradney wrote:
> Prakash,
> 
> try it without preempt.. just to see. As soon as I removed it today the
> crashes went away (for 5 hours).. PC is now up for 2.5 hours and I'm
> waiting to see if it will be 5 hrs or 5 days this time around :)

Oh Ok, I did a mistake: Checking my kernel config again I noticed my 
last experiment indeed was with preemp OFF, so it didn't help.

Prakash


> On Fri, 2003-12-05 at 00:14, Prakash K. Cheemplavam wrote:
> 
>>Jesse Allen wrote:
>>
>>>On Thu, Dec 04, 2003 at 09:02:08PM +0100, cheuche+lkml@free.fr wrote:
>>>
>>>
>>>>Hello,
>>>>
>>>>Along with the lockups already described here, I've noticed an
>>>>unidentified source of interrupts on IRQ7.
>>>
>>>...
>>>
>>>
>>>>I wonder if people experiencing lockup problems also have these
>>>>noise interrupts,
>>>
>>>
>>>I just took a look at this, by setting up parport_pc, and yes I get noise.
>>>
>>>This was my first sample with a kernel with APIC:
>>>  7:      29230    IO-APIC-edge  parport0
>>
>>I just did an experminent with a very light kernel, nearly nothing 
>>compiled inside, except apic acpi, preempt and needed stuff plus 
>>scsi+libata and no ide. IRQ 7 was not present and every device had its 
>>own irq. Nevertheless system locked up at second hdparm run...
>>
>>Prakash



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 23:21               ` Craig Bradney
  2003-12-04 23:36                 ` Prakash K. Cheemplavam
@ 2003-12-05  5:47                 ` Bob
  2003-12-05  7:01                   ` Craig Bradney
  2003-12-05 12:33                   ` Prakash K. Cheemplavam
  1 sibling, 2 replies; 17+ messages in thread
From: Bob @ 2003-12-05  5:47 UTC (permalink / raw)
  To: linux-kernel

Do you have onboard ethernet enabled with nforce2
mboard? I am fine with pre-emptive kernel but have
to disable onboard ethernet in cmos setup or I see
"Disabling IRQ7" and problems develop.

-Bob

Craig Bradney wrote:

>Prakash,
>
>try it without preempt.. just to see. As soon as I removed it today the
>crashes went away (for 5 hours).. PC is now up for 2.5 hours and I'm
>waiting to see if it will be 5 hrs or 5 days this time around :)
>
>Craig
>
>On Fri, 2003-12-05 at 00:14, Prakash K. Cheemplavam wrote:
>  
>
>>Jesse Allen wrote:
>>    
>>
>>>On Thu, Dec 04, 2003 at 09:02:08PM +0100, cheuche+lkml@free.fr wrote:
>>>
>>>      
>>>
>>>>Hello,
>>>>
>>>>Along with the lockups already described here, I've noticed an
>>>>unidentified source of interrupts on IRQ7.
>>>>        
>>>>
>>>...
>>>
>>>      
>>>
>>>>I wonder if people experiencing lockup problems also have these
>>>>noise interrupts,
>>>>        
>>>>
>>>I just took a look at this, by setting up parport_pc, and yes I get noise.
>>>
>>>This was my first sample with a kernel with APIC:
>>>  7:      29230    IO-APIC-edge  parport0
>>>      
>>>
>>I just did an experminent with a very light kernel, nearly nothing 
>>compiled inside, except apic acpi, preempt and needed stuff plus 
>>scsi+libata and no ide. IRQ 7 was not present and every device had its 
>>own irq. Nevertheless system locked up at second hdparm run...
>>
>>Prakash
>>
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>>
>>    
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-05  5:47                 ` Bob
@ 2003-12-05  7:01                   ` Craig Bradney
  2003-12-05 12:33                   ` Prakash K. Cheemplavam
  1 sibling, 0 replies; 17+ messages in thread
From: Craig Bradney @ 2003-12-05  7:01 UTC (permalink / raw)
  To: Bob; +Cc: linux-kernel

nforce net is off. 3com is on. (in bios)

Craig

On Fri, 2003-12-05 at 06:47, Bob wrote:
> Do you have onboard ethernet enabled with nforce2
> mboard? I am fine with pre-emptive kernel but have
> to disable onboard ethernet in cmos setup or I see
> "Disabling IRQ7" and problems develop.
> 
> -Bob
> 
> Craig Bradney wrote:
> 
> >Prakash,
> >
> >try it without preempt.. just to see. As soon as I removed it today the
> >crashes went away (for 5 hours).. PC is now up for 2.5 hours and I'm
> >waiting to see if it will be 5 hrs or 5 days this time around :)
> >
> >Craig
> >
> >On Fri, 2003-12-05 at 00:14, Prakash K. Cheemplavam wrote:
> >  
> >
> >>Jesse Allen wrote:
> >>    
> >>
> >>>On Thu, Dec 04, 2003 at 09:02:08PM +0100, cheuche+lkml@free.fr wrote:
> >>>
> >>>      
> >>>
> >>>>Hello,
> >>>>
> >>>>Along with the lockups already described here, I've noticed an
> >>>>unidentified source of interrupts on IRQ7.
> >>>>        
> >>>>
> >>>...
> >>>
> >>>      
> >>>
> >>>>I wonder if people experiencing lockup problems also have these
> >>>>noise interrupts,
> >>>>        
> >>>>
> >>>I just took a look at this, by setting up parport_pc, and yes I get noise.
> >>>
> >>>This was my first sample with a kernel with APIC:
> >>>  7:      29230    IO-APIC-edge  parport0
> >>>      
> >>>
> >>I just did an experminent with a very light kernel, nearly nothing 
> >>compiled inside, except apic acpi, preempt and needed stuff plus 
> >>scsi+libata and no ide. IRQ 7 was not present and every device had its 
> >>own irq. Nevertheless system locked up at second hdparm run...
> >>
> >>Prakash
> >>
> >>-
> >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>the body of a message to majordomo@vger.kernel.org
> >>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>Please read the FAQ at  http://www.tux.org/lkml/
> >>
> >>    
> >>
> >
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/
> >
> >  
> >
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-04 20:02         ` NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ? cheuche+lkml
  2003-12-04 20:48           ` Bob
  2003-12-04 23:05           ` Jesse Allen
@ 2003-12-05  8:16           ` cheuche+lkml
  2 siblings, 0 replies; 17+ messages in thread
From: cheuche+lkml @ 2003-12-05  8:16 UTC (permalink / raw)
  To: linux-kernel

I've just seen something strange about IRQ7, during one test, the
interrupt counter on IRQ7 was in sync with the timer counter, and the
difference was about 21400. I rebooted to see what happens at 21.4
seconds after boot and it is more or less the time some modules get
loaded by auto-detecting hardware. But unfortunately I now cannot
reproduced it, I only get bursts of IRQ7 as initially reported.

I also noted in dmesg of 2.6.0-test that part about timer interrupt :

..TIMER: vector=0x31 pin1=2 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...  failed.
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ... works.

This is interesting because 2.4.22 and 2.4.23-pre9 shows :

..TIMER: vector=0x31 pin1=2 pin2=0
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...
..... (found pin 0) ...works.

and then the timer is not in XT-PIC mode but IO-APIC-edge mode, and I
also noticed there is no flood of IRQ7 with these 2.4 kernels. Is it 
Related or not with the IRQ flood or the lockups I don't know.

By the way 2.4.23 shows the same thing as 2.6.0-test, timer in XT-PIC
mode and some IRQ7, but way less than 2.6.0-test.

Mathieu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
  2003-12-05  5:47                 ` Bob
  2003-12-05  7:01                   ` Craig Bradney
@ 2003-12-05 12:33                   ` Prakash K. Cheemplavam
  1 sibling, 0 replies; 17+ messages in thread
From: Prakash K. Cheemplavam @ 2003-12-05 12:33 UTC (permalink / raw)
  To: Bob; +Cc: linux-kernel

Bob wrote:
> Do you have onboard ethernet enabled with nforce2
> mboard? I am fine with pre-emptive kernel but have
> to disable onboard ethernet in cmos setup or I see
> "Disabling IRQ7" and problems develop.


No, It is enabled in bios, but in my test run i just didn't compile the 
forcedeth driver. So the irq 7 didn't showed up. Eben with my "normal" 
kernel the th0 interface it mapped to irq 10:

            CPU0
   0:    1634197          XT-PIC  timer
   1:       2653          XT-PIC  i8042
   2:          0          XT-PIC  cascade
   5:       1620          XT-PIC  Skystar2, ohci_hcd, NVidia nForce2
   8:          3          XT-PIC  rtc
   9:          0          XT-PIC  acpi
  10:       9412          XT-PIC  eth0
  11:     171083          XT-PIC  libata, ohci_hcd, nvidia
  12:     110080          XT-PIC  i8042
  14:         10          XT-PIC  ide0
  15:         15          XT-PIC  ide1
NMI:          0
ERR:          2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NForce2 pseudoscience stability testing (2.6.0-test11)
  2003-12-04 12:17 NForce2 pseudoscience stability testing (2.6.0-test11) b
  2003-12-04 15:19 ` Craig Bradney
@ 2003-12-05 13:28 ` Pat Erley
  1 sibling, 0 replies; 17+ messages in thread
From: Pat Erley @ 2003-12-05 13:28 UTC (permalink / raw)
  To: linux-kernel

I'm going to add my AMD/Nvidia IDE experiences as well as my current nforce2 experience.

Firstly, aside from the forcedeth module vs. nvnet hacked, I have never even known of problems with nforce2 systems.  I have a shuttle mn31/n (micro ATX) and I can use firewire, ide hd running udma5, ide cd running udma2, and the only thing I can do to crash/hang the system is to force unload a module.  It's running apic, lapic, acpi quite happily, no preempt, run every test since around 2.5.75.

noteing that.  I have to run my FSB underclocked by 1 mhz.

my cpu claims to be an xp2400(133/266fsb) but I run it at 132/264.  It was hanging/rebooting due to heat.

my other system (a little off topic here)  is a dual athlon athlon-mp tyan thunder k7 system.  will NOT run with apic and the amd ide driver.

pat erley

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ?
@ 2003-12-05  5:56 b
  0 siblings, 0 replies; 17+ messages in thread
From: b @ 2003-12-05  5:56 UTC (permalink / raw)
  To: recbo, linux-kernel

I have almost everything disabled and the error occurs.

USB1 OFF
USB2 OFF
FireWire OFF
SATA - Jumpered OFF
AUDIO OFF
NVIDIA LAN OFF.

lspci
00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different
version?) (rev c1)
00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1
(rev c1)
00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4
(rev c1)
00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3
(rev c1)
00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2
(rev c1)
00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5
(rev c1)
00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a4)
00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
00:08.0 PCI bridge: nVidia Corporation nForce2 External PCI Bridge
(rev a3)
00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
00:0c.0 PCI bridge: nVidia Corporation nForce2 PCI Bridge (rev a3)
00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev c1)
01:07.0 Ethernet controller: Digital Equipment Corporation Farallon
PN9000SX (rev 01)
01:08.0 Ethernet controller: Digital Equipment Corporation Farallon
PN9000SX (rev 01)
02:01.0 Ethernet controller: 3Com Corporation 3C920B-EMB Integrated
Fast Ethernet Controller (rev 40)
03:00.0 VGA compatible controller: nVidia Corporation NV18
[GeForce4 MX 440 AGP 8x] (rev a2)

Thats it. And this think locks anytime APIC is enabled. Its just a
matter of time.


Bob wrote
 >(2.6.0-test11) - IRQ flood related ?
 >
 >
 >Do you have onboard ethernet enabled with nforce2
 >mboard? I am fine with pre-emptive kernel but have
 >to disable onboard ethernet in cmos setup or I see
 >"Disabling IRQ7" and problems develop.
 >
 >-Bob
 >
 >Craig Bradney wrote:
 >
 >>Prakash,
 >>
 >>try it without preempt.. just to see. As soon as I removed it
 >today the
 >>crashes went away (for 5 hours).. PC is now up for 2.5 hours and I'm
 >>waiting to see if it will be 5 hrs or 5 days this time around :)
 >>
 >>Craig
 >>
 >>On Fri, 2003-12-05 at 00:14, Prakash K. Cheemplavam wrote:
 >>
 >>
 >>>Jesse Allen wrote:
 >>>
 >>>
 >>>>On Thu, Dec 04, 2003 at 09:02:08PM +0100,

 >>>>
 >>>>
 >>>>
 >>>>>Hello,
 >>>>>
 >>>>>Along with the lockups already described here, I've noticed an
 >>>>>unidentified source of interrupts on IRQ7.
 >>>>>
 >>>>>
 >>>>...
 >>>>
 >>>>
 >>>>
 >>>>>I wonder if people experiencing lockup problems also have these
 >>>>>noise interrupts,
 >>>>>
 >>>>>
 >>>>I just took a look at this, by setting up parport_pc, and
 >yes I get noise.
 >>>>
 >>>>This was my first sample with a kernel with APIC:
 >>>>  7:      29230    IO-APIC-edge  parport0
 >>>>
 >>>>
 >>>I just did an experiment with a very light kernel, nearly nothing
 >>>compiled inside, except apic acpi, preempt and needed stuff plus
 >>>scsi+libata and no ide. IRQ 7 was not present and every
 >device had its
 >>>own irq. Nevertheless system locked up at second hdparm run...
 >>>
 >>>Prakash
 >>>
 >>>
 >>
 >
 >



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2003-12-05 13:28 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-04 12:17 NForce2 pseudoscience stability testing (2.6.0-test11) b
2003-12-04 15:19 ` Craig Bradney
2003-12-04 16:32   ` Josh McKinney
2003-12-04 17:08     ` Julien Oster
2003-12-04 17:55       ` Josh McKinney
2003-12-04 20:02         ` NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ? cheuche+lkml
2003-12-04 20:48           ` Bob
2003-12-04 23:05           ` Jesse Allen
2003-12-04 23:14             ` Prakash K. Cheemplavam
2003-12-04 23:21               ` Craig Bradney
2003-12-04 23:36                 ` Prakash K. Cheemplavam
2003-12-05  5:47                 ` Bob
2003-12-05  7:01                   ` Craig Bradney
2003-12-05 12:33                   ` Prakash K. Cheemplavam
2003-12-05  8:16           ` cheuche+lkml
2003-12-05 13:28 ` NForce2 pseudoscience stability testing (2.6.0-test11) Pat Erley
2003-12-05  5:56 NForce2 pseudoscience stability testing (2.6.0-test11) - IRQ flood related ? b

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).