All of lore.kernel.org
 help / color / mirror / Atom feed
* Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
@ 2003-03-19 22:16 Wolfram Schlich
  2003-03-20  1:42 ` Alan Cox
  0 siblings, 1 reply; 10+ messages in thread
From: Wolfram Schlich @ 2003-03-19 22:16 UTC (permalink / raw)
  To: Linux-Kernel mailinglist

Hi,

I am experiencing system hardlocks under the following conditions:
- Hardware:
  - Tyan Thunder K7 w/ 2x Athlon MP 1.2GHz (5x PCI)
  - 2x Onboard Adaptec 7899P SCSI adapter
	IRQ 16, IRQ 17
  - 2x Onboard 3Com 3C982 100Mb 32bit PCI NIC
  	IRQ 18, IRC 19
  - 1x National Semiconductor DP83820 1000Mb 64bit PCI NIC
	IRQ 16
  - 2x Promise Ultra 133TX2 PDC20269
    IRQ 16, IRQ 17
- Software:
  - Linux 2.4.21-pre5:
  	CONFIG_IDE=y
	CONFIG_BLK_DEV_IDEDISK=y
	CONFIG_BLK_DEV_IDEPCI=y
	CONFIG_BLK_DEV_GENERIC=y
	CONFIG_IDEPCI_SHARE_IRQ=y
	CONFIG_BLK_DEV_IDEDMA_PCI=y
	CONFIG_IDEDMA_PCI_AUTO=y
	CONFIG_BLK_DEV_IDEDMA=y
	CONFIG_BLK_DEV_ADMA=y
	CONFIG_BLK_DEV_PDC202XX_NEW=y
	CONFIG_IDEPCI_SHARE_IRQ=y
	CONFIG_IDEDMA_IVB=y
	CONFIG_BLK_DEV_PDC202XX=y
	CONFIG_BLK_DEV_IDE_MODES=y

When one of the Promise controllers is sharing the same IRQ with one of
the NICs (don't matter which, I tried all) and data is copied *to* the
machine over the network, the system deadlocks. When data is copied
*from* the system over the network, it works all ok. Unfortunately the
system BIOS doesn't give me any possibility of setting the IRQ
channels by hand, so all I can do is put the cards into other slots.

Ah, at boot time the kernel spits out this message:
--8<--
I/O APIC: AMD Errata #22 may be present. In the event of instability try
        : booting with the "noapic" option.
--8<--
I've not yet tried that, but will do now.
-- 
Wolfram Schlich; Friedhofstr. 8, D-88069 Tettnang; +49-(0)178-SCHLICH

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-19 22:16 Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs Wolfram Schlich
@ 2003-03-20  1:42 ` Alan Cox
  2003-03-20  7:22   ` Wolfram Schlich
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cox @ 2003-03-20  1:42 UTC (permalink / raw)
  To: Wolfram Schlich; +Cc: Linux-Kernel mailinglist

On Wed, 2003-03-19 at 22:16, Wolfram Schlich wrote:
> When one of the Promise controllers is sharing the same IRQ with one of
> the NICs (don't matter which, I tried all) and data is copied *to* the
> machine over the network, the system deadlocks. When data is copied
> *from* the system over the network, it works all ok. Unfortunately the
> system BIOS doesn't give me any possibility of setting the IRQ
> channels by hand, so all I can do is put the cards into other slots.
> 

Thats very useful information. There certain have been (and it seems
still are) some cases with shared IRQ that are not quite handled right.
The 2.4.21pre5/pre5-ac work has partly been about fixing it. Deadlocks
suprise me however, since the problems I've seen have been I/O
errors.

However there is another known problem that does cause deadlocks with
the AMD76x, especially if the onboard IDE is used. Shove a PS/2 mouse
in the box, reboot and retest - if you dont already have one


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20  1:42 ` Alan Cox
@ 2003-03-20  7:22   ` Wolfram Schlich
  2003-03-20  8:51     ` Stephan von Krawczynski
                       ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Wolfram Schlich @ 2003-03-20  7:22 UTC (permalink / raw)
  To: Linux-Kernel mailinglist

* Alan Cox <alan@lxorguk.ukuu.org.uk> [2003-03-20 01:31]:
> On Wed, 2003-03-19 at 22:16, Wolfram Schlich wrote:
> > When one of the Promise controllers is sharing the same IRQ with one of
> > the NICs (don't matter which, I tried all) and data is copied *to* the
> > machine over the network, the system deadlocks. When data is copied
> > *from* the system over the network, it works all ok. Unfortunately the
> > system BIOS doesn't give me any possibility of setting the IRQ
> > channels by hand, so all I can do is put the cards into other slots.
> > 
> 
> Thats very useful information. There certain have been (and it seems
> still are) some cases with shared IRQ that are not quite handled right.
> The 2.4.21pre5/pre5-ac work has partly been about fixing it. Deadlocks
> suprise me however, since the problems I've seen have been I/O
> errors.

Well, now I have trashed my array :-)
-> http://marc.theaimsgroup.com/?l=linux-raid&m=104811878405765&w=2

Btw., it spits out *lots* of messages when IRQ sharing is *disabled*
in the kernel config and just dies quietly when it's *enabled*
(having it dying before didn't mess up my array... ;)).

> However there is another known problem that does cause deadlocks with
> the AMD76x, especially if the onboard IDE is used. Shove a PS/2 mouse
> in the box, reboot and retest - if you dont already have one

?! I'm using the onboard IDE for two CDROM drives and one smaller
hard disk which I use rarely... and I didn't use any of these devices
in the cases in which I had the described problems... Anyway, why should I
connect a PS/2 mouse to the machine? Is it gonna solve all my
problems at once? ;-)
-- 
Mit freundlichen Gruessen / Yours sincerely
Wolfram Schlich; Friedhofstr. 8, D-88069 Tettnang; +49-(0)178-SCHLICH

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20  7:22   ` Wolfram Schlich
@ 2003-03-20  8:51     ` Stephan von Krawczynski
  2003-03-20 12:08       ` Wolfram Schlich
  2003-03-20 10:23     ` Chris Newland
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Stephan von Krawczynski @ 2003-03-20  8:51 UTC (permalink / raw)
  To: Wolfram Schlich; +Cc: linux-kernel

On Thu, 20 Mar 2003 08:22:59 +0100
Wolfram Schlich <wolfram@schlich.org> wrote:

> [died ide with shared interrupts]

Don't know if it is related, but I experienced the same thing sharing PDC with
3com GBit (Broadcom) and it was indeed solved by latest version of tg3-driver
from Jeff. Maybe there are analogies between the two cases concerning the nic
drivers, too.

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20  7:22   ` Wolfram Schlich
  2003-03-20  8:51     ` Stephan von Krawczynski
@ 2003-03-20 10:23     ` Chris Newland
  2003-03-20 12:06       ` Wolfram Schlich
  2003-03-20 14:57     ` Alan Cox
  2003-03-25 18:32     ` Jan Kasprzak
  3 siblings, 1 reply; 10+ messages in thread
From: Chris Newland @ 2003-03-20 10:23 UTC (permalink / raw)
  To: Wolfram Schlich, Linux-Kernel mailinglist

Hi Wolfram,

I had the same hardlock problem with dual athlons, MSI K7D Master, Promise
TX2000 (PDC20271) with 2 HDDs on RAID0 on the Promise card and only a CDROM
on the onboard IDE channel.

It used to lock hard (2.4.18 vanilla kernel) on 'tar' when using a USB mouse
but I haven't had a single lockup since plugging in a PS2 mouse :)

PS. Whilst 2.4 kernels run fine for me, I can't get any 2.5 kernel to run
yet.

I get a VFS kernel panic on bootup (can't mount root device).

I've installed Rusty's 2.5 modutils and tried compiling the 20271 driver
both into the kernel and as a module.

I read in Dave Jones' post-halloween notes that the Promise drivers are
broken:

<quote>
- The hptraid/promise RAID drivers are currently non functional, and
  will probably be converted to use device-mapper.
</quote>

Is this still true?

Best Regards,

Chris Newland

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Wolfram Schlich
> Sent: 20 March 2003 07:23
> To: Linux-Kernel mailinglist
> Subject: Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and
> shared IRQs
>
>
> * Alan Cox <alan@lxorguk.ukuu.org.uk> [2003-03-20 01:31]:
> > On Wed, 2003-03-19 at 22:16, Wolfram Schlich wrote:
> > > When one of the Promise controllers is sharing the same IRQ
> with one of
> > > the NICs (don't matter which, I tried all) and data is copied *to* the
> > > machine over the network, the system deadlocks. When data is copied
> > > *from* the system over the network, it works all ok. Unfortunately the
> > > system BIOS doesn't give me any possibility of setting the IRQ
> > > channels by hand, so all I can do is put the cards into other slots.
> > >
> >
> > Thats very useful information. There certain have been (and it seems
> > still are) some cases with shared IRQ that are not quite handled right.
> > The 2.4.21pre5/pre5-ac work has partly been about fixing it. Deadlocks
> > suprise me however, since the problems I've seen have been I/O
> > errors.
>
> Well, now I have trashed my array :-)
> -> http://marc.theaimsgroup.com/?l=linux-raid&m=104811878405765&w=2
>
> Btw., it spits out *lots* of messages when IRQ sharing is *disabled*
> in the kernel config and just dies quietly when it's *enabled*
> (having it dying before didn't mess up my array... ;)).
>
> > However there is another known problem that does cause deadlocks with
> > the AMD76x, especially if the onboard IDE is used. Shove a PS/2 mouse
> > in the box, reboot and retest - if you dont already have one
>
> ?! I'm using the onboard IDE for two CDROM drives and one smaller
> hard disk which I use rarely... and I didn't use any of these devices
> in the cases in which I had the described problems... Anyway, why should I
> connect a PS/2 mouse to the machine? Is it gonna solve all my
> problems at once? ;-)
> --
> Mit freundlichen Gruessen / Yours sincerely
> Wolfram Schlich; Friedhofstr. 8, D-88069 Tettnang; +49-(0)178-SCHLICH
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20 10:23     ` Chris Newland
@ 2003-03-20 12:06       ` Wolfram Schlich
  0 siblings, 0 replies; 10+ messages in thread
From: Wolfram Schlich @ 2003-03-20 12:06 UTC (permalink / raw)
  To: Linux-Kernel mailinglist

* Chris Newland <chris.newland@emorphia.com> [2003-03-20 11:25]:
> Hi Wolfram,

Hi!

> I had the same hardlock problem with dual athlons, MSI K7D Master, Promise
> TX2000 (PDC20271) with 2 HDDs on RAID0 on the Promise card and only a CDROM
> on the onboard IDE channel.
> 
> It used to lock hard (2.4.18 vanilla kernel) on 'tar' when using a USB mouse
> but I haven't had a single lockup since plugging in a PS2 mouse :)
> 
> PS. Whilst 2.4 kernels run fine for me, I can't get any 2.5 kernel to run
> yet.
> 
> I get a VFS kernel panic on bootup (can't mount root device).
> 
> I've installed Rusty's 2.5 modutils and tried compiling the 20271 driver
> both into the kernel and as a module.
> 
> I read in Dave Jones' post-halloween notes that the Promise drivers are
> broken:
> 
> <quote>
> - The hptraid/promise RAID drivers are currently non functional, and
>   will probably be converted to use device-mapper.
> </quote>
> 
> Is this still true?

I have no idea... I'm not using:
a) A Promise RAID controller (just the dumb ones)
b) Kernel 2.5
:-)
-- 
Mit freundlichen Gruessen / Yours sincerely
Wolfram Schlich; Friedhofstr. 8, D-88069 Tettnang; +49-(0)178-SCHLICH

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20  8:51     ` Stephan von Krawczynski
@ 2003-03-20 12:08       ` Wolfram Schlich
  0 siblings, 0 replies; 10+ messages in thread
From: Wolfram Schlich @ 2003-03-20 12:08 UTC (permalink / raw)
  To: linux-kernel

* Stephan von Krawczynski <skraw@ithnet.com> [2003-03-20 09:53]:
> On Thu, 20 Mar 2003 08:22:59 +0100
> Wolfram Schlich <wolfram@schlich.org> wrote:
> 
> > [died ide with shared interrupts]
> 
> Don't know if it is related, but I experienced the same thing sharing PDC with
> 3com GBit (Broadcom) and it was indeed solved by latest version of tg3-driver
> from Jeff. Maybe there are analogies between the two cases concerning the nic
> drivers, too.

Interesting. Well, I have the problems with both the 3c59x (100Mb) and
the ns83820 (1000Mb) drivers...
-- 
Mit freundlichen Gruessen / Yours sincerely
Wolfram Schlich; Friedhofstr. 8, D-88069 Tettnang; +49-(0)178-SCHLICH

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20 14:57     ` Alan Cox
@ 2003-03-20 14:35       ` Wolfram Schlich
  0 siblings, 0 replies; 10+ messages in thread
From: Wolfram Schlich @ 2003-03-20 14:35 UTC (permalink / raw)
  To: Linux-Kernel mailinglist

* Alan Cox <alan@lxorguk.ukuu.org.uk> [2003-03-20 14:51]:
> On Thu, 2003-03-20 at 07:22, Wolfram Schlich wrote:
> > Well, now I have trashed my array :-)
> > -> http://marc.theaimsgroup.com/?l=linux-raid&m=104811878405765&w=2
> > 
> > Btw., it spits out *lots* of messages when IRQ sharing is *disabled*
> > in the kernel config and just dies quietly when it's *enabled*
> > (having it dying before didn't mess up my array... ;)).
> 
> I'll take a look. I have no promise docs however so there is little that
> can be done for promise specific bugs if it looks that way.

Should I contact some guy at Promise regarding that issue?

> > ?! I'm using the onboard IDE for two CDROM drives and one smaller
> > hard disk which I use rarely... and I didn't use any of these devices
> > in the cases in which I had the described problems... Anyway, why should I
> > connect a PS/2 mouse to the machine? Is it gonna solve all my
> > problems at once? ;-)
> 
> Probably not, but it will avoid a lockup with IDE DMA in a specific case

This only affects onboard IDE usage?
Argh, I start to hate this AMD-MP stuff.

Btw., I get these messages from time to time (not often):
--8<--
APIC error on CPU1: 00(02)
APIC error on CPU0: 00(02)
--8<--
Should I boot with "noapic" or "disableapic"? But I guess this is
another issue...
-- 
Wolfram Schlich; Friedhofstr. 8, D-88069 Tettnang; +49-(0)178-SCHLICH

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20  7:22   ` Wolfram Schlich
  2003-03-20  8:51     ` Stephan von Krawczynski
  2003-03-20 10:23     ` Chris Newland
@ 2003-03-20 14:57     ` Alan Cox
  2003-03-20 14:35       ` Wolfram Schlich
  2003-03-25 18:32     ` Jan Kasprzak
  3 siblings, 1 reply; 10+ messages in thread
From: Alan Cox @ 2003-03-20 14:57 UTC (permalink / raw)
  To: Wolfram Schlich; +Cc: Linux-Kernel mailinglist

On Thu, 2003-03-20 at 07:22, Wolfram Schlich wrote:
> Well, now I have trashed my array :-)
> -> http://marc.theaimsgroup.com/?l=linux-raid&m=104811878405765&w=2
> 
> Btw., it spits out *lots* of messages when IRQ sharing is *disabled*
> in the kernel config and just dies quietly when it's *enabled*
> (having it dying before didn't mess up my array... ;)).

I'll take a look. I have no promise docs however so there is little that
can be done for promise specific bugs if it looks that way.

> ?! I'm using the onboard IDE for two CDROM drives and one smaller
> hard disk which I use rarely... and I didn't use any of these devices
> in the cases in which I had the described problems... Anyway, why should I
> connect a PS/2 mouse to the machine? Is it gonna solve all my
> problems at once? ;-)

Probably not, but it will avoid a lockup with IDE DMA in a specific case


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs
  2003-03-20  7:22   ` Wolfram Schlich
                       ` (2 preceding siblings ...)
  2003-03-20 14:57     ` Alan Cox
@ 2003-03-25 18:32     ` Jan Kasprzak
  3 siblings, 0 replies; 10+ messages in thread
From: Jan Kasprzak @ 2003-03-25 18:32 UTC (permalink / raw)
  To: Linux-Kernel mailinglist

Wolfram Schlich wrote:
: > However there is another known problem that does cause deadlocks with
: > the AMD76x, especially if the onboard IDE is used. Shove a PS/2 mouse
: > in the box, reboot and retest - if you dont already have one
: 
: ?! I'm using the onboard IDE for two CDROM drives and one smaller
: hard disk which I use rarely... and I didn't use any of these devices
: in the cases in which I had the described problems... Anyway, why should I
: connect a PS/2 mouse to the machine? Is it gonna solve all my
: problems at once? ;-)

	I had a similar problem which has been solved by plugging
in a PS/2 mouse. So far I've got about 10 reports from people where
the PS/2 mouse solved the problem. It seems it is limited only to
the revision 04 of AMD 768 southbridge, and especially the MSI K7D-Master
boards. My lspci looks like this:

00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev 04)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE (rev 04)
[...]

	It is even somewhat documented as an official AMD erratum.

	This is not a dead-lock per se - but rather a hard lock-up
of the box (the system is totally locked up, even pressing NumLock does
not light the NumLock LED on the keyboard).

	However: I also have occasional (less than 1 per week) dead-locks
on this box related probably to NFS or ext3 or NFS-lockd - the system is
OK, only all nfsd and lockd processes are stuck in the "D" state,
sometimes there is also an "exportfs -a" process in the "D" state
(my /etc/exports is generated from database, and I run exportfs
every two hours or so). And I think it is SMP-related, not necessarily
AMD-related. These deadlocks are more often in 2.4.21-pre kernels
than in vanilla 2.4.20. See my previous posts to LKML on this topic as well.

-Yenya

-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/   Czech Linux Homepage: http://www.linux.cz/ |
|-- If you start doing things because you hate others and want to screw  --|
|-- them over the end result is bad.   --Linus Torvalds to the BBC News  --|

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-03-25 18:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-19 22:16 Hardlocks with 2.4.21-pre5, pdc202xx_new (PDC20269) and shared IRQs Wolfram Schlich
2003-03-20  1:42 ` Alan Cox
2003-03-20  7:22   ` Wolfram Schlich
2003-03-20  8:51     ` Stephan von Krawczynski
2003-03-20 12:08       ` Wolfram Schlich
2003-03-20 10:23     ` Chris Newland
2003-03-20 12:06       ` Wolfram Schlich
2003-03-20 14:57     ` Alan Cox
2003-03-20 14:35       ` Wolfram Schlich
2003-03-25 18:32     ` Jan Kasprzak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.