linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.4.21 IDE problems (lost interrupt, bad DMA status)
@ 2003-06-30 22:15 Marek Michalkiewicz
  2003-06-30 22:16 ` Alan Cox
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Marek Michalkiewicz @ 2003-06-30 22:15 UTC (permalink / raw)
  To: linux-kernel

Hi,

After upgrading the kernel from 2.4.20 to 2.4.21, sometimes I see
the following messages:

hda: dma_timer_expiry: dma status == 0x24
hda: lost interrupt
hda: dma_intr: bad DMA status (dma_stat=30)
hda: dma_intr: status=0x50 { DriveReady SeekComplete }

It happens especially when there is a lot of disk I/O (which stops
for a few seconds when these messages appear), with three different
disks (very unlikely they all decided to die at the same time...),
one old ATA33 (QUANTUM FIREBALL SE8.4A) and two newer ATA100 disks
(WDC WD300BB-32CCB0, ST340015A).  IDE controller: VIA VT82C686B
on a MSI MS-6368L motherboard.

I don't remember seeing anything like that in any earlier 2.4.x
kernels.  Is this a known problem?  Is this anything dangerous -
should I disable UDMA for now to play it safe?

Thanks,
Marek


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-06-30 22:15 2.4.21 IDE problems (lost interrupt, bad DMA status) Marek Michalkiewicz
@ 2003-06-30 22:16 ` Alan Cox
  2003-07-01 19:45   ` Marek Michalkiewicz
  2003-06-30 22:39 ` Danny ter Haar
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Alan Cox @ 2003-06-30 22:16 UTC (permalink / raw)
  To: Marek Michalkiewicz; +Cc: Linux Kernel Mailing List

On Llu, 2003-06-30 at 23:15, Marek Michalkiewicz wrote:
> Hi,
> 
> After upgrading the kernel from 2.4.20 to 2.4.21, sometimes I see
> the following messages:
> 
> hda: dma_timer_expiry: dma status == 0x24
> hda: lost interrupt
> hda: dma_intr: bad DMA status (dma_stat=30)
> hda: dma_intr: status=0x50 { DriveReady SeekComplete }

Does it happen if you disable local apic support ?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-06-30 22:15 2.4.21 IDE problems (lost interrupt, bad DMA status) Marek Michalkiewicz
  2003-06-30 22:16 ` Alan Cox
@ 2003-06-30 22:39 ` Danny ter Haar
  2003-06-30 22:47 ` dmeyer
  2003-07-01 13:13 ` Edward King
  3 siblings, 0 replies; 10+ messages in thread
From: Danny ter Haar @ 2003-06-30 22:39 UTC (permalink / raw)
  To: linux-kernel

Marek Michalkiewicz  <marekm@amelek.gda.pl> wrote:
>I don't remember seeing anything like that in any earlier 2.4.x
>kernels.  Is this a known problem?  Is this anything dangerous -
>should I disable UDMA for now to play it safe?

afaik this concerns a "lost" interrupt.
Alan Cox's -ax__ pre-patches (current ac4) seems to fix it 
for a lot of people. Other approch is to disable IO_APIC on
uni processors during kernel compile.

Happy compiling ;-)

Danny

-- 
Miguel   | "I can't tell if I have worked all my life or if
de Icaza |  I have never worked a single day of my life,"


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-06-30 22:15 2.4.21 IDE problems (lost interrupt, bad DMA status) Marek Michalkiewicz
  2003-06-30 22:16 ` Alan Cox
  2003-06-30 22:39 ` Danny ter Haar
@ 2003-06-30 22:47 ` dmeyer
  2003-07-02 10:34   ` joe briggs
  2003-07-01 13:13 ` Edward King
  3 siblings, 1 reply; 10+ messages in thread
From: dmeyer @ 2003-06-30 22:47 UTC (permalink / raw)
  To: linux-kernel

In article <20030630221542.GA17416@alf.amelek.gda.pl> you write:
> Hi,
> 
> After upgrading the kernel from 2.4.20 to 2.4.21, sometimes I see
> the following messages:
> 
> hda: dma_timer_expiry: dma status == 0x24
> hda: lost interrupt
> hda: dma_intr: bad DMA status (dma_stat=30)
> hda: dma_intr: status=0x50 { DriveReady SeekComplete }
> 
> It happens especially when there is a lot of disk I/O (which stops
> for a few seconds when these messages appear), with three different
> disks (very unlikely they all decided to die at the same time...),
> one old ATA33 (QUANTUM FIREBALL SE8.4A) and two newer ATA100 disks
> (WDC WD300BB-32CCB0, ST340015A).  IDE controller: VIA VT82C686B
> on a MSI MS-6368L motherboard.
> 
> I don't remember seeing anything like that in any earlier 2.4.x
> kernels.  Is this a known problem?  Is this anything dangerous -
> should I disable UDMA for now to play it safe?

I never saw any corruption when I had it.  I've seen this with stock
kernels since 2.4.18 or so with ACPI and APIC enabled; with ac kernels
I never get it (I'm suspecting the old ACPI in the stock kernels is
the problem).

So my suggestion is either turn off ACPI and/or APIC, or try
2.4.21-ac.

-- 
Dave Meyer
dmeyer@dmeyer.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-06-30 22:15 2.4.21 IDE problems (lost interrupt, bad DMA status) Marek Michalkiewicz
                   ` (2 preceding siblings ...)
  2003-06-30 22:47 ` dmeyer
@ 2003-07-01 13:13 ` Edward King
  2003-07-01 19:53   ` Marek Michalkiewicz
  3 siblings, 1 reply; 10+ messages in thread
From: Edward King @ 2003-07-01 13:13 UTC (permalink / raw)
  To: Marek Michalkiewicz, linux-kernel



Marek Michalkiewicz wrote:

>Hi,
>
>After upgrading the kernel from 2.4.20 to 2.4.21, sometimes I see
>the following messages:
>
>hda: dma_timer_expiry: dma status == 0x24
>hda: lost interrupt
>hda: dma_intr: bad DMA status (dma_stat=30)
>hda: dma_intr: status=0x50 { DriveReady SeekComplete }
>
>It happens especially when there is a lot of disk I/O (which stops
>for a few seconds when these messages appear), with three different
>disks (very unlikely they all decided to die at the same time...),
>  
>

Are you using software raid or devfs?

I was losing interrupts and disabling devfs removed the problem (very 
reproducable with software raid 5 -- never really tried much heavy disk 
use without raid.)

Edward King



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-06-30 22:16 ` Alan Cox
@ 2003-07-01 19:45   ` Marek Michalkiewicz
  0 siblings, 0 replies; 10+ messages in thread
From: Marek Michalkiewicz @ 2003-07-01 19:45 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

On Mon, Jun 30, 2003 at 11:16:40PM +0100, Alan Cox wrote:
> On Llu, 2003-06-30 at 23:15, Marek Michalkiewicz wrote:
> > 
> > hda: dma_timer_expiry: dma status == 0x24
> > hda: lost interrupt
> > hda: dma_intr: bad DMA status (dma_stat=30)
> > hda: dma_intr: status=0x50 { DriveReady SeekComplete }
> 
> Does it happen if you disable local apic support ?

It seems that booting with "noapic" fixes it, or at least now it
is much more difficult to trigger.  Still testing...

Before upgrading to 2.4.21, I've been running 2.4.20 with APIC
enabled for a few months, and there were no such IDE errors.

BTW, "noapic" fixes the "power button not working if ACPI is alone
on its own IRQ" problem (present in both 2.4.20 and 2.4.21) too.

Thanks,
Marek


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-07-01 13:13 ` Edward King
@ 2003-07-01 19:53   ` Marek Michalkiewicz
  0 siblings, 0 replies; 10+ messages in thread
From: Marek Michalkiewicz @ 2003-07-01 19:53 UTC (permalink / raw)
  To: Edward King; +Cc: linux-kernel

On Tue, Jul 01, 2003 at 08:13:34AM -0500, Edward King wrote:
> 
> Are you using software raid or devfs?

No devfs.  As for RAID - I'm running the same kernel image on two
very similar boxes, RAID is compiled in but not used on one box,
and the other box currently has RAID1 in degraded mode (one disk,
waiting for me to install the second one).

Marek


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-07-02 10:34   ` joe briggs
@ 2003-07-02 10:00     ` Herbert Xu
  0 siblings, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2003-07-02 10:00 UTC (permalink / raw)
  To: joe briggs, linux-kernel

joe briggs <jbriggs@briggsmedia.com> wrote:
> Can anyone tell me what the -ac patches do with respect to this problem?  
> Also, what functionality is lost when CONFIG_X86_IO_APIC is not set, and 
> should it improve this hd timeout/lost interrupt problem?

It fixes the problem where interrupts are lost when the relevant IRQ line
is disabled.
-- 
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
  2003-06-30 22:47 ` dmeyer
@ 2003-07-02 10:34   ` joe briggs
  2003-07-02 10:00     ` Herbert Xu
  0 siblings, 1 reply; 10+ messages in thread
From: joe briggs @ 2003-07-02 10:34 UTC (permalink / raw)
  To: dmeyer, linux-kernel

Can anyone tell me what the -ac patches do with respect to this problem?  
Also, what functionality is lost when CONFIG_X86_IO_APIC is not set, and 
should it improve this hd timeout/lost interrupt problem?

Thanks!

On Monday 30 June 2003 06:47 pm, dmeyer@dmeyer.net wrote:
> In article <20030630221542.GA17416@alf.amelek.gda.pl> you write:
> > Hi,
> >
> > After upgrading the kernel from 2.4.20 to 2.4.21, sometimes I see
> > the following messages:
> >
> > hda: dma_timer_expiry: dma status == 0x24
> > hda: lost interrupt
> > hda: dma_intr: bad DMA status (dma_stat=30)
> > hda: dma_intr: status=0x50 { DriveReady SeekComplete }
> >
> > It happens especially when there is a lot of disk I/O (which stops
> > for a few seconds when these messages appear), with three different
> > disks (very unlikely they all decided to die at the same time...),
> > one old ATA33 (QUANTUM FIREBALL SE8.4A) and two newer ATA100 disks
> > (WDC WD300BB-32CCB0, ST340015A).  IDE controller: VIA VT82C686B
> > on a MSI MS-6368L motherboard.
> >
> > I don't remember seeing anything like that in any earlier 2.4.x
> > kernels.  Is this a known problem?  Is this anything dangerous -
> > should I disable UDMA for now to play it safe?
>
> I never saw any corruption when I had it.  I've seen this with stock
> kernels since 2.4.18 or so with ACPI and APIC enabled; with ac kernels
> I never get it (I'm suspecting the old ACPI in the stock kernels is
> the problem).
>
> So my suggestion is either turn off ACPI and/or APIC, or try
> 2.4.21-ac.

-- 
Joe Briggs
Briggs Media Systems
105 Burnsen Ave.
Manchester NH 01304 USA
TEL 603-232-3115 FAX 603-625-5809 MOBILE 603-493-2386
www.briggsmedia.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.4.21 IDE problems (lost interrupt, bad DMA status)
@ 2003-07-21 21:57 Ronald Wahl
  0 siblings, 0 replies; 10+ messages in thread
From: Ronald Wahl @ 2003-07-21 21:57 UTC (permalink / raw)
  To: linux-kernel

Herbert Xu wrote:
> joe briggs <jbriggs@briggsmedia.com> wrote:
> > Can anyone tell me what the -ac patches do with respect to this problem?  
> > Also, what functionality is lost when CONFIG_X86_IO_APIC is not set, and 
> > should it improve this hd timeout/lost interrupt problem?

> It fixes the problem where interrupts are lost when the relevant IRQ line
> is disabled.

I have 3 questions regarding this issue:

1. Can you explain the problem a little bit more in detail?

2. Is there a dedicated patch solving this issue? (I don't want to
   apply the complete -ac patch )

3. Will this patch be in 2.4.22?


Thx & regards,
ron

PS: Sorry if this mail is not part of the origin thread. I'm not on the
    list and read about the problem in a mailing list archive.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-07-21 21:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-30 22:15 2.4.21 IDE problems (lost interrupt, bad DMA status) Marek Michalkiewicz
2003-06-30 22:16 ` Alan Cox
2003-07-01 19:45   ` Marek Michalkiewicz
2003-06-30 22:39 ` Danny ter Haar
2003-06-30 22:47 ` dmeyer
2003-07-02 10:34   ` joe briggs
2003-07-02 10:00     ` Herbert Xu
2003-07-01 13:13 ` Edward King
2003-07-01 19:53   ` Marek Michalkiewicz
2003-07-21 21:57 Ronald Wahl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).