linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Crashing with Abit KT7, 2.2.19+ide patches
@ 2001-08-27  8:01 Nicholas Lee
  2001-08-27 23:14 ` Tim Moore
  0 siblings, 1 reply; 6+ messages in thread
From: Nicholas Lee @ 2001-08-27  8:01 UTC (permalink / raw)
  To: linux-kernel



Are there any known issues with 2.2.19+ide patchs and the Abit KT7?


[nic@hoppa:~] cat /proc/ide/drivers
ide-cdrom version 4.58
ide-disk version 1.09
[nic@hoppa:~] uname -a
Linux hoppa 2.2.19 #2 Thu Aug 16 16:28:31 NZST 2001 i686 unknown
[nic@hoppa:~] cat /proc/ide/hda/model
ST320420A
[nic@hoppa:~] dmesg | grep -i DMA
VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:07.1
    ide0: BM-DMA at 0xec00-0xec07, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xec08-0xec0f, BIOS settings: hdc:DMA, hdd:pio
hda: ST320420A, 19458MB w/2048kB Cache, CHS=2480/255/63, UDMA(66)
hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)

Aug 26 13:59:05 hoppa kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Aug 26 13:59:05 hoppa kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

[nic@hoppa:~] sudo hdparm -v /dev/hda

/dev/hda:
 multcount    =  0 (off)
 I/O support  =  1 (32-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 nowerr       =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 2480/255/63, sectors = 39851760, start = 0


Was running 2.2.18pre21/ prior and still getting these errors, but now
I've had two HDD crashes in the last week.


I just upgraded this system from 2.2.18pre21 tO 2.2.19+ide patches in
the last two weeks and its become rather flaky.


I'd had two HDD crashes in the last week, none where the system was any
all load. 

This is rather disquieting as linux is usually rock stable in those
situation.

The system basically freezes and the console disappears bus reset
messages, which aren't saved to the syslog.  On a button-reset the
Primary Master IDE HDD (hda) just doesn't exist (for the bios) and boot
up.  A power-cycle is required to get the system back again.


I didn't have a crash issue with this system prior with 2.2.18pre21.
Although I have also added a Dlink 4-port NIC and the system is doing
some IPSec that it wasn't doing before.

Is this hardware that's just gone flaky?  I'm certainly not trusting VIA
much any more, and Seagate are definitely way off my purchase list.


PS: I'm not subscribed, please CC:


Thanks,

-- 
Nicholas Lee - nj.lee at plumtree.co dot nz, somewhere on the fish Maui caught.

                         Quixotic Eccentricity


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Crashing with Abit KT7, 2.2.19+ide patches
  2001-08-27  8:01 Crashing with Abit KT7, 2.2.19+ide patches Nicholas Lee
@ 2001-08-27 23:14 ` Tim Moore
  2001-08-27 23:48   ` Nicholas Lee
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Tim Moore @ 2001-08-27 23:14 UTC (permalink / raw)
  To: Nicholas Lee; +Cc: linux-kernel

> [nic@hoppa:~] dmesg | grep -i DMA
> VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:07.1
>     ide0: BM-DMA at 0xec00-0xec07, BIOS settings: hda:DMA, hdb:pio
>     ide1: BM-DMA at 0xec08-0xec0f, BIOS settings: hdc:DMA, hdd:pio
> hda: ST320420A, 19458MB w/2048kB Cache, CHS=2480/255/63, UDMA(66)
> hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)
> 
> Aug 26 13:59:05 hoppa kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> Aug 26 13:59:05 hoppa kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

I've a similar machine

[15:45] abit:~ > dmesg | grep -i DMA
VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:07.1
    ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:DMA, hdd:DMA
hda: IBM-DTLA-307020, 19623MB w/1916kB Cache, CHS=2501/255/63, UDMA(66)
hdc: Maxtor 32049H2, 19541MB w/2048kB Cache, CHS=39704/16/63, UDMA(66)

and have had this problem in the past.  Make sure you are using the
latest 2.2.19 ide patch (ide.2.2.19.05042001).  My problem was marginal
ATA/66 IDE cables that came with my motherboard (Abit KA7).  Other than
upgrading cables I also use kernel param 'ide0=ata66' at boot and these
.config entries:

CONFIG_M686=y
CONFIG_MTRR=y
CONFIG_IDEDMA_AUTO=y
CONFIG_BLK_DEV_VIA82CXXX=y

Also it could be EIDE cables too long or not fully inserted or damaged
(pinched), or an actual disk failing, or excessive heat in the box if
the disk is very hot to the touch.

rgds,
tim.

--

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Crashing with Abit KT7, 2.2.19+ide patches
  2001-08-27 23:14 ` Tim Moore
@ 2001-08-27 23:48   ` Nicholas Lee
  2001-08-28  0:15   ` Nicholas Lee
  2001-08-28  2:16   ` Nicholas Lee
  2 siblings, 0 replies; 6+ messages in thread
From: Nicholas Lee @ 2001-08-27 23:48 UTC (permalink / raw)
  To: Tim Moore; +Cc: linux-kernel

On Mon, Aug 27, 2001 at 04:14:43PM -0700, Tim Moore wrote:
> > [nic@hoppa:~] dmesg | grep -i DMA
> > VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:07.1
> >     ide0: BM-DMA at 0xec00-0xec07, BIOS settings: hda:DMA, hdb:pio
> >     ide1: BM-DMA at 0xec08-0xec0f, BIOS settings: hdc:DMA, hdd:pio
> > hda: ST320420A, 19458MB w/2048kB Cache, CHS=2480/255/63, UDMA(66)
> > hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)
> > 
> > Aug 26 13:59:05 hoppa kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > Aug 26 13:59:05 hoppa kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
> 
> I've a similar machine
> 
> [15:45] abit:~ > dmesg | grep -i DMA
> VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:07.1
>     ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:DMA
>     ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:DMA, hdd:DMA
> hda: IBM-DTLA-307020, 19623MB w/1916kB Cache, CHS=2501/255/63, UDMA(66)
> hdc: Maxtor 32049H2, 19541MB w/2048kB Cache, CHS=39704/16/63, UDMA(66)
> 
> and have had this problem in the past.  Make sure you are using the
> latest 2.2.19 ide patch (ide.2.2.19.05042001).  My problem was marginal


[nic@hoppa:~] ls -l ide.2.2.19.05042001.patch 
-rw-r--r--    1 nic      nic        240676 Aug 15 18:22 ide.2.2.19.05042001.patch


Hmm, I should have looked "in" this patch before:


+Use multi-mode by default
+CONFIG_IDEDISK_MULTI_MODE
+  If you get this error, try to enable this option.
+
+  hda: set_multmode: status=0x51 { DriveReady SeekComplete Error }
+  hda: set_multmode: error=0x04 { DriveStatusError }
+
+  If in doubt, say N.
+

Not sure if that will help though as the error messages I;'ve been
seeing are different:

Aug 28 08:55:00 hoppa kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Aug 28 08:55:00 hoppa kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

No mention of set_multmode.  Looks more like a DMA interrupt (dma_intr) error.



> ATA/66 IDE cables that came with my motherboard (Abit KA7).  Other than

It has an ATA100 cable on the HDD in question, of course that is a cable
that came with the motherboard.  

> upgrading cables I also use kernel param 'ide0=ata66' at boot and these
> .config entries:
> 
> CONFIG_M686=y
> CONFIG_MTRR=y
> CONFIG_IDEDMA_AUTO=y
> CONFIG_BLK_DEV_VIA82CXXX=y

Yep they are all set.   



> Also it could be EIDE cables too long or not fully inserted or damaged
> (pinched), or an actual disk failing, or excessive heat in the box if

I'll look at getting some new cables I guess.  I hate these sorts of
bugs, as the only way to test fixes is to risk further corruption.


> the disk is very hot to the touch.

I doubt heat would be the problem as the system is in a new case (6
months) with an extra case fan and a 300W PSU.  Last time I looked in
the bios it said the Duron 700 was running at 30C, I'm not sure I
believe that though.   Its certainly not like some of the old P5
troopers I've seen around.  One friend had the PSU and CPU fan fail on
him, almost burnt the house down before he found out,  the system keep
happly going almost to the end. 8)



Do you think that putting something like a 3wave IDE controller in the
machine will fix this?  ie. Ignoring a possible cable fault is VIA 686
southbridge really that crap.


Thanks for the help.


-- 
Nicholas Lee - nj.lee at plumtree.co dot nz, somewhere on the fish Maui caught.

                         Quixotic Eccentricity


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Crashing with Abit KT7, 2.2.19+ide patches
  2001-08-27 23:14 ` Tim Moore
  2001-08-27 23:48   ` Nicholas Lee
@ 2001-08-28  0:15   ` Nicholas Lee
  2001-08-28  2:16   ` Nicholas Lee
  2 siblings, 0 replies; 6+ messages in thread
From: Nicholas Lee @ 2001-08-28  0:15 UTC (permalink / raw)
  To: Tim Moore; +Cc: linux-kernel

On Mon, Aug 27, 2001 at 04:14:43PM -0700, Tim Moore wrote:
> > [nic@hoppa:~] dmesg | grep -i DMA
> > VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:07.1
> >     ide0: BM-DMA at 0xec00-0xec07, BIOS settings: hda:DMA, hdb:pio
> >     ide1: BM-DMA at 0xec08-0xec0f, BIOS settings: hdc:DMA, hdd:pio
> > hda: ST320420A, 19458MB w/2048kB Cache, CHS=2480/255/63, UDMA(66)
> > hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)
> > 
> > Aug 26 13:59:05 hoppa kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > Aug 26 13:59:05 hoppa kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
> 

I've discovered this comment by Andre Hedrick and another by Alan Cox:

http://marc.theaimsgroup.com/?l=linux-kernel&m=97528796025605&w=2

"This is what it tells you directly.  You have dirty crosstalk on your
ribbon.  Basically nothing is wrong, except you can not safely support
that transfer rate."

http://marc.theaimsgroup.com/?l=linux-kernel&m=99633759016613&w=2

"BadCRC is normally a cable error, but I'm suspicious that its also one
of the things caused by PCI bus problems on the VIA stuff"


The thing is though how can it be such a short step between a few CRC
errors and the IDE bus going into Autistic mode.



Nicholas



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Crashing with Abit KT7, 2.2.19+ide patches
  2001-08-27 23:14 ` Tim Moore
  2001-08-27 23:48   ` Nicholas Lee
  2001-08-28  0:15   ` Nicholas Lee
@ 2001-08-28  2:16   ` Nicholas Lee
  2001-08-28 16:56     ` Tim Moore
  2 siblings, 1 reply; 6+ messages in thread
From: Nicholas Lee @ 2001-08-28  2:16 UTC (permalink / raw)
  Cc: linux-kernel



I managed to catch one of these crash the system messages:

Aug 28 14:07:51 hoppa kernel: hda: timeout waiting for DMA
Aug 28 14:07:51 hoppa kernel: hda: ide_dma_timeout: Lets do it again!stat = 0xd0, dma_stat = 0x20
Aug 28 14:07:51 hoppa kernel: hda: DMA disabled
Aug 28 14:07:51 hoppa kernel: hda: irq timeout: status=0x80 { Busy }
Aug 28 14:07:51 hoppa kernel: hda: DMA disabled
Aug 28 14:07:51 hoppa kernel: hda: ide_set_handler: handler not null; old=c018f67c, new=c018f67c
Aug 28 14:07:51 hoppa kernel: bug: kernel timer added twice at c018f526.
Aug 28 14:07:53 hoppa kernel: ide0: reset: success


Note the second the last line: "bug: kernel timer added twice at
c018f526"


This occured 4 mintues after a system reboot and some rsync activity. 
It occured this time when I was in the shell doing a cd [tab].

Other times the HDD might have just crashed again.


Note: this box is also acting as a router between three interfaces at
the same time with IPSec on one of these interfaces.   


Is this likely to make the situation worse?


Nicholas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Crashing with Abit KT7, 2.2.19+ide patches
  2001-08-28  2:16   ` Nicholas Lee
@ 2001-08-28 16:56     ` Tim Moore
  0 siblings, 0 replies; 6+ messages in thread
From: Tim Moore @ 2001-08-28 16:56 UTC (permalink / raw)
  To: Nicholas Lee; +Cc: linux-kernel

Nicholas Lee wrote:
> 
> I managed to catch one of these crash the system messages:
> 
> Aug 28 14:07:51 hoppa kernel: hda: timeout waiting for DMA
> Aug 28 14:07:51 hoppa kernel: hda: ide_dma_timeout: Lets do it again!stat = 0xd0, dma_stat = 0x20
> Aug 28 14:07:51 hoppa kernel: hda: DMA disabled
> Aug 28 14:07:51 hoppa kernel: hda: irq timeout: status=0x80 { Busy }
> Aug 28 14:07:51 hoppa kernel: hda: DMA disabled
> Aug 28 14:07:51 hoppa kernel: hda: ide_set_handler: handler not null; old=c018f67c, new=c018f67c
> Aug 28 14:07:51 hoppa kernel: bug: kernel timer added twice at c018f526.
> Aug 28 14:07:53 hoppa kernel: ide0: reset: success

It might be a side effect.  Never try to resolve more than one issue at
a time.

Remove all non-critical PCI cards and drives except the system disk,
then play with swapping IDE cables and generating disk I/O till no more
timeouts.
--

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-08-28 16:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-27  8:01 Crashing with Abit KT7, 2.2.19+ide patches Nicholas Lee
2001-08-27 23:14 ` Tim Moore
2001-08-27 23:48   ` Nicholas Lee
2001-08-28  0:15   ` Nicholas Lee
2001-08-28  2:16   ` Nicholas Lee
2001-08-28 16:56     ` Tim Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).