SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14

All of lore.kernel.org
 help / color / mirror / Atom feed

* SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
@ 2006-12-07 16:08 Steve Graham
  2006-12-12  1:21 ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Steve Graham @ 2006-12-07 16:08 UTC (permalink / raw)
  To: jgarzik; +Cc: linux-ide

Hi Jeff,

My name is Steve Graham and I work for a small
startup.  Our company is developing a server board
with the Silicon Images 3512 and we are getting some
strange lockups during high levels of disk activity. 
The test I'm currently running to cause the problem is
to run the following concurrently: 'nbench',
'tiobench', and an 'scp' of a 200Meg file to the sata
drive.  Every so often I will get the following
message:

ata1: status=0x51 { DriveReady SeekComplete Error }
  ata1: error=0x04 { DriveStatusError }

This doesn't mean the drive is locked up and doesn't
appear to have any side effects on its own but
eventually I will get the above message that is
immediately followed by the next block of messages
that do result in a lockup:

ata1: command 0x35 timeout, stat 0xd1 host_stat 0x1
  ata1: status=0xd1 { Busy }
  sd 0:0:0:0: SCSI error: return code = 0x8000002
  sda: Current: sense key=0xb
      ASC=0x47 ASCQ=0x0
  end_request: I/O error, dev sda, sector 17033103
  ata1: Abnormal status 0xD1 on port 0xC001E087
  ata1: Alternate status 0xD1 on port 0xC001E08A
  ata1: Error 0xd1
  ata1: Abnormal status 0xD1 on port 0xC001E087
  ata1: Alternate status 0xD1 on port 0xC001E08A
  ata1: Error 0xd1
  ata1: Abnormal status 0xD1 on port 0xC001E087
  ata1: Alternate status 0xD1 on port 0xC001E08A

These messages will repeat every 30 seconds and at
this point all disk accesses are locked.

This problem has occured with both a Seagate and
Western Digital drive so I don't think I've got drive
issues.  Unfortunately I'm not at my desk or I would
tell you the model numbers on the above drives.

It looks like after the DriveStatusError the drive
will sometimes report that it is in the 'Busy' state
and it never returns from that state.  That causes the
WRITE_REQ ATA command to always timeout waiting for
the drive to become ready.

>From reading the forums, it looks like this isn't a
new problem but it also looks like it was a problem
that should have been resolved in earlier versions of
the driver.

Do you know of any other fixes that have been made to
this driver that I can try and if not do you have any
suggestions on where I may start to look to try to
debug this problem.  Also, is there a safe way to
'force' the drive out of this 'busy' state that will
not result in us losing data.

Thanks,

Steve Graham... 

____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2006-12-07 16:08 SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14 Steve Graham
@ 2006-12-12  1:21 ` Tejun Heo
  2006-12-30 18:53   ` Steve Graham
  2007-02-03 20:14   ` Fredrik Rinnestam
  0 siblings, 2 replies; 8+ messages in thread
From: Tejun Heo @ 2006-12-12  1:21 UTC (permalink / raw)
  To: Steve Graham; +Cc: jgarzik, linux-ide

Hello,

Steve Graham wrote:
> My name is Steve Graham and I work for a small
> startup.  Our company is developing a server board
> with the Silicon Images 3512 and we are getting some
> strange lockups during high levels of disk activity. 
> The test I'm currently running to cause the problem is
> to run the following concurrently: 'nbench',
> 'tiobench', and an 'scp' of a 200Meg file to the sata
> drive.  Every so often I will get the following
> message:
> 
> ata1: status=0x51 { DriveReady SeekComplete Error }
>   ata1: error=0x04 { DriveStatusError }

Which kernel version are you running?

> This doesn't mean the drive is locked up and doesn't
> appear to have any side effects on its own but
> eventually I will get the above message that is
> immediately followed by the next block of messages
> that do result in a lockup:
> 
> ata1: command 0x35 timeout, stat 0xd1 host_stat 0x1
>   ata1: status=0xd1 { Busy }
>   sd 0:0:0:0: SCSI error: return code = 0x8000002
>   sda: Current: sense key=0xb
>       ASC=0x47 ASCQ=0x0
>   end_request: I/O error, dev sda, sector 17033103
>   ata1: Abnormal status 0xD1 on port 0xC001E087
>   ata1: Alternate status 0xD1 on port 0xC001E08A
>   ata1: Error 0xd1
>   ata1: Abnormal status 0xD1 on port 0xC001E087
>   ata1: Alternate status 0xD1 on port 0xC001E08A
>   ata1: Error 0xd1
>   ata1: Abnormal status 0xD1 on port 0xC001E087
>   ata1: Alternate status 0xD1 on port 0xC001E08A

This is message from old error handling and doesn't really contain much
useful info.  Even if you have to use previous kernel in production
system, providing error messages from 2.6.19 will help chasing down the
cause.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2006-12-12  1:21 ` Tejun Heo
@ 2006-12-30 18:53   ` Steve Graham
  2007-02-03 20:14   ` Fredrik Rinnestam
  1 sibling, 0 replies; 8+ messages in thread
From: Steve Graham @ 2006-12-30 18:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, linux-ide

Hi Tejun,

Sorry it took some time to respond.  I went away for
th holidays and just returned yesterday.

We were using linux 2.6.14.  I saw a message on one of
the forums that suggested moving to linux 2.6.18
because of improved error handling to fix this
problem.  I tried that by moving the entire SCSI
framework from 2.6.18 into 2.6.14 (The full 2.6.18 is
not stable on our platform).

Anyhow, after doing this difficult task I managed to
get rid of the lockups but I still get error messages
and drive 'stalls'.  Unfortunately, I don't have them
recorded anywhere because the error messages don't
seem to hurt anything.  The drive locks for about 30
seconds, the driver does a 'soft reset' and then the
drive comes back alive.  It's far from optimal but at
least the system is usable.

I will try to repeat the test and get the error
messages again so I can send them to you but if you
have any ideas before then please let me know.

Cheers,

Steve...

--- Tejun Heo <htejun@gmail.com> wrote:

> Hello,
> 
> Steve Graham wrote:
> > My name is Steve Graham and I work for a small
> > startup.  Our company is developing a server board
> > with the Silicon Images 3512 and we are getting
> some
> > strange lockups during high levels of disk
> activity. 
> > The test I'm currently running to cause the
> problem is
> > to run the following concurrently: 'nbench',
> > 'tiobench', and an 'scp' of a 200Meg file to the
> sata
> > drive.  Every so often I will get the following
> > message:
> > 
> > ata1: status=0x51 { DriveReady SeekComplete Error
> }
> >   ata1: error=0x04 { DriveStatusError }
> 
> Which kernel version are you running?
> 
> > This doesn't mean the drive is locked up and
> doesn't
> > appear to have any side effects on its own but
> > eventually I will get the above message that is
> > immediately followed by the next block of messages
> > that do result in a lockup:
> > 
> > ata1: command 0x35 timeout, stat 0xd1 host_stat
> 0x1
> >   ata1: status=0xd1 { Busy }
> >   sd 0:0:0:0: SCSI error: return code = 0x8000002
> >   sda: Current: sense key=0xb
> >       ASC=0x47 ASCQ=0x0
> >   end_request: I/O error, dev sda, sector 17033103
> >   ata1: Abnormal status 0xD1 on port 0xC001E087
> >   ata1: Alternate status 0xD1 on port 0xC001E08A
> >   ata1: Error 0xd1
> >   ata1: Abnormal status 0xD1 on port 0xC001E087
> >   ata1: Alternate status 0xD1 on port 0xC001E08A
> >   ata1: Error 0xd1
> >   ata1: Abnormal status 0xD1 on port 0xC001E087
> >   ata1: Alternate status 0xD1 on port 0xC001E08A
> 
> This is message from old error handling and doesn't
> really contain much
> useful info.  Even if you have to use previous
> kernel in production
> system, providing error messages from 2.6.19 will
> help chasing down the
> cause.
> 
> -- 
> tejun
> 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2006-12-12  1:21 ` Tejun Heo
  2006-12-30 18:53   ` Steve Graham
@ 2007-02-03 20:14   ` Fredrik Rinnestam
  2007-02-03 23:17     ` Fredrik Rinnestam
  1 sibling, 1 reply; 8+ messages in thread
From: Fredrik Rinnestam @ 2007-02-03 20:14 UTC (permalink / raw)
  To: linux-ide

> > ata1: command 0x35 timeout, stat 0xd1 host_stat 0x1
> >   ata1: status=0xd1 { Busy }
> >   sd 0:0:0:0: SCSI error: return code = 0x8000002
> >   sda: Current: sense key=0xb
> >       ASC=0x47 ASCQ=0x0
> >   end_request: I/O error, dev sda, sector 17033103
> >   ata1: Abnormal status 0xD1 on port 0xC001E087
> >   ata1: Alternate status 0xD1 on port 0xC001E08A
> >   ata1: Error 0xd1
> >   ata1: Abnormal status 0xD1 on port 0xC001E087
> >   ata1: Alternate status 0xD1 on port 0xC001E08A
> >   ata1: Error 0xd1
> >   ata1: Abnormal status 0xD1 on port 0xC001E087
> >   ata1: Alternate status 0xD1 on port 0xC001E08A
> 
> This is message from old error handling and doesn't really contain much
> useful info.  Even if you have to use previous kernel in production
> system, providing error messages from 2.6.19 will help chasing down the
> cause.
> 
> -- 
> tejun

I get very similar error-messages with my SiI 3512 controller. I just
noticed this while connecting a second drive and moving data from sdb to
sda. Copying stuff from the internal ide-controller (nvidia) to sda does
not produce any errors. I've also tried diffrent cables without any
luck. Kernel is 2.6.19.2

This is what dmesg pukes out:

EXT3 FS on sdb1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
ata1.00: (BMDMA stat 0x60)
ata1.00: tag 0 cmd 0x35 Emask 0x1 stat 0x51 err 0x4 (device error)
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
ata1.00: (BMDMA stat 0x60)
ata1.00: tag 0 cmd 0x35 Emask 0x1 stat 0x51 err 0x4 (device error)
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x61)
ata1.00: tag 0 cmd 0x35 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd1)
ata1: port failed to respond (30 secs, Status 0xd1)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/100
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x61)
ata1.00: tag 0 cmd 0x35 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd1)
ata1: port failed to respond (30 secs, Status 0xd1)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/100
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x61)
ata1.00: tag 0 cmd 0x35 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd1)
ata1: port failed to respond (30 secs, Status 0xd1)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/100
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: limiting speed to UDMA/66
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x61)
ata1.00: tag 0 cmd 0x35 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd1)
ata1: port failed to respond (30 secs, Status 0xd1)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/66
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
ata1.00: (BMDMA stat 0x60)
ata1.00: tag 0 cmd 0x35 Emask 0x1 stat 0x51 err 0x4 (device error)
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: limiting speed to UDMA/44
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x61)
ata1.00: tag 0 cmd 0x35 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd1)
ata1: port failed to respond (30 secs, Status 0xd1)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/44
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: limiting speed to UDMA/33
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x2 frozen
ata1.00: (BMDMA stat 0x61)
ata1.00: tag 0 cmd 0x35 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd1)
ata1: port failed to respond (30 secs, Status 0xd1)
ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/33
ata1: EH complete
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back

01:09.0 Mass storage controller: Silicon Image, Inc. SiI 3512
[SATALink/SATARaid] Serial ATA Controller (rev 01)
	Subsystem: Silicon Image, Inc. SiI 3512 SATALink Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
	Latency: 32, Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 11
	Region 0: I/O ports at c800 [size=8]
	Region 1: I/O ports at cc00 [size=4]
	Region 2: I/O ports at d000 [size=8]
	Region 3: I/O ports at d400 [size=4]
	Region 4: I/O ports at d800 [size=16]
	Region 5: Memory at e3041000 (32-bit, non-prefetchable)
[size=512]
	[virtual] Expansion ROM at 50000000 [disabled] [size=512K]
	Capabilities: [60] Power Management version 2
		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=2 PME-

-- 
Fredrik Rinnestam

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2007-02-03 20:14   ` Fredrik Rinnestam
@ 2007-02-03 23:17     ` Fredrik Rinnestam
  2007-02-06  7:34       ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Fredrik Rinnestam @ 2007-02-03 23:17 UTC (permalink / raw)
  To: linux-ide

> 
> I get very similar error-messages with my SiI 3512 controller. I just
> noticed this while connecting a second drive and moving data from sdb to
> sda. Copying stuff from the internal ide-controller (nvidia) to sda does
> not produce any errors. I've also tried diffrent cables without any
> luck. Kernel is 2.6.19.2
> 
uhm, scratch that:

this is from normal operation on the second drive (fileserver):

ata2: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: failed to recover some devices, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: failed to recover some devices, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2.00: disabled
ata2: soft resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH pending after completion, repeating EH (cnt=4)
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x1
ata2: soft resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2.00: detaching (SCSI 1:0:0:0)
Synchronizing SCSI cache for disk sdb: 
FAILED
  status = 0, message = 00, host = 4, driver = 00
  <3>ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: SATA link down (SStatus 0 SControl 310)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0x2 frozen
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: hard resetting port
ata2: COMRESET failed (device not ready)
ata2: reset failed, giving up
ata2: EH complete
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2
offset 0
scsi 1:0:0:0: rejecting I/O to dead device
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device
scsi 1:0:0:0: rejecting I/O to dead device

-- 
Fredrik Rinnestam

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2007-02-03 23:17     ` Fredrik Rinnestam
@ 2007-02-06  7:34       ` Tejun Heo
  2007-02-06 17:55         ` Fredrik Rinnestam
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-02-06  7:34 UTC (permalink / raw)
  To: Fredrik Rinnestam; +Cc: linux-ide

Fredrik Rinnestam wrote:
>> I get very similar error-messages with my SiI 3512 controller. I just
>> noticed this while connecting a second drive and moving data from sdb to
>> sda. Copying stuff from the internal ide-controller (nvidia) to sda does
>> not produce any errors. I've also tried diffrent cables without any
>> luck. Kernel is 2.6.19.2
>>
> uhm, scratch that:
> 
> this is from normal operation on the second drive (fileserver):

Sorry but can't really find anything specific in your log.  Please...

1. give a host at 2.6.20.  It will give us yet more info about errors.

2. turn on "Kernel hacking -> Show timing information on printks."

3. report the result of "lspci -nn"

4. try to connect harddrives to separate power supply or use other
hardware debugging tactics.  There have been a lot of SATA bug reports
which turned out to be hardware problems.  SATA seems to be the first to
get hit when there is generic hardware problem (e.g. insufficient / bad
power supply).

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2007-02-06  7:34       ` Tejun Heo
@ 2007-02-06 17:55         ` Fredrik Rinnestam
  2007-02-12  1:03           ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Fredrik Rinnestam @ 2007-02-06 17:55 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

> Sorry but can't really find anything specific in your log.  Please...
> 
> 1. give a host at 2.6.20.  It will give us yet more info about errors.
> 
> 2. turn on "Kernel hacking -> Show timing information on printks."
> 
> 3. report the result of "lspci -nn"
> 
> 4. try to connect harddrives to separate power supply or use other
> hardware debugging tactics.  There have been a lot of SATA bug reports
> which turned out to be hardware problems.  SATA seems to be the first to
> get hit when there is generic hardware problem (e.g. insufficient / bad
> power supply).
> 

I replaced the SiL card with a Promise SATA 300 TX4 - no errors.

I plugged the SiL card in another computer and managed to produce the
same errors.

lspci -nn

00:00.0 Host bridge [0600]: Intel Corporation 82P965/G965 Memory
Controller Hub [8086:29a0] (rev 02)
00:01.0 PCI bridge [0604]: Intel Corporation 82P965/G965 PCI Express
Root Port [8086:29a1] (rev 02)
00:1a.0 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB UHCI #4 [8086:2834] (rev 02)
00:1a.1 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB UHCI #5 [8086:2835] (rev 02)
00:1a.7 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB2 EHCI #2 [8086:283a] (rev 02)
00:1b.0 Audio device [0403]: Intel Corporation 82801H (ICH8 Family) HD
Audio Controller [8086:284b] (rev 02)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI
Express Port 1 [8086:283f] (rev 02)
00:1c.5 PCI bridge [0604]: Intel Corporation 82801H (ICH8 Family) PCI
Express Port 6 [8086:2849] (rev 02)
00:1d.0 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB UHCI #1 [8086:2830] (rev 02)
00:1d.1 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB UHCI #2 [8086:2831] (rev 02)
00:1d.2 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB UHCI #3 [8086:2832] (rev 02)
00:1d.7 USB Controller [0c03]: Intel Corporation 82801H (ICH8 Family)
USB2 EHCI #1 [8086:2836] (rev 02)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge
[8086:244e] (rev f2)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801HB/HR (ICH8/R) LPC
Interface Controller [8086:2810] (rev 02)
00:1f.2 SATA controller [0106]: Intel Corporation 82801HR/HO/HH
(ICH8R/DO/DH) 6 port SATA AHCI Controller [808
6:2821] (rev 02)
00:1f.3 SMBus [0c05]: Intel Corporation 82801H (ICH8 Family) SMBus
Controller [8086:283e] (rev 02)
01:00.0 VGA compatible controller [0300]: nVidia Corporation G70
[GeForce 7300 GT] [10de:0393] (rev a1)
02:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
88E8056 PCI-E Gigabit Ethernet Controller [1
1ab:4364] (rev 12)
04:02.0 Mass storage controller [0180]: Silicon Image, Inc. SiI 3512
[SATALink/SATARaid] Serial ATA Controller
 [1095:3512] (rev 01)
04:03.0 FireWire (IEEE 1394) [0c00]: Texas Instruments TSB43AB22/A
IEEE-1394a-2000 Controller (PHY/Link) [104c
:8023]
04:04.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
88E8001 Gigabit Ethernet Controller [11ab:43
20] (rev 14)

dmesg can be found at http://fredrik.obra.se/dmesg-fubar

-- 
Fredrik Rinnestam

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14
  2007-02-06 17:55         ` Fredrik Rinnestam
@ 2007-02-12  1:03           ` Tejun Heo
  0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2007-02-12  1:03 UTC (permalink / raw)
  To: Fredrik Rinnestam; +Cc: linux-ide

Fredrik Rinnestam wrote:
>> Sorry but can't really find anything specific in your log.  Please...
>>
>> 1. give a host at 2.6.20.  It will give us yet more info about errors.
>>
>> 2. turn on "Kernel hacking -> Show timing information on printks."
>>
>> 3. report the result of "lspci -nn"
>>
>> 4. try to connect harddrives to separate power supply or use other
>> hardware debugging tactics.  There have been a lot of SATA bug reports
>> which turned out to be hardware problems.  SATA seems to be the first to
>> get hit when there is generic hardware problem (e.g. insufficient / bad
>> power supply).
>>
> 
> I replaced the SiL card with a Promise SATA 300 TX4 - no errors.
> 
> I plugged the SiL card in another computer and managed to produce the
> same errors.

It just seems you have a bad controller.  Time for RMA?

-- 
tejun


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-02-12  6:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-12-07 16:08 SIL3512 lockup problem using driver verion 0.9 and Linux 2.6.14 Steve Graham
2006-12-12  1:21 ` Tejun Heo
2006-12-30 18:53   ` Steve Graham
2007-02-03 20:14   ` Fredrik Rinnestam
2007-02-03 23:17     ` Fredrik Rinnestam
2007-02-06  7:34       ` Tejun Heo
2007-02-06 17:55         ` Fredrik Rinnestam
2007-02-12  1:03           ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.