* 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
@ 2007-02-07 17:17 Emmeran Seehuber
2007-02-09 9:19 ` Tejun Heo
0 siblings, 1 reply; 9+ messages in thread
From: Emmeran Seehuber @ 2007-02-07 17:17 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]
Hi there,
we`ve got a database server machine running a 2.6.18.2 vanilla kernel on
Debian Etch. The database is MySQL 5. Everything works fine, but sometimes
the server "lags", i.e. it doesn`t respond for 30 seconds. We`ve now
investigated the problem and found this messages in syslog (and dmesg):
15:55:44 omega11 kernel: ata1: port is slow to respond, please be patient
15:55:44 omega11 kernel: ata1: soft resetting port
15:55:44 omega11 kernel: ata1: port is slow to respond, please be patient
15:55:44 omega11 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl
300)
15:55:44 omega11 kernel: ATA: abnormal status 0xD0 on port 0xFFFFC2000000401C
15:55:44 omega11 last message repeated 5 times
15:55:44 omega11 kernel: ata1.00: qc timeout (cmd 0xec)
15:55:44 omega11 kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
15:55:44 omega11 kernel: ata1: failed to recover some devices, retrying in 5
secs
15:55:44 omega11 kernel: ata1: hard resetting port
15:55:44 omega11 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl
300)
15:55:44 omega11 kernel: ata1.00: configured for UDMA/133
15:55:44 omega11 kernel: ata1: EH complete
15:55:44 omega11 kernel: SCSI device sda: 293046768 512-byte hdwr sectors
(150040 MB)
15:55:44 omega11 kernel: sda: Write Protect is off
15:55:44 omega11 kernel: SCSI device sda: drive cache: write back
We`ve got this messages up to 5 times a day since as far as our syslogs reach.
It seems no kind of queuing is used:
# cat /sys/block/sda/device/queue_type
none
# cat /sys/block/sda/device/queue_depth
1
The server is up for 91 days now and has low to medium load (depending on
daytime). Since it`s a production server located in a datacenter, we can`t
just test some random kernel on it :(
Does somebody have a glue whats going on here? Could it be a hardware failure?
We have an identical machine using the same kernel. It`s used as a webserver.
There also this messages shows up, but not that often (10 times in 91 days
uptime). If it is a hardware failure, then both machines would been affected
by the same hardware problem.
What can we do to fix this problem? Is it known?
I`ve found many posts related to SATA problems, but none seemed to be about
this problem.
Do you need additional information?
Thanks
cu,
Emmy
P.S.: Please CC me, since i`m not subscribed.
[-- Attachment #2: lspci.txt --]
[-- Type: text/plain, Size: 1565 bytes --]
00:01.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge
00:02.0 Host bridge: Broadcom HT1000 Legacy South Bridge
00:02.1 IDE interface: Broadcom HT1000 Legacy IDE controller
00:02.2 ISA bridge: Broadcom HT1000 LPC Bridge
00:03.0 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:03.1 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:03.2 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:04.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05)
00:05.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05)
00:06.0 VGA compatible controller: XGI - Xabre Graphics Inc Volari Z7
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:0d.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge (rev c0)
01:0e.0 RAID bus controller: Broadcom BCM5785 (HT1000) SATA Native SATA Mode
[-- Attachment #3: config.gz --]
[-- Type: application/x-gzip, Size: 11411 bytes --]
[-- Attachment #4: cpuinfo.txt --]
[-- Type: text/plain, Size: 3052 bytes --]
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 275
stepping : 2
cpu MHz : 2194.616
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy
bogomips : 4390.69
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 275
stepping : 2
cpu MHz : 2194.616
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy
bogomips : 4390.11
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
processor : 2
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 275
stepping : 2
cpu MHz : 2194.616
cache size : 1024 KB
physical id : 1
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy
bogomips : 4393.11
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
processor : 3
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 275
stepping : 2
cpu MHz : 2194.616
cache size : 1024 KB
physical id : 1
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy
bogomips : 4393.51
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-07 17:17 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000)) Emmeran Seehuber
@ 2007-02-09 9:19 ` Tejun Heo
2007-02-09 11:37 ` Emmeran Seehuber
0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2007-02-09 9:19 UTC (permalink / raw)
To: Emmeran Seehuber; +Cc: linux-kernel
Hi,
Emmeran Seehuber wrote:
> we`ve got a database server machine running a 2.6.18.2 vanilla kernel on
> Debian Etch. The database is MySQL 5. Everything works fine, but sometimes
> the server "lags", i.e. it doesn`t respond for 30 seconds. We`ve now
> investigated the problem and found this messages in syslog (and dmesg):
>
> 15:55:44 omega11 kernel: ata1: port is slow to respond, please be patient
> 15:55:44 omega11 kernel: ata1: soft resetting port
> 15:55:44 omega11 kernel: ata1: port is slow to respond, please be patient
> 15:55:44 omega11 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl
> 300)
> 15:55:44 omega11 kernel: ATA: abnormal status 0xD0 on port 0xFFFFC2000000401C
> 15:55:44 omega11 last message repeated 5 times
> 15:55:44 omega11 kernel: ata1.00: qc timeout (cmd 0xec)
> 15:55:44 omega11 kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> 15:55:44 omega11 kernel: ata1: failed to recover some devices, retrying in 5
> secs
> 15:55:44 omega11 kernel: ata1: hard resetting port
> 15:55:44 omega11 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl
> 300)
> 15:55:44 omega11 kernel: ata1.00: configured for UDMA/133
> 15:55:44 omega11 kernel: ata1: EH complete
> 15:55:44 omega11 kernel: SCSI device sda: 293046768 512-byte hdwr sectors
> (150040 MB)
> 15:55:44 omega11 kernel: sda: Write Protect is off
> 15:55:44 omega11 kernel: SCSI device sda: drive cache: write back
This is just the recovery part. Need more log. If possible, please
give a shot at 2.6.20. It might have fixed your problem or at least
allow better diagnosis.
> We`ve got this messages up to 5 times a day since as far as our syslogs reach.
>
> It seems no kind of queuing is used:
> # cat /sys/block/sda/device/queue_type
> none
> # cat /sys/block/sda/device/queue_depth
> 1
>
> The server is up for 91 days now and has low to medium load (depending on
> daytime). Since it`s a production server located in a datacenter, we can`t
> just test some random kernel on it :(
I see.
> Does somebody have a glue whats going on here? Could it be a hardware failure?
It might be. Quite some SATA bug reports turn out to be hardware
problem, most commonly PSU issues.
> We have an identical machine using the same kernel. It`s used as a webserver.
> There also this messages shows up, but not that often (10 times in 91 days
> uptime). If it is a hardware failure, then both machines would been affected
> by the same hardware problem.
Hmmm...
> What can we do to fix this problem? Is it known?
>
> I`ve found many posts related to SATA problems, but none seemed to be about
> this problem.
>
> Do you need additional information?
Yeah, please post the content of /var/log/boot.msg if available and the
result of dmesg and lspci -nn.
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-09 9:19 ` Tejun Heo
@ 2007-02-09 11:37 ` Emmeran Seehuber
2007-02-09 13:54 ` Tejun Heo
0 siblings, 1 reply; 9+ messages in thread
From: Emmeran Seehuber @ 2007-02-09 11:37 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1003 bytes --]
Am Friday 09 February 2007 schrieb Tejun Heo:
> Hi,
>
> This is just the recovery part. Need more log. If possible, please
> give a shot at 2.6.20. It might have fixed your problem or at least
> allow better diagnosis.
>
I´ll look into getting 2.6.20 on the machine. But it might take some time till
we can do this.
> > Does somebody have a glue whats going on here? Could it be a hardware
> > failure?
>
> It might be. Quite some SATA bug reports turn out to be hardware
> problem, most commonly PSU issues.
The power supply unit (you meant this with PSU, didn`t you?) has 800 Watt, so
it should be powerfull enough for one harddisk and no graphics board.
> > Do you need additional information?
>
> Yeah, please post the content of /var/log/boot.msg if available and the
> result of dmesg and lspci -nn.
We don`t have a /var/log/boot.msg, but it seems the boot messages were saved
in /var/log/dmesg, so I attached it.
Thanks for your effort.
cu,
Emmy
[-- Attachment #2: lspci-nn.txt --]
[-- Type: text/plain, Size: 1945 bytes --]
00:01.0 PCI bridge [0604]: Broadcom HT1000 PCI/PCI-X bridge [1166:0036]
00:02.0 Host bridge [0600]: Broadcom HT1000 Legacy South Bridge [1166:0205]
00:02.1 IDE interface [0101]: Broadcom HT1000 Legacy IDE controller [1166:0214]
00:02.2 ISA bridge [0601]: Broadcom HT1000 LPC Bridge [1166:0234]
00:03.0 USB Controller [0c03]: Broadcom HT1000 USB Controller [1166:0223] (rev 01)
00:03.1 USB Controller [0c03]: Broadcom HT1000 USB Controller [1166:0223] (rev 01)
00:03.2 USB Controller [0c03]: Broadcom HT1000 USB Controller [1166:0223] (rev 01)
00:04.0 Ethernet controller [0200]: Intel Corporation 82541GI/PI Gigabit Ethernet Controller [8086:1076] (rev 05)
00:05.0 Ethernet controller [0200]: Intel Corporation 82541GI/PI Gigabit Ethernet Controller [8086:1076] (rev 05)
00:06.0 VGA compatible controller [0300]: XGI - Xabre Graphics Inc Volari Z7 [18ca:0020]
00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100]
00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map [1022:1101]
00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller [1022:1102]
00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [1022:1103]
00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100]
00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map [1022:1101]
00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller [1022:1102]
00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [1022:1103]
01:0d.0 PCI bridge [0604]: Broadcom HT1000 PCI/PCI-X bridge [1166:0104] (rev c0)
01:0e.0 RAID bus controller [0104]: Broadcom BCM5785 (HT1000) SATA Native SATA Mode [1166:024a]
[-- Attachment #3: dmesg.txt.gz --]
[-- Type: application/x-gzip, Size: 1191 bytes --]
[-- Attachment #4: boot.dmesg.txt.gz --]
[-- Type: application/x-gzip, Size: 7357 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-09 11:37 ` Emmeran Seehuber
@ 2007-02-09 13:54 ` Tejun Heo
2007-02-09 17:09 ` Emmeran Seehuber
0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2007-02-09 13:54 UTC (permalink / raw)
To: Emmeran Seehuber; +Cc: linux-kernel
Emmeran Seehuber wrote:
>>> Does somebody have a glue whats going on here? Could it be a hardware
>>> failure?
>> It might be. Quite some SATA bug reports turn out to be hardware
>> problem, most commonly PSU issues.
>
> The power supply unit (you meant this with PSU, didn`t you?) has 800 Watt, so
> it should be powerfull enough for one harddisk and no graphics board.
I see.
>>> Do you need additional information?
>> Yeah, please post the content of /var/log/boot.msg if available and the
>> result of dmesg and lspci -nn.
>
> We don`t have a /var/log/boot.msg, but it seems the boot messages were saved
> in /var/log/dmesg, so I attached it.
Yeap, that's exactly what I wanted. So, the driver is sata_svw and
errors are timeouts for both reads and writes with BMDMA engine still
running. It looks like transmission errors to me. Can you post the
result of 'smartctl -a /dev/sdX'?
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-09 13:54 ` Tejun Heo
@ 2007-02-09 17:09 ` Emmeran Seehuber
2007-02-10 6:49 ` Tejun Heo
0 siblings, 1 reply; 9+ messages in thread
From: Emmeran Seehuber @ 2007-02-09 17:09 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
Am Friday 09 February 2007 schrieb Tejun Heo:
> Yeap, that's exactly what I wanted. So, the driver is sata_svw and
> errors are timeouts for both reads and writes with BMDMA engine still
> running. It looks like transmission errors to me. Can you post the
> result of 'smartctl -a /dev/sdX'?
here it is:
-->
# smartctl -a /dev/sda
smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/
Device: ATA WDC WD1500ADFD-0 Version: 20.0
Serial number: WD-WMAP41246348
Device type: disk
Local Time is: Fri Feb 9 18:06:23 2007 CET
Device does not support SMART
Error Counter logging not supported
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
Device does not support Self Test logging
<--
cu,
Emmy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-09 17:09 ` Emmeran Seehuber
@ 2007-02-10 6:49 ` Tejun Heo
2007-02-10 8:42 ` Emmeran Seehuber
0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2007-02-10 6:49 UTC (permalink / raw)
To: Emmeran Seehuber; +Cc: linux-kernel
Emmeran Seehuber wrote:
> # smartctl -a /dev/sda
> smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce
> Allen
> Home page is http://smartmontools.sourceforge.net/
>
> Device: ATA WDC WD1500ADFD-0 Version: 20.0
> Serial number: WD-WMAP41246348
> Device type: disk
> Local Time is: Fri Feb 9 18:06:23 2007 CET
> Device does not support SMART
Hmmm... Raptor not supporting SMART. That's weird. Please try
'smartctl -d ata -a /dev/sda'.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-10 6:49 ` Tejun Heo
@ 2007-02-10 8:42 ` Emmeran Seehuber
2007-02-11 22:19 ` Tejun Heo
0 siblings, 1 reply; 9+ messages in thread
From: Emmeran Seehuber @ 2007-02-10 8:42 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 192 bytes --]
Am Saturday 10 February 2007 schrieb Tejun Heo:
> Hmmm... Raptor not supporting SMART. That's weird. Please try
> 'smartctl -d ata -a /dev/sda'.
The output is attached.
cu,
Emmy
[-- Attachment #2: smartctl.txt --]
[-- Type: text/plain, Size: 4145 bytes --]
smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: WDC WD1500ADFD-00NLR1
Serial Number: WD-WMAP41246348
Firmware Version: 20.07P20
User Capacity: 150.039.945.216 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 7
ATA Standard is: Not recognized. Minor revision code: 0x1d
Local Time is: Sat Feb 10 09:31:27 2007 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (4783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 72) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0007 170 170 021 Pre-fail Always - 4533
4 Start_Stop_Count 0x0032 100 100 040 Old_age Always - 59
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 200 200 051 Old_age Always - 0
9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 2232
10 Spin_Retry_Count 0x0012 100 253 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 59
194 Temperature_Celsius 0x0022 118 116 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0012 200 200 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x000a 200 253 000 Old_age Always - 5
200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
2007-02-10 8:42 ` Emmeran Seehuber
@ 2007-02-11 22:19 ` Tejun Heo
0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2007-02-11 22:19 UTC (permalink / raw)
To: Emmeran Seehuber; +Cc: linux-kernel
Hello, Emmeran.
There is no logged error on drive's side. Only timeouts on host's side
with BMDMA engine running. I dunno specifics of the severwork
controller but many controllers with BMDMA interface timeouts the
command if transmission failure occurs, so my primary suspect is still
hardware transmission problem which seems quite common in SATA world.
Can you try the followings?
1. Use different cable and connect the hdd to different hdd.
2. If possible, connect the harddisk to different power supply. (I know
you have juicy PSU but just in case)
Probably applying this to only one machine and leaving the other alone
as control is a good idea.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
@ 2007-02-09 17:56 koan
0 siblings, 0 replies; 9+ messages in thread
From: koan @ 2007-02-09 17:56 UTC (permalink / raw)
To: linux-kernel
I believe you need to add the flag '-d ata' to the smartctl command in
order to see the smart status of a SATA device.
-Jesse
-------------------------------------------------------------
here it is:
-->
# smartctl -a /dev/sda
smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/
Device: ATA WDC WD1500ADFD-0 Version: 20.0
Serial number: WD-WMAP41246348
Device type: disk
Local Time is: Fri Feb 9 18:06:23 2007 CET
Device does not support SMART
Error Counter logging not supported
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
Device does not support Self Test logging
<--
cu,
Emmy
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-02-11 22:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-07 17:17 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000)) Emmeran Seehuber
2007-02-09 9:19 ` Tejun Heo
2007-02-09 11:37 ` Emmeran Seehuber
2007-02-09 13:54 ` Tejun Heo
2007-02-09 17:09 ` Emmeran Seehuber
2007-02-10 6:49 ` Tejun Heo
2007-02-10 8:42 ` Emmeran Seehuber
2007-02-11 22:19 ` Tejun Heo
2007-02-09 17:56 koan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.