All of lore.kernel.org
 help / color / mirror / Atom feed
* Scsi errors with Megaraid 300-8x
@ 2006-08-22 14:45 Johan Groth
  2006-08-23 15:27 ` Mark Lord
  0 siblings, 1 reply; 11+ messages in thread
From: Johan Groth @ 2006-08-22 14:45 UTC (permalink / raw)
  To: linux-kernel

Hi,
ever since I upgraded my server from a dual Opteron 244 (mobo Tyan 2885) 
system to a dual dual-core Opteron 285 (mobo Tyan 2895) system, I'm 
getting read errors that freezes the system which leads to my disk based 
backup software stopped working (faubackup). I think it is faubackup 
that triggers the bug.

I get these errors in the log:
Aug 20 06:35:08 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:35:56 jaguar kernel: end_request: I/O error, dev sda, sector 
616924530
Aug 20 06:36:03 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:03 jaguar kernel: end_request: I/O error, dev sda, sector 
616924538
Aug 20 06:36:03 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:03 jaguar kernel: end_request: I/O error, dev sda, sector 
616924546
Aug 20 06:36:03 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:03 jaguar kernel: end_request: I/O error, dev sda, sector 
616924554
Aug 20 06:36:07 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:07 jaguar kernel: end_request: I/O error, dev sda, sector 
616924562
Aug 20 06:36:07 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:07 jaguar kernel: end_request: I/O error, dev sda, sector 
616924570
Aug 20 06:36:07 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:07 jaguar kernel: end_request: I/O error, dev sda, sector 
616924578
Aug 20 06:36:07 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 0x40001
Aug 20 06:36:07 jaguar kernel: end_request: I/O error, dev sda, sector 
616924538

The last sector is repeated until I reboot the machine. The only 
difference I've made to the raid configuration is that sdc is now 2x250 
MB instead of 4x120MB, but that array is the target not the source (sda).
The raid HW is an LSI Megaraid 300-8x with the following configuration:

Host: scsi0 Channel: 01 Id: 00 Lun: 00
   Vendor: MegaRAID Model: LD 0 RAID0  312G Rev: 814D
   Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 01 Id: 01 Lun: 00
   Vendor: MegaRAID Model: LD 1 RAID0  312G Rev: 814D
   Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 01 Id: 02 Lun: 00
   Vendor: MegaRAID Model: LD 2 RAID0  474G Rev: 814D
   Type:   Direct-Access                    ANSI SCSI revision: 02

I'm running debian sid stock kernel 2.6.17.

Other hw changes that may, may not affect the kernel:
CPU: dual core, hence the kernel sees 4 cpus, before 2.
Removed a sound card, a CS46xx card and am using internal sound instead.
Installed a DVB-T Freeview card (digital TV).
More RAM, 2GB instead 1.

Should mention that I used to use 2.6.16 on the old system, but I've 
checked in the kernel tree and no changes has been made to the megaraid 
drivers since April, I think.

Can anyone help me?
Please, CC me as I'm not subscribed.

Regards,
Johan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-22 14:45 Scsi errors with Megaraid 300-8x Johan Groth
@ 2006-08-23 15:27 ` Mark Lord
  2006-08-23 15:42   ` Johan Groth
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Lord @ 2006-08-23 15:27 UTC (permalink / raw)
  To: Johan Groth; +Cc: linux-kernel

Johan Groth wrote:
> Hi,
> ever since I upgraded my server from a dual Opteron 244 (mobo Tyan 2885) 
> system to a dual dual-core Opteron 285 (mobo Tyan 2895) system, I'm 
> getting read errors that freezes the system which leads to my disk based 
> backup software stopped working (faubackup). I think it is faubackup 
> that triggers the bug.
> 
> I get these errors in the log:
> Aug 20 06:35:08 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 
> 0x40001
> Aug 20 06:35:56 jaguar kernel: end_request: I/O error, dev sda, sector 
> 616924530
> Aug 20 06:36:03 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 
> 0x40001
> Aug 20 06:36:03 jaguar kernel: end_request: I/O error, dev sda, sector 
> 616924538
..
> Aug 20 06:36:07 jaguar kernel: sd 2:1:0:0: SCSI error: return code = 
> 0x40001
> Aug 20 06:36:07 jaguar kernel: end_request: I/O error, dev sda, sector 
> 616924538
> 
> The last sector is repeated until I reboot the machine. The only 
> difference I've made to the raid configuration is that sdc is now 2x250 
> MB instead of 4x120MB, but that array is the target not the source (sda).
> The raid HW is an LSI Megaraid 300-8x with the following configuration:
..

That looks like the classic SCSI bad-sectory non-recovery bug.
The code in scsi_lib.c, scsi_error.c, and sd.c is currently a
bit of a mess here.  

Basically, given an I/O request for 200 sectors, with a bad sector
in the middle at number 100, what SCSI will often do is fail sectors
number 1 through 100, one at a time, retrying the entire remainder of
the request after each attempt.  This takes hours, and results in no
data for the first 99 good sectors.

What it needs to do *instead*, is retry each sector individually,
rather than the entire request.  This would result in sectors 1..99
and 101..200 succeeding, and retries/failure only for sector 100.

A slight optimization would be to fail the bio size around sector 100,
rather than just the one sector.

I've got patches that do exactly this, and they work quite well.
But they're probably not "pretty enough" for inclusion.

Cheers



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:27 ` Mark Lord
@ 2006-08-23 15:42   ` Johan Groth
  2006-08-23 15:45     ` Justin Piszcz
  0 siblings, 1 reply; 11+ messages in thread
From: Johan Groth @ 2006-08-23 15:42 UTC (permalink / raw)
  To: Mark Lord; +Cc: linux-kernel

Mark Lord wrote:
> Johan Groth wrote:

[snip]

> Basically, given an I/O request for 200 sectors, with a bad sector
> in the middle at number 100, what SCSI will often do is fail sectors
> number 1 through 100, one at a time, retrying the entire remainder of
> the request after each attempt.  This takes hours, and results in no
> data for the first 99 good sectors.

So what you are saying is that after the move to a new box and a new 
mobo a sector has gone bad on that raid slice? Weird, as I was very 
careful this those drives when I moved them.

I mean, the raid controller is the same, the cpus are the same, just 
more of them, the pci-x bus the same so I didn't expect any problems at 
all.

I was also under the impression that SATA raid controllers work like 
SCSI raid controllers in the way that if a bad sector is encountered the 
controller moves what it can and the mark the sector as bad. I might be 
very wrong about that, though.

However, if I have a bad sector I would like to have that one marked as 
bad so the kernel never tries to read it again. Any suggestions how I do 
that. I assume I have to boot something like Knoppix as sda is my system 
disk.

Regards,
Johan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:42   ` Johan Groth
@ 2006-08-23 15:45     ` Justin Piszcz
  2006-08-23 15:48       ` Johan Groth
  0 siblings, 1 reply; 11+ messages in thread
From: Justin Piszcz @ 2006-08-23 15:45 UTC (permalink / raw)
  To: Johan Groth; +Cc: Mark Lord, linux-kernel



On Wed, 23 Aug 2006, Johan Groth wrote:

> Mark Lord wrote:
>> Johan Groth wrote:
>
> [snip]
>
>> Basically, given an I/O request for 200 sectors, with a bad sector
>> in the middle at number 100, what SCSI will often do is fail sectors
>> number 1 through 100, one at a time, retrying the entire remainder of
>> the request after each attempt.  This takes hours, and results in no
>> data for the first 99 good sectors.
>
> So what you are saying is that after the move to a new box and a new mobo a 
> sector has gone bad on that raid slice? Weird, as I was very careful this 
> those drives when I moved them.
>
> I mean, the raid controller is the same, the cpus are the same, just more of 
> them, the pci-x bus the same so I didn't expect any problems at all.
>
> I was also under the impression that SATA raid controllers work like SCSI 
> raid controllers in the way that if a bad sector is encountered the 
> controller moves what it can and the mark the sector as bad. I might be very 
> wrong about that, though.
>
> However, if I have a bad sector I would like to have that one marked as bad 
> so the kernel never tries to read it again. Any suggestions how I do that. I 
> assume I have to boot something like Knoppix as sda is my system disk.
>
> Regards,
> Johan
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

Run badblocks in r+w mode on the bad disk and it will force the disk to 
re-allocate the bad sector if it can.

Justin.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:45     ` Justin Piszcz
@ 2006-08-23 15:48       ` Johan Groth
  2006-08-23 15:53         ` Justin Piszcz
  2006-08-24 14:48         ` Mark Lord
  0 siblings, 2 replies; 11+ messages in thread
From: Johan Groth @ 2006-08-23 15:48 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: Mark Lord, linux-kernel

Justin Piszcz wrote:
> Run badblocks in r+w mode on the bad disk and it will force the disk to 
> re-allocate the bad sector if it can.
> 
> Justin.

Is that possible to do in a non-destructive way? I don't want to loose 
all data and apparently I can't back it up either :(.

Regards,
Johan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:48       ` Johan Groth
@ 2006-08-23 15:53         ` Justin Piszcz
  2006-08-23 15:57           ` Johan Groth
  2006-08-24 14:48         ` Mark Lord
  1 sibling, 1 reply; 11+ messages in thread
From: Justin Piszcz @ 2006-08-23 15:53 UTC (permalink / raw)
  To: Johan Groth; +Cc: Mark Lord, linux-kernel



On Wed, 23 Aug 2006, Johan Groth wrote:

> Justin Piszcz wrote:
>> Run badblocks in r+w mode on the bad disk and it will force the disk to 
>> re-allocate the bad sector if it can.
>> 
>> Justin.
>
> Is that possible to do in a non-destructive way? I don't want to loose all 
> data and apparently I can't back it up either :(.
>
> Regards,
> Johan
>

Nope, r+w will write over everything on the disk, but I have found -the- 
most effective way to see if a disk is good or not.  I'd rather have the 
disk die to that test rather than using it in production and finding it 
dies with my data on it.

Justin.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:53         ` Justin Piszcz
@ 2006-08-23 15:57           ` Johan Groth
  2006-08-23 15:59             ` Justin Piszcz
  0 siblings, 1 reply; 11+ messages in thread
From: Johan Groth @ 2006-08-23 15:57 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel

Justin Piszcz wrote:
> 
> Nope, r+w will write over everything on the disk, but I have found -the- 
> most effective way to see if a disk is good or not.  I'd rather have the 
> disk die to that test rather than using it in production and finding it 
> dies with my data on it.
> 

Hmm, we both should read the man page of badblocks a bit better :).
I found this:

-n     Use non-destructive read-write mode.  By default only a 
non-destructive read-only test is done. This option must not be combined 
with the -w option, as they are mutually exclusive.


Cheers,
Johan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:57           ` Johan Groth
@ 2006-08-23 15:59             ` Justin Piszcz
  0 siblings, 0 replies; 11+ messages in thread
From: Justin Piszcz @ 2006-08-23 15:59 UTC (permalink / raw)
  To: Johan Groth; +Cc: linux-kernel



On Wed, 23 Aug 2006, Johan Groth wrote:

> Justin Piszcz wrote:
>> 
>> Nope, r+w will write over everything on the disk, but I have found -the- 
>> most effective way to see if a disk is good or not.  I'd rather have the 
>> disk die to that test rather than using it in production and finding it 
>> dies with my data on it.
>> 
>
> Hmm, we both should read the man page of badblocks a bit better :).
> I found this:
>
> -n     Use non-destructive read-write mode.  By default only a 
> non-destructive read-only test is done. This option must not be combined with 
> the -w option, as they are mutually exclusive.
>
>
> Cheers,
> Johan
>

I have not tested that option.  I wonder if it is as good as a real R+W 
mode.  What does smartctl -a /dev/sda say on the disk that you are having 
problems with?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-23 15:48       ` Johan Groth
  2006-08-23 15:53         ` Justin Piszcz
@ 2006-08-24 14:48         ` Mark Lord
  2006-08-24 15:09           ` Johan Groth
  1 sibling, 1 reply; 11+ messages in thread
From: Mark Lord @ 2006-08-24 14:48 UTC (permalink / raw)
  To: Johan Groth; +Cc: Justin Piszcz, linux-kernel

Johan Groth wrote:
> Justin Piszcz wrote:
>> Run badblocks in r+w mode on the bad disk and it will force the disk 
>> to re-allocate the bad sector if it can.
>>
>> Justin.
> 
> Is that possible to do in a non-destructive way? I don't want to loose 
> all data and apparently I can't back it up either :(.

Yes, it is perfectly doable, but I don't think anyone has yet bothered
to release a utility that actually does it.

OPPORTUNITY FOR FAME AND FORTUNE! (okay, maybe just some fame):
=================================
Hack the existing smartctl code to read out the failed sector numbers,
and then issue single-sector read-overwrite to each of those bad sectors.

Very simple code.  I'll do it myself eventually, but please beat me to it!

Cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-24 14:48         ` Mark Lord
@ 2006-08-24 15:09           ` Johan Groth
  2006-08-24 16:57             ` Mark Lord
  0 siblings, 1 reply; 11+ messages in thread
From: Johan Groth @ 2006-08-24 15:09 UTC (permalink / raw)
  To: Mark Lord; +Cc: linux-kernel

Mark Lord wrote:
> Johan Groth wrote:
>> Justin Piszcz wrote:
>>> Run badblocks in r+w mode on the bad disk and it will force the disk 
>>> to re-allocate the bad sector if it can.
>>>
>>> Justin.
>>
>> Is that possible to do in a non-destructive way? I don't want to loose 
>> all data and apparently I can't back it up either :(.
> 
> Yes, it is perfectly doable, but I don't think anyone has yet bothered
> to release a utility that actually does it.
> 
> OPPORTUNITY FOR FAME AND FORTUNE! (okay, maybe just some fame):
> =================================
> Hack the existing smartctl code to read out the failed sector numbers,
> and then issue single-sector read-overwrite to each of those bad sectors.
> 
> Very simple code.  I'll do it myself eventually, but please beat me to it!

Authors of badblocks already has with the -n option :). When I ran 
badblocks on the entire partition it wanted to check over 210 million 
blocks and when it finally came to the bad sector parts the controller 
lost the drive and the kernel started to spit out scsi errors! Buggy 
driver, hardware error? God knows. Unfortunately I don't have a log to 
show you as I was in single user mode.

I would like to run badblocks again but only around the damaged part. 
Thing is that I know which sector the kernel thinks is bad but badblocks 
wants to know which block to start and end at. How do I convert a sector 
number to block number. The partition is a standard XFS partition, ie I 
haven't made any changes to block sizes when I formatted it.

Got this from xfs_info:
meta-data=/dev/sda7              isize=256    agcount=16, agsize=3433014 
blks
          =                       sectsz=512   attr=1
data     =                       bsize=4096   blocks=54928224, imaxpct=25
          =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=26820, version=1
          =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0

There's another thing I would like know. How do I find out what file a 
sector belongs to?

Regards,
Johan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Scsi errors with Megaraid 300-8x
  2006-08-24 15:09           ` Johan Groth
@ 2006-08-24 16:57             ` Mark Lord
  0 siblings, 0 replies; 11+ messages in thread
From: Mark Lord @ 2006-08-24 16:57 UTC (permalink / raw)
  To: Johan Groth; +Cc: linux-kernel

Johan Groth wrote:
> Mark Lord wrote:
>
>> OPPORTUNITY FOR FAME AND FORTUNE! (okay, maybe just some fame):
>> =================================
>> Hack the existing smartctl code to read out the failed sector numbers,
>> and then issue single-sector read-overwrite to each of those bad sectors.
>>
>> Very simple code.  I'll do it myself eventually, but please beat me to 
>> it!
> 
> Authors of badblocks already has with the -n option :)

Not quite.  As you pointed out:

> I would like to run badblocks again but only around the damaged part. 

The drive *knows* which sectors are bad -- it's mostly all in the S.M.A.R.T.
logs and such.  Which smartctl already knows how to read.

So now we just need a script-kiddie to do a nice little awk script for you,
which extracts the bad sectors info from the output of smartctl, and then
feeds this as input to badblocks.

Cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-08-24 16:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-22 14:45 Scsi errors with Megaraid 300-8x Johan Groth
2006-08-23 15:27 ` Mark Lord
2006-08-23 15:42   ` Johan Groth
2006-08-23 15:45     ` Justin Piszcz
2006-08-23 15:48       ` Johan Groth
2006-08-23 15:53         ` Justin Piszcz
2006-08-23 15:57           ` Johan Groth
2006-08-23 15:59             ` Justin Piszcz
2006-08-24 14:48         ` Mark Lord
2006-08-24 15:09           ` Johan Groth
2006-08-24 16:57             ` Mark Lord

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.