All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Schut <schut@sarvision.nl>
Cc: Andrew Dunn <andrew.g.dunn@gmail.com>,
	Gabor Gombas <gombasg@sztaki.hu>,
	Ryan Wagoner <rswagoner@gmail.com>,
	Richard Scobie <richard@sauce.co.nz>,
	Linux RAID Mailing List <linux-raid@vger.kernel.org>
Subject: Re: RAID 6 Failure follow up
Date: Tue, 17 Nov 2009 09:40:13 +0100	[thread overview]
Message-ID: <4B02616D.8060405@sarvision.nl> (raw)
In-Reply-To: <4AF94FCA.6040303@sarvision.nl>

Vincent Schut wrote:
> Andrew Dunn wrote:
>> I am able to reproduce this smart error now. I have done it twice, so
>> maybe other things are causing this also.
>>
>> When I scanned the devices this morning with smartctl via webmin I lost
>> 8 of the 9 drives. They are howerver still in my /dev folder.
>>
>> Now I sent out my logs from the first failure last night, smartctl was
>> on the system... I dont know if ubuntu server's default smartd
>> configuration makes it do periodic scans because I didnt change anything.
>>
>> I would hate to move back to 9.10 and see this problem again.
>>
>> Should I just not install smartmontools? This seems like a bad solution
>> because now I wont be able to check the drives in advance for failures.
>>
>> Have you installed LSI's linux drivers? Some people say this solves
>> their issue.
>>
>> From the logs sent out last night do you think it could be something 
>> else?
>>
>> Thanks a ton,
> 
> FWIW, I encountered the same issue, and seem to have found a viable 
> workaround by accessing the SATA disks on that LSI backplane as scsi 
> devices, e.g. by adding '-d scsi' to my smartctl/smartd.conf lines. No 
> more errors in the logs, no more drives being kicked out.
> Though not as much info is available that way as when using de sata 
> driver ('-d sat', or automatically), like temperature is unavailable, it 
> does allow me to initiate the selftests and get their result, and to 
> monitor generic smart status of the drives. Quite enough for me.
> 
> YMMV, though.

Folks, I need to retract this. Thought I've had far less problems with 
'-d scsi' instead of '-d sat' when running the LSI SAS / smartmontools / 
mdadm combo, I got bitten again last night by a drive being kicked out 
for no apparent reason. For now my only possible advise is: don't use 
smartmontools on drives that are on this LSI SAS backplane.
I dearly hope this will improve soon; I hate it to have my drives go 
unmonitored for too long...

Vincent.

> 
> Vincent.
>>
>> Gabor Gombas wrote:
>>> On Mon, Nov 09, 2009 at 05:08:23AM -0500, Andrew Dunn wrote:
>>>
>>>  
>>>> does it momentarily offline the disks? like they re-appear in /dev
>>>> within moments? That would be similar behavior to what I am
>>>> experiencing, the disks drop from the array, but they are in /dev by 
>>>> the
>>>> time I get a chance to see them.
>>>>     
>>> No, either the disks need to be physically removed and re-inserted, or
>>> the machine needs to be rebooted.
>>>
>>> Gabor
>>>
>>>   
>>
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


  parent reply	other threads:[~2009-11-17  8:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-08 14:07 RAID 6 Failure follow up Andrew Dunn
2009-11-08 14:23 ` Roger Heflin
2009-11-08 14:30   ` Andrew Dunn
2009-11-08 18:01     ` Richard Scobie
2009-11-08 18:22       ` Andrew Dunn
2009-11-08 18:34         ` Joe Landman
2009-11-08 22:09       ` Andrew Dunn
2009-11-08 22:59         ` Richard Scobie
2009-11-09  2:45           ` Ryan Wagoner
2009-11-09  2:57             ` Richard Scobie
2009-11-09  8:09             ` Gabor Gombas
2009-11-09 10:08               ` Andrew Dunn
2009-11-09 11:34                 ` Gabor Gombas
2009-11-09 22:04                   ` Andrew Dunn
2009-11-10 10:55                   ` Andrew Dunn
2009-11-10 11:34                     ` Vincent Schut
2009-11-11 12:34                       ` Andrew Dunn
2009-11-11 12:46                         ` Vincent Schut
2009-11-17  8:40                       ` Vincent Schut [this message]
2009-11-10 12:45                     ` Ryan Wagoner
2009-11-08 14:36   ` Andrew Dunn
2009-11-08 14:56     ` Roger Heflin
2009-11-08 17:08       ` Andrew Dunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B02616D.8060405@sarvision.nl \
    --to=schut@sarvision.nl \
    --cc=andrew.g.dunn@gmail.com \
    --cc=gombasg@sztaki.hu \
    --cc=linux-raid@vger.kernel.org \
    --cc=richard@sauce.co.nz \
    --cc=rswagoner@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.