From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932998AbZHZNLF (ORCPT ); Wed, 26 Aug 2009 09:11:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932986AbZHZNLE (ORCPT ); Wed, 26 Aug 2009 09:11:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:65349 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932985AbZHZNLC (ORCPT ); Wed, 26 Aug 2009 09:11:02 -0400 Message-ID: <4A95349E.7010101@redhat.com> Date: Wed, 26 Aug 2009 09:11:58 -0400 From: Ric Wheeler User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.1) Gecko/20090806 Fedora/3.0-3.8.b3.fc12 Thunderbird/3.0b3 MIME-Version: 1.0 To: Theodore Tso , Pavel Machek , david@lang.hm, Florian Weimer , Goswin von Brederlow , Rob Landley , kernel list , Andrew Morton , mtk.manpages@gmail.com, rdunlap@xenotime.net, linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org, corbet@lwn.net Subject: Re: [patch] document flash/RAID dangers References: <20090825233701.GH4300@elf.ucw.cz> <4A947839.4010601@redhat.com> <20090826000657.GK4300@elf.ucw.cz> <4A947E05.8070406@redhat.com> <20090826002045.GO4300@elf.ucw.cz> <4A9481BE.1030308@redhat.com> <20090826003803.GP4300@elf.ucw.cz> <4A9485A6.1010803@redhat.com> <20090826112121.GD26595@elf.ucw.cz> <4A952370.50603@redhat.com> <20090826124058.GK32712@mit.edu> In-Reply-To: <20090826124058.GK32712@mit.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/26/2009 08:40 AM, Theodore Tso wrote: > On Wed, Aug 26, 2009 at 07:58:40AM -0400, Ric Wheeler wrote: >>> Drive in raid 5 failed; hot spare was available (no idea about >>> UPS). System apparently locked up trying to talk to the failed drive, >>> or maybe admin just was not patient enough, so he just powercycled the >>> array. He lost the array. >>> >>> So while most people will not agressively powercycle the RAID array, >>> drive failure still provokes little tested error paths, and getting >>> unclean shutdown is quite easy in such case. >> >> Then what we need to document is do not power cycle an array during a >> rebuild, right? > > Well, the softwar raid layer could be improved so that it implements > scrubbing by default (i.e., have the md package install a cron job to > implement a periodict scrub pass automatically). The MD code could > also regularly check to make sure the hot spare is OK; the other > possibility is that hot spare, which hadn't been used in a long time, > had silently failed. Actually, MD does this scan already (not automatically, but you can set up a simple cron job to kick off a periodic "check"). It is a delicate balance to get the frequency of the scrubbing correct. On one hand, you want to make sure that you detect errors in a timely fashion, certainly detection of single sector errors before you might develop a second sector level error on another drive. On the other hand, running scans/scrubs continually impacts the performance of your real workload and can potentially impact your components' life span by subjecting them to a heavy workload. Rule of thumb seems from my experience is that most people settle in with a scan once a week or two (done at a throttled rate). > >> In the end, there are cascading failures that will defeat any data >> protection scheme, but that does not mean that the value of that scheme >> is zero. We need to be get more people to use RAID (including MD5) and >> try to enhance it as we go. Just using a single disk is not a good >> thing... > > Yep; the solution is to improve the storage devices. It is *not* to > encourage people to think RAID is not worth it, or that somehow ext2 > is better than ext3 because it runs fsck's all the time at boot up. > That's just crazy talk. > > - Ted Agreed.... ric