linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-19 16:04 Stephan von Krawczynski
  2003-04-19 15:29 ` Alan Cox
                   ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: Stephan von Krawczynski @ 2003-04-19 16:04 UTC (permalink / raw)
  To: linux-kernel

Hello all,

after shooting down one of this bloody cute new very-big-and-poor IDE drives
today I wonder whether it would be a good idea to give the linux-fs (namely my
preferred reiser and ext2 :-) some fault-tolerance. I remember there have been
some discussions along this issue some time ago and I guess remembering that it
was decided against because it should be the drivers issue to give the fs a
clean space to live, right? 
Unfortunately todays' reality seems to have gotten a lot worse comparing to one
year ago. I cannot remember a lot of failed drives back then, but today about
20% seemed to be already shipped DOA. Most I came across have only small
problems (few dead sectors), but they seemed to produce quite a lot of trouble 
- at least on my 3ware in non-RAID setup the box partly dies away because
reiser feels quite unhappy about the non-recoverable disk-errors.
I know this question can get religious, but to name my only point: wouldn't it
be a good defensive programming style _not_ to rely on proven-to-be-unreliable
hardware manufacturers. Thing is: you cannot prevent buying bad hardware these
days, because just about every manufacturer already sold bad apples ...

Regards,
Stephan

^ permalink raw reply	[flat|nested] 74+ messages in thread
* Re: Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-20 15:06 Chuck Ebbert
  2003-04-20 15:19 ` John Bradford
  0 siblings, 1 reply; 74+ messages in thread
From: Chuck Ebbert @ 2003-04-20 15:06 UTC (permalink / raw)
  To: linux-kernel

Alan Cox wrote:


> Buy IDE disks in pairs use md1, and remember to continually send the
> hosed ones back to the vendor/shop (and if they keep appearing DOA to
> your local trading standards/fair trading type bodies).


  I buy three drives at a time so I have a matching spare, because AFAIC
you shouldn't be doing RAID on unmatched drives.

  Using RAID1 is especially important when using software instead
of hardware for fault-tolerance because the software is more likely to
have bugs just because of the 'culture' of hardware vs. software
developers, and the RAID5 algorithm is very hard to get right anyway,
especially in failure/rebuild mode.  Even on a hardware controller
RAID5 is still inherently less reliable.

 (...and what's all this about unreliable drives, anyway?  Every drive
I have bought since 1987 still works.)
------
 Chuck

^ permalink raw reply	[flat|nested] 74+ messages in thread
* Re: Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-20 17:03 Chuck Ebbert
  2003-04-20 17:25 ` John Bradford
  0 siblings, 1 reply; 74+ messages in thread
From: Chuck Ebbert @ 2003-04-20 17:03 UTC (permalink / raw)
  To: arjanv; +Cc: linux-kernel

arjan wrote:


>> You will if it writes and fails to read back. The disk can't invent a
>> sector that is gone. 
>
> but linux can if you use an raid1 mirror... maybe we should teach the md
> layer to write back the data from the other disk on a "bad sector"
> error.


  I have some ugly code that forces all reads from a mirror set to
a specific copy, set via a global sysctl.  This lets you do things
like make a backup from disk 0, then verify against disk 1 and take
action if something is wrong.


------
 Chuck

^ permalink raw reply	[flat|nested] 74+ messages in thread
* Re: Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-20 17:28 Chuck Ebbert
  2003-04-21  9:36 ` Stephan von Krawczynski
  0 siblings, 1 reply; 74+ messages in thread
From: Chuck Ebbert @ 2003-04-20 17:28 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

Stephan wrote:


> Maybe I have something in common with google, I am re-writing large parts (well
> over 50%) of the harddrives capacity on a daily basis (in the discussed setup).
> how many people really do that?


  I'll bet the people who do are using SCSI disks...


------
 Chuck

^ permalink raw reply	[flat|nested] 74+ messages in thread
* Re: Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-20 17:28 Chuck Ebbert
  0 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2003-04-20 17:28 UTC (permalink / raw)
  To: John Bradford; +Cc: linux-kernel


>>   I buy three drives at a time so I have a matching spare, because AFAIC
>> you shouldn't be doing RAID on unmatched drives.
>
> Err, yes you should :-).
>
> Unless they are spindle syncronised, the advantage of identical
> physical layout diminishes, and the disadvantage of quite possibly
> getting components from the same, (faulty), batch increases :-).


 Yeah, I know, and some of my serial numbers are too close together
for comfort but I still like everything matched up:


hde: MAXTOR 4K060H3, ATA DISK drive
hdg: MAXTOR 4K060H3, ATA DISK drive
hdi: MAXTOR 4K060H3, ATA DISK drive
 hde: hde1 hde2 hde3 hde4 < hde5 hde6 hde7 hde8 hde9 >
 hdg: hdg1 hdg2 hdg3 hdg4 < hdg5 hdg6 hdg7 hdg8 hdg9 >
 hdi: hdi1 hdi2 hdi3 hdi4 < hdi5 hdi6 hdi7 hdi8 hdi9 >



------
 Chuck

^ permalink raw reply	[flat|nested] 74+ messages in thread
* Re: Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-20 17:44 Chuck Ebbert
  0 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2003-04-20 17:44 UTC (permalink / raw)
  To: John Bradford; +Cc: linux-kernel


>>   I have some ugly code that forces all reads from a mirror set to
>> a specific copy, set via a global sysctl.  This lets you do things
>> like make a backup from disk 0, then verify against disk 1 and take
>> action if something is wrong.
>
> That's interesting.  Have you thought of making it read from _both_
> disks and check that the data matches, before passing it back?


  It didn't seem to be worth doing, since a userspace program could
be written to do the same thing using my small patch.  Only problem
is, it uses a global sysctl that affects every mirror set in the machine,
so it could affect performance of every mirror if used during load.


------
 Chuck

^ permalink raw reply	[flat|nested] 74+ messages in thread
* Re: Are linux-fs's drive-fault-tolerant by concept?
@ 2003-04-20 17:44 Chuck Ebbert
  0 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2003-04-20 17:44 UTC (permalink / raw)
  To: arjanv; +Cc: linux-kernel


>> You will if it writes and fails to read back. The disk can't invent a
>> sector that is gone. 
>
> but linux can if you use an raid1 mirror... maybe we should teach the md
> layer to write back the data from the other disk on a "bad sector"
> error.


  NTFS does this in the filesystem by moving the affected cluster
somewhere else, then marking it bad in its allocation map.  Of course
in order to do that it has to get notifications from the ft disk
driver...


------
 Chuck

^ permalink raw reply	[flat|nested] 74+ messages in thread
[parent not found: <mail.linux.kernel/20030420185512.763df745.skraw@ithnet.com>]

end of thread, other threads:[~2003-05-06  6:51 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-19 16:04 Are linux-fs's drive-fault-tolerant by concept? Stephan von Krawczynski
2003-04-19 15:29 ` Alan Cox
2003-04-19 17:00   ` Stephan von Krawczynski
2003-04-19 22:04     ` Alan Cox
2003-04-20 16:24       ` Stephan von Krawczynski
2003-04-20 13:59     ` John Bradford
2003-04-20 16:55       ` Stephan von Krawczynski
2003-04-20 17:12         ` John Bradford
2003-04-20 17:21           ` Stephan von Krawczynski
2003-04-20 18:48             ` Alan Cox
2003-04-20 20:00               ` John Bradford
2003-04-21  1:51                 ` jw schultz
2003-04-19 21:13   ` Jos Hulzink
2003-04-20 16:07     ` Stephan von Krawczynski
2003-04-20 16:40       ` John Bradford
2003-04-20 17:01         ` Stephan von Krawczynski
2003-04-20 17:20           ` John Bradford
2003-04-21  9:32             ` Stephan von Krawczynski
2003-04-21  9:55               ` John Bradford
2003-04-21 11:24                 ` Stephan von Krawczynski
2003-04-21 11:50                   ` Alan Cox
2003-04-21 12:14                   ` John Bradford
2003-04-19 16:22 ` John Bradford
2003-04-19 16:36   ` Russell King
2003-04-19 16:45     ` John Bradford
2003-04-19 16:52   ` Stephan von Krawczynski
2003-04-19 20:04     ` John Bradford
2003-04-19 20:33       ` Andreas Dilger
2003-04-21  9:25         ` Denis Vlasenko
2003-04-21  9:42           ` John Bradford
2003-04-21 10:25             ` Stephan von Krawczynski
2003-04-21 10:50               ` John Bradford
2003-04-19 20:38       ` Stephan von Krawczynski
2003-04-20 14:21         ` John Bradford
2003-04-21  9:09           ` Denis Vlasenko
2003-04-21  9:35             ` John Bradford
2003-04-21 11:03               ` Stephan von Krawczynski
2003-04-21 12:04                 ` John Bradford
2003-04-21 11:22               ` Denis Vlasenko
2003-04-21 11:46                 ` Stephan von Krawczynski
2003-04-21 12:13                 ` John Bradford
2003-04-19 20:05     ` John Bradford
2003-04-19 23:13     ` Arnaldo Carvalho de Melo
2003-04-19 17:54   ` Felipe Alfaro Solana
2003-04-25  0:07   ` Stewart Smith
2003-04-25  0:52     ` Richard B. Johnson
2003-04-25  7:13       ` John Bradford
     [not found] ` <20030419161011$0136@gated-at.bofh.it>
2003-04-19 17:18   ` Florian Weimer
2003-04-19 18:07     ` Stephan von Krawczynski
2003-04-19 18:41       ` Dr. David Alan Gilbert
2003-04-19 20:56         ` Helge Hafting
2003-04-19 21:15           ` Valdis.Kletnieks
2003-04-20 10:51             ` Helge Hafting
2003-04-20 19:04               ` Valdis.Kletnieks
2003-04-19 21:57         ` Alan Cox
2003-04-20 10:09         ` Geert Uytterhoeven
2003-04-21  8:37         ` Denis Vlasenko
2003-05-05 12:38         ` Pavel Machek
2003-04-19 22:02     ` Alan Cox
2003-04-20  8:41       ` Arjan van de Ven
2003-04-25  0:11     ` Stewart Smith
2003-04-20 15:06 Chuck Ebbert
2003-04-20 15:19 ` John Bradford
2003-04-20 17:03 Chuck Ebbert
2003-04-20 17:25 ` John Bradford
2003-04-20 17:28 Chuck Ebbert
2003-04-21  9:36 ` Stephan von Krawczynski
2003-04-20 17:28 Chuck Ebbert
2003-04-20 17:44 Chuck Ebbert
2003-04-20 17:44 Chuck Ebbert
     [not found] <mail.linux.kernel/20030420185512.763df745.skraw@ithnet.com>
     [not found] ` <03Apr21.020150edt.41463@gpu.utcc.utoronto.ca>
2003-04-21 11:19   ` Stephan von Krawczynski
2003-04-21 11:52     ` Alan Cox
2003-04-21 14:14     ` Valdis.Kletnieks
2003-05-06  7:03       ` Mike Fedyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).