linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Bradford <john@grabjohn.com>
To: root@chaos.analogic.com
Cc: stewartsmith@mac.com (Stewart Smith),
	john@grabjohn.com (John Bradford),
	skraw@ithnet.com (Stephan von Krawczynski),
	linux-kernel@vger.kernel.org (linux-kernel)
Subject: Re: Are linux-fs's drive-fault-tolerant by concept?
Date: Fri, 25 Apr 2003 08:13:39 +0100 (BST)	[thread overview]
Message-ID: <200304250713.h3P7Ddp6000359@81-2-122-30.bradfords.org.uk> (raw)
In-Reply-To: <Pine.LNX.4.53.0304242027350.3180@chaos> from "Richard B. Johnson" at Apr 24, 2003 08:52:59 PM

> > >> I wonder whether it would be a good idea to give the linux-fs
> > >> (namely my preferred reiser and ext2 :-) some fault-tolerance.
> > >
> > > Fault tollerance should be done at a lower level than the filesystem.
> >
> > I would (partly) disagree. On the FS level, you would still have to
> > deal with the data having gone away (or become corrupted). Simply
> > passing a (known) corrupted block to a FS isn't going to do anything
> > useful. Having the FS know that "this data is known crap" could tell it
> > to
> > a) go look at a backup structure (e.g. one of the many superblock
> > copies)
> > b) guess (e.g. in disk allocation bitmap, just think of them all as
> > used)
> > c) fail with error (e.g. "cannot read directory due to a physical
> > problem with the disk"
> > d) try to reconstruct the data (e.g. search around the disk for magic
> > numbers)
> >
> > <snip>
> > > The filesystem doesn't know or care what device it is stored on, and
> > > therefore shouldn't try to predict likely failiures.
> >
> > but it should be tolerant of them and able to recover to some extent.
> > Generally, the first sign that a disk is dying (to an end user) is when
> > really-weird-stuff(tm) starts happening. A nice error message from the
> > file system when they try to go into the directory (or whatever) would
> > be a lot nicer.
> >
> > You could generalize the failure down to an extents type record (i.e.
> > offset and length) which would suit 99.9% of cases (i think :). In the
> > case of post-detection of error, the extra effort is probably worth it.
> >
> > these kinda issues are coming up in my honors thesis too, so there
> > might even be the (dreaded) code and discussion sometime near the end
> > of the year :)
> > ------------------------------
> > Stewart Smith
> > stewartsmith@mac.com
> > Ph: +61 4 3884 4332
> > ICQ: 6734154
> 
> With most devices used for file-systems most all writes succeed.
> So the file-system doesn't even know that there was some error
> until it tries to read the data, probably next week. Through the
> ages, attempts to fix this have destroyed any real I/O capability.

The fix is to dispense with the disk device altogether, and have a
huge battery-backed RAM.  It's practical already - two gigs of ECC RAM
and some logic to make it appear as an IDE or SCSI device would cost
very little to build.

Infact, you don't even need to do that.

Just put three gigs of RAM in an existing machine, and set it to boot
from CD with the root filesystem on a RAM disk, and use further RAM
disks for all of your partitions.  Copy the contents of the RAM disk
containing user data over the LAN to another box every 30 minutes.
Patch the kernel to dump the contents of the RAM disks to another box
over the LAN if it oopses.

I've actually thought of co-locating a machine and running a webserver
entirely from RAM this way.

John.

  reply	other threads:[~2003-04-25  6:58 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-19 16:04 Are linux-fs's drive-fault-tolerant by concept? Stephan von Krawczynski
2003-04-19 15:29 ` Alan Cox
2003-04-19 17:00   ` Stephan von Krawczynski
2003-04-19 22:04     ` Alan Cox
2003-04-20 16:24       ` Stephan von Krawczynski
2003-04-20 13:59     ` John Bradford
2003-04-20 16:55       ` Stephan von Krawczynski
2003-04-20 17:12         ` John Bradford
2003-04-20 17:21           ` Stephan von Krawczynski
2003-04-20 18:48             ` Alan Cox
2003-04-20 20:00               ` John Bradford
2003-04-21  1:51                 ` jw schultz
2003-04-19 21:13   ` Jos Hulzink
2003-04-20 16:07     ` Stephan von Krawczynski
2003-04-20 16:40       ` John Bradford
2003-04-20 17:01         ` Stephan von Krawczynski
2003-04-20 17:20           ` John Bradford
2003-04-21  9:32             ` Stephan von Krawczynski
2003-04-21  9:55               ` John Bradford
2003-04-21 11:24                 ` Stephan von Krawczynski
2003-04-21 11:50                   ` Alan Cox
2003-04-21 12:14                   ` John Bradford
2003-04-19 16:22 ` John Bradford
2003-04-19 16:36   ` Russell King
2003-04-19 16:45     ` John Bradford
2003-04-19 16:52   ` Stephan von Krawczynski
2003-04-19 20:04     ` John Bradford
2003-04-19 20:33       ` Andreas Dilger
2003-04-21  9:25         ` Denis Vlasenko
2003-04-21  9:42           ` John Bradford
2003-04-21 10:25             ` Stephan von Krawczynski
2003-04-21 10:50               ` John Bradford
2003-04-19 20:38       ` Stephan von Krawczynski
2003-04-20 14:21         ` John Bradford
2003-04-21  9:09           ` Denis Vlasenko
2003-04-21  9:35             ` John Bradford
2003-04-21 11:03               ` Stephan von Krawczynski
2003-04-21 12:04                 ` John Bradford
2003-04-21 11:22               ` Denis Vlasenko
2003-04-21 11:46                 ` Stephan von Krawczynski
2003-04-21 12:13                 ` John Bradford
2003-04-19 20:05     ` John Bradford
2003-04-19 23:13     ` Arnaldo Carvalho de Melo
2003-04-19 17:54   ` Felipe Alfaro Solana
2003-04-25  0:07   ` Stewart Smith
2003-04-25  0:52     ` Richard B. Johnson
2003-04-25  7:13       ` John Bradford [this message]
     [not found] ` <20030419161011$0136@gated-at.bofh.it>
2003-04-19 17:18   ` Florian Weimer
2003-04-19 18:07     ` Stephan von Krawczynski
2003-04-19 18:41       ` Dr. David Alan Gilbert
2003-04-19 20:56         ` Helge Hafting
2003-04-19 21:15           ` Valdis.Kletnieks
2003-04-20 10:51             ` Helge Hafting
2003-04-20 19:04               ` Valdis.Kletnieks
2003-04-19 21:57         ` Alan Cox
2003-04-20 10:09         ` Geert Uytterhoeven
2003-04-21  8:37         ` Denis Vlasenko
2003-05-05 12:38         ` Pavel Machek
2003-04-19 22:02     ` Alan Cox
2003-04-20  8:41       ` Arjan van de Ven
2003-04-25  0:11     ` Stewart Smith
2003-04-20 15:06 Chuck Ebbert
2003-04-20 15:19 ` John Bradford
2003-04-20 17:03 Chuck Ebbert
2003-04-20 17:25 ` John Bradford
2003-04-20 17:28 Chuck Ebbert
2003-04-21  9:36 ` Stephan von Krawczynski
2003-04-20 17:28 Chuck Ebbert
2003-04-20 17:44 Chuck Ebbert
2003-04-20 17:44 Chuck Ebbert
     [not found] <mail.linux.kernel/20030420185512.763df745.skraw@ithnet.com>
     [not found] ` <03Apr21.020150edt.41463@gpu.utcc.utoronto.ca>
2003-04-21 11:19   ` Stephan von Krawczynski
2003-04-21 11:52     ` Alan Cox
2003-04-21 14:14     ` Valdis.Kletnieks
2003-05-06  7:03       ` Mike Fedyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200304250713.h3P7Ddp6000359@81-2-122-30.bradfords.org.uk \
    --to=john@grabjohn.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=root@chaos.analogic.com \
    --cc=skraw@ithnet.com \
    --cc=stewartsmith@mac.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).