linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Jos Visser <josv@osp.nl>
To: Andi Kleen <ak@suse.de>
Cc: linux-lvm@msede.com
Subject: Re: [linux-lvm] LVM 0.8 and reiser filesystem
Date: Wed, 7 Jun 2000 18:04:55 +0200	[thread overview]
Message-ID: <20000607180455.Y3279@jadzia.josv.com> (raw)
In-Reply-To: <20000607145954.A22712@gruyere.muc.suse.de>; from ak@suse.de on Wed, Jun 07, 2000 at 02:59:54PM +0200

And thus it came to pass that Andi Kleen wrote:
(on Wed, Jun 07, 2000 at 02:59:54PM +0200 to be exact)

> On Wed, Jun 07, 2000 at 02:00:43PM +0200, Luca Berra wrote:
> > On Tue, Jun 06, 2000 at 06:41:38PM +0200, Andi Kleen wrote:
> > > On a real production system you probably should not use software RAID1
> > > or RAID5 though. It is unreliable in the crash case though because
> > > it does not support data logging. In this case a hardware RAID controller
> > > is the better alternative. Of course you can run LVM on top of it.
> > I fail to get your point, what makes hw raid more reliable than sw raid?
> > why are you saying that sw raid is unreliable.
> 
> RAID1 and RAID5 require atomic update of several blocks (parity or mirror
> blocks). If the machine crashes inbetween writing such an atomic update
> it gets inconsistent.
> 
> In RAID5 that is very bad (e.g. when the parity block is not uptodate
> and another block is unreadable) you get silent data corruption. In
> RAID1 with a slave device you at worst get oudated data (may cause
> problems with journaled file systems or programs that fsync/O_SYNC
> really guarantee stable on disk storage). raidcheck can fix that in
> a lot of cases, but not in all: sometimes it cannot decide if a 
> block contains old or new data. 
> 
> Hardware RAID usually avoids the problem by using a battery backed 
> log device for atomic updates. Software Raid could do the same
> by logging block updates in a log (e.g. together with the journaled
> file system), but that is not implemented in Linux ATM. It would
> also be a severe performance hit.

The way HP's logical volume manager does it is by maintaining a kind of 
data log somewhere in the volume metadata.  This log (let's call it the 
Mirror Write Cache) is effectively a bitmap which keeps track of which 
blocks in the logical volume are hit by a write.  The unit of 
granularity here is not an individual block, but something that is 
called a Large Track Group (LTG, let's say a couple of MB).  Whenever 
all parallel writes are finished, the corresponding LTG bit in the MWC 
is cleared and the MWC on disk is (eventually) updated.

After a crash when the Volume Group is activated, all copies (plexes)
of a volume must be synchronized. The VM software inspects the MWC, and
then knows which blocks might be out of sync across the plexes. Only
these blocks are then synchronized using a read from the preferred plex
and write to all other plexes. The MWC is used to prevent a full sync
after a crash.

++Jos


-- 
The InSANE quiz master is always right!
(or was it the other way round? :-)

  parent reply	other threads:[~2000-06-07 16:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-06-06 15:44 [linux-lvm] LVM 0.8 and reiser filesystem holger_zecha
2000-06-06 16:21 ` Luca Berra
2000-06-06 16:29 ` Brian Kress
2000-06-06 16:41 ` Andi Kleen
2000-06-07 12:00   ` Luca Berra
2000-06-07 12:59     ` Andi Kleen
2000-06-07 15:34       ` Luca Berra
2000-06-07 16:14         ` Andi Kleen
2000-06-07 16:04       ` Jos Visser [this message]
2000-06-07 16:08         ` Andi Kleen
2000-06-07 16:23           ` Jos Visser
2000-06-07 13:53     ` Eric M. Hopper
2000-06-06 17:33 ` Eric M. Hopper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20000607180455.Y3279@jadzia.josv.com \
    --to=josv@osp.nl \
    --cc=ak@suse.de \
    --cc=linux-lvm@msede.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).