linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@cse.unsw.edu.au>
To: "Otto Meier" <gf435@gmx.net>, Holger Kiehl <Holger.Kiehl@dwd.de>,
	Hans Reiser <reiser@namesys.com>,
	edward@namesys.com, Ed Tomlinson <tomlins@cam.org>,
	Nils Rennebarth <nils@ipe.uni-stuttgart.de>,
	Manfred Spraul <manfred@colorfullife.com>,
	David Willmore <n0ymv@callsign.net>,
	Linus Torvalds <torvalds@transmeta.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: [PATCH] - filesystem corruption on soft RAID5 in 2.4.0+
Date: Mon, 22 Jan 2001 07:47:42 +1100 (EST)	[thread overview]
Message-ID: <14955.19182.663691.194031@notabene.cse.unsw.edu.au> (raw)


There have been assorted reports of filesystem corruption on raid5 in
2.4.0, and I have finally got a patch - see below.
I don't know if it addresses everybody's problems, but it fixed a very
really problem that is very reproducable.

The problem is that parity can be calculated wrongly when doing a
read-modify-write update cycle.  If you have a fully functional, you
wont notice this problem as the parity block is never used to return
data.  But if you have a degraded array, you will get corruption very
quickly.
So I think this will solve the reported corruption with ext2fs, as I
think they were mostly on degradred arrays.  I have no idea whether it
will address the reiserfs problems as I don't think anybody reporting
those problems described their array.

In any case, please apply, and let me know of any further problems.


--- ./drivers/md/raid5.c	2001/01/21 04:01:57	1.1
+++ ./drivers/md/raid5.c	2001/01/21 20:36:05	1.2
@@ -714,6 +714,11 @@
 		break;
 	}
 	spin_unlock_irq(&conf->device_lock);
+	if (count>1) {
+		xor_block(count, bh_ptr);
+		count = 1;
+	}
+	
 	for (i = disks; i--;)
 		if (chosen[i]) {
 			struct buffer_head *bh = sh->bh_cache[i];


 From my notes for this patch:

   For the read-modify-write cycle, we need to calculate the xor of a
   bunch of old blocks and bunch of new versions of those blocks.  The
   old and new blocks occupy the same buffer space, and because xoring
   is delayed until we have lots of buffers, it could get delayed too
   much and parity doesn't get calculated until after data had been
   over-written.

   This patch flushes any pending xor's before copying over old buffers.


Everybody running raid5 on 2.4.0 or 2.4.1-pre really should apply this
patch, and then arrange the get parity checked and corrected on their
array.
There currently isn't a clean way to correct parity.
One way would be to shut down to single user, remount all filesystems
readonly, or un mount them, and the pull the plug.
On reboot, raid will rebuild parity, but the filesystems should be
clean.
An alternate it so rerun mkraid giving exactly the write configuration.
This doesn't require pulling the plug, but if you get the config file
wrong, you could loose your data.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

             reply	other threads:[~2001-01-21 20:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-01-21 20:47 Neil Brown [this message]
2001-01-21 21:14 ` [PATCH] - filesystem corruption on soft RAID5 in 2.4.0+ Manfred Spraul
2001-01-22 11:19   ` Holger Kiehl
2001-01-22  9:23 ` Hans Reiser
2001-01-22 19:36 ` Edward
2001-01-23  8:21 ` Holger Kiehl
2001-03-07 21:30 ` Kernel crash during resync of raid5 on SMP Otto Meier
2001-03-07 21:55   ` Neil Brown
2001-03-08 15:19     ` Otto Meier
2001-03-09  0:17       ` Neil Brown
2001-01-22  0:18 [PATCH] - filesystem corruption on soft RAID5 in 2.4.0+ Bernd Eckenfels
2001-01-22  6:37 ` Neil Brown
2001-01-22 18:09 Otto Meier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14955.19182.663691.194031@notabene.cse.unsw.edu.au \
    --to=neilb@cse.unsw.edu.au \
    --cc=Holger.Kiehl@dwd.de \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=edward@namesys.com \
    --cc=gf435@gmx.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=n0ymv@callsign.net \
    --cc=nils@ipe.uni-stuttgart.de \
    --cc=reiser@namesys.com \
    --cc=tomlins@cam.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).