linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ville Herva <vherva@niksula.hut.fi>
To: linux-kernel@vger.kernel.org, Willy Tarreau <willy@w.ods.org>
Subject: Re: Something corrupts raid5 disks slightly during reboot
Date: Wed, 14 Jan 2004 16:46:46 +0200	[thread overview]
Message-ID: <20040114144646.GS11115091@niksula.cs.hut.fi> (raw)
In-Reply-To: <20040102194200.GA11115091@niksula.cs.hut.fi>

On Fri, Jan 02, 2004 at 09:42:00PM +0200, you [Ville Herva] wrote:
> Summary:                                                                   
> 
> I've been experiencing strange corruption on a raid5 volume for some time. 
> The kernel is 2.2.x + RAID-0.90 patch. Fs is ext2 (+e2compr). After        
> unmounting the filesystem, I can mount it again without problems. I can also
> raidstop the raid device in between and all is still fine:
> 
> > umount /dev/md4; mount /dev/md4
>     - no corruption              
> > umount /dev/md4; raidstop /dev/md4; raidstart /dev/md4; mount /dev/md4
>     - no corruption                                                     
> 
> But after a reboot, the filesystem is corrupted - few bytes differ in the
> beginning of /dev/md4 between 1k and and 5k.
> 
> See the threads
>   http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MMYt.4B2.1%40gated-at.bofh.it&rnum=1&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg
>   http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MZsH.72R.5%40gated-at.bofh.it&rnum=4&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg
> for details.
(...) 
> I found out that the difference (corruption) is usually on three bytes on
> /dev/hdg, but sometimes on /dev/hdc, too. (/dev/md4 = hdb+hdc+hdg; hdb&hdc
> are on i810, hdg is on hpt370).
> 
> First, I did
>    umount /dev/md4
>    raidstop /dev/md4
>    head -c 50k /dev/hdg > /save/hdg
>    reboot
> 
> To rule out kernel raid autodetect and raid code in general, I
> booted 2.2.25-1-secure with "single init=/bin/bash raid=noautodetect".
>  Did
>    head -c50k /dev/hdg | cmp -l /save/hdg
>  Three bytes differed:
>    4641   0      35
>    4642   0      205
>    4643   0      10
>    bytepos after before
>            boot  boot  
> 
>  wrote the original stuff back:
>    dd if=/save/hdg /dev/hdg
>    sync
>    hdparm -W0 /dev/hdg
>    sync
>    reboot
> 
> Booted 2.2.25-1-secure with "single init=/bin/bash raid=noautodetect"
> again.
>  Did
>    head -c50k /dev/hdg | cmp -l /save/hdg
>  Three same three bytes differed again.
>  Wrote the stuff back, sync'ed, did hdparm, and powered off. Still, the the
> bytes differed on next boot.
> 
> Then I booted 2.4.21-jam1 with "single init=/bin/bash raid=noautodetect" (I
> happened to have 2.4.21-jam1 compiled with suitable drivers at hand).
>  Wrote the same stuff back with dd, synced, turned ide cache off.
>  Booted 2.4.21-jam1 with "single init=/bin/bash raid=noautodetect" again.
>  Did the diff; the three bytes differed again.
> 
> Note that sometimes few bytes on hdc differed, too. Usually it was just the
> three hdg bytes.
> 
> So this is not a 2.2 kernel issue. I very much doubt it's a kernel issue at
> all. Unless it is a bug in kernel partition detection that is still present
> in 2.4.x.
>          
> I tried to turn off the ide write cache with hdparm -W0, so it shouldn't  
> be a write caching issue.
> 
> If it's a bios issue, it's really a strange one, since it affects both disks
> on i810 ide and on hpt370. The disks have no partition table, though, which
> _could_ confuse the bios.

Addition: 

  - I tried booting from 2.6.1 single user mode to 2.6.1 single user
    mode (booting with sysrq-b to avoid shutdown process):
       ->  The corruption on /dev/hdg happens like with 2.2 and 2.4

  - I booted from 2.6.1 single user mode to 2.6.1 single user
    mode with kexec patch to avoid entering BIOS in between
       ->  The corruption DOES NOT happen

I'm pretty much out of ideas.


-- v --

v@iki.fi

  parent reply	other threads:[~2004-01-14 14:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-31 19:08 Something corrupts raid5 disks slightly during reboot Ville Herva
2003-11-01  1:41 ` Jeffrey E. Hundstad
2003-11-01  1:57   ` Mike Fedyk
2003-11-01  8:33     ` Ville Herva
2003-11-01  8:27   ` ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] Ville Herva
2003-11-01 15:56     ` Willy Tarreau
2003-11-01 18:25       ` Ville Herva
2003-11-01 19:01         ` Willy Tarreau
2003-11-01 21:02           ` Ville Herva
2003-11-02  6:05             ` Andre Hedrick
2003-11-02  8:28               ` Ville Herva
2003-11-02 20:57                 ` Matthias Andree
2003-11-03  5:34                 ` Andre Hedrick
2003-11-03  6:38                   ` Ville Herva
2004-01-02 19:42           ` Something corrupts raid5 disks slightly during reboot Ville Herva
2004-01-02 20:02             ` Ville Herva
2004-01-14 14:46             ` Ville Herva [this message]
2004-01-14 22:22               ` Willy Tarreau
2004-01-14 22:46                 ` Ville Herva
2004-01-14 16:39 Samium Gromoff
2004-01-14 22:30 ` Ville Herva
2004-01-15 12:42   ` Samium Gromoff
2004-01-15 19:57     ` Ville Herva
2004-01-16 10:24       ` Samium Gromoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040114144646.GS11115091@niksula.cs.hut.fi \
    --to=vherva@niksula.hut.fi \
    --cc=linux-kernel@vger.kernel.org \
    --cc=willy@w.ods.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).