Re: Fw: 2.6.28.9: EXT3/NFS inodes corruption

From: Theodore Tso <tytso-3s7WtUTddSA@public.gmane.org>
To: Sylvain Rochet <gradator-XWGZPxRNpGHk1uMJSBkQmQ@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Fw: 2.6.28.9: EXT3/NFS inodes corruption
Date: Wed, 22 Apr 2009 20:11:39 -0400	[thread overview]
Message-ID: <20090423001139.GX15541@mit.edu> (raw)
In-Reply-To: <20090422234823.GA24477-XWGZPxRNpGHk1uMJSBkQmQ@public.gmane.org>

On Thu, Apr 23, 2009 at 01:48:23AM +0200, Sylvain Rochet wrote:
> > 
> > This is on the client side; what happens when you look at the same
> > directory from the server side?
> 
> This is on the server side ;)
> 

On the server side, that means you also an inode table block look
corrupted.  I'm pretty sure that if you used debugfs to examine those
blocks you would have seen that the inodes were completely garbaged.
Depending on the inode size, and assuming a 4k block size, there are
typically 128 or 64 inodes in a 4k block, so if you were to look at
the inodes by inode number, you normally find that adjacent inodes are
corrupted within a 4k block.  Of course, this just tells us what had
gotten damaged; whether it was damanged by a kernel bug, a memory bug,
a hard drive or controller failure (and there are multiple types of
storage stack failures; complete garbage getting written into the
right place, and the right data getting written into the wrong place).

> Well, this is the inode numbers of directories with entries pointing on 
> inexisting inodes, of course we cannot delete these directories anymore 
> through a regular recursive deletion (well, without debugfs ;). 
> Considering the amount of inodes, this is quite a very low corruption 
> rate.

Well, sure, but any amount of corruption is extremely troubling....

> Yes, this is what we thought too, especially because we use ext3/nfs for 
> a very long time without problem like that. I moved all the data to the 
> backup array so we can now do read-write tests on the primary one 
> without impacting much the production.
> 
> So, let's check the raid6 array, well, this is going to take a few days.
> 
> # badblocks -w -s /dev/md10
> 
> If everything goes well I will check disk by disk.
> 
> By the way, if such corruptions doesn't happen on the backup storage 
> array we can conclude to a hardware problem around the primary one, but, 
> we are not going to be able to conclude before a few weeks.

Good luck!!

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html