Re: Fw: 2.6.28.9: EXT3/NFS inodes corruption

From: Sylvain Rochet <gradator-XWGZPxRNpGHk1uMJSBkQmQ@public.gmane.org>
To: Theodore Tso <tytso-3s7WtUTddSA@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Fw: 2.6.28.9: EXT3/NFS inodes corruption
Date: Fri, 24 Apr 2009 01:14:14 +0200	[thread overview]
Message-ID: <20090423231414.GA32422@gradator.net> (raw)
In-Reply-To: <20090423001139.GX15541-3s7WtUTddSA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2379 bytes --]

Hi,

On Wed, Apr 22, 2009 at 08:11:39PM -0400, Theodore Tso wrote:
> 
> On the server side, that means you also an inode table block look
> corrupted.  I'm pretty sure that if you used debugfs to examine those
> blocks you would have seen that the inodes were completely garbaged.

Yep, I destroyed all evidences by using badblocks in read-write mode, 
but in case of real need of them we just have to put the production back 
on the primary array and wait a few days.

> Depending on the inode size, and assuming a 4k block size, there are
> typically 128 or 64 inodes in a 4k block,

4k block size
128 bytes/inode

so 32 inodes per 4k block in our case ?

Since the new default is 256 bytes/inode and values of less than 128 are 
not allowed, how is it possible to store 64 or 128 inodes in a 4k block ?
(Maybe I miss something :p)

> so if you were to look at the inodes by inode number, you normally 
> find that adjacent inodes are corrupted within a 4k block.  Of course, 
> this just tells us what had gotten damaged; whether it was damanged by 
> a kernel bug, a memory bug, a hard drive or controller failure (and 
> there are multiple types of storage stack failures; complete garbage 
> getting written into the right place, and the right data getting 
> written into the wrong place).

Yes, this is not going to be easy to find out what is responsible, but 
based on the probability of hardware that use to fail easily, let's 
point out one of the harddrive :-)

> Well, sure, but any amount of corruption is extremely troubling....

Yep ;-)

> > By the way, if such corruptions doesn't happen on the backup storage 
> > array we can conclude to a hardware problem around the primary one, but, 
> > we are not going to be able to conclude before a few weeks.
> 
> Good luck!!

Thanks, actually this isn't so bad, we enjoy having backup hardware
(The things we always consider as useless until we -really- need it -- 
"Who said like backups ? I heard it from the end of the room." ;-)

By the way, the badblocks check is going to take 12 days considering the 
current rate. However I ran some data checks of the raid6 array in the 
past, mainly when the filesystem was corrupted and every check 
succeeded. Maybe the raid6 driver computed another parity strides by 
reading corrupted data.

Sylvain

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]