All of lore.kernel.org
 help / color / mirror / Atom feed
* ext4_ext_check_inode: bad header/extent in inode
@ 2009-04-23 12:15 Christian Kujau
  2009-04-23 12:50 ` Eric Sandeen
  0 siblings, 1 reply; 22+ messages in thread
From: Christian Kujau @ 2009-04-23 12:15 UTC (permalink / raw)
  To: linux-ext4

Hi there,

let's say "something" happened to this ext4 partition - OK, I created a 
ZFS pool on a freshly created (unmounted) ext4 partition (for testing 
purposes) and this might have wiped some ext4 information off the 
partition. But I was able to mount the partition again and it looked
like as if all data was in place. 

However, a fsck later on revealed and fixed quite a few errors. Now the 
filesystem can still be mounted, but some files cannot be read:

------------------
# mount -t ext4 /dev/md0 /mnt/md0
# ls -la /mnt/md0 /mnt/md0/lost+found
/mnt/md0:
total 28
drwxr-xr-x  4 root  root   4096 Apr 23 13:43 .
drwxr-xr-x  4 root  root   4096 Apr  2 13:39 ..
drwxr-xr-x 23 dummy users  4096 Apr 18 21:06 linux-2.6-git
drwx------  3 root  root  16384 Apr 23 13:43 lost+found
ls: cannot access /mnt/md0/lost+found/#12042: Input/output error
ls: cannot access /mnt/md0/lost+found/#12207: Input/output error
ls: cannot access /mnt/md0/lost+found/#12249: Input/output error
-------------------


I realize that "creating a filesystem on an ext4 partition" may indeed 
harm ext4 information and I don't expect fsck to get everything fixed - 
but then I think: in the real world this "destruction" could be caused 
by bad memory/cables or just a disk controller gone mad - so yes, some 
ext4 information may have been lost, but:

Shouldn't fsck (1.41.3) complain more, when there are errors left
on the filesystem? Even if the errors cannot be fixed, I'd have 
expected fsck to tell me about that. But fsck exits clean on the 2nd run, 
but there are still a few files unaccessible.

Details about the process (mkfs, fsck, etc.), a .config and kernel logs 
are here: http://nerdbynature.de/bits/2.6.30-rc2/

Thanks,
Christian.
-- 
Only Bruce Schneier is allowed to wear the "I read the NSA's e-mail" t-shirt.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-23 12:15 ext4_ext_check_inode: bad header/extent in inode Christian Kujau
@ 2009-04-23 12:50 ` Eric Sandeen
  2009-04-23 19:04   ` Christian Kujau
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Sandeen @ 2009-04-23 12:50 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-ext4

Christian Kujau wrote:
> Hi there,
> 
> let's say "something" happened to this ext4 partition - OK, I created a 
> ZFS pool on a freshly created (unmounted) ext4 partition (for testing 
> purposes) and this might have wiped some ext4 information off the 
> partition. But I was able to mount the partition again and it looked
> like as if all data was in place. 
> 
> However, a fsck later on revealed and fixed quite a few errors. Now the 
> filesystem can still be mounted, but some files cannot be read:
> 
> ------------------
> # mount -t ext4 /dev/md0 /mnt/md0
> # ls -la /mnt/md0 /mnt/md0/lost+found
> /mnt/md0:
> total 28
> drwxr-xr-x  4 root  root   4096 Apr 23 13:43 .
> drwxr-xr-x  4 root  root   4096 Apr  2 13:39 ..
> drwxr-xr-x 23 dummy users  4096 Apr 18 21:06 linux-2.6-git
> drwx------  3 root  root  16384 Apr 23 13:43 lost+found
> ls: cannot access /mnt/md0/lost+found/#12042: Input/output error
> ls: cannot access /mnt/md0/lost+found/#12207: Input/output error
> ls: cannot access /mnt/md0/lost+found/#12249: Input/output error
> -------------------
> 
> 
> I realize that "creating a filesystem on an ext4 partition" may indeed 
> harm ext4 information and I don't expect fsck to get everything fixed - 
> but then I think: in the real world this "destruction" could be caused 
> by bad memory/cables or just a disk controller gone mad - so yes, some 
> ext4 information may have been lost, but:
> 
> Shouldn't fsck (1.41.3) complain more, when there are errors left
> on the filesystem? Even if the errors cannot be fixed, I'd have 
> expected fsck to tell me about that. But fsck exits clean on the 2nd run, 
> but there are still a few files unaccessible.

Yep, probably so; based on:

[400026.511081] EXT4-fs error (device md0): ext4_ext_check_inode: bad
header/extent in inode #12120: invalid magic - magic 702, entries 30990,
max 5120(0), depth 55352(55352)
[400026.519200] EXT4-fs error (device md0): ext4_ext_check_inode: bad
header/extent in inode #12272: invalid magic - magic 1c2b, entries 6928,
max 14(0), depth 4116(4116)

I'd have expected fsck to find that, I think.  I'd first suggest using
1.41.4 or 1.41.5 (probably released very soon) and see if that catches
it (I don't remember offhand if there is a relevant change since 1.41.3
but the check should be easy...)

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-23 12:50 ` Eric Sandeen
@ 2009-04-23 19:04   ` Christian Kujau
  2009-04-23 20:40     ` Theodore Tso
  2009-04-23 20:51     ` Andreas Dilger
  0 siblings, 2 replies; 22+ messages in thread
From: Christian Kujau @ 2009-04-23 19:04 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

On Thu, 23 Apr 2009, Eric Sandeen wrote:
> I'd have expected fsck to find that, I think.  I'd first suggest using
> 1.41.4 or 1.41.5 (probably released very soon) and see if that catches
> it (I don't remember offhand if there is a relevant change since 1.41.3
> but the check should be easy...)

Yes, in fact I _did_ have the latest e2fsprogs.git checkout [0] in 
place, but did not use it. OK, compiled that, e2fsck still present itself 
as "1.41.4" (which tree do I have to follow to get the 1.41.5 one?) but
was not able to fix the errors either. Again, I do not expect e2fsck to 
actuall fix it, because the damage I did to the fs was probably too 
severe. But when fsck exits with code 0, I'd "expect" it to be clean. So, 
I guess what I want is fsck to tell me to get my backups ready, as the fs 
is damaged too heavily...

Christian.

[0] git://git.eu.kernel.org/pub/scm/fs/ext2/e2fsprogs.git
-- 
The Phaistos Disc had a hieroglyph that translates to "Bruce Schneier".

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-23 19:04   ` Christian Kujau
@ 2009-04-23 20:40     ` Theodore Tso
  2009-04-23 22:15       ` Christian Kujau
  2009-04-23 20:51     ` Andreas Dilger
  1 sibling, 1 reply; 22+ messages in thread
From: Theodore Tso @ 2009-04-23 20:40 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Eric Sandeen, linux-ext4

On Thu, Apr 23, 2009 at 12:04:38PM -0700, Christian Kujau wrote:
> On Thu, 23 Apr 2009, Eric Sandeen wrote:
> > I'd have expected fsck to find that, I think.  I'd first suggest using
> > 1.41.4 or 1.41.5 (probably released very soon) and see if that catches
> > it (I don't remember offhand if there is a relevant change since 1.41.3
> > but the check should be easy...)
> 
> Yes, in fact I _did_ have the latest e2fsprogs.git checkout [0] in 
> place, but did not use it. OK, compiled that, e2fsck still present itself 
> as "1.41.4" (which tree do I have to follow to get the 1.41.5 one?) but
> was not able to fix the errors either. Again, I do not expect e2fsck to 
> actuall fix it, because the damage I did to the fs was probably too 
> severe. But when fsck exits with code 0, I'd "expect" it to be clean. So, 
> I guess what I want is fsck to tell me to get my backups ready, as the fs 
> is damaged too heavily...

Hmm, it really should have detected it.  OK. if you still have the
filesystem around, can you first start by sending me an e2image file:

e2image /dev/md0 /tmp/md0.e2i

This is will dump out the superblock, block group descriptors, and
inode table, and it will allow me to take a look at the inodes in
question.

I tried corrupting the eh_magic field in a test filesystem, and e2fsck
caught it no problem.

Eventually I might need a raw e2image dump, i.e.:


	   e2image -r /dev/md0 - | bzip2 > /var/tmp/md0.e2i.bz2

but such things are very large, and reveal more information, since it
also includes directory names.  But let's see if a simple e2image file
is enough for me to get started.

Thanks,

					- Ted

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-23 19:04   ` Christian Kujau
  2009-04-23 20:40     ` Theodore Tso
@ 2009-04-23 20:51     ` Andreas Dilger
  1 sibling, 0 replies; 22+ messages in thread
From: Andreas Dilger @ 2009-04-23 20:51 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Eric Sandeen, linux-ext4

On Apr 23, 2009  12:04 -0700, Christian Kujau wrote:
> On Thu, 23 Apr 2009, Eric Sandeen wrote:
> > I'd have expected fsck to find that, I think.  I'd first suggest using
> > 1.41.4 or 1.41.5 (probably released very soon) and see if that catches
> > it (I don't remember offhand if there is a relevant change since 1.41.3
> > but the check should be easy...)

There is a function ext2fs_extent_header_verify() that should be
catching the reported corruption.  The extents regression tests
that we have for e2fsck are detecting things like bad headers
in the tests that we have.

Using "debugfs" with "imap" to find and dump the inode blocks for
the corrupted inodes would be a good start for tracking this down.
If you have the space, saving an "e2image" of the filesystem should
also be done for future debugging.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-23 20:40     ` Theodore Tso
@ 2009-04-23 22:15       ` Christian Kujau
  2009-04-24  3:20         ` Theodore Tso
  0 siblings, 1 reply; 22+ messages in thread
From: Christian Kujau @ 2009-04-23 22:15 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Eric Sandeen, linux-ext4

On Thu, 23 Apr 2009, Theodore Tso wrote:
> Eventually I might need a raw e2image dump, i.e.:
> 
> 	   e2image -r /dev/md0 - | bzip2 > /var/tmp/md0.e2i.bz2

I've put the raw e2image dump on http://nerdbynature.de/bits/2.6.30-rc2/
Do you still need the dump without the "-r"?

I could even delete all the intact files on the filesystem, fill the free 
space will zeros and try to dump the whole image if that would be helpful. 

Thanks,
Christian.
-- 
Bruce Schneier only smiles when he finds an unbreakable cryptosystem. Of
course, Bruce Schneier never smiles.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-23 22:15       ` Christian Kujau
@ 2009-04-24  3:20         ` Theodore Tso
  2009-04-24  7:09           ` Andreas Dilger
  2009-04-24  8:57           ` Christian Kujau
  0 siblings, 2 replies; 22+ messages in thread
From: Theodore Tso @ 2009-04-24  3:20 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Eric Sandeen, linux-ext4

On Thu, Apr 23, 2009 at 03:15:10PM -0700, Christian Kujau wrote:
> On Thu, 23 Apr 2009, Theodore Tso wrote:
> > Eventually I might need a raw e2image dump, i.e.:
> > 
> > 	   e2image -r /dev/md0 - | bzip2 > /var/tmp/md0.e2i.bz2
> 
> I've put the raw e2image dump on http://nerdbynature.de/bits/2.6.30-rc2/
> Do you still need the dump without the "-r"?

Nope, the raw e2image file was perfect.  This was actually a problem I
knew about, and wanted to get fixed before the e2fsprogs 1.41.5
release.  The problem was that i_file_acl_high was set and the kernel
unconditionally checks for it even though the INCOMPAT_64_BITS flag is
not set.

It's been fixed in e2fpsrogs 1.41.5, which I've only just released.
It's available on git, sourceforge, and all of the other usual places.
This should fix the problem for you.

					- Ted

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24  3:20         ` Theodore Tso
@ 2009-04-24  7:09           ` Andreas Dilger
  2009-04-24 11:58             ` Theodore Tso
  2009-04-24  8:57           ` Christian Kujau
  1 sibling, 1 reply; 22+ messages in thread
From: Andreas Dilger @ 2009-04-24  7:09 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Christian Kujau, Eric Sandeen, linux-ext4

On Apr 23, 2009  23:20 -0400, Theodore Ts'o wrote:
> On Thu, Apr 23, 2009 at 03:15:10PM -0700, Christian Kujau wrote:
> > On Thu, 23 Apr 2009, Theodore Tso wrote:
> > > Eventually I might need a raw e2image dump, i.e.:
> > > 
> > > 	   e2image -r /dev/md0 - | bzip2 > /var/tmp/md0.e2i.bz2
> > 
> > I've put the raw e2image dump on http://nerdbynature.de/bits/2.6.30-rc2/
> > Do you still need the dump without the "-r"?
> 
> Nope, the raw e2image file was perfect.  This was actually a problem I
> knew about, and wanted to get fixed before the e2fsprogs 1.41.5
> release.  The problem was that i_file_acl_high was set and the kernel
> unconditionally checks for it even though the INCOMPAT_64_BITS flag is
> not set.

Maybe I'm missing something, but how can i_file_acl_high being used
cause a problem with the file extent maps?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24  3:20         ` Theodore Tso
  2009-04-24  7:09           ` Andreas Dilger
@ 2009-04-24  8:57           ` Christian Kujau
  2009-04-24  9:40             ` Christian Kujau
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Kujau @ 2009-04-24  8:57 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Eric Sandeen, linux-ext4

On Thu, 23 Apr 2009, Theodore Tso wrote:
> It's been fixed in e2fpsrogs 1.41.5, which I've only just released.
> It's available on git, sourceforge, and all of the other usual places.
> This should fix the problem for you.

I'm pulling from git.eu.kernel.org and thought that it hasn't been sync'ed 
yet. But git.kernel.org still lists 8203fe506a06524587c18940b6cd19a0592a4bd2
as the last commit.

How do I checkout the latest version? I thought "git pull" would just do 
that.

Thanks,
Christian.
-- 
Bruce Schneier can change most random distributions. With his fists.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24  8:57           ` Christian Kujau
@ 2009-04-24  9:40             ` Christian Kujau
  2009-04-24 12:00               ` Theodore Tso
  0 siblings, 1 reply; 22+ messages in thread
From: Christian Kujau @ 2009-04-24  9:40 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Eric Sandeen, linux-ext4

On Fri, 24 Apr 2009, Christian Kujau wrote:
> On Thu, 23 Apr 2009, Theodore Tso wrote:
> > It's been fixed in e2fpsrogs 1.41.5, which I've only just released.
> > It's available on git, sourceforge, and all of the other usual places.
> > This should fix the problem for you.

I got it from sourceforge now and it did indeed "do" something:

sid:~# /opt/e2fsprogs/sbin/fsck.ext4 -v /dev/md0
e2fsck 1.41.5 (23-Apr-2009)
/dev/md0 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
i_file_acl_hi for inode 12042 (/lost+found/#12042) is 64578, should be zero. Clear? yes
i_file_acl_hi for inode 12207 (/lost+found/#12207) is 5, should be zero Clear? yes
i_file_acl_hi for inode 12249 (/lost+found/#12249) is 5, should be zero.Clear? yes
i_file_acl_hi for inode 12090 (/t/#12090) is 5932, should be zero. Clear? yes

...but the "Input/output error" persist :-\

Full scriptlog: http://nerdbynature.de/bits/2.6.30-rc2/scriptlog2.txt

Maybe the repair with 1.41.{3,4} before did something wrong and I should 
use 1.41.5 from the beginning? This issue is somewhat reproducible (given 
the fs holds enough files), so I could start over, corrupt the fs again 
and let e2fsprogs-1.41.5 do the whole job.

Christian.
-- 
Bruce Schneier does not leak information on the EM spectrum: he emits the theme
to The Good, The Bad, and The Ugly.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24  7:09           ` Andreas Dilger
@ 2009-04-24 11:58             ` Theodore Tso
  2009-04-24 20:09               ` Andreas Dilger
  0 siblings, 1 reply; 22+ messages in thread
From: Theodore Tso @ 2009-04-24 11:58 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Christian Kujau, Eric Sandeen, linux-ext4

On Fri, Apr 24, 2009 at 01:09:43AM -0600, Andreas Dilger wrote:
> On Apr 23, 2009  23:20 -0400, Theodore Ts'o wrote:
> > On Thu, Apr 23, 2009 at 03:15:10PM -0700, Christian Kujau wrote:
> > > On Thu, 23 Apr 2009, Theodore Tso wrote:
> > > > Eventually I might need a raw e2image dump, i.e.:
> > > > 
> > > > 	   e2image -r /dev/md0 - | bzip2 > /var/tmp/md0.e2i.bz2
> > > 
> > > I've put the raw e2image dump on http://nerdbynature.de/bits/2.6.30-rc2/
> > > Do you still need the dump without the "-r"?
> > 
> > Nope, the raw e2image file was perfect.  This was actually a problem I
> > knew about, and wanted to get fixed before the e2fsprogs 1.41.5
> > release.  The problem was that i_file_acl_high was set and the kernel
> > unconditionally checks for it even though the INCOMPAT_64_BITS flag is
> > not set.
> 
> Maybe I'm missing something, but how can i_file_acl_high being used
> cause a problem with the file extent maps?

It didn't; what had happened was that a garbage block had got written
into the inode table.  This caused the kernel to complain about
eh_magic being wrong in the inode table.  E2fsck 1.41.3 fixed those
problems, but it ignored i_file_acl_high because the 64 bit feature
flag was not set.  The kernel always pays attention to
i_file_acl_high, regardless of whether the 64-bit feature flag is set
or not.  Hence, the kernel was complaining and refused to touch those
inodes.   

Actually, on a 2.6.30 kernel it causes the kernel to loop forever with
kernel messages "__find_get_block_slow() failed."

In any case, with e2fsck 1.41.5, I added code to fix i_file_acl_high
getting set, and I was then able to mount the raw e2image file and not
have any problems accessing the files in question, so I'm pretty sure
that was the problem.

						- Ted

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24  9:40             ` Christian Kujau
@ 2009-04-24 12:00               ` Theodore Tso
  2009-04-24 12:36                 ` Eric Sandeen
  2009-04-24 20:41                 ` Christian Kujau
  0 siblings, 2 replies; 22+ messages in thread
From: Theodore Tso @ 2009-04-24 12:00 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Eric Sandeen, linux-ext4

On Fri, Apr 24, 2009 at 02:40:58AM -0700, Christian Kujau wrote:
> 
> Maybe the repair with 1.41.{3,4} before did something wrong and I should 
> use 1.41.5 from the beginning? This issue is somewhat reproducible (given 
> the fs holds enough files), so I could start over, corrupt the fs again 
> and let e2fsprogs-1.41.5 do the whole job.

I doubt it; the problem was that the repair performed by e2fsck 1.41.3
was simply imcomplete.  You can try to reproduce the corrupted
filesystem again, but it should result in the same result.

Note that because there was garbage written into the inode table,
there was going to be data loss; there's not much that can be done
about that.  This just simple does a better job cleaning up after the
mess, that's all.

						- Ted


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 12:00               ` Theodore Tso
@ 2009-04-24 12:36                 ` Eric Sandeen
  2009-04-24 12:42                   ` Eric Sandeen
  2009-04-24 20:21                   ` Christian Kujau
  2009-04-24 20:41                 ` Christian Kujau
  1 sibling, 2 replies; 22+ messages in thread
From: Eric Sandeen @ 2009-04-24 12:36 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Christian Kujau, linux-ext4

Theodore Tso wrote:
> On Fri, Apr 24, 2009 at 02:40:58AM -0700, Christian Kujau wrote:
>> Maybe the repair with 1.41.{3,4} before did something wrong and I should 
>> use 1.41.5 from the beginning? This issue is somewhat reproducible (given 
>> the fs holds enough files), so I could start over, corrupt the fs again 
>> and let e2fsprogs-1.41.5 do the whole job.
> 
> I doubt it; the problem was that the repair performed by e2fsck 1.41.3
> was simply imcomplete.  You can try to reproduce the corrupted
> filesystem again, but it should result in the same result.
> 
> Note that because there was garbage written into the inode table,
> there was going to be data loss; there's not much that can be done
> about that.  This just simple does a better job cleaning up after the
> mess, that's all.
> 
> 						- Ted
> 

But it's still got errors:

it fixed up inodes 12042, 12207, 12249 in lost+found plus "12090
(/t/#12090)" (?)

but post-mount:

sid:~# mount -t ext4 /dev/md0 /mnt/md0
sid:~# ls -la /mnt/md0/lost*
ls: cannot access /mnt/md0/lost+found/#12042: Input/output error
ls: cannot access /mnt/md0/lost+found/#12207: Input/output error
ls: cannot access /mnt/md0/lost+found/#12249: Input/output error
total 20
c????????? ? ?    ?        ?            ? #12042
s????????? ? ?    ?        ?            ? #12207
s????????? ? ?    ?        ?            ? #12249
drwx------ 2 root root 16384 Apr 23 21:16 .
drwxr-xr-x 5 root root  4096 Apr 23 21:15 ..

Christian, is there anything in dmesg along with it this time?

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 12:36                 ` Eric Sandeen
@ 2009-04-24 12:42                   ` Eric Sandeen
  2009-04-24 20:21                   ` Christian Kujau
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Sandeen @ 2009-04-24 12:42 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Christian Kujau, linux-ext4

Eric Sandeen wrote:
> Theodore Tso wrote:
>> On Fri, Apr 24, 2009 at 02:40:58AM -0700, Christian Kujau wrote:
>>> Maybe the repair with 1.41.{3,4} before did something wrong and I should 
>>> use 1.41.5 from the beginning? This issue is somewhat reproducible (given 
>>> the fs holds enough files), so I could start over, corrupt the fs again 
>>> and let e2fsprogs-1.41.5 do the whole job.
>> I doubt it; the problem was that the repair performed by e2fsck 1.41.3
>> was simply imcomplete.  You can try to reproduce the corrupted
>> filesystem again, but it should result in the same result.
>>
>> Note that because there was garbage written into the inode table,
>> there was going to be data loss; there's not much that can be done
>> about that.  This just simple does a better job cleaning up after the
>> mess, that's all.
>>
>> 						- Ted
>>
> 
> But it's still got errors:
> 
> it fixed up inodes 12042, 12207, 12249 in lost+found plus "12090
> (/t/#12090)" (?)
> 
> but post-mount:
> 
> sid:~# mount -t ext4 /dev/md0 /mnt/md0
> sid:~# ls -la /mnt/md0/lost*
> ls: cannot access /mnt/md0/lost+found/#12042: Input/output error
> ls: cannot access /mnt/md0/lost+found/#12207: Input/output error
> ls: cannot access /mnt/md0/lost+found/#12249: Input/output error
> total 20
> c????????? ? ?    ?        ?            ? #12042
> s????????? ? ?    ?        ?            ? #12207
> s????????? ? ?    ?        ?            ? #12249
> drwx------ 2 root root 16384 Apr 23 21:16 .
> drwxr-xr-x 5 root root  4096 Apr 23 21:15 ..
> 
> Christian, is there anything in dmesg along with it this time?

For what it's worth, it seems to repair it ok for me, or at least I'm
not getting your errors:

[root@inode test]# mount -o loop md0_e2i mnt/
[root@inode test]# ls -la mnt/lost*
total 20
drwx------. 2 root       root       16384 2009-04-23 14:16 .
drwxr-xr-x. 5 root       root        4096 2009-04-23 14:15 ..
c--S--s-w-. 1 4232184840 4232201271  8, 0 1970-01-13 00:41 #12042
sr-xrwSr--. 1 4229628009 4228087921     0 1976-03-19 08:28 #12207
sr-xrwSr--. 1 4229628009 4228087921     0 1976-03-19 08:28 #12249




-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 11:58             ` Theodore Tso
@ 2009-04-24 20:09               ` Andreas Dilger
  0 siblings, 0 replies; 22+ messages in thread
From: Andreas Dilger @ 2009-04-24 20:09 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Christian Kujau, Eric Sandeen, linux-ext4

On Apr 24, 2009  07:58 -0400, Theodore Ts'o wrote:
> It didn't; what had happened was that a garbage block had got written
> into the inode table.  This caused the kernel to complain about
> eh_magic being wrong in the inode table.  E2fsck 1.41.3 fixed those
> problems, but it ignored i_file_acl_high because the 64 bit feature
> flag was not set.  The kernel always pays attention to
> i_file_acl_high, regardless of whether the 64-bit feature flag is set
> or not.  Hence, the kernel was complaining and refused to touch those
> inodes.   
> 
> In any case, with e2fsck 1.41.5, I added code to fix i_file_acl_high
> getting set, and I was then able to mount the raw e2image file and not
> have any problems accessing the files in question, so I'm pretty sure
> that was the problem.

This sounds like another case for the "inode badness" patch that we've
developed, as it allows e2fsck to detect that the inode is just full of
garbage and clear the whole thing, instead of fixing it piecemeal and
leaving the sanitized garbage behind.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 12:36                 ` Eric Sandeen
  2009-04-24 12:42                   ` Eric Sandeen
@ 2009-04-24 20:21                   ` Christian Kujau
  2009-04-24 20:34                     ` Eric Sandeen
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Kujau @ 2009-04-24 20:21 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Theodore Tso, linux-ext4

On Fri, 24 Apr 2009, Eric Sandeen wrote:
> Christian, is there anything in dmesg along with it this time?

Yes, as soon as the input/output errors occur, I get:

[34989.160273] EXT4-fs error (device md0): ext4_ext_check_inode: bad header/extent in inode #12042: invalid magic - magic 800, entries 8192, max 5696(0), depth 6144(6144)
[34989.162491] EXT4-fs error (device md0): ext4_ext_check_inode: bad header/extent in inode #12207: invalid magic - magic 42fc, entries 17104, max 62268(0), depth 1283(1283)
[34989.166784] EXT4-fs error (device md0): ext4_ext_check_inode: bad header/extent in inode #12249: invalid magic - magic 42fc, entries 17104, max 62268(0), depth 1283(1283)

I'll try (not) to reproduce this and will use e2fsprogs from the 
beginning.

Christian.
-- 
Bruce Schneier cuts the hair of every man who does not cut his own -- and is
not confused by this fact.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 20:21                   ` Christian Kujau
@ 2009-04-24 20:34                     ` Eric Sandeen
  2009-04-24 20:59                       ` Theodore Tso
  2009-04-24 21:02                       ` ext4_ext_check_inode: bad header/extent in inode Christian Kujau
  0 siblings, 2 replies; 22+ messages in thread
From: Eric Sandeen @ 2009-04-24 20:34 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Theodore Tso, linux-ext4

Christian Kujau wrote:
> On Fri, 24 Apr 2009, Eric Sandeen wrote:
>> Christian, is there anything in dmesg along with it this time?
> 
> Yes, as soon as the input/output errors occur, I get:
> 
> [34989.160273] EXT4-fs error (device md0): ext4_ext_check_inode: bad header/extent in inode #12042: invalid magic - magic 800, entries 8192, max 5696(0), depth 6144(6144)
> [34989.162491] EXT4-fs error (device md0): ext4_ext_check_inode: bad header/extent in inode #12207: invalid magic - magic 42fc, entries 17104, max 62268(0), depth 1283(1283)
> [34989.166784] EXT4-fs error (device md0): ext4_ext_check_inode: bad header/extent in inode #12249: invalid magic - magic 42fc, entries 17104, max 62268(0), depth 1283(1283)

Oh, duh, of course.  Now that I'm thinking about it right ...

So these are funny inodes:

# file mnt/lost+found/*
mnt/lost+found/#12042: setuid setgid character special
mnt/lost+found/#12207: setgid socket
mnt/lost+found/#12249: setgid socket

by virtue of the corruption.

They don't actually have any blocks,

debugfs:  stat <12042>
Inode: 12042   Type: character special    Mode:  0012   Flags: 0xe0406
Generation: 1123828476    Version: 0x21204098
User: -62782456   Group: -62766025   Size: 0
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x00063400 -- Mon Jan  5 10:55:28 1970
atime: 0x01380030 -- Tue Aug 25 10:48:00 1970
mtime: 0x00103001 -- Tue Jan 13 00:41:05 1970
Size of extra inode fields: 4
Device major/minor number: 08:00 (hex 08:00)

so we shouldn't be checking the extent header, I think.

        if (ei->i_flags & EXT4_EXTENTS_FL) {
                /* Validate extent which is part of inode */
                ret = ext4_ext_check_inode(inode);
        } else if ...

Or maybe fsck should be clearing the extents flag on inodes like this?

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 12:00               ` Theodore Tso
  2009-04-24 12:36                 ` Eric Sandeen
@ 2009-04-24 20:41                 ` Christian Kujau
  1 sibling, 0 replies; 22+ messages in thread
From: Christian Kujau @ 2009-04-24 20:41 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Eric Sandeen, linux-ext4

On Fri, 24 Apr 2009, Theodore Tso wrote:
> Note that because there was garbage written into the inode table,
> there was going to be data loss; there's not much that can be done
> about that.

Of course, I realize that there will be dataloss, but I assumed that fsck 
would repair the filesystem to be intact again. However, even with 
e2fsprogs-1.41.5 the I/O errors persist:

  http://nerdbynature.de/bits/2.6.30-rc3/

Thanks,
Christian.
-- 
Bruce Schneier has built a non-deterministic Turing machine, so he doesn't care
whether P=NP.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 20:34                     ` Eric Sandeen
@ 2009-04-24 20:59                       ` Theodore Tso
  2009-04-24 22:54                         ` [PATCH] ext4: Do not try to validate extents on special files Theodore Ts'o
  2009-04-24 21:02                       ` ext4_ext_check_inode: bad header/extent in inode Christian Kujau
  1 sibling, 1 reply; 22+ messages in thread
From: Theodore Tso @ 2009-04-24 20:59 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Christian Kujau, linux-ext4

On Fri, Apr 24, 2009 at 03:34:12PM -0500, Eric Sandeen wrote:
> So these are funny inodes:
> 
> # file mnt/lost+found/*
> mnt/lost+found/#12042: setuid setgid character special
> mnt/lost+found/#12207: setgid socket
> mnt/lost+found/#12249: setgid socket
> 
> by virtue of the corruption.
.
> so we shouldn't be checking the extent header, I think.
> 
>         if (ei->i_flags & EXT4_EXTENTS_FL) {
>                 /* Validate extent which is part of inode */
>                 ret = ext4_ext_check_inode(inode);
>         } else if ...
> 
> Or maybe fsck should be clearing the extents flag on inodes like this?
> 

Good catch!  Yeah, probably both.  The kernel should only validating
the extent header if the file is regular file, a directory, or a symlink.

And e2fsck should be clearing the extents flag on inodes like this.

I'll create the patch....

							- Ted

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 20:34                     ` Eric Sandeen
  2009-04-24 20:59                       ` Theodore Tso
@ 2009-04-24 21:02                       ` Christian Kujau
  2009-04-24 21:23                         ` Eric Sandeen
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Kujau @ 2009-04-24 21:02 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Theodore Tso, linux-ext4

On Fri, 24 Apr 2009, Eric Sandeen wrote:
> They don't actually have any blocks,

So, they could be just deleted anyway? While I cannot do so with "rm", 
debugfs was able to (thanks for the hint!):

http://nerdbynature.de/bits/2.6.30-rc3/screenlog2.txt

Now I have only 2 files left triggering the "Input/output error" - I could 
kill these off with debugfs as well, then the filesystem should be clean 
again. Still, I wish fsck had done this for me :-)

Christian.
-- 
Bruce Schneier has built a non-deterministic Turing machine, so he doesn't care
whether P=NP.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ext4_ext_check_inode: bad header/extent in inode
  2009-04-24 21:02                       ` ext4_ext_check_inode: bad header/extent in inode Christian Kujau
@ 2009-04-24 21:23                         ` Eric Sandeen
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Sandeen @ 2009-04-24 21:23 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Theodore Tso, linux-ext4

Christian Kujau wrote:
> On Fri, 24 Apr 2009, Eric Sandeen wrote:
>> They don't actually have any blocks,
> 
> So, they could be just deleted anyway? While I cannot do so with "rm", 
> debugfs was able to (thanks for the hint!):
> 
> http://nerdbynature.de/bits/2.6.30-rc3/screenlog2.txt
> 
> Now I have only 2 files left triggering the "Input/output error" - I could 
> kill these off with debugfs as well, then the filesystem should be clean 
> again. Still, I wish fsck had done this for me :-)

It will.... soon... :)

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] ext4: Do not try to validate extents on special files
  2009-04-24 20:59                       ` Theodore Tso
@ 2009-04-24 22:54                         ` Theodore Ts'o
  0 siblings, 0 replies; 22+ messages in thread
From: Theodore Ts'o @ 2009-04-24 22:54 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

The EXTENTS_FL flag should never be set on special files, but if it
is, don't bother trying to validate that the extents tree is valid,
since only files, directories, and non-fast symlinks will ever have an
extent data structure.  We perhaps should flag the filesystem as being
corrupted if we see a special file (named pipes, device nodes, Unix
domain sockets, etc.) with the EXTENTS_FL flag, but e2fsck doesn't
currently check this case, so we'll just ignore this for now, since
it's harmless.

Without this fix, a special device with the extents flag is flagged as
an error by the kernel, so it is impossible to access or delete the
inode, but e2fsck doesn't see it as a problem, leading to
confused/frustrated users.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/ext4/inode.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1146003..e91f978 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4407,6 +4407,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 			(__u64)(le32_to_cpu(raw_inode->i_version_hi)) << 32;
 	}
 
+	ret = 0;
 	if (ei->i_file_acl &&
 	    ((ei->i_file_acl < 
 	      (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
@@ -4418,8 +4419,11 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 		ret = -EIO;
 		goto bad_inode;
 	} else if (ei->i_flags & EXT4_EXTENTS_FL) {
-		/* Validate extent which is part of inode */
-		ret = ext4_ext_check_inode(inode);
+		if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+		    (S_ISLNK(inode->i_mode) &&
+		     !ext4_inode_is_fast_symlink(inode)))
+			/* Validate extent which is part of inode */
+			ret = ext4_ext_check_inode(inode);
  	} else if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
 		   (S_ISLNK(inode->i_mode) &&
 		    !ext4_inode_is_fast_symlink(inode))) {
-- 
1.5.6.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-04-24 22:54 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-23 12:15 ext4_ext_check_inode: bad header/extent in inode Christian Kujau
2009-04-23 12:50 ` Eric Sandeen
2009-04-23 19:04   ` Christian Kujau
2009-04-23 20:40     ` Theodore Tso
2009-04-23 22:15       ` Christian Kujau
2009-04-24  3:20         ` Theodore Tso
2009-04-24  7:09           ` Andreas Dilger
2009-04-24 11:58             ` Theodore Tso
2009-04-24 20:09               ` Andreas Dilger
2009-04-24  8:57           ` Christian Kujau
2009-04-24  9:40             ` Christian Kujau
2009-04-24 12:00               ` Theodore Tso
2009-04-24 12:36                 ` Eric Sandeen
2009-04-24 12:42                   ` Eric Sandeen
2009-04-24 20:21                   ` Christian Kujau
2009-04-24 20:34                     ` Eric Sandeen
2009-04-24 20:59                       ` Theodore Tso
2009-04-24 22:54                         ` [PATCH] ext4: Do not try to validate extents on special files Theodore Ts'o
2009-04-24 21:02                       ` ext4_ext_check_inode: bad header/extent in inode Christian Kujau
2009-04-24 21:23                         ` Eric Sandeen
2009-04-24 20:41                 ` Christian Kujau
2009-04-23 20:51     ` Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.