* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
@ 2009-04-29 1:06 ` bugzilla-daemon
2009-04-29 3:15 ` bugzilla-daemon
` (8 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-04-29 1:06 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #2 from Franco Broi <franco@fugro-fsi.com.au> 2009-04-29 01:06:09 ---
Of the 12 tests, 2 produced errors.
EXT4-fs error (device dm-5): ext4_mb_generate_buddy: EXT4-fs: group 9: 32768
blocks in bitmap, 1023 in gd
The filesystem seems OK, I can ls the test files.
EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 0: 32768
blocks in bitmap, 970 in gd
EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 0: 32768
blocks in bitmap, 32766 in gd
EXT4-fs error (device dm-3): ext4_init_block_bitmap: Checksum bad for group 1
EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 1: 0 blocks
in bitmap, 1023 in gd
EXT4-fs error (device dm-3): ext4_dx_find_entry: bad entry in directory #15:
directory entry across blocks - offset=28672, inode=0, rec_len=65536,
name_len=0
EXT4-fs error (device dm-3): ext4_add_entry: bad entry in directory #15:
directory entry across blocks - offset=0, inode=0, rec_len=65536, name_len=0
EXT4-fs error (device dm-3): htree_dirblock_to_tree: bad entry in directory #2:
directory entry across blocks - offset=0, inode=0, rec_len=65536, name_len=0
EXT4-fs error (device dm-3): htree_dirblock_to_tree: bad entry in directory #2:
directory entry across blocks - offset=0, inode=0, rec_len=65536, name_len=0
EXT4-fs error (device dm-3): htree_dirblock_to_tree: bad entry in directory #2:
directory entry across blocks - offset=0, inode=0, rec_len=65536, name_len=0
EXT4-fs error (device dm-3): htree_dirblock_to_tree: bad entry in directory #2:
directory entry across blocks - offset=0, inode=0, rec_len=65536, name_len=0
EXT4-fs error (device dm-3): htree_dirblock_to_tree: bad entry in directory #2:
directory entry across blocks - offset=0, inode=0, rec_len=65536, name_len=0
Although df looks ok
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vgdata--143-data143
13456415384 13232157624 224257760 99% /data143
# ls /data143
Produces no output.
At this point I will need to switch back to ext3 so that I can get this disk
into production but I do have a small window to run some more tests if anyone
has any ideas.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
2009-04-29 1:06 ` [Bug 13201] " bugzilla-daemon
@ 2009-04-29 3:15 ` bugzilla-daemon
2009-04-30 3:00 ` bugzilla-daemon
` (7 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-04-29 3:15 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
Eric Sandeen <sandeen@redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sandeen@redhat.com
--- Comment #3 from Eric Sandeen <sandeen@redhat.com> 2009-04-29 03:15:31 ---
Franco, sorry we haven't gotten back with suggestions on this. It looks like
you have hit a couple different end results. We've had a few reports of
corruption on larger filesystems which makes us wonder if there might be a
problem above 8T somewhere...
The current upstream git tree (or the 2.6.30-rc3-git5 prepatch) has more extent
validity checking in it; if you do have the time for another test, running on
that codebase may yield more info, depending on where the problem lies.
-Eric
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
2009-04-29 1:06 ` [Bug 13201] " bugzilla-daemon
2009-04-29 3:15 ` bugzilla-daemon
@ 2009-04-30 3:00 ` bugzilla-daemon
2009-04-30 9:39 ` bugzilla-daemon
` (6 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-04-30 3:00 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #4 from Franco Broi <franco@fugro-fsi.com.au> 2009-04-30 03:00:48 ---
I ran a test overnight using 2.6.30-rc3-git5 and it didn't fail. Not sure if
this is a good or bad thing?
I've deleted the files and started the test again.
By the way, deleting files with ext4 is lightening fast, it only takes about 5
minutes to delete 13TB! Again, not sure if this is a good or bad thing, it
doesn't give you much time to hit ctrl_c...
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (2 preceding siblings ...)
2009-04-30 3:00 ` bugzilla-daemon
@ 2009-04-30 9:39 ` bugzilla-daemon
2009-05-19 18:04 ` bugzilla-daemon
` (5 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-04-30 9:39 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #5 from Franco Broi <franco@fugro-fsi.com.au> 2009-04-30 09:39:21 ---
I've now got filesystem corruption with 2.6.30-rc3-git5, looks pretty much the
same as before.
Apr 30 17:30:56 echo20 kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy: EXT4-fs: group 0: 32768 blocks in bitmap, 23495 in gd
Apr 30 17:30:56 echo20 kernel: EXT4-fs error (device dm-3):
ext4_mb_mark_diskspace_used: Allocating block 1024 in system zone of 0 group
When I do an ls in the test directory I get lots of Input/output errors
EXT4-fs error (device dm-3): ext4_lookup: deleted inode referenced: 127
EXT4-fs error (device dm-3): ext4_lookup: deleted inode referenced: 358
EXT4-fs error (device dm-3): ext4_lookup: deleted inode referenced: 196
Anything you want me to try?
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (3 preceding siblings ...)
2009-04-30 9:39 ` bugzilla-daemon
@ 2009-05-19 18:04 ` bugzilla-daemon
2009-05-23 10:10 ` bugzilla-daemon
` (4 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-05-19 18:04 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
Theodore Tso <tytso@mit.edu> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tytso@mit.edu
--- Comment #6 from Theodore Tso <tytso@mit.edu> 2009-05-19 18:04:05 ---
Could you try replicating this problem in 2.6.30-rc6? We fixed a race
condition in i_cached_extents could have very well caused your problem. I'm
hoping it will close this a few other mystery bug reports we've had over the
past couple of months. (The bug is an old one, but we had struggled with a
reliable reproduction case.)
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (4 preceding siblings ...)
2009-05-19 18:04 ` bugzilla-daemon
@ 2009-05-23 10:10 ` bugzilla-daemon
2009-06-05 0:51 ` bugzilla-daemon
` (3 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-05-23 10:10 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #7 from Franco Broi <franco@fugro-fsi.com.au> 2009-05-23 10:10:50 ---
I wont be able to recreate the original test conditions but I'll run a test
with a single large filesystem within a couple of weeks.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (5 preceding siblings ...)
2009-05-23 10:10 ` bugzilla-daemon
@ 2009-06-05 0:51 ` bugzilla-daemon
2009-06-08 16:49 ` bugzilla-daemon
` (2 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-06-05 0:51 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #8 from Franco Broi <franco@fugro-fsi.com.au> 2009-06-05 00:51:52 ---
I haven't been able to recreate the problem using 2.6.30-rc8 but the test
conditions aren't identical to before. Would it make a difference that only a
single filesystem is being written to and not 4 simultaneously as in the
original tests?
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (6 preceding siblings ...)
2009-06-05 0:51 ` bugzilla-daemon
@ 2009-06-08 16:49 ` bugzilla-daemon
2009-06-09 5:10 ` bugzilla-daemon
2009-08-26 18:11 ` bugzilla-daemon
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-06-08 16:49 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #9 from Theodore Tso <tytso@mit.edu> 2009-06-08 16:49:28 ---
If this is the same problem as the one which we fixed with identical symptoms,
what matters is multiple processes/threads writing to the same file at the same
time. People using NFS or SAMBA on a backup server seemed to be a most common
scenarios for triggering this (admittedly very hard to reproduce) bug. We
finally got lucky in that someone had a setup which allows for reliable
reproduction of the bug, so we could finally sink our teeth into it.
So if what you saw was the same as the bug we fixed in 2.6.30-rc6, no it
shouldn't make a difference. If it is a completely different bug, then of
course all bets are off. In general though whether you are writing to one
filesystem or 4 filesystems shouldn't make a difference, except in that it
might change the timing necessary to hit a race condition (and in the case of
the bug that we found and fixed, it was highly timing dependent; in fact, even
after we found the problem, we weren't able to come up with a reliable
reproduction case, even though the problem was obvious on paper and the one
user who could reliably reproduce reported it went away once the patch was
applied; IIRC, Eric finally put in a delay into the code to widen the race
window to the point where he could replicate it.)
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (7 preceding siblings ...)
2009-06-08 16:49 ` bugzilla-daemon
@ 2009-06-09 5:10 ` bugzilla-daemon
2009-08-26 18:11 ` bugzilla-daemon
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-06-09 5:10 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
--- Comment #10 from Franco Broi <franco@fugro-fsi.com.au> 2009-06-09 05:10:31 ---
(In reply to comment #9)
> If this is the same problem as the one which we fixed with identical symptoms,
> what matters is multiple processes/threads writing to the same file at the same
> time.
Then it doesn't sound like it's the same bug. My tests are very simple, they
just keep writing 8GB files until the disk fills up, there is no concurrent
access to files or even the filesystem, and the machines are completely
standalone.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug 13201] kernel BUG at fs/ext4/extents.c:2737
2009-04-28 9:28 [Bug 13201] New: kernel BUG at fs/ext4/extents.c:2737 bugzilla-daemon
` (8 preceding siblings ...)
2009-06-09 5:10 ` bugzilla-daemon
@ 2009-08-26 18:11 ` bugzilla-daemon
9 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2009-08-26 18:11 UTC (permalink / raw)
To: linux-ext4
http://bugzilla.kernel.org/show_bug.cgi?id=13201
Valerie Aurora <vaurora@redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vaurora@redhat.com
--- Comment #11 from Valerie Aurora <vaurora@redhat.com> 2009-08-26 18:11:07 ---
Given that the bug appears to be fixed, and we can't reproduce the original
conditions or get more data on this bug, it seems like we should close this
bug.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 13+ messages in thread