All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG at fs/ext4/mballoc.c:3295
@ 2009-03-12 16:54 Thiemo Nagel
  2009-03-12 17:07 ` Eric Sandeen
  2009-03-12 18:46 ` [PATCH] fix bogus BUG_ONs in in mballoc code Eric Sandeen
  0 siblings, 2 replies; 9+ messages in thread
From: Thiemo Nagel @ 2009-03-12 16:54 UTC (permalink / raw)
  To: Ext4 Developers List

Hello,

the following can be observed reproducibly with 2.6.29-rc7 when filling 
up a very small (2MB) file system:

dd if=/dev/zero of=/file/inside/small/filesystem

hangs, dmesg output is:

[  602.831279] EXT4-fs: barriers enabled
[  602.862751] EXT4-fs warning: mounting fs with errors, running e2fsck 
is recommended
[  602.864165] kjournald2 starting: pid 5414, dev loop0:8, commit 
interval 5 seconds
[  602.864318] EXT4 FS on loop0, internal journal on loop0:8
[  602.864325] EXT4-fs: delayed allocation enabled
[  602.864348] EXT4-fs: file extents enabled
[  602.864669] EXT4-fs: mballoc enabled
[  602.864674] EXT4-fs: recovery complete.
[  602.869466] EXT4-fs: mounted filesystem loop0 with ordered data mode
[  623.000911] JBD: barrier-based sync failed on loop0:8 - disabling 
barriers
[  633.432299] ------------[ cut here ]------------
[  633.432329] kernel BUG at fs/ext4/mballoc.c:3295!
[  633.432353] invalid opcode: 0000 [#1] PREEMPT SMP
[  633.432382] last sysfs file: /sys/power/state
[  633.432388] Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand 
cpufreq_userspace cpufreq_powersave acpi_cpufreq speedstep_lib 
freq_table toshiba loop snd_intel8x0 snd_ac97_codec ac97_bus ipw2200 
snd_pcm libipw snd_timer psmouse snd soundcore lib80211 snd_page_alloc 
toshiba_acpi rfkill backlight input_polldev ac battery button evdev ext3 
jbd mbcache usbhid sd_mod ata_generic ata_piix libata ehci_hcd uhci_hcd 
e100 mii scsi_mod usbcore thermal processor fan thermal_sys
[  633.432669]
[  633.432676] Pid: 5475, comm: dd Not tainted (2.6.29-rc7-dbg #1) TECRA A3X
[  633.432701] EIP: 0060:[<e04486bc>] EFLAGS: 00210202 CPU: 0
[  633.432772] EIP is at ext4_mb_normalize_request+0x712/0x7bd [ext4]
[  633.432796] EAX: 00000001 EBX: 00000200 ECX: 00000001 EDX: cf950738
[  633.432803] ESI: 00000000 EDI: cf94a3f0 EBP: cd07fa98 ESP: cd07fa38
[  633.432827]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  633.432852] Process dd (pid: 5475, ti=cd07e000 task=dd092d00 
task.ti=cd07e000)
[  633.432859] Stack:
[  633.432863]  00000002 00000001 00000000 e044853a 0005e800 00000000 
cd07fb6c cf950738
[  633.432918]  00000200 00000000 cf94a3f0 0000000a 00000200 00000000 
00000000 00000000
[  633.432958]  cf94a0ac cf94a184 cf94a184 000001da 00000002 00000000 
dcdf0c00 cf94a184
[  633.433016] Call Trace:
[  633.433021]  [<e044853a>] ? ext4_mb_normalize_request+0x590/0x7bd [ext4]
[  633.433047]  [<e044d5e8>] ? ext4_mb_new_blocks+0x1c3/0x480 [ext4]
[  633.433047]  [<e04455fa>] ? ext4_ext_get_blocks+0xc06/0xe7c [ext4]
[  633.433047]  [<c0315ccd>] ? _spin_unlock+0x27/0x3c
[  633.433047]  [<e007da49>] ? find_revoke_record+0x94/0xa3 [jbd2]
[  633.433047]  [<e007e10d>] ? jbd2_journal_cancel_revoke+0x11a/0x15c [jbd2]
[  633.433047]  [<c014f580>] ? __lock_acquire+0x475/0x5e0
[  633.433047]  [<e04343a2>] ? ext4_get_blocks_wrap+0xf0/0x214 [ext4]
[  633.433047]  [<e0434bb4>] ? ext4_get_block+0xba/0xf1 [ext4]
[  633.433047]  [<c01c1954>] ? __block_prepare_write+0x15b/0x352
[  633.433047]  [<c0179320>] ? find_lock_page+0x33/0x6b
[  633.433047]  [<c01c1ce7>] ? block_write_begin+0x7b/0xd7
[  633.433047]  [<e0434afa>] ? ext4_get_block+0x0/0xf1 [ext4]
[  633.433047]  [<e042f416>] ? ext4_write_begin+0xde/0x1dd [ext4]
[  633.433047]  [<e0434afa>] ? ext4_get_block+0x0/0xf1 [ext4]
[  633.433047]  [<e042faab>] ? ext4_da_write_begin+0x10f/0x227 [ext4]
[  633.433047]  [<c0178ac1>] ? generic_file_buffered_write+0xd9/0x27c
[  633.433047]  [<c017a503>] ? __generic_file_aio_write_nolock+0x4b3/0x507
[  633.433047]  [<c017a5b2>] ? generic_file_aio_write+0x5b/0xb9
[  633.433047]  [<c012ec77>] ? __do_softirq+0x68/0x163
[  633.433047]  [<c014d207>] ? validate_chain+0x12e/0x1044
[  633.433047]  [<e042cd05>] ? ext4_file_write+0xd0/0x152 [ext4]
[  633.433047]  [<c014fea2>] ? mark_held_locks+0x4f/0x66
[  633.433047]  [<c01a1eb5>] ? do_sync_write+0xc4/0x109
[  633.433047]  [<c013dadd>] ? autoremove_wake_function+0x0/0x41
[  633.433047]  [<c01fa634>] ? security_file_permission+0xf/0x11
[  633.433047]  [<c01a20b1>] ? rw_verify_area+0xb0/0xd3
[  633.433047]  [<c0315ccd>] ? _spin_unlock+0x27/0x3c
[  633.433047]  [<c01a216b>] ? vfs_write+0x97/0x14e
[  633.433047]  [<c0103237>] ? sysenter_exit+0xf/0x1a
[  633.433047]  [<c01a1df1>] ? do_sync_write+0x0/0x109
[  633.433047]  [<c01a2963>] ? sys_write+0x3d/0x61
[  633.433047]  [<c01031fb>] ? sysenter_do_call+0x12/0x3f
[  633.433047] Code: 26 8b 55 bc 8b 42 04 8b 80 7c 02 00 00 8b 40 08 b9 
01 00 00 00 83 fe 00 7f 0b 7c 04 39 c3 73 05 b9 00 00 00 00 89 c8 85 c0 
74 04 <0f> 0b eb fe 8b 7d dc 8b 4d bc 89 79 18 89 59 24 8b 45 b8 8b 50
[  633.433047] EIP: [<e04486bc>] ext4_mb_normalize_request+0x712/0x7bd 
[ext4] SS:ESP 0068:cd07fa38
[  633.443710] ---[ end trace 7b79a7a8035c66f0 ]---


IIRC the small filesystem was created like that:

dd if=/dev/zero of=image.ext4 bs=1M count=2
/sbin/mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 -O 
large_file,dir_index,flex_bg,extent,sparse_super image.ext4

Kind regards,

Thiemo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG at fs/ext4/mballoc.c:3295
  2009-03-12 16:54 BUG at fs/ext4/mballoc.c:3295 Thiemo Nagel
@ 2009-03-12 17:07 ` Eric Sandeen
  2009-03-12 17:13   ` Thiemo Nagel
  2009-03-12 18:46 ` [PATCH] fix bogus BUG_ONs in in mballoc code Eric Sandeen
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2009-03-12 17:07 UTC (permalink / raw)
  To: Thiemo Nagel; +Cc: Ext4 Developers List

Thiemo Nagel wrote:
> Hello,
> 
> the following can be observed reproducibly with 2.6.29-rc7 when filling 
> up a very small (2MB) file system:
> 
> dd if=/dev/zero of=/file/inside/small/filesystem
> 
> hangs, dmesg output is:
> 
> [  602.831279] EXT4-fs: barriers enabled
> [  602.862751] EXT4-fs warning: mounting fs with errors, running e2fsck 
> is recommended
> [  602.864165] kjournald2 starting: pid 5414, dev loop0:8, commit 
> interval 5 seconds
> [  602.864318] EXT4 FS on loop0, internal journal on loop0:8
> [  602.864325] EXT4-fs: delayed allocation enabled
> [  602.864348] EXT4-fs: file extents enabled
> [  602.864669] EXT4-fs: mballoc enabled
> [  602.864674] EXT4-fs: recovery complete.
> [  602.869466] EXT4-fs: mounted filesystem loop0 with ordered data mode
> [  623.000911] JBD: barrier-based sync failed on loop0:8 - disabling 
> barriers
> [  633.432299] ------------[ cut here ]------------
> [  633.432329] kernel BUG at fs/ext4/mballoc.c:3295!

I don't see it:

# dd if=/dev/zero of=2mbfile bs=1M count=2
# mkfs.ext4 -F 2mbfile
# mount -o loop 2mbfile mnt/
# dd if=/dev/zero of=mnt/file
dd: writing to `mnt/file': No space left on device
1917+0 records in
1916+0 records out
980992 bytes (981 kB) copied, 0.0162723 s, 60.3 MB/s

is this more or less what you did?

-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG at fs/ext4/mballoc.c:3295
  2009-03-12 17:07 ` Eric Sandeen
@ 2009-03-12 17:13   ` Thiemo Nagel
  2009-03-12 17:16     ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Thiemo Nagel @ 2009-03-12 17:13 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Ext4 Developers List

> I don't see it:
> 
> # dd if=/dev/zero of=2mbfile bs=1M count=2
> # mkfs.ext4 -F 2mbfile
> # mount -o loop 2mbfile mnt/
> # dd if=/dev/zero of=mnt/file
> dd: writing to `mnt/file': No space left on device
> 1917+0 records in
> 1916+0 records out
> 980992 bytes (981 kB) copied, 0.0162723 s, 60.3 MB/s
> 
> is this more or less what you did?

Yes, except that I used the following parameters for mkfs.ext4 (IIRC):

/sbin/mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 -O 
large_file,dir_index,flex_bg,extent,sparse_super image.ext4

Kind regards,

Thiemo Nagel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG at fs/ext4/mballoc.c:3295
  2009-03-12 17:13   ` Thiemo Nagel
@ 2009-03-12 17:16     ` Eric Sandeen
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Sandeen @ 2009-03-12 17:16 UTC (permalink / raw)
  To: Thiemo Nagel; +Cc: Ext4 Developers List

Thiemo Nagel wrote:
>> I don't see it:
>>
>> # dd if=/dev/zero of=2mbfile bs=1M count=2
>> # mkfs.ext4 -F 2mbfile
>> # mount -o loop 2mbfile mnt/
>> # dd if=/dev/zero of=mnt/file
>> dd: writing to `mnt/file': No space left on device
>> 1917+0 records in
>> 1916+0 records out
>> 980992 bytes (981 kB) copied, 0.0162723 s, 60.3 MB/s
>>
>> is this more or less what you did?
> 
> Yes, except that I used the following parameters for mkfs.ext4 (IIRC):
> 
> /sbin/mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 -O 
> large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> 
> Kind regards,
> 
> Thiemo Nagel

Bingo, thanks :)

-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] fix bogus BUG_ONs in in mballoc code
  2009-03-12 16:54 BUG at fs/ext4/mballoc.c:3295 Thiemo Nagel
  2009-03-12 17:07 ` Eric Sandeen
@ 2009-03-12 18:46 ` Eric Sandeen
  2009-03-13  0:38   ` Theodore Tso
  2009-03-13  1:09   ` Theodore Tso
  1 sibling, 2 replies; 9+ messages in thread
From: Eric Sandeen @ 2009-03-12 18:46 UTC (permalink / raw)
  To: Thiemo Nagel; +Cc: Ext4 Developers List

Thiemo Nagel reported that:

# dd if=/dev/zero of=image.ext4 bs=1M count=2
# mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
  -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
# mount -o loop image.ext4 mnt/
# dd if=/dev/zero of=mnt/file

oopsed, with a BUG_ON in ext4_mb_normalize_request because
size == EXT4_BLOCKS_PER_GROUP

It appears to me (esp. after talking to Andreas) that the BUG_ON
is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
be allowed, though larger sizes do indicate a problem.

Fix that an another (apparently rare) codepath with a similar check.

Reported-by: Thiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
--

Index: linux-2.6/fs/ext4/mballoc.c
===================================================================
--- linux-2.6.orig/fs/ext4/mballoc.c
+++ linux-2.6/fs/ext4/mballoc.c
@@ -1447,7 +1447,7 @@ static void ext4_mb_measure_extent(struc
 	struct ext4_free_extent *gex = &ac->ac_g_ex;
 
 	BUG_ON(ex->fe_len <= 0);
-	BUG_ON(ex->fe_len >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
+	BUG_ON(ex->fe_len > EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
 	BUG_ON(ex->fe_start >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
 	BUG_ON(ac->ac_status != AC_STATUS_CONTINUE);
 
@@ -3292,7 +3292,7 @@ ext4_mb_normalize_request(struct ext4_al
 	}
 	BUG_ON(start + size <= ac->ac_o_ex.fe_logical &&
 			start > ac->ac_o_ex.fe_logical);
-	BUG_ON(size <= 0 || size >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
+	BUG_ON(size <= 0 || size > EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
 
 	/* now prepare goal request */
 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix bogus BUG_ONs in in mballoc code
  2009-03-12 18:46 ` [PATCH] fix bogus BUG_ONs in in mballoc code Eric Sandeen
@ 2009-03-13  0:38   ` Theodore Tso
  2009-03-13 11:09     ` Andreas Dilger
  2009-03-13  1:09   ` Theodore Tso
  1 sibling, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2009-03-13  0:38 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Thiemo Nagel, Ext4 Developers List

On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
> Thiemo Nagel reported that:
> 
> # dd if=/dev/zero of=image.ext4 bs=1M count=2
> # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
>   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> # mount -o loop image.ext4 mnt/
> # dd if=/dev/zero of=mnt/file
> 
> oopsed, with a BUG_ON in ext4_mb_normalize_request because
> size == EXT4_BLOCKS_PER_GROUP
> 
> It appears to me (esp. after talking to Andreas) that the BUG_ON
> is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
> be allowed, though larger sizes do indicate a problem.

Clearly we should make this change to avoid the BUG_ON; but stupid
question, why shouldn't we allow sizes larger than
EXT4_BLOCKS_PER_GROUP?  

Especially with flex_bg, it is possible for an allocation size >
EXT4_BLOCKS_PER_GROUP to be satisifed, especially if the filesystem
isn't that full yet, and it might even make sense to request a larger
allocation for video files that are getting preallocated, for
example....

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix bogus BUG_ONs in in mballoc code
  2009-03-12 18:46 ` [PATCH] fix bogus BUG_ONs in in mballoc code Eric Sandeen
  2009-03-13  0:38   ` Theodore Tso
@ 2009-03-13  1:09   ` Theodore Tso
  2009-03-13  2:08     ` Eric Sandeen
  1 sibling, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2009-03-13  1:09 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Thiemo Nagel, Ext4 Developers List

On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
> Thiemo Nagel reported that:
> 
> # dd if=/dev/zero of=image.ext4 bs=1M count=2
> # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
>   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> # mount -o loop image.ext4 mnt/
> # dd if=/dev/zero of=mnt/file
> 
> oopsed, with a BUG_ON in ext4_mb_normalize_request because
> size == EXT4_BLOCKS_PER_GROUP
> 
> It appears to me (esp. after talking to Andreas) that the BUG_ON
> is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
> be allowed, though larger sizes do indicate a problem.
> 
> Fix that an another (apparently rare) codepath with a similar check.

Hmm.... is this at all likely to happen with a standard ext4
filesystem parameters?  Or was this triggered because of the
artifially set -g 512 parameter?  The question is whether we should
try pushing this to Linus at this point, or let this wait until the
merge window opens.

Opinions?

						= Ted
<

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix bogus BUG_ONs in in mballoc code
  2009-03-13  1:09   ` Theodore Tso
@ 2009-03-13  2:08     ` Eric Sandeen
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Sandeen @ 2009-03-13  2:08 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Thiemo Nagel, Ext4 Developers List

Theodore Tso wrote:
> On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
>> Thiemo Nagel reported that:
>>
>> # dd if=/dev/zero of=image.ext4 bs=1M count=2
>> # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
>>   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
>> # mount -o loop image.ext4 mnt/
>> # dd if=/dev/zero of=mnt/file
>>
>> oopsed, with a BUG_ON in ext4_mb_normalize_request because
>> size == EXT4_BLOCKS_PER_GROUP
>>
>> It appears to me (esp. after talking to Andreas) that the BUG_ON
>> is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
>> be allowed, though larger sizes do indicate a problem.
>>
>> Fix that an another (apparently rare) codepath with a similar check.
> 
> Hmm.... is this at all likely to happen with a standard ext4
> filesystem parameters?  Or was this triggered because of the
> artifially set -g 512 parameter?  The question is whether we should
> try pushing this to Linus at this point, or let this wait until the
> merge window opens.
> 
> Opinions?
> 
> 						= Ted
> <

I wondered the same thing, and will admit to probably not digging deep
enough on this one.  I think the fix is ok as is but you are asking the
right questions.  Maybe a clusterfs mballoc expert can chime in and save
us some time? :)

-=Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix bogus BUG_ONs in in mballoc code
  2009-03-13  0:38   ` Theodore Tso
@ 2009-03-13 11:09     ` Andreas Dilger
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Dilger @ 2009-03-13 11:09 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Eric Sandeen, Thiemo Nagel, Ext4 Developers List

On Mar 12, 2009  20:38 -0400, Theodore Ts'o wrote:
> On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
> > Thiemo Nagel reported that:
> > 
> > # dd if=/dev/zero of=image.ext4 bs=1M count=2
> > # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
> >   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> > # mount -o loop image.ext4 mnt/
> > # dd if=/dev/zero of=mnt/file
> > 
> > oopsed, with a BUG_ON in ext4_mb_normalize_request because
> > size == EXT4_BLOCKS_PER_GROUP
> > 
> > It appears to me (esp. after talking to Andreas) that the BUG_ON
> > is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
> > be allowed, though larger sizes do indicate a problem.
> 
> Clearly we should make this change to avoid the BUG_ON; but stupid
> question, why shouldn't we allow sizes larger than
> EXT4_BLOCKS_PER_GROUP?  
> 
> Especially with flex_bg, it is possible for an allocation size >
> EXT4_BLOCKS_PER_GROUP to be satisifed, especially if the filesystem
> isn't that full yet, and it might even make sense to request a larger
> allocation for video files that are getting preallocated, for
> example....

There are two reasons that we can't have too-large mballoc allocations:
- mballoc works on a per-group basis, so the most blocks that it can
  allocate at a time is BLOCKS_PER_GROUP.
- the on-disk extent format cannot map more than 128MB at a time, which
  is equal to the group size at 4kB blocksize.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-03-13 11:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-12 16:54 BUG at fs/ext4/mballoc.c:3295 Thiemo Nagel
2009-03-12 17:07 ` Eric Sandeen
2009-03-12 17:13   ` Thiemo Nagel
2009-03-12 17:16     ` Eric Sandeen
2009-03-12 18:46 ` [PATCH] fix bogus BUG_ONs in in mballoc code Eric Sandeen
2009-03-13  0:38   ` Theodore Tso
2009-03-13 11:09     ` Andreas Dilger
2009-03-13  1:09   ` Theodore Tso
2009-03-13  2:08     ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.