All of lore.kernel.org
 help / color / mirror / Atom feed
* ext4_mb_generate_buddy and double-free errors
@ 2009-03-13  0:29 Frank Mayhar
  2009-03-13  2:03 ` Eric Sandeen
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Mayhar @ 2009-03-13  0:29 UTC (permalink / raw)
  To: ext4 development

We're seeing errors like:
  EXT4-fs error (device sda3): ext4_mb_generate_buddy: EXT4-fs: group 3049: 21020 blocks in bitmap, 21529 in gd

Usually after this the system is cleaned and in the process we see many
"mb_free_blocks: double-free of inode x's block y(bit z in group d)".
(In fact, we see exactly as many of these as the difference between the
group and computed count of free blocks.)

It looks like the bitmap itself is getting messed up somehow, at least
enough to make the free count disagree with the map itself.  Has anyone
else seen something like this?  Any pointers as to where to look for
potential culprits?
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext4_mb_generate_buddy and double-free errors
  2009-03-13  0:29 ext4_mb_generate_buddy and double-free errors Frank Mayhar
@ 2009-03-13  2:03 ` Eric Sandeen
  2009-03-13  3:22   ` Frank Mayhar
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Sandeen @ 2009-03-13  2:03 UTC (permalink / raw)
  To: Frank Mayhar; +Cc: ext4 development

Frank Mayhar wrote:
> We're seeing errors like:
>   EXT4-fs error (device sda3): ext4_mb_generate_buddy: EXT4-fs: group 3049: 21020 blocks in bitmap, 21529 in gd
> 
> Usually after this the system is cleaned and in the process we see many
> "mb_free_blocks: double-free of inode x's block y(bit z in group d)".
> (In fact, we see exactly as many of these as the difference between the
> group and computed count of free blocks.)
> 
> It looks like the bitmap itself is getting messed up somehow, at least
> enough to make the free count disagree with the map itself.  Has anyone
> else seen something like this?  Any pointers as to where to look for
> potential culprits?

Which kernel, for starters?

I saw some of those reports a while back; haven't since, though don't
know offhand what might have been the root cause or fix...

-Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext4_mb_generate_buddy and double-free errors
  2009-03-13  2:03 ` Eric Sandeen
@ 2009-03-13  3:22   ` Frank Mayhar
  2009-03-13  3:25     ` Eric Sandeen
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Mayhar @ 2009-03-13  3:22 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development, Curt Wohlgemuth, mrubin

On Thu, 2009-03-12 at 21:03 -0500, Eric Sandeen wrote:
> Frank Mayhar wrote:
> > We're seeing errors like:
> >   EXT4-fs error (device sda3): ext4_mb_generate_buddy: EXT4-fs: group 3049: 21020 blocks in bitmap, 21529 in gd
> > 
> > Usually after this the system is cleaned and in the process we see many
> > "mb_free_blocks: double-free of inode x's block y(bit z in group d)".
> > (In fact, we see exactly as many of these as the difference between the
> > group and computed count of free blocks.)
> > 
> > It looks like the bitmap itself is getting messed up somehow, at least
> > enough to make the free count disagree with the map itself.  Has anyone
> > else seen something like this?  Any pointers as to where to look for
> > potential culprits?
> 
> Which kernel, for starters?

It's our development kernel, 2.6.26 plus as many of the ext4/jbd2
patches as we can comfortably pull in.
-- 
Frank Mayhar <fmayhar@google.com>
Google, Inc.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext4_mb_generate_buddy and double-free errors
  2009-03-13  3:22   ` Frank Mayhar
@ 2009-03-13  3:25     ` Eric Sandeen
  2009-03-13  3:32       ` Curt Wohlgemuth
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Sandeen @ 2009-03-13  3:25 UTC (permalink / raw)
  To: Frank Mayhar; +Cc: ext4 development, Curt Wohlgemuth, mrubin

Frank Mayhar wrote:
> On Thu, 2009-03-12 at 21:03 -0500, Eric Sandeen wrote:
>> Frank Mayhar wrote:
>>> We're seeing errors like:
>>>   EXT4-fs error (device sda3): ext4_mb_generate_buddy: EXT4-fs: group 3049: 21020 blocks in bitmap, 21529 in gd
>>>
>>> Usually after this the system is cleaned and in the process we see many
>>> "mb_free_blocks: double-free of inode x's block y(bit z in group d)".
>>> (In fact, we see exactly as many of these as the difference between the
>>> group and computed count of free blocks.)
>>>
>>> It looks like the bitmap itself is getting messed up somehow, at least
>>> enough to make the free count disagree with the map itself.  Has anyone
>>> else seen something like this?  Any pointers as to where to look for
>>> potential culprits?
>> Which kernel, for starters?
> 
> It's our development kernel, 2.6.26 plus as many of the ext4/jbd2
> patches as we can comfortably pull in.

which makes it a little tough; can you test on upstream too to see if it
persists?

At this point you are becoming your own distribution (but I suppose you
are used to that) ;)

-Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext4_mb_generate_buddy and double-free errors
  2009-03-13  3:25     ` Eric Sandeen
@ 2009-03-13  3:32       ` Curt Wohlgemuth
  0 siblings, 0 replies; 5+ messages in thread
From: Curt Wohlgemuth @ 2009-03-13  3:32 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Frank Mayhar, ext4 development, mrubin

Hi Eric:

On Thu, Mar 12, 2009 at 8:25 PM, Eric Sandeen <sandeen@redhat.com> wrote:
> Frank Mayhar wrote:
>> On Thu, 2009-03-12 at 21:03 -0500, Eric Sandeen wrote:
>>> Frank Mayhar wrote:
>>>> We're seeing errors like:
>>>>   EXT4-fs error (device sda3): ext4_mb_generate_buddy: EXT4-fs: group 3049: 21020 blocks in bitmap, 21529 in gd
>>>>
>>>> Usually after this the system is cleaned and in the process we see many
>>>> "mb_free_blocks: double-free of inode x's block y(bit z in group d)".
>>>> (In fact, we see exactly as many of these as the difference between the
>>>> group and computed count of free blocks.)
>>>>
>>>> It looks like the bitmap itself is getting messed up somehow, at least
>>>> enough to make the free count disagree with the map itself.  Has anyone
>>>> else seen something like this?  Any pointers as to where to look for
>>>> potential culprits?
>>> Which kernel, for starters?
>>
>> It's our development kernel, 2.6.26 plus as many of the ext4/jbd2
>> patches as we can comfortably pull in.
>
> which makes it a little tough; can you test on upstream too to see if it
> persists?
>
> At this point you are becoming your own distribution (but I suppose you
> are used to that) ;)

(sigh) Yes we are.  I've pulled in nearly all the patches up in the
ext4-stable branch through the beginning of Feb.  l looked through
patches in this branch today, and didn't see anything new that seemed
relevant.

Testing on upstream won't work for us, unfortunately.

We're mostly hoping that if anybody else has seen this problem they
can chime in with their experiences.  The generate_buddy code that
encounters this error just resets the group descriptor bb_free to the
value in the bitmap, so it's likely not fatal, but this is our first
exposure to some interesting workloads, so we'd like to nail down the
cause asap.

Thanks,
Curt
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-03-13  3:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-13  0:29 ext4_mb_generate_buddy and double-free errors Frank Mayhar
2009-03-13  2:03 ` Eric Sandeen
2009-03-13  3:22   ` Frank Mayhar
2009-03-13  3:25     ` Eric Sandeen
2009-03-13  3:32       ` Curt Wohlgemuth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.