All of lore.kernel.org
 help / color / mirror / Atom feed
From: Curt Wohlgemuth <curtw@google.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Valerie Aurora <vaurora@redhat.com>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: Odd "leak" of extent info into data blocks?
Date: Tue, 8 Sep 2009 14:18:35 -0700	[thread overview]
Message-ID: <6601abe90909081418k5de55938mfe411fccfe10a258@mail.gmail.com> (raw)
In-Reply-To: <20090908194045.GQ22901@mit.edu>

Hi Ted:

On Tue, Sep 8, 2009 at 12:40 PM, Theodore Tso<tytso@mit.edu> wrote:
> On Tue, Sep 08, 2009 at 11:21:11AM -0700, Curt Wohlgemuth wrote:
>> Hi Valerie:
>>
>> On Tue, Sep 8, 2009 at 10:56 AM, Valerie Aurora<vaurora@redhat.com> wrote:
>> > Hey, did you figure this out?  If not, I want to have a bug open
>> > somewhere.
>>
>> Yes, sorry.  I was going to post a patch for this, but have been
>> waiting to verify that it really fixes the issue.  And see the thread
>> started by Frank Mayhar about fsync issues as well...
>>
>> The problem is a race, between the last write to a to-be-freed
>> metadata block (to update the extent header) and the block being
>> marked free in the on-disk/buddy bitmaps.  Note that this only happens
>> without a journal, since *with* a journal the ordering is done
>> correctly.
>
> Just to clarify, this a race that shows up even without an unclean
> shutdown, right?

Correct.

>> Without a journal, the block buffer_head is written to, the
>> buffer_head is marked dirty, and the bitmaps are updated via
>> ext4_free_blocks().  In rare cases, the block is re-allocated for
>> another inode and written to -- subsequently, the writeback mechanism
>> will then flush the dirty extent header back to disk.  That's why it
>> looks like "leaked extent data" in the data block.
>
> If this shows up even without an unclean shutdown, then it sounds like
> the problem is a missing bforget() call.

I looked into this, and it may be merely my ignorance, but I don't see
how bforget() would solve the race.

All bforget() does is clear the buffer's dirty bit.  Meanwhile, the
page is still marked dirty, and can be in the middle of writeback;
it's true that __block_write_full_page() will check the dirty bit for
each buffer in the page, but there doesn't seem to be any
synchronization to ensure that the write won't take place at some
point in time after bforget() is called.  Which means it can be called
after the bitmap is changed.

This is why I opted to wait for the buffer to be written out before
continuing on to ext4_free_blocks().

Am I missing something?

Thanks,
Curt
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-09-08 21:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-22 23:10 Odd "leak" of extent info into data blocks? Curt Wohlgemuth
     [not found] ` <20090908175605.GB7801@shell>
2009-09-08 18:21   ` Curt Wohlgemuth
2009-09-08 19:40     ` Theodore Tso
2009-09-08 21:18       ` Curt Wohlgemuth [this message]
2009-09-08 23:36         ` Theodore Tso
2009-09-09  4:00           ` Curt Wohlgemuth
2009-09-09 15:19             ` Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6601abe90909081418k5de55938mfe411fccfe10a258@mail.gmail.com \
    --to=curtw@google.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=vaurora@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.