All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Kevin Shanahan <kmshanah@ucwb.org.au>, linux-ext4@vger.kernel.org
Subject: Re: More ext4 acl/xattr corruption - 4th occurence now
Date: Fri, 15 May 2009 06:27:19 -0400	[thread overview]
Message-ID: <20090515102719.GB6816@mit.edu> (raw)
In-Reply-To: <20090515045827.GB1279@skywalker>

On Fri, May 15, 2009 at 10:28:27AM +0530, Aneesh Kumar K.V wrote:
> > commit 8ff799da106e9fc4da9b2a3753b5b86caab27f13
> > Author: Theodore Ts'o <tytso@mit.edu>
> > Date:   Thu May 14 00:39:48 2009 -0400
> > 
> >     ext4: Add a block validity check to ext4_get_blocks_wrap()
> >     
> >     A few users with very large disks have been reporting low block number
> >     filesystem corruptions, potentially zapping the block group
> >     descriptors or inodes in the first inode table block.  It's not clear
> >     what is causing this, but most recently, it appears that whatever is
> >     trashing the filesystem metadata appears to be file data.  So let's
> >     try to set a trap for the corruption in ext4_get_blocks_wrap(), which
> >     is where logical blocks in an inode are mapped to physical blocks.
> 
> 
> We already do block validation in allocator. In
> ext4_mb_mark_diskspace_used we check whether the block allocator is
> wrongly getting blocks from system zone. So i guess we don't need
> this patch. Or may be we need to remove the check in
> ext4_mb_mark_diskspace_used and add the ckeck in get_block so that
> it catches the extent cache corruption as found by this bug.

Yeah, I was debating what to do with this patch long term.  It catches
problems than the sanity checks in ext4_mb_mark_diskspace_used() in
mballoc.c, or ext4_valid_extent() in ext4/extents.c will miss.
However, (a) it's expensive to do these sorts of tests, and (b) it
only checks for problems in the primary block group descriptors and
first block group's inode tables, since this is where we had been
reporting problems.

The other problem is that ext4_mb_mark_diskspace_used()'s tests assume
!flex_bg, and with flex_bg, its checks are largely ineffective.

What I think the right approach probably will be is that we need to
build an rbtree of the "system zone" blocks at mount time.  Especially
with flex_bg, in the normal case the block group descriptors and
bitmap blocks will form a very nice, contiguous set of extents that
should be relatively compactly stored in an rbtree.  This will allow
us to make the tests done by ext4_valid_extent(),
ext4_mb_mark_diskspace_used(), and ext4_get_blocks() to be ***much***
more effective and comprehensive. 

This still doesn't answer the question of where we should be doing the
tests.  If we're trying to track down a bug, we probably want to be
doing these tests everywhere.  For performance reasons, though, my
guess is that *most* of the tests should be default disabled except
unless a mount option is given.  (Why a run-time mount option as
opposed a compile-time tests?  So that if this bug happens in the
field, a Level 3 engineer from RHEL's, SLES's, or IBM's help desk can
tell the customer to mount with this option, and track it down without
requesting that the user install a custom kernel; having done some
field work, I can tell you this will save a huge amount of
time/effort.)  

I would probably keep the tests in ext4_valid_extent() so we can test
for on-disk corruption, but we would probably want to turn off the
tests at allocation and ext4_get_blocks() for performance reasons ---
at least until someone has a chance to do some real benchmarking tests
and confirms whether or not the extra CPU utilization is visible on
say a TPC-H or TPC-C run.

						- Ted

  reply	other threads:[~2009-05-15 10:27 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-13  6:26 More ext4 acl/xattr corruption - 4th occurence now Kevin Shanahan
2009-05-13 23:56 ` Kevin Shanahan
2009-05-14  4:40 ` Theodore Tso
2009-05-14 11:07   ` Kevin Shanahan
2009-05-14 11:17     ` Manish Katiyar
2009-05-14 12:30       ` Theodore Tso
2009-05-14 13:25     ` Kevin Shanahan
2009-05-14 14:07       ` Theodore Tso
2009-05-14 14:30         ` Kevin Shanahan
2009-05-14 15:44           ` Eric Sandeen
2009-05-14 21:07             ` Kevin Shanahan
2009-05-14 21:08               ` Eric Sandeen
2009-05-14 16:12           ` Theodore Tso
2009-05-14 21:02             ` Kevin Shanahan
2009-05-14 21:23               ` Theodore Tso
2009-05-14 21:33                 ` Kevin Shanahan
2009-05-15 23:18                   ` Kevin Shanahan
2009-05-15  1:21                 ` Eric Sandeen
2009-05-15 12:50                   ` Theodore Tso
2009-05-15 12:58                     ` Eric Sandeen
2009-05-15 15:24                       ` Eric Sandeen
2009-05-15 16:27                         ` Eric Sandeen
2009-05-15  4:55                 ` Aneesh Kumar K.V
2009-05-15 10:11                   ` Theodore Tso
2009-05-15 13:07                   ` Theodore Tso
2009-05-19 10:00                 ` Thierry Vignaud
2009-05-19 11:36                   ` Theodore Tso
2009-05-19 12:01                     ` Alex Tomas
2009-05-19 15:04                       ` Theodore Tso
2009-05-19 15:16                         ` Alex Tomas
2009-05-19 15:18                         ` Thierry Vignaud
2009-05-15  3:57             ` Alex Tomas
2009-05-15  4:58   ` Aneesh Kumar K.V
2009-05-15 10:27     ` Theodore Tso [this message]
2009-05-18  2:14       ` [PATCH] ext4: Add a comprehensive block validity check to ext4_get_blocks() (Was: More ext4 acl/xattr corruption - 4th occurence now) Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090515102719.GB6816@mit.edu \
    --to=tytso@mit.edu \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=kmshanah@ucwb.org.au \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.