All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Tomas <bzzz@sun.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Kevin Shanahan <kmshanah@ucwb.org.au>,
	Andreas Dilger <adilger@sun.com>,
	linux-ext4@vger.kernel.org
Subject: Re: More ext4 acl/xattr corruption - 4th occurence now
Date: Fri, 15 May 2009 07:57:14 +0400	[thread overview]
Message-ID: <4A0CE81A.1060006@sun.com> (raw)
In-Reply-To: <20090514161254.GJ11352@mit.edu>

when cache was introduced single exclusive spinlock protect
whole ext3_ext_get_blocks and there was no concurrency at all.
so I guess your theory is correct.

thanks, Alex

Theodore Tso wrote:
> On Fri, May 15, 2009 at 12:00:15AM +0930, Kevin Shanahan wrote:
>>> debugfs: stat <759>
>> hermes:~# debugfs /dev/dm-0
>> debugfs 1.41.3 (12-Oct-2008)
>> debugfs:  stat <759>
>>
>> Inode: 759   Type: regular    Mode:  0660   Flags: 0x80000
>> Generation: 3979120103    Version: 0x00000000:00000001
>> User:     0   Group: 10140   Size: 14615630848
>> File ACL: 0    Directory ACL: 0
>> Links: 1   Blockcount: 28546168
>> Fragment:  Address: 0    Number: 0    Size: 0
>>  ctime: 0x4a0acdb5:2a88cbec -- Wed May 13 23:10:05 2009
>>  atime: 0x4a0ac45b:10899618 -- Wed May 13 22:30:11 2009
>>  mtime: 0x4a0acdb5:2a88cbec -- Wed May 13 23:10:05 2009
>> crtime: 0x4a0ac45b:10899618 -- Wed May 13 22:30:11 2009
> 
>> Inode   Pathname
>> 759     /local/dumps/exchange/exchange-2000-UCWB-KVM-18.bkfB-KVM-18.bkf
> 
> Do you know how the system was likely writing into
> /local/dumps/exhcnag/eexchange-2000-UCWB-KVM-18.bkf?  What this a
> backup via rsync or tar?  Was this some application writing into a
> pre-existing file via NFS, or via local disk access?
> 
> Given the ctime/atime fields, I'm inclined to guess the latter, but it
> would be good to know.
> 
> The stat dump for the inode 759 does *not* show logical block 1741329
> getting mapped to physical block 529.  So the question is how did that
> happen?
> 
> I've started looking, and one thing popped up at me.  I need to check
> in with the Lustre folks who originally donated the code, but I don't
> see any spinlock or mutexes protecting the inode's extent cache.  So
> if you are on an SMP machine, this could potentially have caused the
> problem.  How many CPU's or cores do you have?  What does
> /proc/cpuinfo report?  Also, would it be correct to assume this file
> is getting served up via Samba.  My theory is that we might be running
> into problems when two threads are simultaneously trying read and
> write to a single file at the same time.
> 
> Hmm, what is accessing your files on this system?  Are you just doing
> backups?  Is it just a backup server?  Or are you serving up files
> using Samba and there are clients which are accessing those files?
> 
> So if this the problem the following experiment should be able to
> confirm whether it's the problem, by seeing if the problem goes away
> if we short-circuit the inode's extent cache.  In fs/ext4/extents.c,
> try inserting a "return" statement to in ext4_ext_put_in_cache():
> 
> static void
> ext4_ext_put_in_cache(struct inode *inode, ext4_lblk_t block,
> 			__u32 len, ext4_fsblk_t start, int type)
> {
> 	struct ext4_ext_cache *cex;
> 
> 	return;		      <---- insert this line
> 	BUG_ON(len == 0);
> 	cex = &EXT4_I(inode)->i_cached_extent;
> 	cex->ec_type = type;
> 	cex->ec_block = block;
> 	cex->ec_len = len;
> 	cex->ec_start = start;
> }
> 
> This should short circuit the i_cached_extent cache, and this may be
> enough to make your problem go away.  (If this theory is correct,
> using mount -o nodelalloc probably won't make a difference, although
> it might change the timing enough to make the bug harder to see.)
> 
> If that solves the problem, the right long-term fix will be to drop bin
> a spinlock to protect i_cached_extent.
> 
> 						- Ted


  parent reply	other threads:[~2009-05-15  4:08 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-13  6:26 More ext4 acl/xattr corruption - 4th occurence now Kevin Shanahan
2009-05-13 23:56 ` Kevin Shanahan
2009-05-14  4:40 ` Theodore Tso
2009-05-14 11:07   ` Kevin Shanahan
2009-05-14 11:17     ` Manish Katiyar
2009-05-14 12:30       ` Theodore Tso
2009-05-14 13:25     ` Kevin Shanahan
2009-05-14 14:07       ` Theodore Tso
2009-05-14 14:30         ` Kevin Shanahan
2009-05-14 15:44           ` Eric Sandeen
2009-05-14 21:07             ` Kevin Shanahan
2009-05-14 21:08               ` Eric Sandeen
2009-05-14 16:12           ` Theodore Tso
2009-05-14 21:02             ` Kevin Shanahan
2009-05-14 21:23               ` Theodore Tso
2009-05-14 21:33                 ` Kevin Shanahan
2009-05-15 23:18                   ` Kevin Shanahan
2009-05-15  1:21                 ` Eric Sandeen
2009-05-15 12:50                   ` Theodore Tso
2009-05-15 12:58                     ` Eric Sandeen
2009-05-15 15:24                       ` Eric Sandeen
2009-05-15 16:27                         ` Eric Sandeen
2009-05-15  4:55                 ` Aneesh Kumar K.V
2009-05-15 10:11                   ` Theodore Tso
2009-05-15 13:07                   ` Theodore Tso
2009-05-19 10:00                 ` Thierry Vignaud
2009-05-19 11:36                   ` Theodore Tso
2009-05-19 12:01                     ` Alex Tomas
2009-05-19 15:04                       ` Theodore Tso
2009-05-19 15:16                         ` Alex Tomas
2009-05-19 15:18                         ` Thierry Vignaud
2009-05-15  3:57             ` Alex Tomas [this message]
2009-05-15  4:58   ` Aneesh Kumar K.V
2009-05-15 10:27     ` Theodore Tso
2009-05-18  2:14       ` [PATCH] ext4: Add a comprehensive block validity check to ext4_get_blocks() (Was: More ext4 acl/xattr corruption - 4th occurence now) Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A0CE81A.1060006@sun.com \
    --to=bzzz@sun.com \
    --cc=adilger@sun.com \
    --cc=kmshanah@ucwb.org.au \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.