linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tytso@mit.edu
To: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-ext4@vger.kernel.org, Jens Axboe <jens.axboe@oracle.com>,
	dmitry.torokhov@gmail.com, Jan Kara <jack@suse.cz>
Subject: Re: 2.6.33-rc1: kernel BUG at fs/ext4/inode.c:1063 (sparc)
Date: Sun, 27 Dec 2009 22:51:59 -0500	[thread overview]
Message-ID: <20091228035159.GC4429@thunk.org> (raw)
In-Reply-To: <a4423d670912271502n13957e8boc168fdc75444dc09@mail.gmail.com>

kernel BUG at fs/ext4/inode.c:1063!

> >> Here is output of 2.6.33-rc2 plus Ted's patch
> >>
> >> EXT4-fs (sda1): inode #1387643: mdb_free (1) < mdb_claim (2) BUG
> >

OK, i've been able to reproduce the problem using xfsqa test #74
(fstest) when an ext3 file system is mounted the ext4 file system
driver.  I was then able to bisect it down to commit d21cd8f6, which
was introduced between 2.6.33-rc1 and 2.6.33-rc2, as part of
quota/ext4 patch series pushed by Jan.

I then tested v2.6.33-rc2 with commit d21cd8 reverted, and I was not
able to replicate the BUG.  More investigation is needed, but if we
compare the potential quota deadlock with the apparently
fairly-easy-to-replicate BUG, if we can't find a better fix fairly
quickly we should probably just revert this commit for now.

						- Ted

commit d21cd8f163ac44b15c465aab7306db931c606908
Author:     Dmitry Monakhov <dmonakhov@openvz.org>
AuthorDate: Thu Dec 10 03:31:45 2009 +0000
Commit:     Jan Kara <jack@suse.cz>
CommitDate: Wed Dec 23 13:44:12 2009 +0100

    ext4: Fix potential quota deadlock
    
    We have to delay vfs_dq_claim_space() until allocation context destruction.
    Currently we have following call-trace:
    ext4_mb_new_blocks()
      /* task is already holding ac->alloc_semp */
     ->ext4_mb_mark_diskspace_used
        ->vfs_dq_claim_space()  /*  acquire dqptr_sem here. Possible deadlock */
     ->ext4_mb_release_context() /* drop ac->alloc_semp here */
    
    Let's move quota claiming to ext4_da_update_reserve_space()
    
     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.32-rc7 #18
     -------------------------------------------------------
     write-truncate-/3465 is trying to acquire lock:
      (&s->s_dquot.dqptr_sem){++++..}, at: [<c025e73b>] dquot_claim_space+0x3b/0x1b0
    
     but task is already holding lock:
      (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
    
     which lock already depends on the new lock.
    
     the existing dependency chain (in reverse order) is:
    
     -> #3 (&meta_group_info[i]->alloc_sem){++++..}:
            [<c017d04b>] __lock_acquire+0xd7b/0x1260
            [<c017d5ea>] lock_acquire+0xba/0xd0
            [<c0527191>] down_read+0x51/0x90
            [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
            [<c02d0c1c>] ext4_mb_free_blocks+0x46c/0x870
            [<c029c9d3>] ext4_free_blocks+0x73/0x130
            [<c02c8cfc>] ext4_ext_truncate+0x76c/0x8d0
            [<c02a8087>] ext4_truncate+0x187/0x5e0
            [<c01e0f7b>] vmtruncate+0x6b/0x70
            [<c022ec02>] inode_setattr+0x62/0x190
            [<c02a2d7a>] ext4_setattr+0x25a/0x370
            [<c022ee81>] notify_change+0x151/0x340
            [<c021349d>] do_truncate+0x6d/0xa0
            [<c0221034>] may_open+0x1d4/0x200
            [<c022412b>] do_filp_open+0x1eb/0x910
            [<c021244d>] do_sys_open+0x6d/0x140
            [<c021258e>] sys_open+0x2e/0x40
            [<c0103100>] sysenter_do_call+0x12/0x32
    
     -> #2 (&ei->i_data_sem){++++..}:
            [<c017d04b>] __lock_acquire+0xd7b/0x1260
            [<c017d5ea>] lock_acquire+0xba/0xd0
            [<c0527191>] down_read+0x51/0x90
            [<c02a5787>] ext4_get_blocks+0x47/0x450
            [<c02a74c1>] ext4_getblk+0x61/0x1d0
            [<c02a7a7f>] ext4_bread+0x1f/0xa0
            [<c02bcddc>] ext4_quota_write+0x12c/0x310
            [<c0262d23>] qtree_write_dquot+0x93/0x120
            [<c0261708>] v2_write_dquot+0x28/0x30
            [<c025d3fb>] dquot_commit+0xab/0xf0
            [<c02be977>] ext4_write_dquot+0x77/0x90
            [<c02be9bf>] ext4_mark_dquot_dirty+0x2f/0x50
            [<c025e321>] dquot_alloc_inode+0x101/0x180
            [<c029fec2>] ext4_new_inode+0x602/0xf00
            [<c02ad789>] ext4_create+0x89/0x150
            [<c0221ff2>] vfs_create+0xa2/0xc0
            [<c02246e7>] do_filp_open+0x7a7/0x910
            [<c021244d>] do_sys_open+0x6d/0x140
            [<c021258e>] sys_open+0x2e/0x40
            [<c0103100>] sysenter_do_call+0x12/0x32
    
     -> #1 (&sb->s_type->i_mutex_key#7/4){+.+...}:
            [<c017d04b>] __lock_acquire+0xd7b/0x1260
            [<c017d5ea>] lock_acquire+0xba/0xd0
            [<c0526505>] mutex_lock_nested+0x65/0x2d0
            [<c0260c9d>] vfs_load_quota_inode+0x4bd/0x5a0
            [<c02610af>] vfs_quota_on_path+0x5f/0x70
            [<c02bc812>] ext4_quota_on+0x112/0x190
            [<c026345a>] sys_quotactl+0x44a/0x8a0
            [<c0103100>] sysenter_do_call+0x12/0x32
    
     -> #0 (&s->s_dquot.dqptr_sem){++++..}:
            [<c017d361>] __lock_acquire+0x1091/0x1260
            [<c017d5ea>] lock_acquire+0xba/0xd0
            [<c0527191>] down_read+0x51/0x90
            [<c025e73b>] dquot_claim_space+0x3b/0x1b0
            [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380
            [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530
            [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0
            [<c02a5966>] ext4_get_blocks+0x226/0x450
            [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0
            [<c02a6ed6>] ext4_da_writepages+0x506/0x790
            [<c01de272>] do_writepages+0x22/0x50
            [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80
            [<c01d7b9b>] filemap_flush+0x2b/0x30
            [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60
            [<c029e595>] ext4_release_file+0x75/0xb0
            [<c0216b59>] __fput+0xf9/0x210
            [<c0216c97>] fput+0x27/0x30
            [<c02122dc>] filp_close+0x4c/0x80
            [<c014510e>] put_files_struct+0x6e/0xd0
            [<c01451b7>] exit_files+0x47/0x60
            [<c0146a24>] do_exit+0x144/0x710
            [<c0147028>] do_group_exit+0x38/0xa0
            [<c0159abc>] get_signal_to_deliver+0x2ac/0x410
            [<c0102849>] do_notify_resume+0xb9/0x890
            [<c01032d2>] work_notifysig+0x13/0x21
    
     other info that might help us debug this:
    
     3 locks held by write-truncate-/3465:
      #0:  (jbd2_handle){+.+...}, at: [<c02e1f8f>] start_this_handle+0x38f/0x5c0
      #1:  (&ei->i_data_sem){++++..}, at: [<c02a57f6>] ext4_get_blocks+0xb6/0x450
      #2:  (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
    
     stack backtrace:
     Pid: 3465, comm: write-truncate- Not tainted 2.6.32-rc7 #18
     Call Trace:
      [<c0524cb3>] ? printk+0x1d/0x22
      [<c017ac9a>] print_circular_bug+0xca/0xd0
      [<c017d361>] __lock_acquire+0x1091/0x1260
      [<c016bca2>] ? sched_clock_local+0xd2/0x170
      [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
      [<c017d5ea>] lock_acquire+0xba/0xd0
      [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0
      [<c0527191>] down_read+0x51/0x90
      [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0
      [<c025e73b>] dquot_claim_space+0x3b/0x1b0
      [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380
      [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530
      [<c02c601d>] ? ext4_ext_find_extent+0x25d/0x280
      [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0
      [<c016bca2>] ? sched_clock_local+0xd2/0x170
      [<c016be60>] ? sched_clock_cpu+0x120/0x160
      [<c016beef>] ? cpu_clock+0x4f/0x60
      [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
      [<c052712c>] ? down_write+0x8c/0xa0
      [<c02a5966>] ext4_get_blocks+0x226/0x450
      [<c016be60>] ? sched_clock_cpu+0x120/0x160
      [<c016beef>] ? cpu_clock+0x4f/0x60
      [<c017908b>] ? trace_hardirqs_off+0xb/0x10
      [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0
      [<c01d69cc>] ? find_get_pages_tag+0x16c/0x180
      [<c01d6860>] ? find_get_pages_tag+0x0/0x180
      [<c02a73bd>] ? __mpage_da_writepage+0x16d/0x1a0
      [<c01dfc4e>] ? pagevec_lookup_tag+0x2e/0x40
      [<c01ddf1b>] ? write_cache_pages+0xdb/0x3d0
      [<c02a7250>] ? __mpage_da_writepage+0x0/0x1a0
      [<c02a6ed6>] ext4_da_writepages+0x506/0x790
      [<c016beef>] ? cpu_clock+0x4f/0x60
      [<c016bca2>] ? sched_clock_local+0xd2/0x170
      [<c016be60>] ? sched_clock_cpu+0x120/0x160
      [<c016be60>] ? sched_clock_cpu+0x120/0x160
      [<c02a69d0>] ? ext4_da_writepages+0x0/0x790
      [<c01de272>] do_writepages+0x22/0x50
      [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80
      [<c01d7b9b>] filemap_flush+0x2b/0x30
      [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60
      [<c029e595>] ext4_release_file+0x75/0xb0
      [<c0216b59>] __fput+0xf9/0x210
      [<c0216c97>] fput+0x27/0x30
      [<c02122dc>] filp_close+0x4c/0x80
      [<c014510e>] put_files_struct+0x6e/0xd0
      [<c01451b7>] exit_files+0x47/0x60
      [<c0146a24>] do_exit+0x144/0x710
      [<c017b163>] ? lock_release_holdtime+0x33/0x210
      [<c0528137>] ? _spin_unlock_irq+0x27/0x30
      [<c0147028>] do_group_exit+0x38/0xa0
      [<c017babb>] ? trace_hardirqs_on+0xb/0x10
      [<c0159abc>] get_signal_to_deliver+0x2ac/0x410
      [<c0102849>] do_notify_resume+0xb9/0x890
      [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
      [<c017b163>] ? lock_release_holdtime+0x33/0x210
      [<c0165b50>] ? autoremove_wake_function+0x0/0x50
      [<c017ba54>] ? trace_hardirqs_on_caller+0x134/0x190
      [<c017babb>] ? trace_hardirqs_on+0xb/0x10
      [<c0300ba4>] ? security_file_permission+0x14/0x20
      [<c0215761>] ? vfs_write+0x131/0x190
      [<c0214f50>] ? do_sync_write+0x0/0x120
      [<c0103115>] ? sysenter_do_call+0x27/0x32
      [<c01032d2>] work_notifysig+0x13/0x21
    
    CC: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
    Signed-off-by: Jan Kara <jack@suse.cz>


  reply	other threads:[~2009-12-28  3:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-24 22:28 2.6.33-rc1: kernel BUG at fs/ext4/inode.c:1063 (sparc) Alexander Beregalov
2009-12-24 22:49 ` Alexander Beregalov
2009-12-25 12:31   ` Dmitry Monakhov
2009-12-25 19:33     ` Alexander Beregalov
2009-12-25 23:47       ` Dmitry Monakhov
2009-12-27 20:32         ` Alexander Beregalov
2009-12-27 21:38           ` Dmitry Torokhov
2009-12-27 22:52           ` tytso
2009-12-27 23:02             ` Alexander Beregalov
2009-12-28  3:51               ` tytso [this message]
2009-12-30  5:37                 ` tytso
2009-12-30 13:18                   ` Dmitry Monakhov
2009-12-30 17:45                     ` tytso
2009-12-30 17:48                     ` tytso
2009-12-24 23:05 ` tytso
2009-12-24 23:15   ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091228035159.GC4429@thunk.org \
    --to=tytso@mit.edu \
    --cc=a.beregalov@gmail.com \
    --cc=dmitry.torokhov@gmail.com \
    --cc=dmonakhov@openvz.org \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).