From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de ([195.135.220.15]:52743 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750900AbcGDRfG (ORCPT ); Mon, 4 Jul 2016 13:35:06 -0400 Date: Mon, 4 Jul 2016 19:35:29 +0200 From: David Sterba To: Wang Xiaoguang Cc: linux-btrfs@vger.kernel.org Subject: Re: [PATCH 1/2] btrfs: fix fsfreeze hang caused by delayed iputs deal Message-ID: <20160704173529.GM13336@twin.jikos.cz> Reply-To: dsterba@suse.cz References: <20160629051510.17222-1-wangxg.fnst@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20160629051510.17222-1-wangxg.fnst@cn.fujitsu.com> Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Wed, Jun 29, 2016 at 01:15:10PM +0800, Wang Xiaoguang wrote: > When running fstests generic/068, sometimes we got below WARNING: > xfs_io D ffff8800331dbb20 0 6697 6693 0x00000080 > ffff8800331dbb20 ffff88007acfc140 ffff880034d895c0 ffff8800331dc000 > ffff880032d243e8 fffffffeffffffff ffff880032d24400 0000000000000001 > ffff8800331dbb38 ffffffff816a9045 ffff880034d895c0 ffff8800331dbba8 > Call Trace: > [] schedule+0x35/0x80 > [] rwsem_down_read_failed+0xf2/0x140 > [] ? __filemap_fdatawrite_range+0xd1/0x100 > [] call_rwsem_down_read_failed+0x18/0x30 > [] ? btrfs_alloc_block_rsv+0x2c/0xb0 [btrfs] > [] percpu_down_read+0x35/0x50 > [] __sb_start_write+0x2c/0x40 > [] start_transaction+0x2a5/0x4d0 [btrfs] > [] btrfs_join_transaction+0x17/0x20 [btrfs] > [] btrfs_evict_inode+0x3c4/0x5d0 [btrfs] > [] evict+0xba/0x1a0 > [] iput+0x196/0x200 > [] btrfs_run_delayed_iputs+0x70/0xc0 [btrfs] > [] btrfs_commit_transaction+0x928/0xa80 [btrfs] > [] btrfs_freeze+0x30/0x40 [btrfs] > [] freeze_super+0xf0/0x190 > [] do_vfs_ioctl+0x4a5/0x5c0 > [] ? do_audit_syscall_entry+0x66/0x70 > [] ? syscall_trace_enter_phase1+0x11f/0x140 > [] SyS_ioctl+0x79/0x90 > [] do_syscall_64+0x62/0x110 > [] entry_SYSCALL64_slow_path+0x25/0x25 > > >From this warning, freeze_super() already holds SB_FREEZE_FS, but > btrfs_freeze() will call btrfs_commit_transaction() again, if > btrfs_commit_transaction() finds that it has delayed iputs to handle, > it'll start_transaction(), which will try to get SB_FREEZE_FS lock > again, then deadlock occurs. > > The root cause is that in btrfs, sync_filesystem(sb) does not make > sure all metadata is updated. See below race window in freeze_super(): > sync_filesystem(sb); > | > | race window > | In this period, cleaner_kthread() may be scheduled to > | run, and it call btrfs_delete_unused_bgs() which will > | add some delayed iputs. > | > sb->s_writers.frozen = SB_FREEZE_FS; > sb_wait_write(sb, SB_FREEZE_FS); > if (sb->s_op->freeze_fs) { > /* freeze_fs will call btrfs_commit_transaction() */ > ret = sb->s_op->freeze_fs(sb); > > So if btrfs is doing freeze job, we should block > btrfs_delete_unused_bgs(), to avoid add delayed iputs. > > Signed-off-by: Wang Xiaoguang > --- > fs/btrfs/disk-io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 863bf7a..fdbe0df 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -1846,8 +1846,11 @@ static int cleaner_kthread(void *arg) > * after acquiring fs_info->delete_unused_bgs_mutex. So we > * can't hold, nor need to, fs_info->cleaner_mutex when deleting > * unused block groups. > + * Extra line, but I think you intended to write a comment that explains why the freeze protection is required here :) > */ > + __sb_start_write(root->fs_info->sb, SB_FREEZE_WRITE, true); There's opencoding an existing wrapper sb_start_write, please use it instead. > btrfs_delete_unused_bgs(root->fs_info); > + __sb_end_write(root->fs_info->sb, SB_FREEZE_WRITE); > sleep: > if (!again) { > set_current_state(TASK_INTERRUPTIBLE); > -- > 2.9.0 > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html