All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>
To: Jan Kara <jack@suse.cz>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	"Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Subject: Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock
Date: Fri, 09 Dec 2011 10:56:18 +0900	[thread overview]
Message-ID: <20111209105616.5A4F.61FB500B@jp.fujitsu.com> (raw)
In-Reply-To: <20110207205325.FB6A.61FB500B@jp.fujitsu.com>


(2011/02/07 20:53), Masayoshi MIZUMA wrote:

> Hi,
> 
> When I checked the freeze feature for ext4 filesystem using fsfreeze command
> at 2.6.38-rc3, I got the following messeges:

Hi,

I checked freeze function with using below test program at 3.2.0-rc4, 
then, I got following messeages and the test program hanged up.
I think this bug is still in  3.2.0-rc4...

The test program:
-----------------------------------------------------------
#!/bin/bash

DEV_1=/dev/sda5
MNT_1=/tmp/sda5
LOOP=500

if [[ ! -d $MNT_1 ]]
then
        mkdir -p $MNT_1
fi

mkfs -t ext4 $DEV_1
mount $DEV_1 $MNT_1

./fsstress -d $MNT_1/tmp -n 10000 -p 100 > /dev/null 2>&1 &
PID=$!

for ((i=0; i<LOOP; i++))
do
        echo LOOP: $i
        fsfreeze -f $MNT_1
        fsfreeze -u $MNT_1
done

kill $PID
-----------------------------------------------------------

The messages I got when I ran the test program is below.
-------------------------------------------------------------
INFO: task flush-8:0:720 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0       D 0000000100521461     0   720      2 0x00000000
 ffff8800b4c41a40 0000000000000046 0000000000000000 0000000000000000
 0000000000013440 ffff8800b4c41fd8 ffff8800b4c40010 0000000000013440
 ffff8800b4c41fd8 0000000000013440 ffffffff81a0d020 ffff8800b464d4e0
Call Trace:
 [<ffffffff81086b4e>] ? prepare_to_wait+0x5e/0x90
 [<ffffffff814ee3ff>] schedule+0x3f/0x60
 [<ffffffffa041e485>] ext4_journal_start_sb+0x145/0x1b0 [ext4]
 [<ffffffff81086820>] ? wake_up_bit+0x40/0x40
 [<ffffffffa0401bc5>] ? ext4_meta_trans_blocks+0xb5/0xc0 [ext4]
 [<ffffffffa0406c9d>] ext4_da_writepages+0x29d/0x620 [ext4]
 [<ffffffff81227a18>] ? blk_finish_plug+0x18/0x50
 [<ffffffff81112bb1>] do_writepages+0x21/0x40
 [<ffffffff8118e380>] writeback_single_inode+0x180/0x3b0
 [<ffffffff8118e971>] writeback_sb_inodes+0x1a1/0x260
 [<ffffffff8118ec6e>] wb_writeback+0xde/0x2b0
 [<ffffffff810739c6>] ? try_to_del_timer_sync+0x86/0xe0
 [<ffffffff8118eee6>] wb_do_writeback+0xa6/0x260
 [<ffffffff81072ef0>] ? lock_timer_base+0x70/0x70
 [<ffffffff8118f14a>] bdi_writeback_thread+0xaa/0x270
 [<ffffffff8118f0a0>] ? wb_do_writeback+0x260/0x260
 [<ffffffff8118f0a0>] ? wb_do_writeback+0x260/0x260
 [<ffffffff810861a6>] kthread+0x96/0xa0
 [<ffffffff814fa5b4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81086110>] ? kthread_worker_fn+0x1a0/0x1a0
 [<ffffffff814fa5b0>] ? gs_change+0x13/0x13

INFO: task fsstress:4376 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fsstress        D ffff88009b52dda8     0  4376   4364 0x00000080
 ffff88009b52dcb8 0000000000000082 ffffffff810d7e09 ffff88009b52dcc0
 0000000000013440 ffff88009b52dfd8 ffff88009b52c010 0000000000013440
 ffff88009b52dfd8 0000000000013440 ffff88009b4d54e0 ffff8800a1481560
Call Trace:
 [<ffffffff810d7e09>] ? trace_clock_local+0x9/0x10
 [<ffffffff814ee3ff>] schedule+0x3f/0x60
 [<ffffffff814ee89d>] schedule_timeout+0x1fd/0x2e0
 [<ffffffff810e5e43>] ? trace_nowake_buffer_unlock_commit+0x43/0x60
 [<ffffffff810127e4>] ? __switch_to+0x194/0x320
 [<ffffffff8104d623>] ? ftrace_raw_event_sched_switch+0x103/0x110
 [<ffffffff814ee26d>] wait_for_common+0x11d/0x190
 [<ffffffff8105a970>] ? try_to_wake_up+0x2b0/0x2b0
 [<ffffffff814ee3bd>] wait_for_completion+0x1d/0x20
 [<ffffffff8118daef>] writeback_inodes_sb_nr+0x7f/0xa0
 [<ffffffff8118dbdf>] writeback_inodes_sb+0x5f/0x80
 [<ffffffff811938d0>] ? __sync_filesystem+0x90/0x90
 [<ffffffff8119388e>] __sync_filesystem+0x4e/0x90
 [<ffffffff811938ef>] sync_one_sb+0x1f/0x30
 [<ffffffff811695da>] iterate_supers+0x7a/0xd0
 [<ffffffff81193934>] sys_sync+0x34/0x70
 [<ffffffff814f8442>] system_call_fastpath+0x16/0x1b
-------------------------------------------------------------

The test program for xfstests is below.
-------------------------------------------------------------
#! /bin/bash
# FSQA Test No. 277
#
# Run fsstress and  freeze/unfreeze in parallel
#
#-----------------------------------------------------------------------
# Copyright (c) 2006 Silicon Graphics, Inc.  All Rights Reserved.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
#
#-----------------------------------------------------------------------
#
# creator
owner=m.mizuma@jp.fujitsu.com

seq=`basename $0`
echo "QA output created by $seq"

here=`pwd`
tmp=/tmp/$$
status=0        # success is the default!
trap "rm -f $tmp.*; exit \$status" 0 1 2 3 15

# get standard environment, filters and checks
. ./common.rc
. ./common.filter

_workout()
{
	echo ""
	echo "Run fsstress"
	echo ""
	num_iterations=500
	out=$SCRATCH_MNT/fsstress.$$
	args="-p100 -n10000 -d $out"
	echo "fsstress $args" >> $here/$seq.full
	$FSSTRESS_PROG $args > /dev/null 2>&1 &
	pid=$!
	echo "Run xfs_freeze in parallel"
	for ((i=0; i < num_iterations; i++))
	do
		xfs_freeze -f $SCRATCH_MNT | tee -a $seq.full
		xfs_freeze -u $SCRATCH_MNT | tee -a $seq.full
	done
	kill $pid 2> /dev/null
	wait $pid
}

# real QA test starts here
_supported_fs generic
_supported_os Linux
_need_to_be_root
_require_scratch

_scratch_mkfs >> $seq.full 2>&1
_scratch_mount

if ! _workout; then
	umount $SCRATCH_DEV 2>/dev/null
	exit
fi

if ! _scratch_unmount; then
	echo "failed to umount"
	status=1
	exit
fi
_check_scratch_fs
status=$?
exit
-------------------------------------------------------------

Thanks,
Masayoshi Mizuma

> 
> ---------------------------------------------------------------------
> Feb  7 15:05:09 RX300S6 kernel: INFO: task fsfreeze:2104 blocked for more than 120 seconds.
> Feb  7 15:05:09 RX300S6 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Feb  7 15:05:09 RX300S6 kernel: fsfreeze        D ffff880076d5f040     0  2104   2018 0x00000000
> Feb  7 15:05:09 RX300S6 kernel: ffff88005a9f3d98 0000000000000086 ffff88005a9f3d38 ffffffff00000000
> Feb  7 15:05:09 RX300S6 kernel: 0000000000014d40 ffff880076d5eab0 ffff880076d5f040 ffff88005a9f3fd8
> Feb  7 15:05:09 RX300S6 kernel: ffff880076d5f048 0000000000014d40 ffff88005a9f2010 0000000000014d40
> Feb  7 15:05:09 RX300S6 kernel: Call Trace:
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff814aa5f5>] rwsem_down_failed_common+0xb5/0x140
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff814aa693>] rwsem_down_write_failed+0x13/0x20
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff8122f1a3>] call_rwsem_down_write_failed+0x13/0x20
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff814a9c12>] ? down_write+0x32/0x40
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff81155b48>] thaw_super+0x28/0xd0
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff81164338>] do_vfs_ioctl+0x368/0x560
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff81157c73>] ? sys_newfstat+0x33/0x40
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff811645d1>] sys_ioctl+0xa1/0xb0
> Feb  7 15:05:09 RX300S6 kernel: [<ffffffff8100bf82>] system_call_fastpath+0x16/0x1b
> ...
> Feb  7 15:07:09 RX300S6 kernel: INFO: task flush-8:0:1409 blocked for more than 120 seconds.
> Feb  7 15:07:09 RX300S6 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Feb  7 15:07:09 RX300S6 kernel: flush-8:0       D ffff880037777a30     0  1409      2 0x00000000
> Feb  7 15:07:09 RX300S6 kernel: ffff880037c95a80 0000000000000046 ffff88007c8037a0 0000000000000000
> Feb  7 15:07:09 RX300S6 kernel: 0000000000014d40 ffff8800377774a0 ffff880037777a30 ffff880037c95fd8
> Feb  7 15:07:09 RX300S6 kernel: ffff880037777a38 0000000000014d40 ffff880037c94010 0000000000014d40
> Feb  7 15:07:09 RX300S6 kernel: Call Trace:
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffffa00abb85>] ext4_journal_start_sb+0x75/0x130 [ext4]
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81082fc0>] ? autoremove_wake_function+0x0/0x40
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffffa0097f0a>] ext4_da_writepages+0x27a/0x640 [ext4]
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81102c91>] do_writepages+0x21/0x40
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff811776b8>] writeback_single_inode+0x98/0x240
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81177cfe>] writeback_sb_inodes+0xce/0x170
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81178709>] writeback_inodes_wb+0x99/0x160
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81178a8b>] wb_writeback+0x2bb/0x430
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81178e2c>] wb_do_writeback+0x22c/0x280
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81178f32>] bdi_writeback_thread+0xb2/0x260
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81178e80>] ? bdi_writeback_thread+0x0/0x260
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81178e80>] ? bdi_writeback_thread+0x0/0x260
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff81082936>] kthread+0x96/0xa0
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff8100cdc4>] kernel_thread_helper+0x4/0x10
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff810828a0>] ? kthread+0x0/0xa0
> Feb  7 15:07:09 RX300S6 kernel: [<ffffffff8100cdc0>] ? kernel_thread_helper+0x0/0x10
> ---------------------------------------------------------------------
> 
> I think the following deadlock problem happened:
> 
>               [flush-8:0:1409]              |          [fsfreeze:2104]
> --------------------------------------------+--------------------------------
> writeback_inodes_wb                         |
>  pin_sb_for_writeback                       |
>    down_read_trylock(&sb->s_umount)         |
>  writeback_sb_inodes                        |thaw_super
>    writeback_single_inode                   | down_write(&sb->s_umount)
>      do_writepages                          |  # stop until flush-8:0 releases
>       ext4_da_writepages                    |  # read lock of sb->s_umount...
>        ext4_journal_start_sb                |
>         vfs_check_frozen                    |
>           wait_event((sb)->s_wait_unfrozen, |
>            ((sb)->s_frozen < (level)))      |
>             # stop until being waked up by  |
>             # fsfreeze...                   |
> --------------------------------------------+--------------------------------
> 
> Could anyone check this problem?
> 
> Thanks,
> Masayoshi Mizuma
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html




  parent reply	other threads:[~2011-12-09  1:56 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-07 11:53 [BUG] ext4: cannot unfreeze a filesystem due to a deadlock Masayoshi MIZUMA
2011-02-15 16:06 ` Jan Kara
2011-02-15 17:03   ` Ted Ts'o
2011-02-15 17:29     ` Jan Kara
2011-02-15 18:04       ` Ted Ts'o
2011-02-15 19:11         ` Jan Kara
2011-02-15 23:17       ` Toshiyuki Okajima
2011-02-16 14:56         ` Jan Kara
2011-02-17  3:50           ` Toshiyuki Okajima
2011-02-17  5:13             ` Andreas Dilger
2011-02-17 10:41               ` Jan Kara
2011-02-17 10:45             ` Jan Kara
2011-03-28  8:06               ` [RFC][PATCH] " Toshiyuki Okajima
2011-03-30 14:12                 ` Jan Kara
2011-03-31  8:37                   ` Yongqiang Yang
2011-03-31  8:48                     ` Yongqiang Yang
2011-03-31 14:04                     ` Eric Sandeen
2011-03-31 14:36                       ` Yongqiang Yang
2011-03-31 15:25                         ` Eric Sandeen
2011-03-31 16:28                         ` Jan Kara
2011-03-31 12:03                   ` Toshiyuki Okajima
2011-04-05 10:25                     ` Toshiyuki Okajima
2011-04-05 22:54                       ` Jan Kara
2011-04-06  5:09                         ` Toshiyuki Okajima
2011-04-06  5:57                           ` Jan Kara
2011-04-06  7:40                             ` Toshiyuki Okajima
2011-04-06 17:46                               ` Jan Kara
2011-04-15 13:39                                 ` Toshiyuki Okajima
2011-04-15 17:13                                   ` Jan Kara
2011-04-15 17:17                                     ` Eric Sandeen
2011-04-15 17:37                                       ` Jan Kara
2011-04-18  9:05                                     ` Toshiyuki Okajima
2011-04-18 10:51                                       ` Jan Kara
2011-04-19  9:43                                         ` Toshiyuki Okajima
2011-04-22  6:58                                           ` Toshiyuki Okajima
2011-04-22 21:26                                             ` Peter M. Petrakis
2011-04-22 21:40                                               ` Jan Kara
2011-04-22 22:57                                                 ` Peter M. Petrakis
2011-04-22 22:10                                             ` Jan Kara
2011-04-25  6:28                                               ` Toshiyuki Okajima
2011-05-03  8:06                                                 ` Surbhi Palande
2011-05-03 11:01                                       ` Surbhi Palande
2011-05-03 13:08                                         ` (unknown), Surbhi Palande
2011-05-03 13:46                                           ` your mail Jan Kara
2011-05-03 13:56                                             ` Surbhi Palande
2011-05-03 15:26                                               ` Surbhi Palande
2011-05-03 15:36                                               ` Jan Kara
2011-05-03 15:43                                                 ` Surbhi Palande
2011-05-04 19:24                                                   ` Jan Kara
2011-05-06 15:20                                                     ` [RFC][PATCH] Do not accept a new handle when the F.S is frozen Surbhi Palande
2011-05-06 15:20                                                     ` [PATCH] Adding support to freeze and unfreeze a journal Surbhi Palande
2011-05-06 20:56                                                       ` Andreas Dilger
2011-05-07 20:04                                                         ` [PATCH v2] " Surbhi Palande
2011-05-08  8:24                                                           ` Marco Stornelli
2011-05-09  9:04                                                             ` Surbhi Palande
2011-05-09  9:24                                                               ` Jan Kara
2011-05-09  9:53                                                           ` Jan Kara
2011-05-09 13:49                                                             ` Surbhi Palande
2011-05-09 14:51                                                               ` [PATCH v3] " Surbhi Palande
2011-05-09 15:08                                                                 ` Jan Kara
2011-05-10 15:07                                                                   ` [PATCH] " Surbhi Palande
2011-05-10 21:07                                                                     ` Andreas Dilger
2011-05-11  7:46                                                                       ` Surbhi Palande
2011-05-09 15:23                                                                 ` [PATCH v3] " Eric Sandeen
2011-05-11  7:06                                                                   ` Surbhi Palande
2011-05-11  7:10                                                                     ` [PATCH] Attempt to sync the fsstress writes to a frozen F.S Surbhi Palande
2011-05-12 14:22                                                                       ` Eric Sandeen
2011-05-12 14:22                                                                         ` Eric Sandeen
2011-05-24 21:42                                                                       ` Ted Ts'o
2011-05-25 12:00                                                                         ` Surbhi Palande
2011-05-25 12:12                                                                           ` Theodore Tso
2011-05-27 16:28                                                                             ` Jan Kara
2011-05-11  9:05                                                                     ` [PATCH v3] Adding support to freeze and unfreeze a journal Andreas Dilger
2011-05-12  9:40                                                                       ` Surbhi Palande
2011-05-03 13:08                                         ` [PATCH] Prevent dirtying a page when ext4 F.S is frozen Surbhi Palande
2011-05-03 15:19                                         ` [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock Jan Kara
2011-05-04 12:09                                           ` Surbhi Palande
2011-05-04 19:19                                             ` Jan Kara
2011-05-04 21:34                                               ` Surbhi Palande
2011-05-04 22:48                                                 ` Jan Kara
2011-05-05  6:06                                                   ` Surbhi Palande
2011-05-05 11:18                                                     ` Jan Kara
2011-05-05 14:01                                                       ` Surbhi Palande
2011-03-31 23:40                 ` Dave Chinner
2011-03-31 23:53                   ` Eric Sandeen
2011-04-01 14:08                   ` Jan Kara
2011-04-06  5:40                     ` Dave Chinner
2011-04-06  6:18                       ` Jan Kara
2011-04-06 11:21                         ` Dave Chinner
2011-04-06 13:44                           ` Christoph Hellwig
2011-04-06 22:59                             ` Dave Chinner
2011-04-06 17:40                           ` Jan Kara
2011-04-06 22:54                             ` Dave Chinner
2011-04-08 21:33                               ` Jan Kara
2011-05-02  9:07                           ` Surbhi Palande
2011-05-02 10:56                             ` Jan Kara
2011-05-02 11:27                               ` Surbhi Palande
2011-05-02 12:06                                 ` Surbhi Palande
2011-05-02 12:20                                 ` Jan Kara
2011-05-02 12:30                                   ` Surbhi Palande
2011-05-02 13:16                                     ` Jan Kara
2011-05-02 13:22                                       ` Christoph Hellwig
2011-05-02 14:20                                         ` Jan Kara
2011-05-02 14:41                                           ` Christoph Hellwig
2011-05-02 16:23                                             ` Jan Kara
2011-05-02 16:38                                               ` Christoph Hellwig
2011-05-02 13:22                                       ` Surbhi Palande
2011-05-02 13:24                                         ` Christoph Hellwig
2011-05-02 13:27                                           ` Surbhi Palande
2011-05-02 14:26                                             ` Jan Kara
2011-05-02 14:04                                         ` Eric Sandeen
2011-05-03  7:27                                           ` Surbhi Palande
2011-05-03 20:14                                             ` Eric Sandeen
2011-05-04  8:26                                               ` Surbhi Palande
2011-05-04 14:30                                                 ` Eric Sandeen
2011-05-02 14:01                                     ` Eric Sandeen
2011-04-05 10:44                   ` Toshiyuki Okajima
2011-12-09  1:56 ` Masayoshi MIZUMA [this message]
2011-12-15 12:41   ` Masayoshi MIZUMA
2013-11-29  4:58     ` Yongqiang Yang
2013-11-29  8:00       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111209105616.5A4F.61FB500B@jp.fujitsu.com \
    --to=m.mizuma@jp.fujitsu.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=toshi.okajima@jp.fujitsu.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.