All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ext4/023: add regression test for ext4lazyinit_task deadlock V2
@ 2016-08-15 12:22 Dmitry Monakhov
  2016-08-17 10:14 ` Eryu Guan
  0 siblings, 1 reply; 2+ messages in thread
From: Dmitry Monakhov @ 2016-08-15 12:22 UTC (permalink / raw)
  To: fstests; +Cc: eguan, Dmitry Monakhov

global ext4lazyinit task may deadlock on frozen under li_list_mtx lock.
Once that happen all mount/umount tasks also blocked.
Patch proposed: http://marc.info/?l=linux-ext4&m=146893382329138

changes since V1:
 - Fix coding style according to eguan's comments
 - Add deadlock timout, so test failure will not block other tests.


Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 tests/ext4/023     | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/023.out |  2 ++
 tests/ext4/group   |  1 +
 3 files changed, 98 insertions(+)
 create mode 100755 tests/ext4/023
 create mode 100644 tests/ext4/023.out

diff --git a/tests/ext4/023 b/tests/ext4/023
new file mode 100755
index 0000000..0f60490
--- /dev/null
+++ b/tests/ext4/023
@@ -0,0 +1,95 @@
+#! /bin/bash
+# FS QA Test No. ext4/023
+#
+# Regression test for deadlock in ext4lazyinit thread
+# Patch proposed: http://marc.info/?l=linux-ext4&m=146893382329138
+# TODO: Add commit ID
+#
+#
+#-----------------------------------------------------------------------
+# (c) Dmitry Monakhov
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#-----------------------------------------------------------------------
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+status=1
+_cleanup()
+{
+    # Failure cleanup
+    if [ $status -ne 0 ]; then
+	xfs_freeze -u $loop_mnt/1
+	$UMOUNT_PROG $loop_mnt/2
+    fi
+    
+    # Normal cleanup
+    $UMOUNT_PROG $loop_mnt/1
+    _destroy_loop_device $loop_dev1
+    _destroy_loop_device $loop_dev2
+    rm -rf $img_dir $loop_mnt
+    _scratch_unmount
+    cd /
+    #rm -f $tmp.*
+
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs ext4
+_supported_os Linux
+_require_scratch
+_require_loop
+_require_freeze
+_require_xfs_io_command "falloc"
+
+rm -f $seqres.full
+echo "Silence is golden"
+
+img_dir=$SCRATCH_MNT/$seq.fs
+loop_mnt=$SCRATCH_MNT/$seq.mnt
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount
+mkdir -p $img_dir
+mkdir -p $loop_mnt/{1,2}
+
+$XFS_IO_PROG -fc "truncate 1G" $img_dir/1.img
+$XFS_IO_PROG -fc "truncate 1G" $img_dir/2.img
+loop_dev1=$(_create_loop_device $img_dir/1.img)
+loop_dev2=$(_create_loop_device $img_dir/2.img)
+
+# Use nodiscard mode in order to force lazy init mode
+$MKFS_EXT4_PROG -F -E nodiscard,lazy_itable_init=1 $loop_dev1 \
+    >> $seqres.full 2>&1
+$MKFS_EXT4_PROG -F -E nodiscard,lazy_itable_init=1 $loop_dev2 \
+    >> $seqres.full 2>&1
+
+# Freeze first fs and check that we may mount/umount second one.
+# If bug is not fixed mount/umount may deadlock
+_mount $loop_dev1 $loop_mnt/1
+xfs_freeze -f $loop_mnt/1
+for i in $(seq 0 5); do
+    timeout -s SIGKILL 10 $MOUNT_PROG $loop_dev2 $loop_mnt/2 || \
+	_fail "mount: deadlock detected. Fail"
+    sleep 1
+    timeout -s SIGKILL 10 $UMOUNT_PROG $loop_mnt/2 || \
+	_fail "umount: deadlock detected. Fail"
+    sleep 1
+done
+xfs_freeze -u $loop_mnt/1
+status=0
+exit
diff --git a/tests/ext4/023.out b/tests/ext4/023.out
new file mode 100644
index 0000000..5c4197b
--- /dev/null
+++ b/tests/ext4/023.out
@@ -0,0 +1,2 @@
+QA output created by 023
+Silence is golden
diff --git a/tests/ext4/group b/tests/ext4/group
index e0772ab..6ccc5e8 100644
--- a/tests/ext4/group
+++ b/tests/ext4/group
@@ -25,6 +25,7 @@
 020 auto quick ioctl rw
 021 auto quick
 022 auto quick attr dangerous
+023 auto quick
 271 auto rw quick
 301 aio auto ioctl rw stress
 302 aio auto ioctl rw stress
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] ext4/023: add regression test for ext4lazyinit_task deadlock V2
  2016-08-15 12:22 [PATCH] ext4/023: add regression test for ext4lazyinit_task deadlock V2 Dmitry Monakhov
@ 2016-08-17 10:14 ` Eryu Guan
  0 siblings, 0 replies; 2+ messages in thread
From: Eryu Guan @ 2016-08-17 10:14 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: fstests

On Mon, Aug 15, 2016 at 04:22:07PM +0400, Dmitry Monakhov wrote:
> global ext4lazyinit task may deadlock on frozen under li_list_mtx lock.
> Once that happen all mount/umount tasks also blocked.
> Patch proposed: http://marc.info/?l=linux-ext4&m=146893382329138
> 
> changes since V1:
>  - Fix coding style according to eguan's comments
>  - Add deadlock timout, so test failure will not block other tests.
> 
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
>  tests/ext4/023     | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/ext4/023.out |  2 ++
>  tests/ext4/group   |  1 +
>  3 files changed, 98 insertions(+)
>  create mode 100755 tests/ext4/023
>  create mode 100644 tests/ext4/023.out
> 
> diff --git a/tests/ext4/023 b/tests/ext4/023
> new file mode 100755
> index 0000000..0f60490
> --- /dev/null
> +++ b/tests/ext4/023
> @@ -0,0 +1,95 @@
> +#! /bin/bash
> +# FS QA Test No. ext4/023
> +#
> +# Regression test for deadlock in ext4lazyinit thread
> +# Patch proposed: http://marc.info/?l=linux-ext4&m=146893382329138
> +# TODO: Add commit ID
> +#
> +#
> +#-----------------------------------------------------------------------
> +# (c) Dmitry Monakhov
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#-----------------------------------------------------------------------
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +status=1
> +_cleanup()
> +{
> +    # Failure cleanup
> +    if [ $status -ne 0 ]; then
> +	xfs_freeze -u $loop_mnt/1
> +	$UMOUNT_PROG $loop_mnt/2

I hit $loop_mnt/2 not properly unmounted in failure case, I think that's
because test hit "mount: deadlock detected. Fail" and $loop_mnt/2 was
not mounted yet right after unfreezing $loop_mnt/1 in cleanup.

[root@dhcp-66-86-11 xfstests]# diff -u tests/ext4/023.out /root/workspace/xfstests/results//ext4_4k/ext4/023.out.bad
--- tests/ext4/023.out  2016-08-17 17:08:53.702000000 +0800
+++ /root/workspace/xfstests/results//ext4_4k/ext4/023.out.bad  2016-08-17 18:05:30.164000000 +0800
@@ -1,2 +1,10 @@
 QA output created by 023
 Silence is golden
+./tests/ext4/023: line 86: 10983 Killed                  timeout -s SIGKILL 10 $MOUNT_PROG $loop_dev2 $loop_mnt/2
+mount: deadlock detected. Fail
+(see /root/workspace/xfstests/results//ext4_4k/ext4/023.full for details)
+umount: /mnt/testarea/scratch/023.mnt/2: not mounted
+rm: cannot remove '/mnt/testarea/scratch/023.mnt/2': Device or resource busy
+umount: /mnt/testarea/scratch: target is busy.
+        (In some cases useful info about processes that use
+         the device is found by lsof(8) or fuser(1))

And this failure prevents subsequent tests from running, because
$SCRATCH_DEV is busy.

Adding "sleep 1" before "$UMOUNT_PROG $loop_mnt/2" works for me, but
that's more like a workaround.

> +    fi
> +    
> +    # Normal cleanup
> +    $UMOUNT_PROG $loop_mnt/1
> +    _destroy_loop_device $loop_dev1
> +    _destroy_loop_device $loop_dev2
> +    rm -rf $img_dir $loop_mnt
> +    _scratch_unmount
> +    cd /
> +    #rm -f $tmp.*

This line should be uncommented.

And tab and space are mixed, please use tab for indention.

> +
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# real QA test starts here
> +_supported_fs ext4
> +_supported_os Linux
> +_require_scratch
> +_require_loop
> +_require_freeze
> +_require_xfs_io_command "falloc"
> +
> +rm -f $seqres.full
> +echo "Silence is golden"
> +
> +img_dir=$SCRATCH_MNT/$seq.fs
> +loop_mnt=$SCRATCH_MNT/$seq.mnt
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount
> +mkdir -p $img_dir
> +mkdir -p $loop_mnt/{1,2}
> +
> +$XFS_IO_PROG -fc "truncate 1G" $img_dir/1.img
> +$XFS_IO_PROG -fc "truncate 1G" $img_dir/2.img
> +loop_dev1=$(_create_loop_device $img_dir/1.img)
> +loop_dev2=$(_create_loop_device $img_dir/2.img)
> +
> +# Use nodiscard mode in order to force lazy init mode
> +$MKFS_EXT4_PROG -F -E nodiscard,lazy_itable_init=1 $loop_dev1 \
> +    >> $seqres.full 2>&1
> +$MKFS_EXT4_PROG -F -E nodiscard,lazy_itable_init=1 $loop_dev2 \
> +    >> $seqres.full 2>&1
> +
> +# Freeze first fs and check that we may mount/umount second one.
> +# If bug is not fixed mount/umount may deadlock
> +_mount $loop_dev1 $loop_mnt/1
> +xfs_freeze -f $loop_mnt/1
> +for i in $(seq 0 5); do
> +    timeout -s SIGKILL 10 $MOUNT_PROG $loop_dev2 $loop_mnt/2 || \

Need test the existence of command timeout, e.g.

in common/config:
export TIMEOUT_PROG="`set_prog_path timeout`"

and in test:
_require_command "$TIMEOUT_PROG" timeout

Thanks,
Eryu

> +	_fail "mount: deadlock detected. Fail"
> +    sleep 1
> +    timeout -s SIGKILL 10 $UMOUNT_PROG $loop_mnt/2 || \
> +	_fail "umount: deadlock detected. Fail"
> +    sleep 1
> +done
> +xfs_freeze -u $loop_mnt/1
> +status=0
> +exit
> diff --git a/tests/ext4/023.out b/tests/ext4/023.out
> new file mode 100644
> index 0000000..5c4197b
> --- /dev/null
> +++ b/tests/ext4/023.out
> @@ -0,0 +1,2 @@
> +QA output created by 023
> +Silence is golden
> diff --git a/tests/ext4/group b/tests/ext4/group
> index e0772ab..6ccc5e8 100644
> --- a/tests/ext4/group
> +++ b/tests/ext4/group
> @@ -25,6 +25,7 @@
>  020 auto quick ioctl rw
>  021 auto quick
>  022 auto quick attr dangerous
> +023 auto quick
>  271 auto rw quick
>  301 aio auto ioctl rw stress
>  302 aio auto ioctl rw stress
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-08-17 10:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-15 12:22 [PATCH] ext4/023: add regression test for ext4lazyinit_task deadlock V2 Dmitry Monakhov
2016-08-17 10:14 ` Eryu Guan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.