All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tests 0/5] tests: add some regression tests
@ 2023-05-23 13:38 Yu Kuai
  2023-05-23 13:38 ` [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10 Yu Kuai
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Yu Kuai @ 2023-05-23 13:38 UTC (permalink / raw)
  To: linux-raid, mariusz.tkaczyk, jes, pmenzel, logang, song
  Cc: yukuai3, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Yu Kuai (5):
  tests: add a new test to check if pluged bio is unlimited for raid10
  tests: add a new test for rdev lifetime
  tests: support to skip checking dmesg
  tests: add a regression test for raid10 deadlock
  tests: add a regression test for raid456 deadlock

 test                                |  8 ++-
 tests/22raid10plug                  | 41 ++++++++++++++
 tests/23rdev-lifetime               | 48 ++++++++++++++++
 tests/24raid10deadlock              | 85 +++++++++++++++++++++++++++++
 tests/24raid10deadlock.inject_error |  0
 tests/24raid456deadlock             | 56 +++++++++++++++++++
 6 files changed, 236 insertions(+), 2 deletions(-)
 create mode 100644 tests/22raid10plug
 create mode 100644 tests/23rdev-lifetime
 create mode 100644 tests/24raid10deadlock
 create mode 100644 tests/24raid10deadlock.inject_error
 create mode 100644 tests/24raid456deadlock

-- 
2.39.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10
  2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
@ 2023-05-23 13:38 ` Yu Kuai
  2023-05-24  7:53   ` Mariusz Tkaczyk
  2023-05-23 13:38 ` [PATCH tests 2/5] tests: add a new test for rdev lifetime Yu Kuai
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-23 13:38 UTC (permalink / raw)
  To: linux-raid, mariusz.tkaczyk, jes, pmenzel, logang, song
  Cc: yukuai3, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Pluged bio is unlimited means that all submitted bio will be pluged, and
those bio won't be issued to underlaying disks until blk_finish_plug() or
blk_flush_plug(). In this case, a lot memory will be used for
raid10_bio and io latency will be very bad.

This test do some dirty pages writeback for raid10, where plug is used, and
check if device inflight counter exceed threshold.

This problem is supposed to be fixed by [1].

[1] https://lore.kernel.org/linux-raid/20230420112946.2869956-9-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 tests/22raid10plug | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)
 create mode 100644 tests/22raid10plug

diff --git a/tests/22raid10plug b/tests/22raid10plug
new file mode 100644
index 00000000..fde4ce80
--- /dev/null
+++ b/tests/22raid10plug
@@ -0,0 +1,41 @@
+devs="$dev0 $dev1 $dev2 $dev3 $dev4 $dev5"
+
+# test will fail if inflight is observed to be greater
+threshold=4096
+
+# create a simple raid10
+mdadm --create --run --level=raid10 --raid-disks 6 $md0 $devs --bitmap=internal --assume-clean
+if [ $? -ne 0 ]; then
+        die "create raid10 failed"
+fi
+
+old_background=`cat /proc/sys/vm/dirty_background_ratio`
+old=`cat /proc/sys/vm/dirty_ratio`
+
+# trigger background writeback
+echo 0 > /proc/sys/vm/dirty_background_ratio
+echo 60 > /proc/sys/vm/dirty_ratio
+
+# io pressure with buffer write
+fio -filename=$md0 -ioengine=libaio -rw=write -bs=4k -numjobs=1 -iodepth=128 -name=test -runtime=10 &
+
+pid=$!
+
+sleep 2
+
+# check if inflight exceed threshold
+while true; do
+        tmp=`cat /sys/block/md0/inflight | awk '{printf("%d\n", $1 + $2);}'`
+        if [ $tmp -gt $threshold ]; then
+                die "inflight is greater than 4096"
+                break
+        elif [ $tmp -eq 0 ]; then
+                break
+        fi
+        sleep 0.1
+done
+
+kill -9 $pid
+mdadm -S $md0
+echo $old_background > /proc/sys/vm/dirty_background_ratio
+echo $old > /proc/sys/vm/dirty_ratio
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH tests 2/5] tests: add a new test for rdev lifetime
  2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
  2023-05-23 13:38 ` [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10 Yu Kuai
@ 2023-05-23 13:38 ` Yu Kuai
  2023-05-24  8:33   ` Mariusz Tkaczyk
  2023-05-23 13:38 ` [PATCH tests 3/5] tests: support to skip checking dmesg Yu Kuai
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-23 13:38 UTC (permalink / raw)
  To: linux-raid, mariusz.tkaczyk, jes, pmenzel, logang, song
  Cc: yukuai3, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

This test add and remove a underlying disk to raid concurretly, verify
that the following problem is fixed:

run mdadm test 23rdev-lifetime at Fri Apr 28 03:25:30 UTC 2023
md: could not open device unknown-block(1,0).
sysfs: cannot create duplicate filename '/devices/virtual/block/md0/md/dev-ram0'
CPU: 26 PID: 10521 Comm: test Not tainted 6.3.0-rc2-00134-g7b3a8828043c #115
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe7/0x180
 dump_stack+0x18/0x30
 sysfs_warn_dup+0xa2/0xd0
 sysfs_create_dir_ns+0x119/0x140
 kobject_add_internal+0x143/0x4d0
 kobject_add_varg+0x35/0x70
 kobject_add+0x64/0xd0
 bind_rdev_to_array+0x254/0x840 [md_mod]
 new_dev_store+0x14d/0x350 [md_mod]
 md_attr_store+0xc1/0x1a0 [md_mod]
 sysfs_kf_write+0x51/0x70
 kernfs_fop_write_iter+0x188/0x270
 vfs_write+0x27e/0x460
 ksys_write+0x85/0x180
 __x64_sys_write+0x21/0x30
 do_syscall_64+0x6c/0xe0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f26bacf5387
Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 84
RSP: 002b:00007ffe98d79e68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f26bacf5387
RDX: 0000000000000004 RSI: 000055bd10282bf0 RDI: 0000000000000001
RBP: 000055bd10282bf0 R08: 000000000000000a R09: 00007f26bad8b4e0
R10: 00007f26bad8b3e0 R11: 0000000000000246 R12: 0000000000000004
R13: 00007f26badc8520 R14: 0000000000000004 R15: 00007f26badc8700
 </TASK>

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 tests/23rdev-lifetime | 48 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
 create mode 100644 tests/23rdev-lifetime

diff --git a/tests/23rdev-lifetime b/tests/23rdev-lifetime
new file mode 100644
index 00000000..df0fcdd9
--- /dev/null
+++ b/tests/23rdev-lifetime
@@ -0,0 +1,48 @@
+devname=${dev0##*/}
+devt=`cat /sys/block/$devname/dev`
+dmesg_marker="run mdadm test 23rdev-lifetime at `date`"
+pid1=0
+pid2=0
+
+stop() {
+        if [[ $1 -ne 0 ]]; then
+                kill -9 $1
+        fi
+}
+
+trap 'stop $pid1; stop $pid2;' EXIT
+
+echo "$dmesg_marker" >> /dev/kmsg
+
+check_dmesg() {
+        dmesg | grep -A 9999 "$dmesg_marker" | grep "sysfs: cannot create duplicate filename"
+
+        if [[ $? -eq 0 ]]; then
+                die "sysfs dumplicate"
+        fi
+}
+
+add_by_sysfs() {
+        while true; do
+                echo $devt > /sys/block/md0/md/new_dev
+                check_dmesg
+        done
+}
+
+remove_by_sysfs(){
+        while true; do
+                echo remove > /sys/block/md0/md/dev-${devname}/state
+                check_dmesg
+        done
+}
+
+echo md0 > /sys/module/md_mod/parameters/new_array || die "create md0 failed"
+
+add_by_sysfs &
+pid1=$!
+
+remove_by_sysfs &
+pid2=$!
+
+sleep 5
+echo clear > /sys/block/md0/md/array_state
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH tests 3/5] tests: support to skip checking dmesg
  2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
  2023-05-23 13:38 ` [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10 Yu Kuai
  2023-05-23 13:38 ` [PATCH tests 2/5] tests: add a new test for rdev lifetime Yu Kuai
@ 2023-05-23 13:38 ` Yu Kuai
  2023-05-24  8:40   ` Mariusz Tkaczyk
  2023-05-23 13:38 ` [PATCH tests 4/5] tests: add a regression test for raid10 deadlock Yu Kuai
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-23 13:38 UTC (permalink / raw)
  To: linux-raid, mariusz.tkaczyk, jes, pmenzel, logang, song
  Cc: yukuai3, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Prepare to add a regression test for raid10 that require error injection
to trigger error path, and kernel will complain about io error, checking
dmesg for error log will make it impossible to pass this test.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 test | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/test b/test
index 61d9ee83..b244453b 100755
--- a/test
+++ b/test
@@ -107,8 +107,12 @@ do_test() {
 		echo -ne "$_script... "
 		if ( set -ex ; . $_script ) &> $targetdir/log
 		then
-			dmesg | grep -iq "error\|call trace\|segfault" &&
-				die "dmesg prints errors when testing $_basename!"
+			if [ -f "${_script}.inject_error" ]; then
+				echo "dmesg checking is skipped because test inject error"
+			else
+				dmesg | grep -iq "error\|call trace\|segfault" &&
+					die "dmesg prints errors when testing $_basename!"
+			fi
 			echo "succeeded"
 			_fail=0
 		else
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH tests 4/5] tests: add a regression test for raid10 deadlock
  2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
                   ` (2 preceding siblings ...)
  2023-05-23 13:38 ` [PATCH tests 3/5] tests: support to skip checking dmesg Yu Kuai
@ 2023-05-23 13:38 ` Yu Kuai
  2023-05-24  8:48   ` Mariusz Tkaczyk
  2023-05-23 13:39 ` [PATCH tests 5/5] tests: add a regression test for raid456 deadlock Yu Kuai
  2023-05-24  7:56 ` [PATCH tests 0/5] tests: add some regression tests Paul Menzel
  5 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-23 13:38 UTC (permalink / raw)
  To: linux-raid, mariusz.tkaczyk, jes, pmenzel, logang, song
  Cc: yukuai3, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The deadlock is described in [1], it's fixed first by [2], however,
it turns out this commit will trigger other problems[3], hence this
commit will be reverted and the deadlock is supposed to be fixed by [1].

[1] https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/
[2] https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
[3] https://lore.kernel.org/linux-raid/20230322064122.2384589-2-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 tests/24raid10deadlock              | 85 +++++++++++++++++++++++++++++
 tests/24raid10deadlock.inject_error |  0
 2 files changed, 85 insertions(+)
 create mode 100644 tests/24raid10deadlock
 create mode 100644 tests/24raid10deadlock.inject_error

diff --git a/tests/24raid10deadlock b/tests/24raid10deadlock
new file mode 100644
index 00000000..27869840
--- /dev/null
+++ b/tests/24raid10deadlock
@@ -0,0 +1,85 @@
+devs="$dev0 $dev1 $dev2 $dev3"
+runtime=120
+pid=""
+
+set_up_injection()
+{
+	echo -1 > /sys/kernel/debug/fail_make_request/times
+	echo 1 > /sys/kernel/debug/fail_make_request/probability
+	echo 0 > /sys/kernel/debug/fail_make_request/verbose
+	echo 1 > /sys/block/${1##*/}/make-it-fail
+}
+
+clean_up_injection()
+{
+	echo 0 > /sys/block/${1##*/}/make-it-fail
+	echo 0 > /sys/kernel/debug/fail_make_request/times
+	echo 0 > /sys/kernel/debug/fail_make_request/probability
+	echo 2 > /sys/kernel/debug/fail_make_request/verbose
+}
+
+test_rdev()
+{
+	while true; do
+		mdadm -f $md0 $1 &> /dev/null
+		mdadm -r $md0 $1 &> /dev/null
+		mdadm --zero-superblock $1 &> /dev/null
+		mdadm -a $md0 $1 &> /dev/null
+		sleep $2
+	done
+}
+
+test_write_action()
+{
+	while true; do
+		echo frozen > /sys/block/md0/md/sync_action
+		echo idle > /sys/block/md0/md/sync_action
+		sleep 0.1
+	done
+}
+
+set_up_test()
+{
+	fio -h &> /dev/null || die "fio not found"
+
+	# create a simple raid10
+	mdadm -Cv -R -n 4 -l10 $md0 $devs || die "create raid10 failed"
+}
+
+clean_up_test()
+{
+	clean_up_injection $dev0
+	kill -9 $pid
+	pkill -9 fio
+
+	sleep 1
+
+	if ! mdadm -S $md0; then
+		die "can't stop array, deadlock is probably triggered"
+	fi
+}
+
+cat /sys/kernel/debug/fail_make_request/times || die "fault injection is not enabled"
+
+trap 'clean_up_test' EXIT
+
+set_up_test || die "set up test failed"
+
+# backgroup io pressure
+fio -filename=$md0 -rw=randwrite -direct=1 -name=test -bs=4k -numjobs=16 -iodepth=16 &
+
+# trigger add/remove device by io failure
+set_up_injection $dev0
+test_rdev $dev0 2 &
+pid="$pid $!"
+
+# add/remove device directly
+test_rdev $dev3 10 &
+pid="$pid $!"
+
+test_write_action &
+pid="$pid $!"
+
+sleep $runtime
+
+exit 0
diff --git a/tests/24raid10deadlock.inject_error b/tests/24raid10deadlock.inject_error
new file mode 100644
index 00000000..e69de29b
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH tests 5/5] tests: add a regression test for raid456 deadlock
  2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
                   ` (3 preceding siblings ...)
  2023-05-23 13:38 ` [PATCH tests 4/5] tests: add a regression test for raid10 deadlock Yu Kuai
@ 2023-05-23 13:39 ` Yu Kuai
  2023-05-24  9:09   ` Mariusz Tkaczyk
  2023-05-24  7:56 ` [PATCH tests 0/5] tests: add some regression tests Paul Menzel
  5 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-23 13:39 UTC (permalink / raw)
  To: linux-raid, mariusz.tkaczyk, jes, pmenzel, logang, song
  Cc: yukuai3, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The deadlock is described in [1], as the last patch described, it's
fixed first by [2], however this fix will be reverted and the deadlock
is supposed to be fixed by [3].

[1] https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t
[2] https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
[3] https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 tests/24raid456deadlock | 56 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)
 create mode 100644 tests/24raid456deadlock

diff --git a/tests/24raid456deadlock b/tests/24raid456deadlock
new file mode 100644
index 00000000..161c3ab8
--- /dev/null
+++ b/tests/24raid456deadlock
@@ -0,0 +1,56 @@
+devs="$dev0 $dev1 $dev2 $dev3 $dev4 $dev5"
+runtime=120
+pid=""
+old=`cat /proc/sys/vm/dirty_background_ratio`
+
+test_write_action()
+{
+	while true; do
+		echo check > /sys/block/md0/md/sync_action &> /dev/null
+		sleep 0.1
+		echo idle > /sys/block/md0/md/sync_action &> /dev/null
+	done
+}
+
+test_write_back()
+{
+	fio -filename=$md0 -bs=4k -rw=write -numjobs=1 -name=test \
+		-time_based -runtime=$runtime &> /dev/null
+}
+
+set_up_test()
+{
+	fio -h &> /dev/null || die "fio not found"
+
+	# create a simple raid6
+	mdadm -Cv -R -n 6 -l6 $md0 $devs --assume-clean || die "create raid6 failed"
+
+	# trigger dirty pages write back
+	echo 0 > /proc/sys/vm/dirty_background_ratio
+}
+
+clean_up_test()
+{
+	echo $old > /proc/sys/vm/dirty_background_ratio
+
+	kill -9 $pid
+	sync $md0
+
+	if ! mdadm -S $md0; then
+		die "can't stop array, deadlock is probably triggered"
+	fi
+}
+
+trap 'clean_up_test' EXIT
+
+set_up_test || die "set up test failed"
+
+test_write_back &
+pid="$pid $!"
+
+test_write_action &
+pid="$pid $!"
+
+sleep $runtime
+
+exit 0
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10
  2023-05-23 13:38 ` [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10 Yu Kuai
@ 2023-05-24  7:53   ` Mariusz Tkaczyk
  2023-05-24  8:26     ` Yu Kuai
  0 siblings, 1 reply; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24  7:53 UTC (permalink / raw)
  To: Yu Kuai; +Cc: linux-raid, jes, pmenzel, logang, song, yukuai3, yangerkun

On Tue, 23 May 2023 21:38:56 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> Pluged bio is unlimited means that all submitted bio will be pluged, and
> those bio won't be issued to underlaying disks until blk_finish_plug() or
> blk_flush_plug(). In this case, a lot memory will be used for
> raid10_bio and io latency will be very bad.
> 
> This test do some dirty pages writeback for raid10, where plug is used, and
> check if device inflight counter exceed threshold.
> 
> This problem is supposed to be fixed by [1].

The test here is for md, mdadm has nothing to do here. I'm not against it
but please extract it to separate directory because like "md_tests".
We need to start grouping tests.
> 
> [1]
> https://lore.kernel.org/linux-raid/20230420112946.2869956-9-yukuai1@huaweicloud.com/
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  tests/22raid10plug | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>  create mode 100644 tests/22raid10plug
> 
> diff --git a/tests/22raid10plug b/tests/22raid10plug
> new file mode 100644
> index 00000000..fde4ce80
> --- /dev/null
> +++ b/tests/22raid10plug
> @@ -0,0 +1,41 @@
> +devs="$dev0 $dev1 $dev2 $dev3 $dev4 $dev5"
> +
> +# test will fail if inflight is observed to be greater
> +threshold=4096
> +
> +# create a simple raid10
> +mdadm --create --run --level=raid10 --raid-disks 6 $md0 $devs
> --bitmap=internal --assume-clean
You don't need 6 drives, 4 is enough (unless I miss something).
 
> +if [ $? -ne 0 ]; then
> +        die "create raid10 failed"
> +fi
> +
> +old_background=`cat /proc/sys/vm/dirty_background_ratio`
> +old=`cat /proc/sys/vm/dirty_ratio`
> +
> +# trigger background writeback
> +echo 0 > /proc/sys/vm/dirty_background_ratio
> +echo 60 > /proc/sys/vm/dirty_ratio
> +
> +# io pressure with buffer write
> +fio -filename=$md0 -ioengine=libaio -rw=write -bs=4k -numjobs=1 -iodepth=128
> -name=test -runtime=10 & +
> +pid=$!
> +
> +sleep 2
> +
> +# check if inflight exceed threshold
> +while true; do
> +        tmp=`cat /sys/block/md0/inflight | awk '{printf("%d\n", $1 + $2);}'`
> +        if [ $tmp -gt $threshold ]; then
> +                die "inflight is greater than 4096"

The message here is not meaningful, what 4096 is? Please add comment describing
why value above 4096 causes an error. We need to understand how the future
changes in md may affect this setting (I think that there is a correlation
between the value and MAX_PLUG_BIO).

> +                break

the break is dead condition because die has `exit` inside.

> +        elif [ $tmp -eq 0 ]; then
> +                break
> +        fi

I would prefer to make verification independent from user
environment and md device inflight state. Simply, we should rely on fio. If
there is a fio in background we should check if inflight doesn't exceeded
expected value. we should finish when fio ends. You set runtime to 10, please
think if we can make this shorter.

Thanks,
Mariusz

> +        sleep 0.1
> +done
> +
> +kill -9 $pid
> +mdadm -S $md0
> +echo $old_background > /proc/sys/vm/dirty_background_ratio
> +echo $old > /proc/sys/vm/dirty_ratio


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 0/5] tests: add some regression tests
  2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
                   ` (4 preceding siblings ...)
  2023-05-23 13:39 ` [PATCH tests 5/5] tests: add a regression test for raid456 deadlock Yu Kuai
@ 2023-05-24  7:56 ` Paul Menzel
  2023-05-24  9:11   ` Yu Kuai
  5 siblings, 1 reply; 19+ messages in thread
From: Paul Menzel @ 2023-05-24  7:56 UTC (permalink / raw)
  To: Yu Kuai
  Cc: yukuai3, yangerkun, linux-raid, mariusz.tkaczyk, jes, logang, song

Dear Yu,


Thank you so very much for adding tests. For those unfamiliar with the 
test framework, could you add one line how to run the newly added tests?


Kind regards,

Paul


PS: Your participation in and contributions to the Linux kernel are very 
much appreciated. Great work, and good to have you.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10
  2023-05-24  7:53   ` Mariusz Tkaczyk
@ 2023-05-24  8:26     ` Yu Kuai
  2023-05-24  9:00       ` Mariusz Tkaczyk
  0 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-24  8:26 UTC (permalink / raw)
  To: Mariusz Tkaczyk, Yu Kuai
  Cc: linux-raid, jes, pmenzel, logang, song, yangerkun, yukuai (C)

Hi,

在 2023/05/24 15:53, Mariusz Tkaczyk 写道:
> On Tue, 23 May 2023 21:38:56 +0800
> Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Pluged bio is unlimited means that all submitted bio will be pluged, and
>> those bio won't be issued to underlaying disks until blk_finish_plug() or
>> blk_flush_plug(). In this case, a lot memory will be used for
>> raid10_bio and io latency will be very bad.
>>
>> This test do some dirty pages writeback for raid10, where plug is used, and
>> check if device inflight counter exceed threshold.
>>
>> This problem is supposed to be fixed by [1].
> 
> The test here is for md, mdadm has nothing to do here. I'm not against it
> but please extract it to separate directory because like "md_tests".
> We need to start grouping tests.

Sorry I don't understand this, currently the test for md is here, I
don't see why need a seperate directory, is there a plan to move all
these tests into new directory?

>>
>> [1]
>> https://lore.kernel.org/linux-raid/20230420112946.2869956-9-yukuai1@huaweicloud.com/
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   tests/22raid10plug | 41 +++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 41 insertions(+)
>>   create mode 100644 tests/22raid10plug
>>
>> diff --git a/tests/22raid10plug b/tests/22raid10plug
>> new file mode 100644
>> index 00000000..fde4ce80
>> --- /dev/null
>> +++ b/tests/22raid10plug
>> @@ -0,0 +1,41 @@
>> +devs="$dev0 $dev1 $dev2 $dev3 $dev4 $dev5"
>> +
>> +# test will fail if inflight is observed to be greater
>> +threshold=4096
>> +
>> +# create a simple raid10
>> +mdadm --create --run --level=raid10 --raid-disks 6 $md0 $devs
>> --bitmap=internal --assume-clean
> You don't need 6 drives, 4 is enough (unless I miss something).

Yes, 4 is enough.

>   
>> +if [ $? -ne 0 ]; then
>> +        die "create raid10 failed"
>> +fi
>> +
>> +old_background=`cat /proc/sys/vm/dirty_background_ratio`
>> +old=`cat /proc/sys/vm/dirty_ratio`
>> +
>> +# trigger background writeback
>> +echo 0 > /proc/sys/vm/dirty_background_ratio
>> +echo 60 > /proc/sys/vm/dirty_ratio
>> +
>> +# io pressure with buffer write
>> +fio -filename=$md0 -ioengine=libaio -rw=write -bs=4k -numjobs=1 -iodepth=128
>> -name=test -runtime=10 & +
>> +pid=$!
>> +
>> +sleep 2
>> +
>> +# check if inflight exceed threshold
>> +while true; do
>> +        tmp=`cat /sys/block/md0/inflight | awk '{printf("%d\n", $1 + $2);}'`
>> +        if [ $tmp -gt $threshold ]; then
>> +                die "inflight is greater than 4096"
> 
> The message here is not meaningful, what 4096 is? Please add comment describing
> why value above 4096 causes an error. We need to understand how the future
> changes in md may affect this setting (I think that there is a correlation
> between the value and MAX_PLUG_BIO).

MAX_PLUG_BIO is just limit for one task, I'm not sure if there will only
be one task issuing io, that why I choose a much larger value 4096.
> 
>> +                break
> 
> the break is dead condition because die has `exit` inside.

Yes, I found that while writing other tests, I'll use trap to make sure
everthing is cleaned up.

> 
>> +        elif [ $tmp -eq 0 ]; then
>> +                break
>> +        fi
> 
> I would prefer to make verification independent from user
> environment and md device inflight state. Simply, we should rely on fio. If
> there is a fio in background we should check if inflight doesn't exceeded
> expected value. we should finish when fio ends. You set runtime to 10, please
> think if we can make this shorter.

I'll change that check inflight when fio is in backgroud. But I don't
10s can be shorter, in my VM, it can take 1-5s to trigger the problem, I
was worried that 10s might not be enough for someone...

Thanks,
Kuai

> 
> Thanks,
> Mariusz
> 
>> +        sleep 0.1
>> +done
>> +
>> +kill -9 $pid
>> +mdadm -S $md0
>> +echo $old_background > /proc/sys/vm/dirty_background_ratio
>> +echo $old > /proc/sys/vm/dirty_ratio
> 
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 2/5] tests: add a new test for rdev lifetime
  2023-05-23 13:38 ` [PATCH tests 2/5] tests: add a new test for rdev lifetime Yu Kuai
@ 2023-05-24  8:33   ` Mariusz Tkaczyk
  2023-05-24  9:05     ` Yu Kuai
  0 siblings, 1 reply; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24  8:33 UTC (permalink / raw)
  To: Yu Kuai; +Cc: linux-raid, jes, pmenzel, logang, song, yukuai3, yangerkun

On Tue, 23 May 2023 21:38:57 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> This test add and remove a underlying disk to raid concurretly, verify
> that the following problem is fixed:

As in previous patch, feel free to move it into separate directory.

This test is limited only to this particular problem you resolved because you
are verifying error message in dmesg. It has no additional value because
probability that this issue will ever more occur in the same shape is
minimal.

IMO you should check how "remove" and "add" are handled, if errors are
returned, if there is no trace in dmesg or if processes are not blocked in
kernel.
You can check for this error message as a additional step at the end of test
but not as a mandatory test pass criteria.

In current form it gives as a knowledge that particular kernel doesn't have your
fix, that is all. Because it is race, probably it is not impacting real life
scenarios, so that gives a weak motivation to backport the fix (only security
reasons matters).

I don't see that this particular scenario requires test. You need to make it
more valuable for the future.
 
Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 3/5] tests: support to skip checking dmesg
  2023-05-23 13:38 ` [PATCH tests 3/5] tests: support to skip checking dmesg Yu Kuai
@ 2023-05-24  8:40   ` Mariusz Tkaczyk
  0 siblings, 0 replies; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24  8:40 UTC (permalink / raw)
  To: Yu Kuai; +Cc: linux-raid, jes, pmenzel, logang, song, yukuai3, yangerkun

On Tue, 23 May 2023 21:38:58 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> Prepare to add a regression test for raid10 that require error injection
> to trigger error path, and kernel will complain about io error, checking
> dmesg for error log will make it impossible to pass this test.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  test | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/test b/test
> index 61d9ee83..b244453b 100755
> --- a/test
> +++ b/test
> @@ -107,8 +107,12 @@ do_test() {
>  		echo -ne "$_script... "
>  		if ( set -ex ; . $_script ) &> $targetdir/log
>  		then
> -			dmesg | grep -iq "error\|call trace\|segfault" &&
> -				die "dmesg prints errors when testing
> $_basename!"
> +			if [ -f "${_script}.inject_error" ]; then
> +				echo "dmesg checking is skipped because test
> inject error"

Following the convention, I would say that it is a "negative test' or just
'negative/md_negative' so errors are expected.

Why not just add some sort of "negative" to the test name?

Anyway:
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 4/5] tests: add a regression test for raid10 deadlock
  2023-05-23 13:38 ` [PATCH tests 4/5] tests: add a regression test for raid10 deadlock Yu Kuai
@ 2023-05-24  8:48   ` Mariusz Tkaczyk
  2023-05-24  9:07     ` Yu Kuai
  0 siblings, 1 reply; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24  8:48 UTC (permalink / raw)
  To: Yu Kuai; +Cc: linux-raid, jes, pmenzel, logang, song, yukuai3, yangerkun

On Tue, 23 May 2023 21:38:59 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> The deadlock is described in [1], it's fixed first by [2], however,
> it turns out this commit will trigger other problems[3], hence this
> commit will be reverted and the deadlock is supposed to be fixed by [1].
> 
> [1]
> https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/
> [2]
> https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
> [3]
> https://lore.kernel.org/linux-raid/20230322064122.2384589-2-yukuai1@huaweicloud.com/
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  tests/24raid10deadlock              | 85 +++++++++++++++++++++++++++++
>  tests/24raid10deadlock.inject_error |  0
>  2 files changed, 85 insertions(+)
>  create mode 100644 tests/24raid10deadlock
>  create mode 100644 tests/24raid10deadlock.inject_error
> 
> diff --git a/tests/24raid10deadlock b/tests/24raid10deadlock
> new file mode 100644
> index 00000000..27869840
> --- /dev/null
> +++ b/tests/24raid10deadlock
> @@ -0,0 +1,85 @@
> +devs="$dev0 $dev1 $dev2 $dev3"
> +runtime=120
> +pid=""
> +
> +set_up_injection()
> +{
> +	echo -1 > /sys/kernel/debug/fail_make_request/times
> +	echo 1 > /sys/kernel/debug/fail_make_request/probability
> +	echo 0 > /sys/kernel/debug/fail_make_request/verbose
> +	echo 1 > /sys/block/${1##*/}/make-it-fail
> +}
> +
> +clean_up_injection()
> +{
> +	echo 0 > /sys/block/${1##*/}/make-it-fail
> +	echo 0 > /sys/kernel/debug/fail_make_request/times
> +	echo 0 > /sys/kernel/debug/fail_make_request/probability
> +	echo 2 > /sys/kernel/debug/fail_make_request/verbose
> +}
> +
> +test_rdev()
> +{
> +	while true; do
> +		mdadm -f $md0 $1 &> /dev/null
> +		mdadm -r $md0 $1 &> /dev/null
> +		mdadm --zero-superblock $1 &> /dev/null
> +		mdadm -a $md0 $1 &> /dev/null
> +		sleep $2
> +	done
> +}
> +
> +test_write_action()
> +{
> +	while true; do
> +		echo frozen > /sys/block/md0/md/sync_action
> +		echo idle > /sys/block/md0/md/sync_action
> +		sleep 0.1
> +	done
> +}
> +
> +set_up_test()
> +{
> +	fio -h &> /dev/null || die "fio not found"
> +
> +	# create a simple raid10
> +	mdadm -Cv -R -n 4 -l10 $md0 $devs || die "create raid10 failed"
> +}
> +
> +clean_up_test()
> +{
> +	clean_up_injection $dev0
> +	kill -9 $pid
> +	pkill -9 fio
> +
> +	sleep 1
> +
> +	if ! mdadm -S $md0; then
> +		die "can't stop array, deadlock is probably triggered"
> +	fi

stop may fail from different reasons I see it as too big to be marker of
"deadlock". I know that --stop still fails because md is unable to clear sysfs
attrs in expected time (or a least it was a problem few years ago). Is there a
better way to check that? I would prefer, different less complicated action to
exclude false positives.

In my IMSM environment I still see that md stop stress test is failing
sporadically (1/100).

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10
  2023-05-24  8:26     ` Yu Kuai
@ 2023-05-24  9:00       ` Mariusz Tkaczyk
  2023-05-24  9:09         ` Yu Kuai
  0 siblings, 1 reply; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24  9:00 UTC (permalink / raw)
  To: Yu Kuai, song; +Cc: linux-raid, jes, pmenzel, logang, yangerkun, yukuai (C)

On Wed, 24 May 2023 16:26:45 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> Hi,
> 
> 在 2023/05/24 15:53, Mariusz Tkaczyk 写道:
> > On Tue, 23 May 2023 21:38:56 +0800
> > Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >   
> >> From: Yu Kuai <yukuai3@huawei.com>
> >>
> >> Pluged bio is unlimited means that all submitted bio will be pluged, and
> >> those bio won't be issued to underlaying disks until blk_finish_plug() or
> >> blk_flush_plug(). In this case, a lot memory will be used for
> >> raid10_bio and io latency will be very bad.
> >>
> >> This test do some dirty pages writeback for raid10, where plug is used, and
> >> check if device inflight counter exceed threshold.
> >>
> >> This problem is supposed to be fixed by [1].  
> > 
> > The test here is for md, mdadm has nothing to do here. I'm not against it
> > but please extract it to separate directory because like "md_tests".
> > We need to start grouping tests.  
> 
> Sorry I don't understand this, currently the test for md is here, I
> don't see why need a seperate directory, is there a plan to move all
> these tests into new directory?

Yes, that how it was in the past, and I would like to change it because 
tests directory is too big and we are still adding new tests.

Ideally, tests like that should stay with kernel, because repository you are
contributing now is "mdadm" - userspace application to manage md devices, not
the driver itself.
That is why I wanted to highlight that placement is controversial but I'm fine
with that. As you said, we did that in the past.

Song, what do you think? Where we should put the test like in the future?


> >> +
> >> +# check if inflight exceed threshold
> >> +while true; do
> >> +        tmp=`cat /sys/block/md0/inflight | awk '{printf("%d\n", $1 +
> >> $2);}'`
> >> +        if [ $tmp -gt $threshold ]; then
> >> +                die "inflight is greater than 4096"  
> > 
> > The message here is not meaningful, what 4096 is? Please add comment
> > describing why value above 4096 causes an error. We need to understand how
> > the future changes in md may affect this setting (I think that there is a
> > correlation between the value and MAX_PLUG_BIO).  
> 
> MAX_PLUG_BIO is just limit for one task, I'm not sure if there will only
> be one task issuing io, that why I choose a much larger value 4096.

Please add it in comment above the value.

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 2/5] tests: add a new test for rdev lifetime
  2023-05-24  8:33   ` Mariusz Tkaczyk
@ 2023-05-24  9:05     ` Yu Kuai
  2023-05-24 10:38       ` Mariusz Tkaczyk
  0 siblings, 1 reply; 19+ messages in thread
From: Yu Kuai @ 2023-05-24  9:05 UTC (permalink / raw)
  To: Mariusz Tkaczyk, Yu Kuai
  Cc: linux-raid, jes, pmenzel, logang, song, yangerkun, yukuai (C)

Hi,

在 2023/05/24 16:33, Mariusz Tkaczyk 写道:
> On Tue, 23 May 2023 21:38:57 +0800
> Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> This test add and remove a underlying disk to raid concurretly, verify
>> that the following problem is fixed:
> 
> As in previous patch, feel free to move it into separate directory.
> 
> This test is limited only to this particular problem you resolved because you
> are verifying error message in dmesg. It has no additional value because
> probability that this issue will ever more occur in the same shape is
> minimal.
> 
> IMO you should check how "remove" and "add" are handled, if errors are
> returned, if there is no trace in dmesg or if processes are not blocked in
> kernel.

It's a litter hard to do that, the problem is that after removing a disk
from array, add it back might fail. But if I follow this order,
it'll be hard to trigger the race, simply based on how quickly kernel
finish queued work. So I just remove and add the disk concurrently, and
return errors is not concerned as long as kernel doesn't WARN.

> You can check for this error message as a additional step at the end of test
> but not as a mandatory test pass criteria.
> 
> In current form it gives as a knowledge that particular kernel doesn't have your
> fix, that is all. Because it is race, probably it is not impacting real life
> scenarios, so that gives a weak motivation to backport the fix (only security
> reasons matters).
> 
> I don't see that this particular scenario requires test. You need to make it
> more valuable for the future.

This is just a regression test for a kerenl problem(also for all the
tests in this patchset) that is solved recently, and I write this test
from the kernel perspective, not user, I think this is the main
difference, because I'm not quite familiar how mdadm works, I just know
how to use it. (I still wonder why not these kernel regression tests is
not landed in blktests)

There are more regression tests that I'm planing to add, and is this the
wrong place to add such tests?

Thanks,
Kuai
>   
> Thanks,
> Mariusz
> 
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 4/5] tests: add a regression test for raid10 deadlock
  2023-05-24  8:48   ` Mariusz Tkaczyk
@ 2023-05-24  9:07     ` Yu Kuai
  0 siblings, 0 replies; 19+ messages in thread
From: Yu Kuai @ 2023-05-24  9:07 UTC (permalink / raw)
  To: Mariusz Tkaczyk, Yu Kuai
  Cc: linux-raid, jes, pmenzel, logang, song, yangerkun, yukuai (C)

Hi,

在 2023/05/24 16:48, Mariusz Tkaczyk 写道:
> On Tue, 23 May 2023 21:38:59 +0800
> Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> The deadlock is described in [1], it's fixed first by [2], however,
>> it turns out this commit will trigger other problems[3], hence this
>> commit will be reverted and the deadlock is supposed to be fixed by [1].
>>
>> [1]
>> https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/
>> [2]
>> https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
>> [3]
>> https://lore.kernel.org/linux-raid/20230322064122.2384589-2-yukuai1@huaweicloud.com/
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   tests/24raid10deadlock              | 85 +++++++++++++++++++++++++++++
>>   tests/24raid10deadlock.inject_error |  0
>>   2 files changed, 85 insertions(+)
>>   create mode 100644 tests/24raid10deadlock
>>   create mode 100644 tests/24raid10deadlock.inject_error
>>
>> diff --git a/tests/24raid10deadlock b/tests/24raid10deadlock
>> new file mode 100644
>> index 00000000..27869840
>> --- /dev/null
>> +++ b/tests/24raid10deadlock
>> @@ -0,0 +1,85 @@
>> +devs="$dev0 $dev1 $dev2 $dev3"
>> +runtime=120
>> +pid=""
>> +
>> +set_up_injection()
>> +{
>> +	echo -1 > /sys/kernel/debug/fail_make_request/times
>> +	echo 1 > /sys/kernel/debug/fail_make_request/probability
>> +	echo 0 > /sys/kernel/debug/fail_make_request/verbose
>> +	echo 1 > /sys/block/${1##*/}/make-it-fail
>> +}
>> +
>> +clean_up_injection()
>> +{
>> +	echo 0 > /sys/block/${1##*/}/make-it-fail
>> +	echo 0 > /sys/kernel/debug/fail_make_request/times
>> +	echo 0 > /sys/kernel/debug/fail_make_request/probability
>> +	echo 2 > /sys/kernel/debug/fail_make_request/verbose
>> +}
>> +
>> +test_rdev()
>> +{
>> +	while true; do
>> +		mdadm -f $md0 $1 &> /dev/null
>> +		mdadm -r $md0 $1 &> /dev/null
>> +		mdadm --zero-superblock $1 &> /dev/null
>> +		mdadm -a $md0 $1 &> /dev/null
>> +		sleep $2
>> +	done
>> +}
>> +
>> +test_write_action()
>> +{
>> +	while true; do
>> +		echo frozen > /sys/block/md0/md/sync_action
>> +		echo idle > /sys/block/md0/md/sync_action
>> +		sleep 0.1
>> +	done
>> +}
>> +
>> +set_up_test()
>> +{
>> +	fio -h &> /dev/null || die "fio not found"
>> +
>> +	# create a simple raid10
>> +	mdadm -Cv -R -n 4 -l10 $md0 $devs || die "create raid10 failed"
>> +}
>> +
>> +clean_up_test()
>> +{
>> +	clean_up_injection $dev0
>> +	kill -9 $pid
>> +	pkill -9 fio
>> +
>> +	sleep 1
>> +
>> +	if ! mdadm -S $md0; then
>> +		die "can't stop array, deadlock is probably triggered"
>> +	fi
> 
> stop may fail from different reasons I see it as too big to be marker of
> "deadlock". I know that --stop still fails because md is unable to clear sysfs
> attrs in expected time (or a least it was a problem few years ago). Is there a
> better way to check that? I would prefer, different less complicated action to
> exclude false positives.
> 
> In my IMSM environment I still see that md stop stress test is failing
> sporadically (1/100).

Yes, this make sense, perhaps I'll try to detect if related task is
stuck in D state.

Thanks,
Kuai
> 
> Thanks,
> Mariusz
> 
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10
  2023-05-24  9:00       ` Mariusz Tkaczyk
@ 2023-05-24  9:09         ` Yu Kuai
  0 siblings, 0 replies; 19+ messages in thread
From: Yu Kuai @ 2023-05-24  9:09 UTC (permalink / raw)
  To: Mariusz Tkaczyk, Yu Kuai, song
  Cc: linux-raid, jes, pmenzel, logang, yangerkun, yukuai (C)

Hi,

在 2023/05/24 17:00, Mariusz Tkaczyk 写道:
> On Wed, 24 May 2023 16:26:45 +0800
> Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
>> Hi,
>>
>> 在 2023/05/24 15:53, Mariusz Tkaczyk 写道:
>>> On Tue, 23 May 2023 21:38:56 +0800
>>> Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>>    
>>>> From: Yu Kuai <yukuai3@huawei.com>
>>>>
>>>> Pluged bio is unlimited means that all submitted bio will be pluged, and
>>>> those bio won't be issued to underlaying disks until blk_finish_plug() or
>>>> blk_flush_plug(). In this case, a lot memory will be used for
>>>> raid10_bio and io latency will be very bad.
>>>>
>>>> This test do some dirty pages writeback for raid10, where plug is used, and
>>>> check if device inflight counter exceed threshold.
>>>>
>>>> This problem is supposed to be fixed by [1].
>>>
>>> The test here is for md, mdadm has nothing to do here. I'm not against it
>>> but please extract it to separate directory because like "md_tests".
>>> We need to start grouping tests.
>>
>> Sorry I don't understand this, currently the test for md is here, I
>> don't see why need a seperate directory, is there a plan to move all
>> these tests into new directory?
> 
> Yes, that how it was in the past, and I would like to change it because
> tests directory is too big and we are still adding new tests.
> 
> Ideally, tests like that should stay with kernel, because repository you are
> contributing now is "mdadm" - userspace application to manage md devices, not
> the driver itself.
> That is why I wanted to highlight that placement is controversial but I'm fine
> with that. As you said, we did that in the past.
> 
> Song, what do you think? Where we should put the test like in the future?

Thanks for the explanation, I know this might require a lot of work, but
maybe it's a good time to migrate these tests to blktests.

Thanks,
Kuai
> 
> 
>>>> +
>>>> +# check if inflight exceed threshold
>>>> +while true; do
>>>> +        tmp=`cat /sys/block/md0/inflight | awk '{printf("%d\n", $1 +
>>>> $2);}'`
>>>> +        if [ $tmp -gt $threshold ]; then
>>>> +                die "inflight is greater than 4096"
>>>
>>> The message here is not meaningful, what 4096 is? Please add comment
>>> describing why value above 4096 causes an error. We need to understand how
>>> the future changes in md may affect this setting (I think that there is a
>>> correlation between the value and MAX_PLUG_BIO).
>>
>> MAX_PLUG_BIO is just limit for one task, I'm not sure if there will only
>> be one task issuing io, that why I choose a much larger value 4096.
> 
> Please add it in comment above the value.
> 
> Thanks,
> Mariusz
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 5/5] tests: add a regression test for raid456 deadlock
  2023-05-23 13:39 ` [PATCH tests 5/5] tests: add a regression test for raid456 deadlock Yu Kuai
@ 2023-05-24  9:09   ` Mariusz Tkaczyk
  0 siblings, 0 replies; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24  9:09 UTC (permalink / raw)
  To: Yu Kuai; +Cc: linux-raid, jes, pmenzel, logang, song, yukuai3, yangerkun

On Tue, 23 May 2023 21:39:00 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> The deadlock is described in [1], as the last patch described, it's
> fixed first by [2], however this fix will be reverted and the deadlock
> is supposed to be fixed by [3].
> 
> [1]
> https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t
> [2]
> https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
> [3]
> https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  tests/24raid456deadlock | 56 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
>  create mode 100644 tests/24raid456deadlock
> 
> diff --git a/tests/24raid456deadlock b/tests/24raid456deadlock
> new file mode 100644
> index 00000000..161c3ab8
> --- /dev/null
> +++ b/tests/24raid456deadlock
> @@ -0,0 +1,56 @@
> +devs="$dev0 $dev1 $dev2 $dev3 $dev4 $dev5"
> +runtime=120
> +pid=""
> +old=`cat /proc/sys/vm/dirty_background_ratio`
> +
> +test_write_action()
> +{
> +	while true; do
> +		echo check > /sys/block/md0/md/sync_action &> /dev/null
> +		sleep 0.1
> +		echo idle > /sys/block/md0/md/sync_action &> /dev/null
> +	done
> +}
> +
> +test_write_back()
> +{
> +	fio -filename=$md0 -bs=4k -rw=write -numjobs=1 -name=test \
> +		-time_based -runtime=$runtime &> /dev/null
> +}
> +
> +set_up_test()
> +{
> +	fio -h &> /dev/null || die "fio not found"
> +
> +	# create a simple raid6
> +	mdadm -Cv -R -n 6 -l6 $md0 $devs --assume-clean || die "create raid6
> failed" +
> +	# trigger dirty pages write back
> +	echo 0 > /proc/sys/vm/dirty_background_ratio
> +}
> +
> +clean_up_test()
> +{
> +	echo $old > /proc/sys/vm/dirty_background_ratio
> +
> +	kill -9 $pid
> +	sync $md0
> +
> +	if ! mdadm -S $md0; then
> +		die "can't stop array, deadlock is probably triggered"
> +	fi

Stop is problematic, I described why in previous patch.
You can clean up array manually by ( I think you should to limit complexity):

echo inactive > /sys/block/mdX/md/array_state
echo clear > /sys/block/mdX/md/array_state

Probably, one of those actions will hang right? The question is how we can
catch it.
I'm fine with current approach too:

Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Thanks,
Mariusz


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 0/5] tests: add some regression tests
  2023-05-24  7:56 ` [PATCH tests 0/5] tests: add some regression tests Paul Menzel
@ 2023-05-24  9:11   ` Yu Kuai
  0 siblings, 0 replies; 19+ messages in thread
From: Yu Kuai @ 2023-05-24  9:11 UTC (permalink / raw)
  To: Paul Menzel, Yu Kuai
  Cc: yangerkun, linux-raid, mariusz.tkaczyk, jes, logang, song, yukuai (C)

Hi,

在 2023/05/24 15:56, Paul Menzel 写道:
> Dear Yu,
> 
> 
> Thank you so very much for adding tests. For those unfamiliar with the 
> test framework, could you add one line how to run the newly added tests?

Ok, ./test -h can show how to run these tests, and I'm using, for
example:

./test --dev=disk --disks=/dev/sd[abcd] --tests=24raid10deadlock

I'll add such line in commit message.

Thanks,
Kuai
> 
> 
> Kind regards,
> 
> Paul
> 
> 
> PS: Your participation in and contributions to the Linux kernel are very 
> much appreciated. Great work, and good to have you.
> 
> .
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH tests 2/5] tests: add a new test for rdev lifetime
  2023-05-24  9:05     ` Yu Kuai
@ 2023-05-24 10:38       ` Mariusz Tkaczyk
  0 siblings, 0 replies; 19+ messages in thread
From: Mariusz Tkaczyk @ 2023-05-24 10:38 UTC (permalink / raw)
  To: Yu Kuai; +Cc: linux-raid, jes, pmenzel, logang, song, yangerkun, yukuai (C)

On Wed, 24 May 2023 17:05:43 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> Hi,
> 
> 在 2023/05/24 16:33, Mariusz Tkaczyk 写道:
> > On Tue, 23 May 2023 21:38:57 +0800
> > Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >   
> >> From: Yu Kuai <yukuai3@huawei.com>
> >>
> >> This test add and remove a underlying disk to raid concurretly, verify
> >> that the following problem is fixed:  
> > 
> > As in previous patch, feel free to move it into separate directory.
> > 
> > This test is limited only to this particular problem you resolved because
> > you are verifying error message in dmesg. It has no additional value because
> > probability that this issue will ever more occur in the same shape is
> > minimal.
> > 
> > IMO you should check how "remove" and "add" are handled, if errors are
> > returned, if there is no trace in dmesg or if processes are not blocked in
> > kernel.  
> 
> It's a litter hard to do that, the problem is that after removing a disk
> from array, add it back might fail. But if I follow this order,
> it'll be hard to trigger the race, simply based on how quickly kernel
> finish queued work. So I just remove and add the disk concurrently, and
> return errors is not concerned as long as kernel doesn't WARN.

Ok, makes sense. All I really care about is to not grep dmesg for particular
error message. We should make verification flexible because in the future error
may come from different place in the same test. That is the main objection here.
Maybe we can do check like in generic mdadm do_test() :
dmesg | grep -iq "error\|call trace\|segfault"

I know that it is done anyway because do_test() is involved but I would prefer
to have this verification in this test anyway to make reported error message
possibility the most meaningful.

But, as we discussed in other patch, probably mdadm is not perfect place not
tests like that so please be aware of it if you will decide to add this test
to kernel.

> > You can check for this error message as a additional step at the end of test
> > but not as a mandatory test pass criteria.
> > 
> > In current form it gives as a knowledge that particular kernel doesn't have
> > your fix, that is all. Because it is race, probably it is not impacting
> > real life scenarios, so that gives a weak motivation to backport the fix
> > (only security reasons matters).
> > 
> > I don't see that this particular scenario requires test. You need to make it
> > more valuable for the future.  
> 
> This is just a regression test for a kerenl problem(also for all the
> tests in this patchset) that is solved recently, and I write this test
> from the kernel perspective, not user, I think this is the main
> difference, because I'm not quite familiar how mdadm works, I just know
> how to use it. (I still wonder why not these kernel regression tests is
> not landed in blktests)
> 
> There are more regression tests that I'm planing to add, and is this the
> wrong place to add such tests?

I think we cleared it up in patch#4.

Thanks for quick feedback,
Mariusz

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-05-24 10:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-23 13:38 [PATCH tests 0/5] tests: add some regression tests Yu Kuai
2023-05-23 13:38 ` [PATCH tests 1/5] tests: add a new test to check if pluged bio is unlimited for raid10 Yu Kuai
2023-05-24  7:53   ` Mariusz Tkaczyk
2023-05-24  8:26     ` Yu Kuai
2023-05-24  9:00       ` Mariusz Tkaczyk
2023-05-24  9:09         ` Yu Kuai
2023-05-23 13:38 ` [PATCH tests 2/5] tests: add a new test for rdev lifetime Yu Kuai
2023-05-24  8:33   ` Mariusz Tkaczyk
2023-05-24  9:05     ` Yu Kuai
2023-05-24 10:38       ` Mariusz Tkaczyk
2023-05-23 13:38 ` [PATCH tests 3/5] tests: support to skip checking dmesg Yu Kuai
2023-05-24  8:40   ` Mariusz Tkaczyk
2023-05-23 13:38 ` [PATCH tests 4/5] tests: add a regression test for raid10 deadlock Yu Kuai
2023-05-24  8:48   ` Mariusz Tkaczyk
2023-05-24  9:07     ` Yu Kuai
2023-05-23 13:39 ` [PATCH tests 5/5] tests: add a regression test for raid456 deadlock Yu Kuai
2023-05-24  9:09   ` Mariusz Tkaczyk
2023-05-24  7:56 ` [PATCH tests 0/5] tests: add some regression tests Paul Menzel
2023-05-24  9:11   ` Yu Kuai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.