* [PATCH] fstests: delete btrfs/064 it makes no sense
@ 2020-09-29 15:50 Anand Jain
2020-09-29 15:55 ` Filipe Manana
2020-09-30 4:44 ` [PATCH v2] fstests: btrfs/064 add a comment to the test case header Anand Jain
0 siblings, 2 replies; 10+ messages in thread
From: Anand Jain @ 2020-09-29 15:50 UTC (permalink / raw)
To: fstests; +Cc: linux-btrfs, josef, guaneryu, johannes.thumshirn
btrfs/064 aimed to test balance and replace concurrency while the stress
test is running in the background.
However, as the balance and the replace operation are mutually
exclusive, so they can never run concurrently.
So long this test case is showed successful because the btrfs replace's
return error was captured only into seqfull.out.
ERROR: ioctl(DEV_REPLACE_START) '/mnt/scratch': add/delete/balance/replace/resize operation in progress
Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
tests/btrfs/064 | 105 --------------------------------------------
tests/btrfs/064.out | 2 -
2 files changed, 107 deletions(-)
delete mode 100755 tests/btrfs/064
delete mode 100644 tests/btrfs/064.out
diff --git a/tests/btrfs/064 b/tests/btrfs/064
deleted file mode 100755
index 683a69f113bf..000000000000
--- a/tests/btrfs/064
+++ /dev/null
@@ -1,105 +0,0 @@
-#! /bin/bash
-# SPDX-License-Identifier: GPL-2.0
-# Copyright (C) 2014 Red Hat Inc. All rights reserved.
-#
-# FSQA Test No. btrfs/064
-#
-# Run btrfs balance and replace operations simultaneously with fsstress
-# running in background.
-#
-seq=`basename $0`
-seqres=$RESULT_DIR/$seq
-echo "QA output created by $seq"
-
-here=`pwd`
-tmp=/tmp/$$
-status=1
-trap "_cleanup; exit \$status" 0 1 2 3 15
-
-_cleanup()
-{
- cd /
- rm -f $tmp.*
-}
-
-# get standard environment, filters and checks
-. ./common/rc
-. ./common/filter
-
-# real QA test starts here
-_supported_fs btrfs
-# we check scratch dev after each loop
-_require_scratch_nocheck
-_require_scratch_dev_pool 5
-_require_scratch_dev_pool_equal_size
-_btrfs_get_profile_configs replace
-
-rm -f $seqres.full
-
-run_test()
-{
- local mkfs_opts=$1
- local saved_scratch_dev_pool=$SCRATCH_DEV_POOL
-
- echo "Test $mkfs_opts" >>$seqres.full
-
- # remove the last device from the SCRATCH_DEV_POOL list so
- # _scratch_pool_mkfs won't use all devices in pool
- local last_dev="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
- SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$last_dev *##"`
- _scratch_pool_mkfs $mkfs_opts >>$seqres.full 2>&1
- # make sure we created btrfs with desired options
- if [ $? -ne 0 ]; then
- echo "mkfs $mkfs_opts failed"
- SCRATCH_DEV_POOL=$saved_scratch_dev_pool
- return
- fi
- _scratch_mount >>$seqres.full 2>&1
- SCRATCH_DEV_POOL=$saved_scratch_dev_pool
-
- args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d $SCRATCH_MNT/stressdir`
- echo "Run fsstress $args" >>$seqres.full
- $FSSTRESS_PROG $args >/dev/null 2>&1 &
- fsstress_pid=$!
-
- echo -n "Start balance worker: " >>$seqres.full
- _btrfs_stress_balance $SCRATCH_MNT >/dev/null 2>&1 &
- balance_pid=$!
- echo "$balance_pid" >>$seqres.full
-
- echo -n "Start replace worker: " >>$seqres.full
- _btrfs_stress_replace $SCRATCH_MNT >>$seqres.full 2>&1 &
- replace_pid=$!
- echo "$replace_pid" >>$seqres.full
-
- echo "Wait for fsstress to exit and kill all background workers" >>$seqres.full
- wait $fsstress_pid
- kill $balance_pid $replace_pid
- wait
- # wait for the balance and replace operations to finish
- while ps aux | grep "balance start" | grep -qv grep; do
- sleep 1
- done
- while ps aux | grep "replace start" | grep -qv grep; do
- sleep 1
- done
-
- echo "Scrub the filesystem" >>$seqres.full
- $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT >>$seqres.full 2>&1
- if [ $? -ne 0 ]; then
- echo "Scrub find errors in \"$mkfs_opts\" test" | tee -a $seqres.full
- fi
-
- _scratch_unmount
- # we called _require_scratch_nocheck instead of _require_scratch
- # do check after test for each profile config
- _check_scratch_fs
-}
-
-echo "Silence is golden"
-for t in "${_btrfs_profile_configs[@]}"; do
- run_test "$t"
-done
-
-status=0
-exit
diff --git a/tests/btrfs/064.out b/tests/btrfs/064.out
deleted file mode 100644
index d90765460090..000000000000
--- a/tests/btrfs/064.out
+++ /dev/null
@@ -1,2 +0,0 @@
-QA output created by 064
-Silence is golden
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-29 15:50 [PATCH] fstests: delete btrfs/064 it makes no sense Anand Jain
@ 2020-09-29 15:55 ` Filipe Manana
2020-09-29 16:02 ` Josef Bacik
2020-09-30 4:44 ` [PATCH v2] fstests: btrfs/064 add a comment to the test case header Anand Jain
1 sibling, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2020-09-29 15:55 UTC (permalink / raw)
To: Anand Jain
Cc: fstests, linux-btrfs, Josef Bacik, Eryu Guan, Johannes Thumshirn
On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com> wrote:
>
> btrfs/064 aimed to test balance and replace concurrency while the stress
> test is running in the background.
>
> However, as the balance and the replace operation are mutually
> exclusive, so they can never run concurrently.
And it's good to have a test that verifies that attempting to run them
concurrently doesn't cause any problems, like crashes, memory leaks or
some sort of filesystem corruption.
For example btrfs/187, which I wrote sometime ago, tests that running
send, balance and deduplication in parallel doesn't result in crashes,
since in the past they were allowed to run concurrently.
I see no point in removing the test, it's useful.
Thanks.
>
> So long this test case is showed successful because the btrfs replace's
> return error was captured only into seqfull.out.
>
> ERROR: ioctl(DEV_REPLACE_START) '/mnt/scratch': add/delete/balance/replace/resize operation in progress
>
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> tests/btrfs/064 | 105 --------------------------------------------
> tests/btrfs/064.out | 2 -
> 2 files changed, 107 deletions(-)
> delete mode 100755 tests/btrfs/064
> delete mode 100644 tests/btrfs/064.out
>
> diff --git a/tests/btrfs/064 b/tests/btrfs/064
> deleted file mode 100755
> index 683a69f113bf..000000000000
> --- a/tests/btrfs/064
> +++ /dev/null
> @@ -1,105 +0,0 @@
> -#! /bin/bash
> -# SPDX-License-Identifier: GPL-2.0
> -# Copyright (C) 2014 Red Hat Inc. All rights reserved.
> -#
> -# FSQA Test No. btrfs/064
> -#
> -# Run btrfs balance and replace operations simultaneously with fsstress
> -# running in background.
> -#
> -seq=`basename $0`
> -seqres=$RESULT_DIR/$seq
> -echo "QA output created by $seq"
> -
> -here=`pwd`
> -tmp=/tmp/$$
> -status=1
> -trap "_cleanup; exit \$status" 0 1 2 3 15
> -
> -_cleanup()
> -{
> - cd /
> - rm -f $tmp.*
> -}
> -
> -# get standard environment, filters and checks
> -. ./common/rc
> -. ./common/filter
> -
> -# real QA test starts here
> -_supported_fs btrfs
> -# we check scratch dev after each loop
> -_require_scratch_nocheck
> -_require_scratch_dev_pool 5
> -_require_scratch_dev_pool_equal_size
> -_btrfs_get_profile_configs replace
> -
> -rm -f $seqres.full
> -
> -run_test()
> -{
> - local mkfs_opts=$1
> - local saved_scratch_dev_pool=$SCRATCH_DEV_POOL
> -
> - echo "Test $mkfs_opts" >>$seqres.full
> -
> - # remove the last device from the SCRATCH_DEV_POOL list so
> - # _scratch_pool_mkfs won't use all devices in pool
> - local last_dev="`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`"
> - SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e "s# *$last_dev *##"`
> - _scratch_pool_mkfs $mkfs_opts >>$seqres.full 2>&1
> - # make sure we created btrfs with desired options
> - if [ $? -ne 0 ]; then
> - echo "mkfs $mkfs_opts failed"
> - SCRATCH_DEV_POOL=$saved_scratch_dev_pool
> - return
> - fi
> - _scratch_mount >>$seqres.full 2>&1
> - SCRATCH_DEV_POOL=$saved_scratch_dev_pool
> -
> - args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d $SCRATCH_MNT/stressdir`
> - echo "Run fsstress $args" >>$seqres.full
> - $FSSTRESS_PROG $args >/dev/null 2>&1 &
> - fsstress_pid=$!
> -
> - echo -n "Start balance worker: " >>$seqres.full
> - _btrfs_stress_balance $SCRATCH_MNT >/dev/null 2>&1 &
> - balance_pid=$!
> - echo "$balance_pid" >>$seqres.full
> -
> - echo -n "Start replace worker: " >>$seqres.full
> - _btrfs_stress_replace $SCRATCH_MNT >>$seqres.full 2>&1 &
> - replace_pid=$!
> - echo "$replace_pid" >>$seqres.full
> -
> - echo "Wait for fsstress to exit and kill all background workers" >>$seqres.full
> - wait $fsstress_pid
> - kill $balance_pid $replace_pid
> - wait
> - # wait for the balance and replace operations to finish
> - while ps aux | grep "balance start" | grep -qv grep; do
> - sleep 1
> - done
> - while ps aux | grep "replace start" | grep -qv grep; do
> - sleep 1
> - done
> -
> - echo "Scrub the filesystem" >>$seqres.full
> - $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT >>$seqres.full 2>&1
> - if [ $? -ne 0 ]; then
> - echo "Scrub find errors in \"$mkfs_opts\" test" | tee -a $seqres.full
> - fi
> -
> - _scratch_unmount
> - # we called _require_scratch_nocheck instead of _require_scratch
> - # do check after test for each profile config
> - _check_scratch_fs
> -}
> -
> -echo "Silence is golden"
> -for t in "${_btrfs_profile_configs[@]}"; do
> - run_test "$t"
> -done
> -
> -status=0
> -exit
> diff --git a/tests/btrfs/064.out b/tests/btrfs/064.out
> deleted file mode 100644
> index d90765460090..000000000000
> --- a/tests/btrfs/064.out
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -QA output created by 064
> -Silence is golden
> --
> 2.25.1
>
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-29 15:55 ` Filipe Manana
@ 2020-09-29 16:02 ` Josef Bacik
2020-09-29 16:13 ` Filipe Manana
0 siblings, 1 reply; 10+ messages in thread
From: Josef Bacik @ 2020-09-29 16:02 UTC (permalink / raw)
To: fdmanana, Anand Jain; +Cc: fstests, linux-btrfs, Eryu Guan, Johannes Thumshirn
On 9/29/20 11:55 AM, Filipe Manana wrote:
> On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com> wrote:
>>
>> btrfs/064 aimed to test balance and replace concurrency while the stress
>> test is running in the background.
>>
>> However, as the balance and the replace operation are mutually
>> exclusive, so they can never run concurrently.
>
> And it's good to have a test that verifies that attempting to run them
> concurrently doesn't cause any problems, like crashes, memory leaks or
> some sort of filesystem corruption.
>
> For example btrfs/187, which I wrote sometime ago, tests that running
> send, balance and deduplication in parallel doesn't result in crashes,
> since in the past they were allowed to run concurrently.
>
> I see no point in removing the test, it's useful.
My confusion was around whether this test was actually testing what we
think it should be testing. If this test was meant to make sure that
replace works while we've got load on the fs, then clearly it's not
doing what we think it's doing.
In this case if we're ok with it exercising the exclusion path then I
think we at least need to update the comment at the beginning of the
test so it's clear what the purpose of the test is. And then we need to
make sure we do actually have a test where device replace is properly
exercised. I _think_ it is with btrfs/065 and some others, so just
updating the comment here would be enough. Thanks,
Josef
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-29 16:02 ` Josef Bacik
@ 2020-09-29 16:13 ` Filipe Manana
2020-09-29 17:26 ` Josef Bacik
0 siblings, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2020-09-29 16:13 UTC (permalink / raw)
To: Josef Bacik
Cc: Anand Jain, fstests, linux-btrfs, Eryu Guan, Johannes Thumshirn
On Tue, Sep 29, 2020 at 5:02 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> On 9/29/20 11:55 AM, Filipe Manana wrote:
> > On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com> wrote:
> >>
> >> btrfs/064 aimed to test balance and replace concurrency while the stress
> >> test is running in the background.
> >>
> >> However, as the balance and the replace operation are mutually
> >> exclusive, so they can never run concurrently.
> >
> > And it's good to have a test that verifies that attempting to run them
> > concurrently doesn't cause any problems, like crashes, memory leaks or
> > some sort of filesystem corruption.
> >
> > For example btrfs/187, which I wrote sometime ago, tests that running
> > send, balance and deduplication in parallel doesn't result in crashes,
> > since in the past they were allowed to run concurrently.
> >
> > I see no point in removing the test, it's useful.
>
> My confusion was around whether this test was actually testing what we
> think it should be testing. If this test was meant to make sure that
> replace works while we've got load on the fs, then clearly it's not
> doing what we think it's doing.
Given that neither the test's description nor the changelog mention
that it expects device replace and balance to be able to run
concurrently,
that errors are explicitly ignored and redirected to $seqres.full, and
we don't do any sort of validation after device replace operations, it
makes it clear to me it's a stress test.
>
> In this case if we're ok with it exercising the exclusion path then I
> think we at least need to update the comment at the beginning of the
> test so it's clear what the purpose of the test is. And then we need to
> make sure we do actually have a test where device replace is properly
> exercised. I _think_ it is with btrfs/065 and some others, so just
> updating the comment here would be enough. Thanks,
>
> Josef
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-29 16:13 ` Filipe Manana
@ 2020-09-29 17:26 ` Josef Bacik
2020-09-30 4:14 ` Anand Jain
0 siblings, 1 reply; 10+ messages in thread
From: Josef Bacik @ 2020-09-29 17:26 UTC (permalink / raw)
To: fdmanana; +Cc: Anand Jain, fstests, linux-btrfs, Eryu Guan, Johannes Thumshirn
On 9/29/20 12:13 PM, Filipe Manana wrote:
> On Tue, Sep 29, 2020 at 5:02 PM Josef Bacik <josef@toxicpanda.com> wrote:
>>
>> On 9/29/20 11:55 AM, Filipe Manana wrote:
>>> On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com> wrote:
>>>>
>>>> btrfs/064 aimed to test balance and replace concurrency while the stress
>>>> test is running in the background.
>>>>
>>>> However, as the balance and the replace operation are mutually
>>>> exclusive, so they can never run concurrently.
>>>
>>> And it's good to have a test that verifies that attempting to run them
>>> concurrently doesn't cause any problems, like crashes, memory leaks or
>>> some sort of filesystem corruption.
>>>
>>> For example btrfs/187, which I wrote sometime ago, tests that running
>>> send, balance and deduplication in parallel doesn't result in crashes,
>>> since in the past they were allowed to run concurrently.
>>>
>>> I see no point in removing the test, it's useful.
>>
>> My confusion was around whether this test was actually testing what we
>> think it should be testing. If this test was meant to make sure that
>> replace works while we've got load on the fs, then clearly it's not
>> doing what we think it's doing.
>
> Given that neither the test's description nor the changelog mention
> that it expects device replace and balance to be able to run
> concurrently,
> that errors are explicitly ignored and redirected to $seqres.full, and
> we don't do any sort of validation after device replace operations, it
> makes it clear to me it's a stress test.
>
Sure but I spent a while looking at it when it was failing being very
confused. In my mind my snapshot-stress.sh is a stress test, because
its meant to run without errors. The changelog and description are
sufficiently vague enough that it appeared that Eryu meant to write a
test that actually did a replace and balance at the same time. The test
clearly isn't doing that, so we need to update the description so it's
clear that's what's going on. And then I wanted to make sure that we do
in fact have a test that stresses replace in these scenarios, because I
want to make sure we actually test replace as well.
Not ripping it out is fine, but updating the description so I'm not
confused in a couple years when I trip over this again would be nice.
Thanks,
Josef
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-29 17:26 ` Josef Bacik
@ 2020-09-30 4:14 ` Anand Jain
2020-09-30 9:16 ` Filipe Manana
0 siblings, 1 reply; 10+ messages in thread
From: Anand Jain @ 2020-09-30 4:14 UTC (permalink / raw)
To: Josef Bacik, fdmanana; +Cc: fstests, linux-btrfs, Eryu Guan, Johannes Thumshirn
On 30/9/20 1:26 am, Josef Bacik wrote:
> On 9/29/20 12:13 PM, Filipe Manana wrote:
>> On Tue, Sep 29, 2020 at 5:02 PM Josef Bacik <josef@toxicpanda.com> wrote:
>>>
>>> On 9/29/20 11:55 AM, Filipe Manana wrote:
>>>> On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com>
>>>> wrote:
>>>>>
>>>>> btrfs/064 aimed to test balance and replace concurrency while the
>>>>> stress
>>>>> test is running in the background.
>>>>>
>>>>> However, as the balance and the replace operation are mutually
>>>>> exclusive, so they can never run concurrently.
>>>>
>>>> And it's good to have a test that verifies that attempting to run them
>>>> concurrently doesn't cause any problems, like crashes, memory leaks or
>>>> some sort of filesystem corruption.
>>>>
>>>> For example btrfs/187, which I wrote sometime ago, tests that running
>>>> send, balance and deduplication in parallel doesn't result in crashes,
>>>> since in the past they were allowed to run concurrently.
>>>>
>>>> I see no point in removing the test, it's useful.
>>>
>>> My confusion was around whether this test was actually testing what we
>>> think it should be testing. If this test was meant to make sure that
>>> replace works while we've got load on the fs, then clearly it's not
>>> doing what we think it's doing.
>>
>> Given that neither the test's description nor the changelog mention
>> that it expects device replace and balance to be able to run
>> concurrently,
>> that errors are explicitly ignored and redirected to $seqres.full, and
>> we don't do any sort of validation after device replace operations, it
>> makes it clear to me it's a stress test.
>>
>
> Sure but I spent a while looking at it when it was failing being very
> confused. In my mind my snapshot-stress.sh is a stress test, because
> its meant to run without errors. The changelog and description are
> sufficiently vague enough that it appeared that Eryu meant to write a
> test that actually did a replace and balance at the same time. The test
> clearly isn't doing that, so we need to update the description so it's
> clear that's what's going on. And then I wanted to make sure that we do
> in fact have a test that stresses replace in these scenarios, because I
> want to make sure we actually test replace as well.
>
> Not ripping it out is fine, but updating the description so I'm not
> confused in a couple years when I trip over this again would be nice.
> Thanks,
>
As of now, we have the following balance concurrency tests.
-----
028 balance and unlink fsstress concurrency [1]
060 balance and subvol ops concurrency with fsstress [2]
061 balance and scrub concurrency with fsstress [2]
062 balance and defrag concurrency with fsstress [2]
063 balance and remount concurrency with fsstress [2]
064 balance and replace concurrency with fsstress [2]
177 balance and resize concurrency
187 balance, send and dedupe concurrency
190 balance with qgroup
[1]
args=`_scale_fsstress_args -z \
-f write=10 -f unlink=10 \
-f creat=10 -f fsync=10 \
-f fsync=10 -n 100000 -p 2 \
-d $SCRATCH_MNT/stress_dir`
[2]
args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d
$SCRATCH_MNT/stressdir`
-----
064 shall test balance with fsstress in the background. The replace
thread is kept out with the early check of BTRFS_FS_EXCL_OP in the kernel.
I am ok with the 064 headers updated, will send v2.
Also, it turns out that this test case helped to find a btrfs-progs bug.
Its patch [1] is sent to the ML.
[1] btrfs-progs: fix return code for failed replace start
Thanks, Anand
> Josef
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] fstests: btrfs/064 add a comment to the test case header
2020-09-29 15:50 [PATCH] fstests: delete btrfs/064 it makes no sense Anand Jain
2020-09-29 15:55 ` Filipe Manana
@ 2020-09-30 4:44 ` Anand Jain
2020-09-30 12:42 ` Josef Bacik
1 sibling, 1 reply; 10+ messages in thread
From: Anand Jain @ 2020-09-30 4:44 UTC (permalink / raw)
To: fstests; +Cc: linux-btrfs, josef, guaneryu, johannes.thumshirn
It appears that the goal of this case was to test balance and
replace concurrency. But these two operations aren't meant to run
concurrently and the replace failing errors are captured in the
seqres.full output. Which are expected errors. To avoid further
confusion, this patch adds comments. The reason to keep this
test case is at the Link.
Link: https://patchwork.kernel.org/patch/11806307/
Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v2:
Instead of removing the test case, add comments to clarify.
Title is changed
old: fstests: delete btrfs/064 it makes no sens
tests/btrfs/064 | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tests/btrfs/064 b/tests/btrfs/064
index 683a69f113bf..ce73acf7872a 100755
--- a/tests/btrfs/064
+++ b/tests/btrfs/064
@@ -4,9 +4,11 @@
#
# FSQA Test No. btrfs/064
#
-# Run btrfs balance and replace operations simultaneously with fsstress
-# running in background.
-#
+# Run btrfs balance and replace operations simultaneously with fsstress running
+# in the background, check with the scrub if all the blocks are ok.
+# Balance and replace operations are mutually exclusive operations they can't
+# run simultaneously. One of them is expected to fail when the other is running.
+
seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"
@@ -62,6 +64,8 @@ run_test()
$FSSTRESS_PROG $args >/dev/null 2>&1 &
fsstress_pid=$!
+ # Start both balance and replace in the background.
+ # Either balance or replace shall run, the other fails.
echo -n "Start balance worker: " >>$seqres.full
_btrfs_stress_balance $SCRATCH_MNT >/dev/null 2>&1 &
balance_pid=$!
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-30 4:14 ` Anand Jain
@ 2020-09-30 9:16 ` Filipe Manana
2020-09-30 10:01 ` Anand Jain
0 siblings, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2020-09-30 9:16 UTC (permalink / raw)
To: Anand Jain
Cc: Josef Bacik, fstests, linux-btrfs, Eryu Guan, Johannes Thumshirn
On Wed, Sep 30, 2020 at 5:14 AM Anand Jain <anand.jain@oracle.com> wrote:
>
> On 30/9/20 1:26 am, Josef Bacik wrote:
> > On 9/29/20 12:13 PM, Filipe Manana wrote:
> >> On Tue, Sep 29, 2020 at 5:02 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >>>
> >>> On 9/29/20 11:55 AM, Filipe Manana wrote:
> >>>> On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com>
> >>>> wrote:
> >>>>>
> >>>>> btrfs/064 aimed to test balance and replace concurrency while the
> >>>>> stress
> >>>>> test is running in the background.
> >>>>>
> >>>>> However, as the balance and the replace operation are mutually
> >>>>> exclusive, so they can never run concurrently.
> >>>>
> >>>> And it's good to have a test that verifies that attempting to run them
> >>>> concurrently doesn't cause any problems, like crashes, memory leaks or
> >>>> some sort of filesystem corruption.
> >>>>
> >>>> For example btrfs/187, which I wrote sometime ago, tests that running
> >>>> send, balance and deduplication in parallel doesn't result in crashes,
> >>>> since in the past they were allowed to run concurrently.
> >>>>
> >>>> I see no point in removing the test, it's useful.
> >>>
> >>> My confusion was around whether this test was actually testing what we
> >>> think it should be testing. If this test was meant to make sure that
> >>> replace works while we've got load on the fs, then clearly it's not
> >>> doing what we think it's doing.
> >>
> >> Given that neither the test's description nor the changelog mention
> >> that it expects device replace and balance to be able to run
> >> concurrently,
> >> that errors are explicitly ignored and redirected to $seqres.full, and
> >> we don't do any sort of validation after device replace operations, it
> >> makes it clear to me it's a stress test.
> >>
> >
> > Sure but I spent a while looking at it when it was failing being very
> > confused. In my mind my snapshot-stress.sh is a stress test, because
> > its meant to run without errors. The changelog and description are
> > sufficiently vague enough that it appeared that Eryu meant to write a
> > test that actually did a replace and balance at the same time. The test
> > clearly isn't doing that, so we need to update the description so it's
> > clear that's what's going on. And then I wanted to make sure that we do
> > in fact have a test that stresses replace in these scenarios, because I
> > want to make sure we actually test replace as well.
> >
> > Not ripping it out is fine, but updating the description so I'm not
> > confused in a couple years when I trip over this again would be nice.
> > Thanks,
> >
>
> As of now, we have the following balance concurrency tests.
> -----
> 028 balance and unlink fsstress concurrency [1]
> 060 balance and subvol ops concurrency with fsstress [2]
> 061 balance and scrub concurrency with fsstress [2]
> 062 balance and defrag concurrency with fsstress [2]
> 063 balance and remount concurrency with fsstress [2]
> 064 balance and replace concurrency with fsstress [2]
> 177 balance and resize concurrency
No, 177 does not test balance and resize concurrency.
It tests balance when a swap file exists. And the resize happens
(starts and ends) before setting the swap file and before doing the
balance.
Thanks.
> 187 balance, send and dedupe concurrency
> 190 balance with qgroup
>
> [1]
> args=`_scale_fsstress_args -z \
> -f write=10 -f unlink=10 \
> -f creat=10 -f fsync=10 \
> -f fsync=10 -n 100000 -p 2 \
> -d $SCRATCH_MNT/stress_dir`
>
> [2]
> args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d
> $SCRATCH_MNT/stressdir`
> -----
>
> 064 shall test balance with fsstress in the background. The replace
> thread is kept out with the early check of BTRFS_FS_EXCL_OP in the kernel.
> I am ok with the 064 headers updated, will send v2.
>
>
> Also, it turns out that this test case helped to find a btrfs-progs bug.
> Its patch [1] is sent to the ML.
> [1] btrfs-progs: fix return code for failed replace start
>
> Thanks, Anand
>
>
> > Josef
>
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] fstests: delete btrfs/064 it makes no sense
2020-09-30 9:16 ` Filipe Manana
@ 2020-09-30 10:01 ` Anand Jain
0 siblings, 0 replies; 10+ messages in thread
From: Anand Jain @ 2020-09-30 10:01 UTC (permalink / raw)
To: fdmanana; +Cc: Josef Bacik, fstests, linux-btrfs, Eryu Guan, Johannes Thumshirn
On 30/9/20 5:16 pm, Filipe Manana wrote:
> On Wed, Sep 30, 2020 at 5:14 AM Anand Jain <anand.jain@oracle.com> wrote:
>>
>> On 30/9/20 1:26 am, Josef Bacik wrote:
>>> On 9/29/20 12:13 PM, Filipe Manana wrote:
>>>> On Tue, Sep 29, 2020 at 5:02 PM Josef Bacik <josef@toxicpanda.com> wrote:
>>>>>
>>>>> On 9/29/20 11:55 AM, Filipe Manana wrote:
>>>>>> On Tue, Sep 29, 2020 at 4:50 PM Anand Jain <anand.jain@oracle.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> btrfs/064 aimed to test balance and replace concurrency while the
>>>>>>> stress
>>>>>>> test is running in the background.
>>>>>>>
>>>>>>> However, as the balance and the replace operation are mutually
>>>>>>> exclusive, so they can never run concurrently.
>>>>>>
>>>>>> And it's good to have a test that verifies that attempting to run them
>>>>>> concurrently doesn't cause any problems, like crashes, memory leaks or
>>>>>> some sort of filesystem corruption.
>>>>>>
>>>>>> For example btrfs/187, which I wrote sometime ago, tests that running
>>>>>> send, balance and deduplication in parallel doesn't result in crashes,
>>>>>> since in the past they were allowed to run concurrently.
>>>>>>
>>>>>> I see no point in removing the test, it's useful.
>>>>>
>>>>> My confusion was around whether this test was actually testing what we
>>>>> think it should be testing. If this test was meant to make sure that
>>>>> replace works while we've got load on the fs, then clearly it's not
>>>>> doing what we think it's doing.
>>>>
>>>> Given that neither the test's description nor the changelog mention
>>>> that it expects device replace and balance to be able to run
>>>> concurrently,
>>>> that errors are explicitly ignored and redirected to $seqres.full, and
>>>> we don't do any sort of validation after device replace operations, it
>>>> makes it clear to me it's a stress test.
>>>>
>>>
>>> Sure but I spent a while looking at it when it was failing being very
>>> confused. In my mind my snapshot-stress.sh is a stress test, because
>>> its meant to run without errors. The changelog and description are
>>> sufficiently vague enough that it appeared that Eryu meant to write a
>>> test that actually did a replace and balance at the same time. The test
>>> clearly isn't doing that, so we need to update the description so it's
>>> clear that's what's going on. And then I wanted to make sure that we do
>>> in fact have a test that stresses replace in these scenarios, because I
>>> want to make sure we actually test replace as well.
>>>
>>> Not ripping it out is fine, but updating the description so I'm not
>>> confused in a couple years when I trip over this again would be nice.
>>> Thanks,
>>>
>>
>> As of now, we have the following balance concurrency tests.
>> -----
>> 028 balance and unlink fsstress concurrency [1]
>> 060 balance and subvol ops concurrency with fsstress [2]
>> 061 balance and scrub concurrency with fsstress [2]
>> 062 balance and defrag concurrency with fsstress [2]
>> 063 balance and remount concurrency with fsstress [2]
>> 064 balance and replace concurrency with fsstress [2]
>> 177 balance and resize concurrency
>
> No, 177 does not test balance and resize concurrency.
> It tests balance when a swap file exists. And the resize happens
> (starts and ends) before setting the swap file and before doing the
> balance.
You are right. Thanks for correcting.
-Anand
>
> Thanks.
>
>
>> 187 balance, send and dedupe concurrency
>> 190 balance with qgroup
>>
>> [1]
>> args=`_scale_fsstress_args -z \
>> -f write=10 -f unlink=10 \
>> -f creat=10 -f fsync=10 \
>> -f fsync=10 -n 100000 -p 2 \
>> -d $SCRATCH_MNT/stress_dir`
>>
>> [2]
>> args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d
>> $SCRATCH_MNT/stressdir`
>> -----
>>
>> 064 shall test balance with fsstress in the background. The replace
>> thread is kept out with the early check of BTRFS_FS_EXCL_OP in the kernel.
>> I am ok with the 064 headers updated, will send v2.
>>
>>
>> Also, it turns out that this test case helped to find a btrfs-progs bug.
>> Its patch [1] is sent to the ML.
>> [1] btrfs-progs: fix return code for failed replace start
>>
>> Thanks, Anand
>>
>>
>>> Josef
>>
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] fstests: btrfs/064 add a comment to the test case header
2020-09-30 4:44 ` [PATCH v2] fstests: btrfs/064 add a comment to the test case header Anand Jain
@ 2020-09-30 12:42 ` Josef Bacik
0 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2020-09-30 12:42 UTC (permalink / raw)
To: Anand Jain, fstests; +Cc: linux-btrfs, guaneryu, johannes.thumshirn
On 9/30/20 12:44 AM, Anand Jain wrote:
> It appears that the goal of this case was to test balance and
> replace concurrency. But these two operations aren't meant to run
> concurrently and the replace failing errors are captured in the
> seqres.full output. Which are expected errors. To avoid further
> confusion, this patch adds comments. The reason to keep this
> test case is at the Link.
>
> Link: https://patchwork.kernel.org/patch/11806307/
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Thanks,
Josef
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-09-30 12:42 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 15:50 [PATCH] fstests: delete btrfs/064 it makes no sense Anand Jain
2020-09-29 15:55 ` Filipe Manana
2020-09-29 16:02 ` Josef Bacik
2020-09-29 16:13 ` Filipe Manana
2020-09-29 17:26 ` Josef Bacik
2020-09-30 4:14 ` Anand Jain
2020-09-30 9:16 ` Filipe Manana
2020-09-30 10:01 ` Anand Jain
2020-09-30 4:44 ` [PATCH v2] fstests: btrfs/064 add a comment to the test case header Anand Jain
2020-09-30 12:42 ` Josef Bacik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).