fstests.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
@ 2020-05-20  7:43 Qu Wenruo
  2020-05-20  7:48 ` Nikolay Borisov
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2020-05-20  7:43 UTC (permalink / raw)
  To: linux-btrfs, fstests

Test if canceling a running balance can cause later balance to dead
loop.

The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
 orphan roots to prevent runaway balance".

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Remove lsof debug output
v3:
- Remove ps debug output
---
 tests/btrfs/213     | 65 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/213.out |  2 ++
 tests/btrfs/group   |  1 +
 3 files changed, 68 insertions(+)
 create mode 100755 tests/btrfs/213
 create mode 100644 tests/btrfs/213.out

diff --git a/tests/btrfs/213 b/tests/btrfs/213
new file mode 100755
index 00000000..145378cb
--- /dev/null
+++ b/tests/btrfs/213
@@ -0,0 +1,65 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 213
+#
+# Test if canceling a running balance can lead to dead looping balance
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# Modify as appropriate.
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_command "$KILLALL_PROG" killall
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+
+runtime=4
+
+# Create enough IO so that we need around $runtime seconds to relocate it
+dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
+	&> /dev/null &
+dd_pid=$!
+sleep $runtime
+"$KILLALL_PROG" -q dd &> /dev/null
+wait $dd_pid
+
+ps aux >> $seqres.full
+# Now balance should take at least $runtime seconds, we can cancel it at
+# $runtime/2 to ensure a success cancel.
+$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
+sleep $(($runtime / 2))
+$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
+
+# Now check if we can finish relocating metadata, which should finish very
+# quickly
+$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
new file mode 100644
index 00000000..bd8f2430
--- /dev/null
+++ b/tests/btrfs/213.out
@@ -0,0 +1,2 @@
+QA output created by 213
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 8d65bddd..fe4d5bb3 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -215,3 +215,4 @@
 210 auto quick qgroup snapshot
 211 auto quick log prealloc
 212 auto balance dangerous
+213 auto fast balance dangerous
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
  2020-05-20  7:43 [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel Qu Wenruo
@ 2020-05-20  7:48 ` Nikolay Borisov
  0 siblings, 0 replies; 8+ messages in thread
From: Nikolay Borisov @ 2020-05-20  7:48 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, fstests



On 20.05.20 г. 10:43 ч., Qu Wenruo wrote:
> Test if canceling a running balance can cause later balance to dead
> loop.
> 
> The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
>  orphan roots to prevent runaway balance".
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Remove lsof debug output
> v3:
> - Remove ps debug output
> ---
>  tests/btrfs/213     | 65 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/213.out |  2 ++
>  tests/btrfs/group   |  1 +
>  3 files changed, 68 insertions(+)
>  create mode 100755 tests/btrfs/213
>  create mode 100644 tests/btrfs/213.out
> 
> diff --git a/tests/btrfs/213 b/tests/btrfs/213
> new file mode 100755
> index 00000000..145378cb
> --- /dev/null
> +++ b/tests/btrfs/213
> @@ -0,0 +1,65 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 213
> +#
> +# Test if canceling a running balance can lead to dead looping balance
> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# Modify as appropriate.
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +_require_command "$KILLALL_PROG" killall
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount
> +
> +runtime=4
> +
> +# Create enough IO so that we need around $runtime seconds to relocate it
> +dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
> +	&> /dev/null &
> +dd_pid=$!
> +sleep $runtime
> +"$KILLALL_PROG" -q dd &> /dev/null
> +wait $dd_pid
> +
> +ps aux >> $seqres.full

Nope, you didn't

> +# Now balance should take at least $runtime seconds, we can cancel it at
> +# $runtime/2 to ensure a success cancel.
> +$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
> +sleep $(($runtime / 2))
> +$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
> +
> +# Now check if we can finish relocating metadata, which should finish very
> +# quickly
> +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
> new file mode 100644
> index 00000000..bd8f2430
> --- /dev/null
> +++ b/tests/btrfs/213.out
> @@ -0,0 +1,2 @@
> +QA output created by 213
> +Silence is golden
> diff --git a/tests/btrfs/group b/tests/btrfs/group
> index 8d65bddd..fe4d5bb3 100644
> --- a/tests/btrfs/group
> +++ b/tests/btrfs/group
> @@ -215,3 +215,4 @@
>  210 auto quick qgroup snapshot
>  211 auto quick log prealloc
>  212 auto balance dangerous
> +213 auto fast balance dangerous
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
  2020-05-20 11:32     ` Qu Wenruo
@ 2020-05-20 12:38       ` Filipe Manana
  0 siblings, 0 replies; 8+ messages in thread
From: Filipe Manana @ 2020-05-20 12:38 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs, fstests

On Wed, May 20, 2020 at 12:32 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2020/5/20 下午6:02, Qu Wenruo wrote:
> >
> >
> > On 2020/5/20 下午5:29, Filipe Manana wrote:
> >> On Wed, May 20, 2020 at 8:59 AM Qu Wenruo <wqu@suse.com> wrote:
> >>>
> >>> Test if canceling a running balance can cause later balance to dead
> >>> loop.
> >>>
> >>> The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
> >>>  orphan roots to prevent runaway balance".
> >>>
> >>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >>> ---
> >>> Changelog:
> >>> v2:
> >>> - Remove lsof debug output
> >>> v3:
> >>> - Remove ps debug output
> >>> ---
> >>>  tests/btrfs/213     | 64 +++++++++++++++++++++++++++++++++++++++++++++
> >>>  tests/btrfs/213.out |  2 ++
> >>>  tests/btrfs/group   |  1 +
> >>>  3 files changed, 67 insertions(+)
> >>>  create mode 100755 tests/btrfs/213
> >>>  create mode 100644 tests/btrfs/213.out
> >>>
> >>> diff --git a/tests/btrfs/213 b/tests/btrfs/213
> >>> new file mode 100755
> >>> index 00000000..f56b0279
> >>> --- /dev/null
> >>> +++ b/tests/btrfs/213
> >>> @@ -0,0 +1,64 @@
> >>> +#! /bin/bash
> >>> +# SPDX-License-Identifier: GPL-2.0
> >>> +# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
> >>> +#
> >>> +# FS QA Test 213
> >>> +#
> >>> +# Test if canceling a running balance can lead to dead looping balance
> >>> +#
> >>> +seq=`basename $0`
> >>> +seqres=$RESULT_DIR/$seq
> >>> +echo "QA output created by $seq"
> >>> +
> >>> +here=`pwd`
> >>> +tmp=/tmp/$$
> >>> +status=1       # failure is the default!
> >>> +trap "_cleanup; exit \$status" 0 1 2 3 15
> >>> +
> >>> +_cleanup()
> >>> +{
> >>> +       cd /
> >>> +       rm -f $tmp.*
> >>> +}
> >>> +
> >>> +# get standard environment, filters and checks
> >>> +. ./common/rc
> >>> +. ./common/filter
> >>> +
> >>> +# remove previous $seqres.full before test
> >>> +rm -f $seqres.full
> >>> +
> >>> +# Modify as appropriate.
> >>> +_supported_fs btrfs
> >>> +_supported_os Linux
> >>> +_require_scratch
> >>> +_require_command "$KILLALL_PROG" killall
> >>> +
> >>> +_scratch_mkfs >> $seqres.full
> >>> +_scratch_mount
> >>> +
> >>> +runtime=4
> >>> +
> >>> +# Create enough IO so that we need around $runtime seconds to relocate it
> >>> +dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
> >>> +       &> /dev/null &
> >>> +dd_pid=$!
> >>> +sleep $runtime
> >>> +"$KILLALL_PROG" -q dd &> /dev/null
> >>
> >> Do we really need the killall program? There's only one dd process.
> >>
> >> We should also kill the dd process at _cleanup(), as killing the test
> >> during the sleep above will result in the dd process not being killed.
> >
> > The main problem here is, I can't find a good way to kill dd.
> > plain 'kill $dd_pid' doesn't seem to kill it properly, as my debug
> > lsof/ps still shows dd running and failed to unmount the fs.
> >
> > 'kill -KILL $dd_pid' kills it well, but causes extra output for the
> > terminated dd.
> >
> > Only 'killall -q dd' works as expected.
> >
> > Any good advice on this?
>
> Oh, the fstests "kindly" overrides dd, "kill $dd_pid" only kills the
> child bash running that dd function, not the real dd command.
> I'll use xfs_io to ensure we get no extra wrapper.

Indeed, I was doing this in the test:

ps -eo pid,ppid,pgid,comm > /tmp/ps.txt
my_pid=$$
echo "dd_pid = $dd_pid mypid = $my_pid" >> /tmp/ps.txt
kill -9 $dd_pid
wait $dd_pid

And was noticing that the ps.txt file had:

23467   937 23467 check
...
23840 23467 23467 213
...
24034 23840 23467 213
24039 24034 23467 dd
...
dd_pid = 24034 mypid = 23840

So dd_pid matched a bash process running the test, which is a child of
the bash process that invoked dd.

Before digging further I saw your reply and checked that common/rc
provides a function named dd() which is a wrapper to dd and it invokes
dd twice, whence you needed killall.
Interesting, wasn't aware of that wrapper.

Thanks.



>
> Thanks,
> Qu
>
> >
> >>
> >>> +wait $dd_pid
> >>> +
> >>> +# Now balance should take at least $runtime seconds, we can cancel it at
> >>> +# $runtime/2 to ensure a success cancel.
> >>> +$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
> >>> +sleep $(($runtime / 2))
> >>> +$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
> >>> +
> >>> +# Now check if we can finish relocating metadata, which should finish very
> >>> +# quickly
> >>> +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full
> >>
> >> Why redirect this one to $seqres.full and not the other balance? What
> >> kind of useful information it provides?
> >
> > Not really much, just an indicator to show that the balance finishes as
> > expected.
> > And we don't want to output it golden output, as mkfs change may create
> > more metadata chunks and cause difference.
> >
> > For other ones, like the one to be canceled, we don't really care that much.
> >
> > Thanks,
> > Qu
> >>
> >>> +
> >>> +echo "Silence is golden"
> >>> +
> >>> +# success, all done
> >>> +status=0
> >>> +exit
> >>> diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
> >>> new file mode 100644
> >>> index 00000000..bd8f2430
> >>> --- /dev/null
> >>> +++ b/tests/btrfs/213.out
> >>> @@ -0,0 +1,2 @@
> >>> +QA output created by 213
> >>> +Silence is golden
> >>> diff --git a/tests/btrfs/group b/tests/btrfs/group
> >>> index 8d65bddd..fe4d5bb3 100644
> >>> --- a/tests/btrfs/group
> >>> +++ b/tests/btrfs/group
> >>> @@ -215,3 +215,4 @@
> >>>  210 auto quick qgroup snapshot
> >>>  211 auto quick log prealloc
> >>>  212 auto balance dangerous
> >>> +213 auto fast balance dangerous
> >>
> >> fast -> quick
> >>
> >> Thanks.
> >>
> >>> --
> >>> 2.26.2
> >>>
> >>
> >>
> >
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
  2020-05-20 10:02   ` Qu Wenruo
@ 2020-05-20 11:32     ` Qu Wenruo
  2020-05-20 12:38       ` Filipe Manana
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2020-05-20 11:32 UTC (permalink / raw)
  To: fdmanana, Qu Wenruo; +Cc: linux-btrfs, fstests


[-- Attachment #1.1: Type: text/plain, Size: 4747 bytes --]



On 2020/5/20 下午6:02, Qu Wenruo wrote:
> 
> 
> On 2020/5/20 下午5:29, Filipe Manana wrote:
>> On Wed, May 20, 2020 at 8:59 AM Qu Wenruo <wqu@suse.com> wrote:
>>>
>>> Test if canceling a running balance can cause later balance to dead
>>> loop.
>>>
>>> The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
>>>  orphan roots to prevent runaway balance".
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>> Changelog:
>>> v2:
>>> - Remove lsof debug output
>>> v3:
>>> - Remove ps debug output
>>> ---
>>>  tests/btrfs/213     | 64 +++++++++++++++++++++++++++++++++++++++++++++
>>>  tests/btrfs/213.out |  2 ++
>>>  tests/btrfs/group   |  1 +
>>>  3 files changed, 67 insertions(+)
>>>  create mode 100755 tests/btrfs/213
>>>  create mode 100644 tests/btrfs/213.out
>>>
>>> diff --git a/tests/btrfs/213 b/tests/btrfs/213
>>> new file mode 100755
>>> index 00000000..f56b0279
>>> --- /dev/null
>>> +++ b/tests/btrfs/213
>>> @@ -0,0 +1,64 @@
>>> +#! /bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
>>> +#
>>> +# FS QA Test 213
>>> +#
>>> +# Test if canceling a running balance can lead to dead looping balance
>>> +#
>>> +seq=`basename $0`
>>> +seqres=$RESULT_DIR/$seq
>>> +echo "QA output created by $seq"
>>> +
>>> +here=`pwd`
>>> +tmp=/tmp/$$
>>> +status=1       # failure is the default!
>>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>>> +
>>> +_cleanup()
>>> +{
>>> +       cd /
>>> +       rm -f $tmp.*
>>> +}
>>> +
>>> +# get standard environment, filters and checks
>>> +. ./common/rc
>>> +. ./common/filter
>>> +
>>> +# remove previous $seqres.full before test
>>> +rm -f $seqres.full
>>> +
>>> +# Modify as appropriate.
>>> +_supported_fs btrfs
>>> +_supported_os Linux
>>> +_require_scratch
>>> +_require_command "$KILLALL_PROG" killall
>>> +
>>> +_scratch_mkfs >> $seqres.full
>>> +_scratch_mount
>>> +
>>> +runtime=4
>>> +
>>> +# Create enough IO so that we need around $runtime seconds to relocate it
>>> +dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
>>> +       &> /dev/null &
>>> +dd_pid=$!
>>> +sleep $runtime
>>> +"$KILLALL_PROG" -q dd &> /dev/null
>>
>> Do we really need the killall program? There's only one dd process.
>>
>> We should also kill the dd process at _cleanup(), as killing the test
>> during the sleep above will result in the dd process not being killed.
> 
> The main problem here is, I can't find a good way to kill dd.
> plain 'kill $dd_pid' doesn't seem to kill it properly, as my debug
> lsof/ps still shows dd running and failed to unmount the fs.
> 
> 'kill -KILL $dd_pid' kills it well, but causes extra output for the
> terminated dd.
> 
> Only 'killall -q dd' works as expected.
> 
> Any good advice on this?

Oh, the fstests "kindly" overrides dd, "kill $dd_pid" only kills the
child bash running that dd function, not the real dd command.
I'll use xfs_io to ensure we get no extra wrapper.

Thanks,
Qu

> 
>>
>>> +wait $dd_pid
>>> +
>>> +# Now balance should take at least $runtime seconds, we can cancel it at
>>> +# $runtime/2 to ensure a success cancel.
>>> +$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
>>> +sleep $(($runtime / 2))
>>> +$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
>>> +
>>> +# Now check if we can finish relocating metadata, which should finish very
>>> +# quickly
>>> +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full
>>
>> Why redirect this one to $seqres.full and not the other balance? What
>> kind of useful information it provides?
> 
> Not really much, just an indicator to show that the balance finishes as
> expected.
> And we don't want to output it golden output, as mkfs change may create
> more metadata chunks and cause difference.
> 
> For other ones, like the one to be canceled, we don't really care that much.
> 
> Thanks,
> Qu
>>
>>> +
>>> +echo "Silence is golden"
>>> +
>>> +# success, all done
>>> +status=0
>>> +exit
>>> diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
>>> new file mode 100644
>>> index 00000000..bd8f2430
>>> --- /dev/null
>>> +++ b/tests/btrfs/213.out
>>> @@ -0,0 +1,2 @@
>>> +QA output created by 213
>>> +Silence is golden
>>> diff --git a/tests/btrfs/group b/tests/btrfs/group
>>> index 8d65bddd..fe4d5bb3 100644
>>> --- a/tests/btrfs/group
>>> +++ b/tests/btrfs/group
>>> @@ -215,3 +215,4 @@
>>>  210 auto quick qgroup snapshot
>>>  211 auto quick log prealloc
>>>  212 auto balance dangerous
>>> +213 auto fast balance dangerous
>>
>> fast -> quick
>>
>> Thanks.
>>
>>> --
>>> 2.26.2
>>>
>>
>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
  2020-05-20  9:29 ` Filipe Manana
@ 2020-05-20 10:02   ` Qu Wenruo
  2020-05-20 11:32     ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2020-05-20 10:02 UTC (permalink / raw)
  To: fdmanana, Qu Wenruo; +Cc: linux-btrfs, fstests


[-- Attachment #1.1: Type: text/plain, Size: 4322 bytes --]



On 2020/5/20 下午5:29, Filipe Manana wrote:
> On Wed, May 20, 2020 at 8:59 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> Test if canceling a running balance can cause later balance to dead
>> loop.
>>
>> The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
>>  orphan roots to prevent runaway balance".
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Remove lsof debug output
>> v3:
>> - Remove ps debug output
>> ---
>>  tests/btrfs/213     | 64 +++++++++++++++++++++++++++++++++++++++++++++
>>  tests/btrfs/213.out |  2 ++
>>  tests/btrfs/group   |  1 +
>>  3 files changed, 67 insertions(+)
>>  create mode 100755 tests/btrfs/213
>>  create mode 100644 tests/btrfs/213.out
>>
>> diff --git a/tests/btrfs/213 b/tests/btrfs/213
>> new file mode 100755
>> index 00000000..f56b0279
>> --- /dev/null
>> +++ b/tests/btrfs/213
>> @@ -0,0 +1,64 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
>> +#
>> +# FS QA Test 213
>> +#
>> +# Test if canceling a running balance can lead to dead looping balance
>> +#
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1       # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +       cd /
>> +       rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +
>> +# remove previous $seqres.full before test
>> +rm -f $seqres.full
>> +
>> +# Modify as appropriate.
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +_require_command "$KILLALL_PROG" killall
>> +
>> +_scratch_mkfs >> $seqres.full
>> +_scratch_mount
>> +
>> +runtime=4
>> +
>> +# Create enough IO so that we need around $runtime seconds to relocate it
>> +dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
>> +       &> /dev/null &
>> +dd_pid=$!
>> +sleep $runtime
>> +"$KILLALL_PROG" -q dd &> /dev/null
> 
> Do we really need the killall program? There's only one dd process.
> 
> We should also kill the dd process at _cleanup(), as killing the test
> during the sleep above will result in the dd process not being killed.

The main problem here is, I can't find a good way to kill dd.
plain 'kill $dd_pid' doesn't seem to kill it properly, as my debug
lsof/ps still shows dd running and failed to unmount the fs.

'kill -KILL $dd_pid' kills it well, but causes extra output for the
terminated dd.

Only 'killall -q dd' works as expected.

Any good advice on this?

> 
>> +wait $dd_pid
>> +
>> +# Now balance should take at least $runtime seconds, we can cancel it at
>> +# $runtime/2 to ensure a success cancel.
>> +$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
>> +sleep $(($runtime / 2))
>> +$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
>> +
>> +# Now check if we can finish relocating metadata, which should finish very
>> +# quickly
>> +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full
> 
> Why redirect this one to $seqres.full and not the other balance? What
> kind of useful information it provides?

Not really much, just an indicator to show that the balance finishes as
expected.
And we don't want to output it golden output, as mkfs change may create
more metadata chunks and cause difference.

For other ones, like the one to be canceled, we don't really care that much.

Thanks,
Qu
> 
>> +
>> +echo "Silence is golden"
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
>> new file mode 100644
>> index 00000000..bd8f2430
>> --- /dev/null
>> +++ b/tests/btrfs/213.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 213
>> +Silence is golden
>> diff --git a/tests/btrfs/group b/tests/btrfs/group
>> index 8d65bddd..fe4d5bb3 100644
>> --- a/tests/btrfs/group
>> +++ b/tests/btrfs/group
>> @@ -215,3 +215,4 @@
>>  210 auto quick qgroup snapshot
>>  211 auto quick log prealloc
>>  212 auto balance dangerous
>> +213 auto fast balance dangerous
> 
> fast -> quick
> 
> Thanks.
> 
>> --
>> 2.26.2
>>
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
  2020-05-20  7:57 Qu Wenruo
  2020-05-20  8:23 ` Johannes Thumshirn
@ 2020-05-20  9:29 ` Filipe Manana
  2020-05-20 10:02   ` Qu Wenruo
  1 sibling, 1 reply; 8+ messages in thread
From: Filipe Manana @ 2020-05-20  9:29 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, fstests

On Wed, May 20, 2020 at 8:59 AM Qu Wenruo <wqu@suse.com> wrote:
>
> Test if canceling a running balance can cause later balance to dead
> loop.
>
> The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
>  orphan roots to prevent runaway balance".
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Remove lsof debug output
> v3:
> - Remove ps debug output
> ---
>  tests/btrfs/213     | 64 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/213.out |  2 ++
>  tests/btrfs/group   |  1 +
>  3 files changed, 67 insertions(+)
>  create mode 100755 tests/btrfs/213
>  create mode 100644 tests/btrfs/213.out
>
> diff --git a/tests/btrfs/213 b/tests/btrfs/213
> new file mode 100755
> index 00000000..f56b0279
> --- /dev/null
> +++ b/tests/btrfs/213
> @@ -0,0 +1,64 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 213
> +#
> +# Test if canceling a running balance can lead to dead looping balance
> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1       # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +       cd /
> +       rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# Modify as appropriate.
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +_require_command "$KILLALL_PROG" killall
> +
> +_scratch_mkfs >> $seqres.full
> +_scratch_mount
> +
> +runtime=4
> +
> +# Create enough IO so that we need around $runtime seconds to relocate it
> +dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
> +       &> /dev/null &
> +dd_pid=$!
> +sleep $runtime
> +"$KILLALL_PROG" -q dd &> /dev/null

Do we really need the killall program? There's only one dd process.

We should also kill the dd process at _cleanup(), as killing the test
during the sleep above will result in the dd process not being killed.

> +wait $dd_pid
> +
> +# Now balance should take at least $runtime seconds, we can cancel it at
> +# $runtime/2 to ensure a success cancel.
> +$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
> +sleep $(($runtime / 2))
> +$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
> +
> +# Now check if we can finish relocating metadata, which should finish very
> +# quickly
> +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full

Why redirect this one to $seqres.full and not the other balance? What
kind of useful information it provides?

> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
> new file mode 100644
> index 00000000..bd8f2430
> --- /dev/null
> +++ b/tests/btrfs/213.out
> @@ -0,0 +1,2 @@
> +QA output created by 213
> +Silence is golden
> diff --git a/tests/btrfs/group b/tests/btrfs/group
> index 8d65bddd..fe4d5bb3 100644
> --- a/tests/btrfs/group
> +++ b/tests/btrfs/group
> @@ -215,3 +215,4 @@
>  210 auto quick qgroup snapshot
>  211 auto quick log prealloc
>  212 auto balance dangerous
> +213 auto fast balance dangerous

fast -> quick

Thanks.

> --
> 2.26.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
  2020-05-20  7:57 Qu Wenruo
@ 2020-05-20  8:23 ` Johannes Thumshirn
  2020-05-20  9:29 ` Filipe Manana
  1 sibling, 0 replies; 8+ messages in thread
From: Johannes Thumshirn @ 2020-05-20  8:23 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, fstests

On 20/05/2020 09:57, Qu Wenruo wrote:
> The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for

Nit: s/ifx/fix/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel
@ 2020-05-20  7:57 Qu Wenruo
  2020-05-20  8:23 ` Johannes Thumshirn
  2020-05-20  9:29 ` Filipe Manana
  0 siblings, 2 replies; 8+ messages in thread
From: Qu Wenruo @ 2020-05-20  7:57 UTC (permalink / raw)
  To: linux-btrfs, fstests

Test if canceling a running balance can cause later balance to dead
loop.

The ifx is titled "btrfs: relocation: Clear the DEAD_RELOC_TREE bit for
 orphan roots to prevent runaway balance".

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Remove lsof debug output
v3:
- Remove ps debug output
---
 tests/btrfs/213     | 64 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/213.out |  2 ++
 tests/btrfs/group   |  1 +
 3 files changed, 67 insertions(+)
 create mode 100755 tests/btrfs/213
 create mode 100644 tests/btrfs/213.out

diff --git a/tests/btrfs/213 b/tests/btrfs/213
new file mode 100755
index 00000000..f56b0279
--- /dev/null
+++ b/tests/btrfs/213
@@ -0,0 +1,64 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2020 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 213
+#
+# Test if canceling a running balance can lead to dead looping balance
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# Modify as appropriate.
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_command "$KILLALL_PROG" killall
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+
+runtime=4
+
+# Create enough IO so that we need around $runtime seconds to relocate it
+dd if=/dev/zero bs=1M of="$SCRATCH_MNT/file" oflag=sync status=none \
+	&> /dev/null &
+dd_pid=$!
+sleep $runtime
+"$KILLALL_PROG" -q dd &> /dev/null
+wait $dd_pid
+
+# Now balance should take at least $runtime seconds, we can cancel it at
+# $runtime/2 to ensure a success cancel.
+$BTRFS_UTIL_PROG balance start -d --bg "$SCRATCH_MNT"
+sleep $(($runtime / 2))
+$BTRFS_UTIL_PROG balance cancel "$SCRATCH_MNT"
+
+# Now check if we can finish relocating metadata, which should finish very
+# quickly
+$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" >> $seqres.full
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/213.out b/tests/btrfs/213.out
new file mode 100644
index 00000000..bd8f2430
--- /dev/null
+++ b/tests/btrfs/213.out
@@ -0,0 +1,2 @@
+QA output created by 213
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 8d65bddd..fe4d5bb3 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -215,3 +215,4 @@
 210 auto quick qgroup snapshot
 211 auto quick log prealloc
 212 auto balance dangerous
+213 auto fast balance dangerous
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-20 12:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20  7:43 [PATCH v3] btrfs: Add a test for dead looping balance after balance cancel Qu Wenruo
2020-05-20  7:48 ` Nikolay Borisov
2020-05-20  7:57 Qu Wenruo
2020-05-20  8:23 ` Johannes Thumshirn
2020-05-20  9:29 ` Filipe Manana
2020-05-20 10:02   ` Qu Wenruo
2020-05-20 11:32     ` Qu Wenruo
2020-05-20 12:38       ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).