All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] fstests: btrfs: add a tests case to make sure btrfs can handle certain interleaved free space correctly
@ 2022-07-18  6:18 Qu Wenruo
  2022-07-18 17:59 ` Zorro Lang
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2022-07-18  6:18 UTC (permalink / raw)
  To: linux-btrfs, fstests; +Cc: Johannes Thumshirn

This is a future-proof test mostly for future zoned raid-stripe-tree
(RST) and P/Q COW based RAID56 implementation.

Unlike regular devices, zoned device can not do overwrite without
resetting (reclaim) a whole zone.

And for the RST and P/Q COW based RAID56, the idea is to CoW the parity
stripe to other location.

But all above behaviors introduce some limitation, if we fill the fs,
then free half of the space interleaved.

- For basic zoned btrfs (aka SINGLE profile for now)
  Normally this means we have no free space at all.

  Thankfully zoned btrfs has GC and reserved zones to reclaim those
  half filled zones.
  In theory we should be able to do new writes.

- For future RST with P/Q CoW for RAID56, on non-zoned device.
  This is more complex, in this case, we should have the following
  full stripe layout for every full stripe:
          0                           64K
  Disk A  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (Data 1)
  Disk B  |                           | (Data 2)
  Disk C  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (P stripe)

  Although in theory we can write into Disk B, but we have to find
  a free space for the new Parity.

  But all other full stripe are like this, which means we're deadlocking
  to find a pure free space without sub-stripe writing.

  This means, even for non-zoned btrfs, we still need GC and reserved
  space to handle P/Q CoW properly.

Another thing specific to this test case is, to reduce the runtime, I
use 256M as the mkfs size for each device.
(A full run with KASAN enabled kernel already takes over 700 seconds)

So far this can only works for non-zoned disks, as 256M is too small for
zoned devices to have enough zones.

Thus need extra advice from zoned device guys.

Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/btrfs/261     | 129 ++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/261.out |   2 +
 2 files changed, 131 insertions(+)
 create mode 100755 tests/btrfs/261
 create mode 100644 tests/btrfs/261.out

diff --git a/tests/btrfs/261 b/tests/btrfs/261
new file mode 100755
index 00000000..01da4759
--- /dev/null
+++ b/tests/btrfs/261
@@ -0,0 +1,129 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 261
+#
+# Make sure all supported profiles (including future zoned RAID56) have proper
+# way to handle fs with interleaved filled space, and can still write data
+# into the fs.
+#
+# This is mostly inspired by some discussion on P/Q COW for RAID56, even for
+# regular devices, this can be problematic if we fill the fs then delete
+# half of the extents interleavedly. Without proper GC and extra reserved
+# space, such CoW P/Q way should run out of space (even one data stripe is
+# free, there is no place to CoW its P/Q).
+#
+. ./common/preamble
+_begin_fstest auto enospc raid
+
+# Override the default cleanup function.
+# _cleanup()
+# {
+# 	cd /
+# 	rm -r -f $tmp.*
+# }
+
+# Import common functions.
+. ./common/filter
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs btrfs
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+
+fill_fs()
+{
+	for (( i = 0;; i += 2 )); do
+		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$i \
+			&> /dev/null
+		if [ $? -ne 0 ]; then
+			break
+		fi
+		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$(($i + 1)) \
+			&> /dev/null
+		if [ $? -ne 0 ]; then
+			break
+		fi
+
+		# Only sync after data 1M writes.
+		if [ $(( $i % 8)) -eq 0 ]; then
+			sync
+		fi
+	done
+
+	# Sync what hasn't yet synced.
+	sync
+	
+	echo "fs filled with $i full stripe write" >> $seqres.full
+
+	# Delete half of the files created above, which should leave
+	# the fs half empty. For RAID56 this would leave all of its full
+	# stripes to be have one full data stripe, one free data stripe,
+	# and one P/Q stripe still in use.
+	rm -rf -- $SCRATCH_MNT/file_*[02468]
+	
+	# Sync to make sure above deleted files really got freed.
+	sync
+}
+
+run_test()
+{
+	local profile=$1
+	local nr_dev=$2
+
+	echo "=== profile=$profile nr_dev=$nr_dev ===" >> $seqres.full
+	_scratch_dev_pool_get $nr_dev
+	# -b is for each device.
+	# Here we use 256M to reduce the runtime.
+	_scratch_pool_mkfs -b 256M -m$profile -d$profile >>$seqres.full 2>&1
+	# make sure we created btrfs with desired options
+	if [ $? -ne 0 ]; then
+		echo "mkfs $mkfs_opts failed"
+		return
+	fi
+	_scratch_mount >>$seqres.full 2>&1
+
+	fill_fs
+
+	# Now try to write 4M data, with the fs half empty we should be
+	# able to do that.
+	# For zoned devices, this will test if the GC and reserved zones
+	# can handle such cases properly.
+	$XFS_IO_PROG -f -c "pwrite 0 4M" -c sync $SCRATCH_MNT/final_write \
+		>> $seqres.full 2>&1
+	if [ $? -ne 0 ]; then
+		echo "The final write failed"
+	fi
+
+	_scratch_unmount
+	# we called _require_scratch_nocheck instead of _require_scratch
+	# do check after test for each profile config
+	_check_scratch_fs
+	echo  >> $seqres.full
+	_scratch_dev_pool_put
+}
+
+# Here we don't use _btrfs_profile_configs as that doesn't include
+# the number of devices, but for full stripe writes for RAID56, we
+# need to ensure nr_data must be 2, so here we manually specify
+# the profile and number of devices.
+run_test "single" "1"
+
+# Zoned only support
+if _scratch_btrfs_is_zoned; then
+	exit
+fi
+
+run_test "raid0" "2"
+run_test "raid1" "2"
+run_test "raid10" "4"
+run_test "raid5" "3"
+run_test "raid6" "4"
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/btrfs/261.out b/tests/btrfs/261.out
new file mode 100644
index 00000000..679ddc0f
--- /dev/null
+++ b/tests/btrfs/261.out
@@ -0,0 +1,2 @@
+QA output created by 261
+Silence is golden
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] fstests: btrfs: add a tests case to make sure btrfs can handle certain interleaved free space correctly
  2022-07-18  6:18 [PATCH RFC] fstests: btrfs: add a tests case to make sure btrfs can handle certain interleaved free space correctly Qu Wenruo
@ 2022-07-18 17:59 ` Zorro Lang
  2022-07-19  5:16   ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Zorro Lang @ 2022-07-18 17:59 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, fstests, Johannes Thumshirn

On Mon, Jul 18, 2022 at 02:18:23PM +0800, Qu Wenruo wrote:
> This is a future-proof test mostly for future zoned raid-stripe-tree
> (RST) and P/Q COW based RAID56 implementation.
> 
> Unlike regular devices, zoned device can not do overwrite without
> resetting (reclaim) a whole zone.
> 
> And for the RST and P/Q COW based RAID56, the idea is to CoW the parity
> stripe to other location.
> 
> But all above behaviors introduce some limitation, if we fill the fs,
> then free half of the space interleaved.
> 
> - For basic zoned btrfs (aka SINGLE profile for now)
>   Normally this means we have no free space at all.
> 
>   Thankfully zoned btrfs has GC and reserved zones to reclaim those
>   half filled zones.
>   In theory we should be able to do new writes.
> 
> - For future RST with P/Q CoW for RAID56, on non-zoned device.
>   This is more complex, in this case, we should have the following
>   full stripe layout for every full stripe:
>           0                           64K
>   Disk A  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (Data 1)
>   Disk B  |                           | (Data 2)
>   Disk C  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (P stripe)
> 
>   Although in theory we can write into Disk B, but we have to find
>   a free space for the new Parity.
> 
>   But all other full stripe are like this, which means we're deadlocking
>   to find a pure free space without sub-stripe writing.
> 
>   This means, even for non-zoned btrfs, we still need GC and reserved
>   space to handle P/Q CoW properly.
> 
> Another thing specific to this test case is, to reduce the runtime, I
> use 256M as the mkfs size for each device.
> (A full run with KASAN enabled kernel already takes over 700 seconds)
> 
> So far this can only works for non-zoned disks, as 256M is too small for
> zoned devices to have enough zones.
> 
> Thus need extra advice from zoned device guys.
> 
> Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---

I think this patch need more review from btrfs list. I just review this patch
from fstests side as below ...

>  tests/btrfs/261     | 129 ++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/261.out |   2 +
>  2 files changed, 131 insertions(+)
>  create mode 100755 tests/btrfs/261
>  create mode 100644 tests/btrfs/261.out
> 
> diff --git a/tests/btrfs/261 b/tests/btrfs/261
> new file mode 100755
> index 00000000..01da4759
> --- /dev/null
> +++ b/tests/btrfs/261
> @@ -0,0 +1,129 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 261
> +#
> +# Make sure all supported profiles (including future zoned RAID56) have proper
> +# way to handle fs with interleaved filled space, and can still write data
> +# into the fs.
> +#
> +# This is mostly inspired by some discussion on P/Q COW for RAID56, even for
> +# regular devices, this can be problematic if we fill the fs then delete
> +# half of the extents interleavedly. Without proper GC and extra reserved
> +# space, such CoW P/Q way should run out of space (even one data stripe is
> +# free, there is no place to CoW its P/Q).
> +#
> +. ./common/preamble
> +_begin_fstest auto enospc raid
> +
> +# Override the default cleanup function.
> +# _cleanup()
> +# {
> +# 	cd /
> +# 	rm -r -f $tmp.*
> +# }

This _cleanup looks like nothing special, you can remove it, to use the default
one.

> +
> +# Import common functions.
> +. ./common/filter
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
     ^^^
   Remove this line please.

> +_supported_fs btrfs
> +# we check scratch dev after each loop
> +_require_scratch_nocheck
> +_require_scratch_dev_pool 4
> +
> +fill_fs()

There's a help named _fill_fs() in common/populate file. I'm not sure if there
are special things in your fill_fs function, better to check if our common
helper can help you?

> +{
> +	for (( i = 0;; i += 2 )); do
> +		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$i \
> +			&> /dev/null
> +		if [ $? -ne 0 ]; then
> +			break
> +		fi
> +		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$(($i + 1)) \
> +			&> /dev/null
> +		if [ $? -ne 0 ]; then
> +			break
> +		fi
> +
> +		# Only sync after data 1M writes.
> +		if [ $(( $i % 8)) -eq 0 ]; then
> +			sync
> +		fi
> +	done
> +
> +	# Sync what hasn't yet synced.
> +	sync
> +	
> +	echo "fs filled with $i full stripe write" >> $seqres.full
> +
> +	# Delete half of the files created above, which should leave
> +	# the fs half empty. For RAID56 this would leave all of its full
> +	# stripes to be have one full data stripe, one free data stripe,
> +	# and one P/Q stripe still in use.
> +	rm -rf -- $SCRATCH_MNT/file_*[02468]
> +	
> +	# Sync to make sure above deleted files really got freed.
> +	sync
> +}
> +
> +run_test()
> +{
> +	local profile=$1
> +	local nr_dev=$2
> +
> +	echo "=== profile=$profile nr_dev=$nr_dev ===" >> $seqres.full
> +	_scratch_dev_pool_get $nr_dev
> +	# -b is for each device.
> +	# Here we use 256M to reduce the runtime.
> +	_scratch_pool_mkfs -b 256M -m$profile -d$profile >>$seqres.full 2>&1

Do you need to make sure this mkfs successed at here?

> +	# make sure we created btrfs with desired options
> +	if [ $? -ne 0 ]; then
> +		echo "mkfs $mkfs_opts failed"
> +		return
> +	fi
> +	_scratch_mount >>$seqres.full 2>&1

If _scratch_mount fails, the testing will exit directly. So generally we don't
need to fill out stdout/stderr. Or you actually want to use _try_scratch_mount
at here?

> +
> +	fill_fs
> +
> +	# Now try to write 4M data, with the fs half empty we should be
> +	# able to do that.
> +	# For zoned devices, this will test if the GC and reserved zones
> +	# can handle such cases properly.
> +	$XFS_IO_PROG -f -c "pwrite 0 4M" -c sync $SCRATCH_MNT/final_write \
> +		>> $seqres.full 2>&1
> +	if [ $? -ne 0 ]; then
> +		echo "The final write failed"
> +	fi
> +
> +	_scratch_unmount
> +	# we called _require_scratch_nocheck instead of _require_scratch
> +	# do check after test for each profile config
> +	_check_scratch_fs
> +	echo  >> $seqres.full
> +	_scratch_dev_pool_put
> +}
> +
> +# Here we don't use _btrfs_profile_configs as that doesn't include
> +# the number of devices, but for full stripe writes for RAID56, we
> +# need to ensure nr_data must be 2, so here we manually specify
> +# the profile and number of devices.
> +run_test "single" "1"
> +
> +# Zoned only support
> +if _scratch_btrfs_is_zoned; then
> +	exit

I think this "exit" will fail this test directly, due to status=1 currectly.
You can use _require_non_zoned_device() to run this case for non-zoned device
only. Or

  if ! _scratch_btrfs_is_zoned;then
	run_test "raid0" "2"
	run_test "raid1" "2"
	run_test "raid10" "4"
	run_test "raid5" "3"
	run_test "raid6" "4"
  fi

As this case is "Silence is golden".

I'm not sure what do you really need at here, can these help?

Thanks,
Zorro

> +fi
> +
> +run_test "raid0" "2"
> +run_test "raid1" "2"
> +run_test "raid10" "4"
> +run_test "raid5" "3"
> +run_test "raid6" "4"
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/btrfs/261.out b/tests/btrfs/261.out
> new file mode 100644
> index 00000000..679ddc0f
> --- /dev/null
> +++ b/tests/btrfs/261.out
> @@ -0,0 +1,2 @@
> +QA output created by 261
> +Silence is golden
> -- 
> 2.36.1
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] fstests: btrfs: add a tests case to make sure btrfs can handle certain interleaved free space correctly
  2022-07-18 17:59 ` Zorro Lang
@ 2022-07-19  5:16   ` Qu Wenruo
  2022-07-19  5:52     ` Zorro Lang
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2022-07-19  5:16 UTC (permalink / raw)
  To: Zorro Lang, Qu Wenruo; +Cc: linux-btrfs, fstests, Johannes Thumshirn



On 2022/7/19 01:59, Zorro Lang wrote:
> On Mon, Jul 18, 2022 at 02:18:23PM +0800, Qu Wenruo wrote:
>> This is a future-proof test mostly for future zoned raid-stripe-tree
>> (RST) and P/Q COW based RAID56 implementation.
>>
>> Unlike regular devices, zoned device can not do overwrite without
>> resetting (reclaim) a whole zone.
>>
>> And for the RST and P/Q COW based RAID56, the idea is to CoW the parity
>> stripe to other location.
>>
>> But all above behaviors introduce some limitation, if we fill the fs,
>> then free half of the space interleaved.
>>
>> - For basic zoned btrfs (aka SINGLE profile for now)
>>    Normally this means we have no free space at all.
>>
>>    Thankfully zoned btrfs has GC and reserved zones to reclaim those
>>    half filled zones.
>>    In theory we should be able to do new writes.
>>
>> - For future RST with P/Q CoW for RAID56, on non-zoned device.
>>    This is more complex, in this case, we should have the following
>>    full stripe layout for every full stripe:
>>            0                           64K
>>    Disk A  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (Data 1)
>>    Disk B  |                           | (Data 2)
>>    Disk C  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (P stripe)
>>
>>    Although in theory we can write into Disk B, but we have to find
>>    a free space for the new Parity.
>>
>>    But all other full stripe are like this, which means we're deadlocking
>>    to find a pure free space without sub-stripe writing.
>>
>>    This means, even for non-zoned btrfs, we still need GC and reserved
>>    space to handle P/Q CoW properly.
>>
>> Another thing specific to this test case is, to reduce the runtime, I
>> use 256M as the mkfs size for each device.
>> (A full run with KASAN enabled kernel already takes over 700 seconds)
>>
>> So far this can only works for non-zoned disks, as 256M is too small for
>> zoned devices to have enough zones.
>>
>> Thus need extra advice from zoned device guys.
>>
>> Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>
> I think this patch need more review from btrfs list. I just review this patch
> from fstests side as below ...
>
>>   tests/btrfs/261     | 129 ++++++++++++++++++++++++++++++++++++++++++++
>>   tests/btrfs/261.out |   2 +
>>   2 files changed, 131 insertions(+)
>>   create mode 100755 tests/btrfs/261
>>   create mode 100644 tests/btrfs/261.out
>>
>> diff --git a/tests/btrfs/261 b/tests/btrfs/261
>> new file mode 100755
>> index 00000000..01da4759
>> --- /dev/null
>> +++ b/tests/btrfs/261
>> @@ -0,0 +1,129 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
>> +#
>> +# FS QA Test 261
>> +#
>> +# Make sure all supported profiles (including future zoned RAID56) have proper
>> +# way to handle fs with interleaved filled space, and can still write data
>> +# into the fs.
>> +#
>> +# This is mostly inspired by some discussion on P/Q COW for RAID56, even for
>> +# regular devices, this can be problematic if we fill the fs then delete
>> +# half of the extents interleavedly. Without proper GC and extra reserved
>> +# space, such CoW P/Q way should run out of space (even one data stripe is
>> +# free, there is no place to CoW its P/Q).
>> +#
>> +. ./common/preamble
>> +_begin_fstest auto enospc raid
>> +
>> +# Override the default cleanup function.
>> +# _cleanup()
>> +# {
>> +# 	cd /
>> +# 	rm -r -f $tmp.*
>> +# }
>
> This _cleanup looks like nothing special, you can remove it, to use the default
> one.

It's still commented out, just from the template.

Or you mean I should delete the unused cleanup function if we don't need?

>
>> +
>> +# Import common functions.
>> +. ./common/filter
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>       ^^^
>     Remove this line please.
>
>> +_supported_fs btrfs
>> +# we check scratch dev after each loop
>> +_require_scratch_nocheck
>> +_require_scratch_dev_pool 4
>> +
>> +fill_fs()
>
> There's a help named _fill_fs() in common/populate file. I'm not sure if there
> are special things in your fill_fs function, better to check if our common
> helper can help you?

The fill fs here is to make sure we fill the fs in a specific way
(always fill the fs using 128KiB, while still being able to delete 64KiB).

I'll add a comment for the reason.

>
>> +{
>> +	for (( i = 0;; i += 2 )); do
>> +		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$i \
>> +			&> /dev/null
>> +		if [ $? -ne 0 ]; then
>> +			break
>> +		fi
>> +		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$(($i + 1)) \
>> +			&> /dev/null
>> +		if [ $? -ne 0 ]; then
>> +			break
>> +		fi
>> +
>> +		# Only sync after data 1M writes.
>> +		if [ $(( $i % 8)) -eq 0 ]; then
>> +			sync
>> +		fi
>> +	done
>> +
>> +	# Sync what hasn't yet synced.
>> +	sync
>> +
>> +	echo "fs filled with $i full stripe write" >> $seqres.full
>> +
>> +	# Delete half of the files created above, which should leave
>> +	# the fs half empty. For RAID56 this would leave all of its full
>> +	# stripes to be have one full data stripe, one free data stripe,
>> +	# and one P/Q stripe still in use.
>> +	rm -rf -- $SCRATCH_MNT/file_*[02468]
>> +
>> +	# Sync to make sure above deleted files really got freed.
>> +	sync
>> +}
>> +
>> +run_test()
>> +{
>> +	local profile=$1
>> +	local nr_dev=$2
>> +
>> +	echo "=== profile=$profile nr_dev=$nr_dev ===" >> $seqres.full
>> +	_scratch_dev_pool_get $nr_dev
>> +	# -b is for each device.
>> +	# Here we use 256M to reduce the runtime.
>> +	_scratch_pool_mkfs -b 256M -m$profile -d$profile >>$seqres.full 2>&1
>
> Do you need to make sure this mkfs successed at here?

Yes.

>
>> +	# make sure we created btrfs with desired options
>> +	if [ $? -ne 0 ]; then
>> +		echo "mkfs $mkfs_opts failed"
>> +		return
>> +	fi
>> +	_scratch_mount >>$seqres.full 2>&1
>
> If _scratch_mount fails, the testing will exit directly. So generally we don't
> need to fill out stdout/stderr. Or you actually want to use _try_scratch_mount
> at here?

_scratch_mount is exactly what I need, I'll just remove the unnecessary
redirection.

>
>> +
>> +	fill_fs
>> +
>> +	# Now try to write 4M data, with the fs half empty we should be
>> +	# able to do that.
>> +	# For zoned devices, this will test if the GC and reserved zones
>> +	# can handle such cases properly.
>> +	$XFS_IO_PROG -f -c "pwrite 0 4M" -c sync $SCRATCH_MNT/final_write \
>> +		>> $seqres.full 2>&1
>> +	if [ $? -ne 0 ]; then
>> +		echo "The final write failed"
>> +	fi
>> +
>> +	_scratch_unmount
>> +	# we called _require_scratch_nocheck instead of _require_scratch
>> +	# do check after test for each profile config
>> +	_check_scratch_fs
>> +	echo  >> $seqres.full
>> +	_scratch_dev_pool_put
>> +}
>> +
>> +# Here we don't use _btrfs_profile_configs as that doesn't include
>> +# the number of devices, but for full stripe writes for RAID56, we
>> +# need to ensure nr_data must be 2, so here we manually specify
>> +# the profile and number of devices.
>> +run_test "single" "1"
>> +
>> +# Zoned only support
>> +if _scratch_btrfs_is_zoned; then
>> +	exit
>
> I think this "exit" will fail this test directly, due to status=1 currectly.
> You can use _require_non_zoned_device() to run this case for non-zoned device
> only. Or
>
>    if ! _scratch_btrfs_is_zoned;then
> 	run_test "raid0" "2"
> 	run_test "raid1" "2"
> 	run_test "raid10" "4"
> 	run_test "raid5" "3"
> 	run_test "raid6" "4"
>    fi
>
> As this case is "Silence is golden".
>
> I'm not sure what do you really need at here, can these help?

My bad, I forgot to finish the comment, and your example is perfect.

Thanks for the review.
Qu

>
> Thanks,
> Zorro
>
>> +fi
>> +
>> +run_test "raid0" "2"
>> +run_test "raid1" "2"
>> +run_test "raid10" "4"
>> +run_test "raid5" "3"
>> +run_test "raid6" "4"
>> +
>> +echo "Silence is golden"
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/261.out b/tests/btrfs/261.out
>> new file mode 100644
>> index 00000000..679ddc0f
>> --- /dev/null
>> +++ b/tests/btrfs/261.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 261
>> +Silence is golden
>> --
>> 2.36.1
>>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] fstests: btrfs: add a tests case to make sure btrfs can handle certain interleaved free space correctly
  2022-07-19  5:16   ` Qu Wenruo
@ 2022-07-19  5:52     ` Zorro Lang
  0 siblings, 0 replies; 4+ messages in thread
From: Zorro Lang @ 2022-07-19  5:52 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs, fstests, Johannes Thumshirn

On Tue, Jul 19, 2022 at 01:16:54PM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/7/19 01:59, Zorro Lang wrote:
> > On Mon, Jul 18, 2022 at 02:18:23PM +0800, Qu Wenruo wrote:
> > > This is a future-proof test mostly for future zoned raid-stripe-tree
> > > (RST) and P/Q COW based RAID56 implementation.
> > > 
> > > Unlike regular devices, zoned device can not do overwrite without
> > > resetting (reclaim) a whole zone.
> > > 
> > > And for the RST and P/Q COW based RAID56, the idea is to CoW the parity
> > > stripe to other location.
> > > 
> > > But all above behaviors introduce some limitation, if we fill the fs,
> > > then free half of the space interleaved.
> > > 
> > > - For basic zoned btrfs (aka SINGLE profile for now)
> > >    Normally this means we have no free space at all.
> > > 
> > >    Thankfully zoned btrfs has GC and reserved zones to reclaim those
> > >    half filled zones.
> > >    In theory we should be able to do new writes.
> > > 
> > > - For future RST with P/Q CoW for RAID56, on non-zoned device.
> > >    This is more complex, in this case, we should have the following
> > >    full stripe layout for every full stripe:
> > >            0                           64K
> > >    Disk A  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (Data 1)
> > >    Disk B  |                           | (Data 2)
> > >    Disk C  |XXXXXXXXXXXXXXXXXXXXXXXXXXX| (P stripe)
> > > 
> > >    Although in theory we can write into Disk B, but we have to find
> > >    a free space for the new Parity.
> > > 
> > >    But all other full stripe are like this, which means we're deadlocking
> > >    to find a pure free space without sub-stripe writing.
> > > 
> > >    This means, even for non-zoned btrfs, we still need GC and reserved
> > >    space to handle P/Q CoW properly.
> > > 
> > > Another thing specific to this test case is, to reduce the runtime, I
> > > use 256M as the mkfs size for each device.
> > > (A full run with KASAN enabled kernel already takes over 700 seconds)
> > > 
> > > So far this can only works for non-zoned disks, as 256M is too small for
> > > zoned devices to have enough zones.
> > > 
> > > Thus need extra advice from zoned device guys.
> > > 
> > > Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
> > > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > > ---
> > 
> > I think this patch need more review from btrfs list. I just review this patch
> > from fstests side as below ...
> > 
> > >   tests/btrfs/261     | 129 ++++++++++++++++++++++++++++++++++++++++++++
> > >   tests/btrfs/261.out |   2 +
> > >   2 files changed, 131 insertions(+)
> > >   create mode 100755 tests/btrfs/261
> > >   create mode 100644 tests/btrfs/261.out
> > > 
> > > diff --git a/tests/btrfs/261 b/tests/btrfs/261
> > > new file mode 100755
> > > index 00000000..01da4759
> > > --- /dev/null
> > > +++ b/tests/btrfs/261
> > > @@ -0,0 +1,129 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
> > > +#
> > > +# FS QA Test 261
> > > +#
> > > +# Make sure all supported profiles (including future zoned RAID56) have proper
> > > +# way to handle fs with interleaved filled space, and can still write data
> > > +# into the fs.
> > > +#
> > > +# This is mostly inspired by some discussion on P/Q COW for RAID56, even for
> > > +# regular devices, this can be problematic if we fill the fs then delete
> > > +# half of the extents interleavedly. Without proper GC and extra reserved
> > > +# space, such CoW P/Q way should run out of space (even one data stripe is
> > > +# free, there is no place to CoW its P/Q).
> > > +#
> > > +. ./common/preamble
> > > +_begin_fstest auto enospc raid
> > > +
> > > +# Override the default cleanup function.
> > > +# _cleanup()
> > > +# {
> > > +# 	cd /
> > > +# 	rm -r -f $tmp.*
> > > +# }
> > 
> > This _cleanup looks like nothing special, you can remove it, to use the default
> > one.
> 
> It's still commented out, just from the template.
> 
> Or you mean I should delete the unused cleanup function if we don't need?

Sure, please remove those useless comments from template, they're just reminder.

> 
> > 
> > > +
> > > +# Import common functions.
> > > +. ./common/filter
> > > +
> > > +# real QA test starts here
> > > +
> > > +# Modify as appropriate.
> >       ^^^
> >     Remove this line please.
> > 
> > > +_supported_fs btrfs
> > > +# we check scratch dev after each loop
> > > +_require_scratch_nocheck
> > > +_require_scratch_dev_pool 4
> > > +
> > > +fill_fs()
> > 
> > There's a help named _fill_fs() in common/populate file. I'm not sure if there
> > are special things in your fill_fs function, better to check if our common
> > helper can help you?
> 
> The fill fs here is to make sure we fill the fs in a specific way
> (always fill the fs using 128KiB, while still being able to delete 64KiB).
> 
> I'll add a comment for the reason.

Sure, if the comment _fill_fs can't help you, feel free to have your own
one (and better to explain what's special in it).

> 
> > 
> > > +{
> > > +	for (( i = 0;; i += 2 )); do
> > > +		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$i \
> > > +			&> /dev/null
> > > +		if [ $? -ne 0 ]; then
> > > +			break
> > > +		fi
> > > +		$XFS_IO_PROG -f -c "pwrite 0 64K" $SCRATCH_MNT/file_$(($i + 1)) \
> > > +			&> /dev/null
> > > +		if [ $? -ne 0 ]; then
> > > +			break
> > > +		fi
> > > +
> > > +		# Only sync after data 1M writes.
> > > +		if [ $(( $i % 8)) -eq 0 ]; then
> > > +			sync
> > > +		fi
> > > +	done
> > > +
> > > +	# Sync what hasn't yet synced.
> > > +	sync
> > > +
> > > +	echo "fs filled with $i full stripe write" >> $seqres.full
> > > +
> > > +	# Delete half of the files created above, which should leave
> > > +	# the fs half empty. For RAID56 this would leave all of its full
> > > +	# stripes to be have one full data stripe, one free data stripe,
> > > +	# and one P/Q stripe still in use.
> > > +	rm -rf -- $SCRATCH_MNT/file_*[02468]
> > > +
> > > +	# Sync to make sure above deleted files really got freed.
> > > +	sync
> > > +}
> > > +
> > > +run_test()
> > > +{
> > > +	local profile=$1
> > > +	local nr_dev=$2
> > > +
> > > +	echo "=== profile=$profile nr_dev=$nr_dev ===" >> $seqres.full
> > > +	_scratch_dev_pool_get $nr_dev
> > > +	# -b is for each device.
> > > +	# Here we use 256M to reduce the runtime.
> > > +	_scratch_pool_mkfs -b 256M -m$profile -d$profile >>$seqres.full 2>&1
> > 
> > Do you need to make sure this mkfs successed at here?
> 
> Yes.
> 
> > 
> > > +	# make sure we created btrfs with desired options
> > > +	if [ $? -ne 0 ]; then
> > > +		echo "mkfs $mkfs_opts failed"
> > > +		return
> > > +	fi
> > > +	_scratch_mount >>$seqres.full 2>&1
> > 
> > If _scratch_mount fails, the testing will exit directly. So generally we don't
> > need to fill out stdout/stderr. Or you actually want to use _try_scratch_mount
> > at here?
> 
> _scratch_mount is exactly what I need, I'll just remove the unnecessary
> redirection.

OK

> 
> > 
> > > +
> > > +	fill_fs
> > > +
> > > +	# Now try to write 4M data, with the fs half empty we should be
> > > +	# able to do that.
> > > +	# For zoned devices, this will test if the GC and reserved zones
> > > +	# can handle such cases properly.
> > > +	$XFS_IO_PROG -f -c "pwrite 0 4M" -c sync $SCRATCH_MNT/final_write \
> > > +		>> $seqres.full 2>&1
> > > +	if [ $? -ne 0 ]; then
> > > +		echo "The final write failed"
> > > +	fi
> > > +
> > > +	_scratch_unmount
> > > +	# we called _require_scratch_nocheck instead of _require_scratch
> > > +	# do check after test for each profile config
> > > +	_check_scratch_fs
> > > +	echo  >> $seqres.full
> > > +	_scratch_dev_pool_put
> > > +}
> > > +
> > > +# Here we don't use _btrfs_profile_configs as that doesn't include
> > > +# the number of devices, but for full stripe writes for RAID56, we
> > > +# need to ensure nr_data must be 2, so here we manually specify
> > > +# the profile and number of devices.
> > > +run_test "single" "1"
> > > +
> > > +# Zoned only support
> > > +if _scratch_btrfs_is_zoned; then
> > > +	exit
> > 
> > I think this "exit" will fail this test directly, due to status=1 currectly.
> > You can use _require_non_zoned_device() to run this case for non-zoned device
> > only. Or
> > 
> >    if ! _scratch_btrfs_is_zoned;then
> > 	run_test "raid0" "2"
> > 	run_test "raid1" "2"
> > 	run_test "raid10" "4"
> > 	run_test "raid5" "3"
> > 	run_test "raid6" "4"
> >    fi
> > 
> > As this case is "Silence is golden".
> > 
> > I'm not sure what do you really need at here, can these help?
> 
> My bad, I forgot to finish the comment, and your example is perfect.
> 
> Thanks for the review.

Np. But before I merge your patch, we still need a RVB from btrfs list at least,
even if I give it my RVB, due to this case is marked as btrfs RFC.

Thanks,
Zorro

> Qu
> 
> > 
> > Thanks,
> > Zorro
> > 
> > > +fi
> > > +
> > > +run_test "raid0" "2"
> > > +run_test "raid1" "2"
> > > +run_test "raid10" "4"
> > > +run_test "raid5" "3"
> > > +run_test "raid6" "4"
> > > +
> > > +echo "Silence is golden"
> > > +status=0
> > > +exit
> > > diff --git a/tests/btrfs/261.out b/tests/btrfs/261.out
> > > new file mode 100644
> > > index 00000000..679ddc0f
> > > --- /dev/null
> > > +++ b/tests/btrfs/261.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 261
> > > +Silence is golden
> > > --
> > > 2.36.1
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-07-19  5:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-18  6:18 [PATCH RFC] fstests: btrfs: add a tests case to make sure btrfs can handle certain interleaved free space correctly Qu Wenruo
2022-07-18 17:59 ` Zorro Lang
2022-07-19  5:16   ` Qu Wenruo
2022-07-19  5:52     ` Zorro Lang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.