From: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
To: Hans Holmberg <Hans.Holmberg@wdc.com>, Zorro Lang <zlang@redhat.com>
Cc: "Zorro Lang" <zlang@kernel.org>,
"Darrick J. Wong" <djwong@kernel.org>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
"Damien Le Moal" <Damien.LeMoal@wdc.com>,
"Matias Bjørling" <Matias.Bjorling@wdc.com>,
"Naohiro Aota" <Naohiro.Aota@wdc.com>, "hch@lst.de" <hch@lst.de>,
"fstests@vger.kernel.org" <fstests@vger.kernel.org>,
"Jaegeuk Kim" <jaegeuk@kernel.org>,
"bvanassche@acm.org" <bvanassche@acm.org>,
"daeho43@gmail.com" <daeho43@gmail.com>,
"Boris Burkov" <boris@bur.io>
Subject: Re: [PATCH] generic: add gc stress test
Date: Sun, 12 May 2024 16:54:30 +0000 [thread overview]
Message-ID: <7889fa5e-e79f-44c4-ade5-bc2508ce8950@wdc.com> (raw)
In-Reply-To: <9c38fffc-72e9-4766-a9d0-ef90411df6f2@wdc.com>
[ +CC Boris ]
On 11.05.24 07:08, Hans Holmberg wrote:
> On 2024-05-08 10:51, Zorro Lang wrote:
>> On Wed, May 08, 2024 at 07:08:01AM +0000, Hans Holmberg wrote:
>>> On 2024-04-17 16:50, Hans Holmberg wrote:
>>>> On 2024-04-17 16:07, Zorro Lang wrote:
>>>>> On Wed, Apr 17, 2024 at 01:21:39PM +0000, Hans Holmberg wrote:
>>>>>> On 2024-04-17 14:43, Zorro Lang wrote:
>>>>>>> On Tue, Apr 16, 2024 at 11:54:37AM -0700, Darrick J. Wong wrote:
>>>>>>>> On Tue, Apr 16, 2024 at 09:07:43AM +0000, Hans Holmberg wrote:
>>>>>>>>> +Zorro (doh!)
>>>>>>>>>
>>>>>>>>> On 2024-04-15 13:23, Hans Holmberg wrote:
>>>>>>>>>> This test stresses garbage collection for file systems by first filling
>>>>>>>>>> up a scratch mount to a specific usage point with files of random size,
>>>>>>>>>> then doing overwrites in parallel with deletes to fragment the backing
>>>>>>>>>> storage, forcing reclaim.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> Test results in my setup (kernel 6.8.0-rc4+)
>>>>>>>>>> f2fs on zoned nullblk: pass (77s)
>>>>>>>>>> f2fs on conventional nvme ssd: pass (13s)
>>>>>>>>>> btrfs on zoned nublk: fails (-ENOSPC)
>>>>>>>>>> btrfs on conventional nvme ssd: fails (-ENOSPC)
>>>>>>>>>> xfs on conventional nvme ssd: pass (8s)
>>>>>>>>>>
>>>>>>>>>> Johannes(cc) is working on the btrfs ENOSPC issue.
>>>>>>>>>>
>>>>>>>>>> tests/generic/744 | 124 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>> tests/generic/744.out | 6 ++
>>>>>>>>>> 2 files changed, 130 insertions(+)
>>>>>>>>>> create mode 100755 tests/generic/744
>>>>>>>>>> create mode 100644 tests/generic/744.out
>>>>>>>>>>
>>>>>>>>>> diff --git a/tests/generic/744 b/tests/generic/744
>>>>>>>>>> new file mode 100755
>>>>>>>>>> index 000000000000..2c7ab76bf8b1
>>>>>>>>>> --- /dev/null
>>>>>>>>>> +++ b/tests/generic/744
>>>>>>>>>> @@ -0,0 +1,124 @@
>>>>>>>>>> +#! /bin/bash
>>>>>>>>>> +# SPDX-License-Identifier: GPL-2.0
>>>>>>>>>> +# Copyright (c) 2024 Western Digital Corporation. All Rights Reserved.
>>>>>>>>>> +#
>>>>>>>>>> +# FS QA Test No. 744
>>>>>>>>>> +#
>>>>>>>>>> +# Inspired by btrfs/273 and generic/015
>>>>>>>>>> +#
>>>>>>>>>> +# This test stresses garbage collection in file systems
>>>>>>>>>> +# by first filling up a scratch mount to a specific usage point with
>>>>>>>>>> +# files of random size, then doing overwrites in parallel with
>>>>>>>>>> +# deletes to fragment the backing zones, forcing reclaim.
>>>>>>>>>> +
>>>>>>>>>> +. ./common/preamble
>>>>>>>>>> +_begin_fstest auto
>>>>>>>>>> +
>>>>>>>>>> +# real QA test starts here
>>>>>>>>>> +
>>>>>>>>>> +_require_scratch
>>>>>>>>>> +
>>>>>>>>>> +# This test requires specific data space usage, skip if we have compression
>>>>>>>>>> +# enabled.
>>>>>>>>>> +_require_no_compress
>>>>>>>>>> +
>>>>>>>>>> +M=$((1024 * 1024))
>>>>>>>>>> +min_fsz=$((1 * ${M}))
>>>>>>>>>> +max_fsz=$((256 * ${M}))
>>>>>>>>>> +bs=${M}
>>>>>>>>>> +fill_percent=95
>>>>>>>>>> +overwrite_percentage=20
>>>>>>>>>> +seq=0
>>>>>>>>>> +
>>>>>>>>>> +_create_file() {
>>>>>>>>>> + local file_name=${SCRATCH_MNT}/data_$1
>>>>>>>>>> + local file_sz=$2
>>>>>>>>>> + local dd_extra=$3
>>>>>>>>>> +
>>>>>>>>>> + POSIXLY_CORRECT=yes dd if=/dev/zero of=${file_name} \
>>>>>>>>>> + bs=${bs} count=$(( $file_sz / ${bs} )) \
>>>>>>>>>> + status=none $dd_extra 2>&1
>>>>>>>>>> +
>>>>>>>>>> + status=$?
>>>>>>>>>> + if [ $status -ne 0 ]; then
>>>>>>>>>> + echo "Failed writing $file_name" >>$seqres.full
>>>>>>>>>> + exit
>>>>>>>>>> + fi
>>>>>>>>>> +}
>>>>>>>>
>>>>>>>> I wonder, is there a particular reason for doing all these file
>>>>>>>> operations with shell code instead of using fsstress to create and
>>>>>>>> delete files to fill the fs and stress all the zone-gc code? This test
>>>>>>>> reminds me a lot of generic/476 but with more fork()ing.
>>>>>>>
>>>>>>> /me has the same confusion. Can this test cover more things than using
>>>>>>> fsstress (to do reclaim test) ? Or does it uncover some known bugs which
>>>>>>> other cases can't?
>>>>>>
>>>>>> ah, adding some more background is probably useful:
>>>>>>
>>>>>> I've been using this test to stress the crap out the zoned xfs garbage
>>>>>> collection / write throttling implementation for zoned rt subvolumes
>>>>>> support in xfs and it has found a number of issues during implementation
>>>>>> that i did not reproduce by other means.
>>>>>>
>>>>>> I think it also has wider applicability as it triggers bugs in btrfs.
>>>>>> f2fs passes without issues, but probably benefits from a quick smoke gc
>>>>>> test as well. Discussed this with Bart and Daeho (now in cc) before
>>>>>> submitting.
>>>>>>
>>>>>> Using fsstress would be cool, but as far as I can tell it cannot
>>>>>> be told to operate at a specific file system usage point, which
>>>>>> is a key thing for this test.
>>>>>
>>>>> As a random test case, if this case can be transformed to use fsstress to cover
>>>>> same issues, that would be nice.
>>>>>
>>>>> But if as a regression test case, it has its particular test coverage, and the
>>>>> issue it covered can't be reproduced by fsstress way, then let's work on this
>>>>> bash script one.
>>>>>
>>>>> Any thoughts?
>>>>
>>>> Yeah, I think bash is preferable for this particular test case.
>>>> Bash also makes it easy to hack for people's private uses.
>>>>
>>>> I use longer versions of this test (increasing overwrite_percentage)
>>>> for weekly testing.
>>>>
>>>> If we need fsstress for reproducing any future gc bug we can add
>>>> whats missing to it then.
>>>>
>>>> Does that make sense?
>>>>
>>>
>>> Hey Zorro,
>>>
>>> Any remaining concerns for adding this test? I could run it across
>>> more file systems(bcachefs could be interesting) and share the results
>>> if needed be.
>>
>> Hi,
>>
>> I remembered you metioned btrfs fails on this test, and I can reproduce it
>> on btrfs [1] with general disk. Have you figured out the reason? I don't
>> want to give btrfs a test failure suddently without a proper explanation :)
>> If it's a case issue, better to fix it for btrfs.
>
>
> I was surprised to see the failure for brtrfs on a conventional block
> device, but have not dug into it. I suspect/assume it's the same root
> cause as the issue Johannes is looking into when using a zoned block
> device as backing storage.
>
> I debugged that a bit with Johannes, and noticed that if I manually
> kick btrfs rebalancing after each write via sysfs, the test progresses
> further (but super slow).
>
> So *I think* that btrfs needs to:
>
> * tune the triggering of gc to kick in way before available free space
> runs out
> * start slowing down / blocking writes when reclaim pressure is high to
> avoid premature -ENOSPC:es.
Yes both Boris and I are working on different solutions to the GC
problem. But apart from that, I have the feeling that using stat to
check on the available space is not the best idea, at least for btrfs.
> It's a pretty nasty problem, as potentially any write could -ENOSPC
> long before the reported available space runs out when a workload
> ends up fragmenting the disk and write pressure is high..
next prev parent reply other threads:[~2024-05-12 16:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-15 11:23 [PATCH] generic: add gc stress test Hans Holmberg
2024-04-16 9:07 ` Hans Holmberg
2024-04-16 18:54 ` Darrick J. Wong
2024-04-17 12:43 ` Zorro Lang
2024-04-17 13:21 ` Hans Holmberg
2024-04-17 14:06 ` Zorro Lang
2024-04-17 14:45 ` Hans Holmberg
2024-05-08 7:08 ` Hans Holmberg
2024-05-08 8:51 ` Zorro Lang
2024-05-08 9:28 ` Qu Wenruo
2024-05-08 11:02 ` Johannes Thumshirn
2024-05-09 5:43 ` hch
2024-05-09 9:42 ` Zorro Lang
2024-05-09 12:54 ` hch
2024-05-10 3:21 ` Zorro Lang
2024-05-11 13:08 ` Hans Holmberg
2024-05-12 16:54 ` Johannes Thumshirn [this message]
2024-05-12 16:56 ` Johannes Thumshirn
2024-05-13 7:33 ` Qu Wenruo
2024-05-14 8:02 ` Hans Holmberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7889fa5e-e79f-44c4-ade5-bc2508ce8950@wdc.com \
--to=johannes.thumshirn@wdc.com \
--cc=Damien.LeMoal@wdc.com \
--cc=Hans.Holmberg@wdc.com \
--cc=Matias.Bjorling@wdc.com \
--cc=Naohiro.Aota@wdc.com \
--cc=boris@bur.io \
--cc=bvanassche@acm.org \
--cc=daeho43@gmail.com \
--cc=djwong@kernel.org \
--cc=fstests@vger.kernel.org \
--cc=hch@lst.de \
--cc=jaegeuk@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=zlang@kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).