From: Zorro Lang <zlang@redhat.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: fstests@vger.kernel.org, linux-fsdevel@vger.kernel.org,
willy@infradead.org, p.raghav@samsung.com, da.gomez@samsung.com,
hare@suse.de, john.g.garry@oracle.com, ziy@nvidia.com,
linux-xfs@vger.kernel.org, patches@lists.linux.dev
Subject: Re: [RFC] fstests: add fsstress + writeback + debugfs folio split test
Date: Thu, 25 Apr 2024 13:31:50 +0800 [thread overview]
Message-ID: <20240425053150.tx7pdosbfv5adat6@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com> (raw)
In-Reply-To: <20240424224649.1494092-1-mcgrof@kernel.org>
On Wed, Apr 24, 2024 at 03:46:48PM -0700, Luis Chamberlain wrote:
> Stress test folio splits by using the debugfs interface to a target
> a new smaller folio order. This is dangerous at the moment as its using
> a debugfs API which requires two out of tree fixes [0] [1] which will
> be posted shortly. With these patches applied no crash is possible yet.
> However, this test was designed to try to exacerbate races with folio
> splits and writeback, at least running generic/447 twice ends up
> producing a crash only if large folio splits with minimum folio order
> is enabled.
>
> With the known fixes for the debugfs interface, this test produces no
> crashes even after 3 hour soaking for 4k and LBS. We should enhance
> this test a bit more so to reproduce the issues observed by running
> generic/447 twice.
>
> This also begs the question if something like MADV_NOHUGEPAGE might be
> desirable from userspace, so to enable userspace to request splits when
> possible.
>
> If inspecting more closely, you'll want to enable on your kernel boot:
>
> dyndbg='file mm/huge_memory.c +p'
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=20240424-lbs&id=80f6df5037fd0ad560526af45bd7f4d779fe03f6
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=20240424-lbs&id=38f6fac5b4283ea48b1876fc56728f062168f8c3
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>
> For those that want to help stress test folio splits to an order, hopefully
> this will help start to enable this. Perhaps there are better ways to create
> more easy targets to stress test folio splits, and in particular try to
> reproduce the issue which so far is only possible by running generic/447 twice
> on LBS. The issue with generic/447 on LBS is not observed on 4k, and this test
> produces no crashes on LBS...
>
> common/rc | 20 +++++++++
> tests/generic/745 | 97 +++++++++++++++++++++++++++++++++++++++++++
> tests/generic/745.out | 2 +
> 3 files changed, 119 insertions(+)
> create mode 100755 tests/generic/745
> create mode 100644 tests/generic/745.out
>
> diff --git a/common/rc b/common/rc
> index d4432f5ce259..1eefb53aa84b 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -127,6 +127,26 @@ _require_compaction()
> _notrun "Need compaction enabled CONFIG_COMPACTION=y"
> fi
> }
> +
> +# Requires CONFIG_DEBUGFS and truncation knobs
> +SPLIT_DEBUGFS="/sys/kernel/debug/split_huge_pages"
> +_require_split_debugfs()
> +{
> + if [ ! -f $SPLIT_DEBUGFS ]; then
> + _notrun "Needs CONFIG_DEBUGFS and split_huge_pages"
> + fi
> +}
The global SPLIT_DEBUGFS isn't necessary, you can just use:
$DEBUGFS_MNT/split_huge_pages
And this remind me we have a _require_debugfs helper in common/rc, but
it's not used by any one test case. And it looks like not correct:
_require_debugfs()
{
#boot_params always present in debugfs
[ -d "$DEBUGFS_MNT/boot_params" ] || _notrun "Debugfs not mounted"
}
I can't find "boot_params" in /sys/kernel/debug/. So we might need to
fix this helper, and then call it in debugfs related test cases.
I'll send another patch to talk about that fix. For this case, it needs
_require_debugfs and $DEBUGFS_MNT.
> +
> +_split_huge_pages_file_full()
May we have a comment to explain what this common helper for? Thanks.
> +{
> + local file=$1
> + local offset="0x0"
> + local len=$(printf "%x" $(stat --format='%s' $file))
> + local order="0"
> + local split_cmd="$file,$offset,0x${len},$order"
> + echo $split_cmd > $SPLIT_DEBUGFS
^^^^^^^^^^^^^
$DEBUGFS_MNT/split_huge_pages is good enough.
> +}
> +
> # Get hugepagesize in bytes
> _get_hugepagesize()
> {
> diff --git a/tests/generic/745 b/tests/generic/745
> new file mode 100755
> index 000000000000..0a30dbee35bd
> --- /dev/null
> +++ b/tests/generic/745
> @@ -0,0 +1,97 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved.
> +#
> +# FS QA Test No. 734
> +#
> +# Run fsstress in a loop, and in the background force some writeback and
> +# folio splits for every file. If you're enabling this and want to check
> +# underneath the hood you may want to enable:
> +#
> +# dyndbg='file mm/huge_memory.c +p'
> +. ./common/preamble
> +_begin_fstest long_rw stress soak smoketest dangerous_fuzzers
Just double check, is it a dangerous_fuzzers test, and not good to be in auto group?
> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> + cd /
> + rm -f $tmp.*
> + $KILLALL_PROG -9 fsstress > /dev/null 2>&1
> +}
> +
> +# Import common functions.
> +. ./common/filter
I didn't see any filters called in this case, so don't need to
import it.
> +
> +# real QA test starts here
> +_supported_fs generic
> +_require_test
I didn't see TEST_DIR or TEST_DEV in this test case, so don't need
_require_test
> +_require_scratch
> +_require_split_debugfs
> +_require_command "$KILLALL_PROG" "killall"
> +
> +echo "Silence is golden"
> +
> +_scratch_mkfs >>$seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +nr_cpus=$((LOAD_FACTOR * 4))
> +nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
> +
> +fsstress_args=(-w -d $SCRATCH_MNT/test -n $nr_ops -p $nr_cpus)
> +
> +# used to let our loops know when to stop
> +runfile="$tmp.keep.running.loop"
> +touch $runfile
> +
> +# The background ops are out of bounds, the goal is to race with fsstress.
> +
> +# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE
> +# for large folios.
> +while [ -e $runfile ]; do
> + for i in $(find $SCRATCH_MNT/test \( -type f \) 2>/dev/null); do
May I ask why the "\( .. \)" is needed?
> + _split_huge_pages_file_full $i >/dev/null 2>&1
Just make sure, don't you want to output to $seqres.full file to get more information
for debug, if something wrong.
> + done
> + sleep 2
> +done &
> +split_huge_pages_files_pid=$!
This split_huge_pages_files_pid isn't be used anywhere, you can deal with it in
_cleanup.
> +
> +blocksize=$(_get_file_block_size $SCRATCH_MNT)
> +export XFS_DIO_MIN=$((blocksize * 2))
Oh, I even forgot we have this parameter in fsstress.c :)
> +
> +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
> +
> +# Our general goal here is to race with ops which can stress folio addition,
> +# removal, edits, and writeback.
> +
> +# zero frequencies for write ops to minimize writeback
> +fsstress_args+=(-R)
But you set "fsstress_args=(-w -d $SCRATCH_MNT/test -n $nr_ops -p $nr_cpus)" above,
so you zeros frequencies of non-write operations (-w) and then zeros frequencies of
write operations (-R). Do you want to use "-z" directly, to zeros frequencies of
all operations ?
> +
> +# XXX: we can improve this, so to increase the chances to allow more
> +# folio splits. Also running generic/447 twice triggers a corner case we can't
> +# capture here on folio splits and write_cache_pages, increasing the chances of
> +# this test to cover that same corner case would be ideal.
> +#
> +# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df
> +fsstress_args+=(-f creat=1)
> +fsstress_args+=(-f write=1)
> +fsstress_args+=(-f dwrite=1)
> +fsstress_args+=(-f truncate=1)
> +fsstress_args+=(-f zero=1)
> +fsstress_args+=(-f unlink=1)
> +fsstress_args+=(-f fsync=1)
> +fsstress_args+=(-f punch=2)
> +fsstress_args+=(-f copyrange=4)
> +fsstress_args+=(-f clonerange=4)
> +
> +if [[ "$FSTYP" != "xfs" || "$FSTYP" == "ext4" ]]; then
> + fsstress_args+=(-f collapse=1)
Can you explain why the FALLOC_FL_COLLAPSE_RANGE is special, and not suitable
for xfs?
And I think the range of $FSTYP" != "xfs" contains "$FSTYP" == "ext4", so
this logic makes me confused.
> +fi
> +
> +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full
> +
> +rm -f $runfile
> +wait > /dev/null 2>&1
Better to move this to _cleanup. e.g.
rm -f $runfile
[ -n "$split_huge_pages_files_pid" ] && kill $split_huge_pages_files_pid
$KILLALL_PROG -9 fsstress
wait
Thanks,
Zorro
> +
> +status=0
> +exit
> diff --git a/tests/generic/745.out b/tests/generic/745.out
> new file mode 100644
> index 000000000000..fce6b7f5489d
> --- /dev/null
> +++ b/tests/generic/745.out
> @@ -0,0 +1,2 @@
> +QA output created by 745
> +Silence is golden
> --
> 2.43.0
>
>
next prev parent reply other threads:[~2024-04-25 5:32 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-24 22:46 [RFC] fstests: add fsstress + writeback + debugfs folio split test Luis Chamberlain
2024-04-25 5:31 ` Zorro Lang [this message]
2024-04-29 4:40 ` Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240425053150.tx7pdosbfv5adat6@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com \
--to=zlang@redhat.com \
--cc=da.gomez@samsung.com \
--cc=fstests@vger.kernel.org \
--cc=hare@suse.de \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=patches@lists.linux.dev \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).