On Mon, Mar 18, 2024 at 11:02:19PM +0100, David Sterba wrote: > On Sat, Mar 16, 2024 at 10:32:33PM +0530, Anand Jain wrote: > > Given that ext4 also allows mounting of a cloned filesystem, the btrfs > > test case btrfs/312, which assesses the functionality of cloned filesystem > > support, can be refactored to be under the shared group. > > > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > > --- > > v2: > > Move to shared testcase instead of generic. > > What's the purpose of shared/ ? We have tests that make sense for a > subset of supported filesystems in generic/, with proper _required and > other the checks it works fine. > > I see that v1 did the move to generic/ but then the 'shared' got > suggested, which is IMHO the wrong direction. I remember some distant > past discussions about shared/ and what to put there. Right now there > are 3 remaining tests which I think is a good opportunity to make it 0. I didn't suggest to make it a shared case directly, I asked if there's a _require_xxxx helper to make this case notrun on "not proper" fs, not just use "btrfs ext4" to be whitelist : https://lore.kernel.org/fstests/20240312044629.hpaqdkl24nxaa3dv@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/ In my personal opinion, the "shared" directory is a place to store the cases which are nearly to be generic, but not ready. It's a place to remind us there're still some cases use something likes "supported btrfs ext4" as the hard condition of _notrun, rather than a flexible _require_xxx helper. These cases in shared better to be moved to generic, if we can improve it in one day. It more likes a "TODO" list of generic. If we just write it in generic/ directory, I'm afraid we'll leave it in hundreds of generic cases then forget it. What do you think? Thanks, Zorro >
On Mon, Mar 18, 2024 at 11:02:19PM +0100, David Sterba wrote:
> What's the purpose of shared/ ? We have tests that make sense for a
> subset of supported filesystems in generic/, with proper _required and
> other the checks it works fine.
Yes. I'd rather get rid of shared and the few tets in there as it just
creats confusion.
On Sat, Mar 16, 2024 at 10:32:33PM +0530, Anand Jain wrote:
> Given that ext4 also allows mounting of a cloned filesystem, the btrfs
> test case btrfs/312, which assesses the functionality of cloned filesystem
> support, can be refactored to be under the shared group.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v2:
> Move to shared testcase instead of generic.
What's the purpose of shared/ ? We have tests that make sense for a
subset of supported filesystems in generic/, with proper _required and
other the checks it works fine.
I see that v1 did the move to generic/ but then the 'shared' got
suggested, which is IMHO the wrong direction. I remember some distant
past discussions about shared/ and what to put there. Right now there
are 3 remaining tests which I think is a good opportunity to make it 0.
Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
> +#
> +# Check if asm/mman.h can be included
> +#
> +AC_DEFUN([AC_HAVE_KERNEL_MADVISE_FLAGS],
> + [ AC_MSG_CHECKING([for kernel madvise flags in asm/mman.h ])
> + AC_COMPILE_IFELSE(
> + [ AC_LANG_PROGRAM([[
> +#include <asm/mman.h>
> + ]], [[
> +int moo = MADV_COLLAPSE;
> + ]])
> + ], have_kernel_madvise=yes
> + AC_MSG_RESULT(yes),
> + AC_MSG_RESULT(no))
> + AC_SUBST(have_kernel_madvise)
> + ])
> +
I don't think we really need this check, as madvise and asm/mman.h
have been around forever. We can probably also drop most of the
actual flag idefs, probably for everything older than MADV_WIPEONFORK.
The rest looks good to me.
On Sat, Mar 16, 2024 at 05:42:30PM +0000, Filipe Manana wrote:
> Mine is also non-debug and yet it always succeeds for me (without your
> patch or any other of course):
Same here.
And the idea behind the test is that when the device is removed the
file system should always return an error, i.e. have a check for shutdown
fs in the read path.
From: Darrick J. Wong <djwong@kernel.org> This is a regression test for "mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly". Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- tests/generic/1835 | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ tests/generic/1835.out | 4 +++ 2 files changed, 69 insertions(+) create mode 100755 tests/generic/1835 create mode 100644 tests/generic/1835.out diff --git a/tests/generic/1835 b/tests/generic/1835 new file mode 100755 index 0000000000..07479ab712 --- /dev/null +++ b/tests/generic/1835 @@ -0,0 +1,65 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2024 Oracle. All Rights Reserved. +# +# FS QA Test 1835 +# +# This is a regression test for a kernel hang that I saw when creating a memory +# mapping, injecting EIO errors on the block device, and invoking +# MADV_POPULATE_READ on the mapping to fault in the pages. +# +. ./common/preamble +_begin_fstest auto rw + +# Override the default cleanup function. +_cleanup() +{ + cd / + rm -f $tmp.* + _dmerror_unmount + _dmerror_cleanup +} + +# Import common functions. +. ./common/dmerror + +_fixed_by_kernel_commit XXXXXXXXXXXX \ + "mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly" + +# real QA test starts here + +# Modify as appropriate. +_supported_fs generic +_require_xfs_io_command madvise -R +_require_scratch +_require_dm_target error +_require_command "$TIMEOUT_PROG" "timeout" + +_scratch_mkfs >> $seqres.full 2>&1 +_dmerror_init + +filesz=2m + +# Create a file that we'll read, then cycle mount to zap pagecache +_dmerror_mount +$XFS_IO_PROG -f -c "pwrite -S 0x58 0 $filesz" "$SCRATCH_MNT/a" >> $seqres.full +_dmerror_unmount +_dmerror_mount + +# Try to read the file data in a regular fashion just to prove that it works. +echo read with no errors +timeout -s KILL 10s $XFS_IO_PROG -c "mmap -r 0 $filesz" -c "madvise -R 0 $filesz" "$SCRATCH_MNT/a" +_dmerror_unmount +_dmerror_mount + +# Load file metadata and induce EIO errors on read. Try to provoke the kernel; +# kill the process after 10s so we can clean up. +stat "$SCRATCH_MNT/a" >> $seqres.full +echo read with IO errors +_dmerror_load_error_table +timeout -s KILL 10s $XFS_IO_PROG -c "mmap -r 0 $filesz" -c "madvise -R 0 $filesz" "$SCRATCH_MNT/a" +_dmerror_load_working_table + +# success, all done +status=0 +exit diff --git a/tests/generic/1835.out b/tests/generic/1835.out new file mode 100644 index 0000000000..1b03586e8c --- /dev/null +++ b/tests/generic/1835.out @@ -0,0 +1,4 @@ +QA output created by 1835 +read with no errors +read with IO errors +madvise: Bad address
From: Darrick J. Wong <djwong@kernel.org> Add all the Linux-specific madvise codes. We're going to need MADV_POPULATE_READ for a regression test. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- configure.ac | 1 include/builddefs.in | 1 io/Makefile | 4 ++ io/madvise.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++ m4/package_libcdev.m4 | 17 ++++++++ 5 files changed, 133 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 3786e44db6fd..723bdca506d1 100644 --- a/configure.ac +++ b/configure.ac @@ -187,6 +187,7 @@ AC_CONFIG_SYSTEMD_SYSTEM_UNIT_DIR AC_CONFIG_CROND_DIR AC_CONFIG_UDEV_DIR AC_HAVE_BLKID_TOPO +AC_HAVE_KERNEL_MADVISE_FLAGS if test "$enable_ubsan" = "yes" || test "$enable_ubsan" = "probe"; then AC_PACKAGE_CHECK_UBSAN diff --git a/include/builddefs.in b/include/builddefs.in index 07428206da45..a04f3e70f19d 100644 --- a/include/builddefs.in +++ b/include/builddefs.in @@ -193,6 +193,7 @@ HAVE_O_TMPFILE = @have_o_tmpfile@ HAVE_MKOSTEMP_CLOEXEC = @have_mkostemp_cloexec@ USE_RADIX_TREE_FOR_INUMS = @use_radix_tree_for_inums@ HAVE_FSVERITY_DESCR = @have_fsverity_descr@ +HAVE_KERNEL_MADVISE = @have_kernel_madvise@ GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall -Werror -Wextra -Wno-unused-parameter # -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl diff --git a/io/Makefile b/io/Makefile index 6f903e3df9a7..ce39fda0e82a 100644 --- a/io/Makefile +++ b/io/Makefile @@ -84,6 +84,10 @@ ifeq ($(HAVE_GETFSMAP),yes) CFILES += fsmap.c endif +ifeq ($(HAVE_KERNEL_MADVISE),yes) +LCFLAGS += -DHAVE_KERNEL_MADVISE +endif + default: depend $(LTCOMMAND) include $(BUILDRULES) diff --git a/io/madvise.c b/io/madvise.c index 6e9c5b121d72..081666f403bb 100644 --- a/io/madvise.c +++ b/io/madvise.c @@ -9,6 +9,9 @@ #include <sys/mman.h> #include "init.h" #include "io.h" +#ifdef HAVE_KERNEL_MADVISE +# include <asm/mman.h> +#endif static cmdinfo_t madvise_cmd; @@ -26,6 +29,47 @@ madvise_help(void) " -r -- expect random page references (POSIX_MADV_RANDOM)\n" " -s -- expect sequential page references (POSIX_MADV_SEQUENTIAL)\n" " -w -- will need these pages (POSIX_MADV_WILLNEED) [*]\n" +"\n" +"The following Linux-specific advise values are available:\n" +#ifdef MADV_COLLAPSE +" -c -- try to collapse range into transparent hugepages (MADV_COLLAPSE)\n" +#endif +#ifdef MADV_COLD +" -D -- deactivate the range (MADV_COLD)\n" +#endif +#ifdef MADV_FREE +" -f -- free the range (MADV_FREE)\n" +#endif +#ifdef MADV_NOHUGEPAGE +" -h -- disable transparent hugepages (MADV_NOHUGEPAGE)\n" +#endif +#ifdef MADV_HUGEPAGE +" -H -- enable transparent hugepages (MADV_HUGEPAGE)\n" +#endif +#ifdef MADV_MERGEABLE +" -m -- mark the range mergeable (MADV_MERGEABLE)\n" +#endif +#ifdef MADV_UNMERGEABLE +" -M -- mark the range unmergeable (MADV_UNMERGEABLE)\n" +#endif +#ifdef MADV_SOFT_OFFLINE +" -o -- mark the range offline (MADV_SOFT_OFFLINE)\n" +#endif +#ifdef MADV_REMOVE +" -p -- punch a hole in the file (MADV_REMOVE)\n" +#endif +#ifdef MADV_HWPOISON +" -P -- poison the page cache (MADV_HWPOISON)\n" +#endif +#ifdef MADV_POPULATE_READ +" -R -- prefault in the range for read (MADV_POPULATE_READ)\n" +#endif +#ifdef MADV_POPULATE_WRITE +" -W -- prefault in the range for write (MADV_POPULATE_WRITE)\n" +#endif +#ifdef MADV_PAGEOUT +" -X -- reclaim the range (MADV_PAGEOUT)\n" +#endif " Notes:\n" " NORMAL sets the default readahead setting on the file.\n" " RANDOM sets the readahead setting on the file to zero.\n" @@ -45,20 +89,85 @@ madvise_f( int advise = MADV_NORMAL, c; size_t blocksize, sectsize; - while ((c = getopt(argc, argv, "drsw")) != EOF) { + while ((c = getopt(argc, argv, "cdDfhHmMopPrRswWX")) != EOF) { switch (c) { +#ifdef MADV_COLLAPSE + case 'c': /* collapse to thp */ + advise = MADV_COLLAPSE; + break; +#endif case 'd': /* Don't need these pages */ advise = MADV_DONTNEED; break; +#ifdef MADV_COLD + case 'D': /* make more likely to be reclaimed */ + advise = MADV_COLD; + break; +#endif +#ifdef MADV_FREE + case 'f': /* page range out of memory */ + advise = MADV_FREE; + break; +#endif +#ifdef MADV_HUGEPAGE + case 'h': /* enable thp memory */ + advise = MADV_HUGEPAGE; + break; +#endif +#ifdef MADV_NOHUGEPAGE + case 'H': /* disable thp memory */ + advise = MADV_NOHUGEPAGE; + break; +#endif +#ifdef MADV_MERGEABLE + case 'm': /* enable merging */ + advise = MADV_MERGEABLE; + break; +#endif +#ifdef MADV_UNMERGEABLE + case 'M': /* disable merging */ + advise = MADV_UNMERGEABLE; + break; +#endif +#ifdef MADV_SOFT_OFFLINE + case 'o': /* offline */ + advise = MADV_SOFT_OFFLINE; + break; +#endif +#ifdef MADV_REMOVE + case 'p': /* punch hole */ + advise = MADV_REMOVE; + break; +#endif +#ifdef MADV_HWPOISON + case 'P': /* poison */ + advise = MADV_HWPOISON; + break; +#endif case 'r': /* Expect random page references */ advise = MADV_RANDOM; break; +#ifdef MADV_POPULATE_READ + case 'R': /* fault in pages for read */ + advise = MADV_POPULATE_READ; + break; +#endif case 's': /* Expect sequential page references */ advise = MADV_SEQUENTIAL; break; case 'w': /* Will need these pages */ advise = MADV_WILLNEED; break; +#ifdef MADV_POPULATE_WRITE + case 'W': /* fault in pages for write */ + advise = MADV_POPULATE_WRITE; + break; +#endif +#ifdef MADV_PAGEOUT + case 'X': /* reclaim memory */ + advise = MADV_PAGEOUT; + break; +#endif default: exitcode = 1; return command_usage(&madvise_cmd); diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4 index 84f288dfcfdb..064d050b2b55 100644 --- a/m4/package_libcdev.m4 +++ b/m4/package_libcdev.m4 @@ -322,3 +322,20 @@ struct fsverity_descriptor m = { }; AC_SUBST(have_fsverity_descr) ]) +# +# Check if asm/mman.h can be included +# +AC_DEFUN([AC_HAVE_KERNEL_MADVISE_FLAGS], + [ AC_MSG_CHECKING([for kernel madvise flags in asm/mman.h ]) + AC_COMPILE_IFELSE( + [ AC_LANG_PROGRAM([[ +#include <asm/mman.h> + ]], [[ +int moo = MADV_COLLAPSE; + ]]) + ], have_kernel_madvise=yes + AC_MSG_RESULT(yes), + AC_MSG_RESULT(no)) + AC_SUBST(have_kernel_madvise) + ]) +
From: Darrick J. Wong <djwong@kernel.org> If verity is enabled on a filesystem, we should create some sample verity files. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- common/populate | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/common/populate b/common/populate index 35071f4210..3f3ec0480d 100644 --- a/common/populate +++ b/common/populate @@ -520,6 +520,27 @@ _scratch_xfs_populate() { done fi + # verity merkle trees + is_verity="$(_xfs_has_feature "$SCRATCH_MNT" verity -v)" + if [ $is_verity -gt 0 ]; then + echo "+ fsverity" + + # Create a biggish file with all zeroes, because metadump + # won't preserve data blocks and we don't want the hashes to + # stop working for our sample fs. + for ((pos = 0, i = 88; pos < 23456789; pos += 234567, i++)); do + $XFS_IO_PROG -f -c "pwrite -S 0 $pos 234567" "$SCRATCH_MNT/verity" + done + + fsverity enable "$SCRATCH_MNT/verity" + + # Create a sparse file + $XFS_IO_PROG -f -c "pwrite -S 0 0 3" "$SCRATCH_MNT/sparse_verity" + truncate -s 23456789 "$SCRATCH_MNT/sparse_verity" + $XFS_IO_PROG -f -c "pwrite -S 0 23456789 3" "$SCRATCH_MNT/sparse_verity" + fsverity enable "$SCRATCH_MNT/sparse_verity" + fi + # Copy some real files (xfs tests, I guess...) echo "+ real files" test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
From: Darrick J. Wong <djwong@kernel.org> Adjust these tests to accomdate the use of xattrs to store fsverity metadata. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- tests/xfs/021 | 3 +++ tests/xfs/122.out | 1 + 2 files changed, 4 insertions(+) diff --git a/tests/xfs/021 b/tests/xfs/021 index ef307fc064..dcecf41958 100755 --- a/tests/xfs/021 +++ b/tests/xfs/021 @@ -118,6 +118,7 @@ _scratch_xfs_db -r -c "inode $inum_1" -c "print a.sfattr" | \ perl -ne ' /\.secure/ && next; /\.parent/ && next; +/\.verity/ && next; print unless /^\d+:\[.*/;' echo "*** dump attributes (2)" @@ -128,6 +129,7 @@ _scratch_xfs_db -r -c "inode $inum_2" -c "a a.bmx[0].startblock" -c print \ | perl -ne ' s/,secure//; s/,parent//; +s/,verity//; s/info.hdr/info/; /hdr.info.crc/ && next; /hdr.info.bno/ && next; @@ -135,6 +137,7 @@ s/info.hdr/info/; /hdr.info.lsn/ && next; /hdr.info.owner/ && next; /\.parent/ && next; +/\.verity/ && next; s/^(hdr.info.magic =) 0x3bee/\1 0xfbee/; s/^(hdr.firstused =) (\d+)/\1 FIRSTUSED/; s/^(hdr.freemap\[0-2] = \[base,size]).*/\1 [FREEMAP..]/; diff --git a/tests/xfs/122.out b/tests/xfs/122.out index 3a99ce77bb..ff886b4eec 100644 --- a/tests/xfs/122.out +++ b/tests/xfs/122.out @@ -141,6 +141,7 @@ sizeof(struct xfs_scrub_vec) = 16 sizeof(struct xfs_scrub_vec_head) = 32 sizeof(struct xfs_swap_extent) = 64 sizeof(struct xfs_unmount_log_format) = 8 +sizeof(struct xfs_verity_merkle_key) = 8 sizeof(struct xfs_xmd_log_format) = 16 sizeof(struct xfs_xmi_log_format) = 80 sizeof(union xfs_rtword_raw) = 4
From: Andrey Albershteyn <aalbersh@redhat.com> XFS supports verity and can be enabled for -g verity group. Signed-off-by: Andrey Albershteyn <andrey.albershteyn@gmail.com> --- common/verity | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/common/verity b/common/verity index 03d175ce1b..df4eb5dee7 100644 --- a/common/verity +++ b/common/verity @@ -43,7 +43,16 @@ _require_scratch_verity() # The filesystem may be aware of fs-verity but have it disabled by # CONFIG_FS_VERITY=n. Detect support via sysfs. - if [ ! -e /sys/fs/$fstyp/features/verity ]; then + case $FSTYP in + xfs) + _scratch_unmount + _check_scratch_xfs_features VERITY &>>$seqres.full + _scratch_mount + ;; + *) + test -e /sys/fs/$fstyp/features/verity + esac + if [ ! $? ]; then _notrun "kernel $fstyp isn't configured with verity support" fi @@ -201,6 +210,9 @@ _scratch_mkfs_verity() ext4|f2fs) _scratch_mkfs -O verity ;; + xfs) + _scratch_mkfs -i verity + ;; btrfs) _scratch_mkfs ;; @@ -407,6 +419,21 @@ _fsv_scratch_corrupt_merkle_tree() done _scratch_mount ;; + xfs) + local ino=$(stat -c '%i' $file) + local attr_offset=$(( $offset % $FSV_BLOCK_SIZE )) + local attr_index=$(printf "%08d" $(( offset - attr_offset ))) + _scratch_unmount + # Attribute name is 8 bytes long (index of Merkle tree page) + _scratch_xfs_db -x -c "inode $ino" \ + -c "attr_modify -f -m 8 -o $attr_offset $attr_index \"BUG\"" \ + >>$seqres.full + # In case bsize == 4096 and merkle block size == 1024, by + # modifying attribute with 'attr_modify we can corrupt quota + # account. Let's repair it + _scratch_xfs_repair > $seqres.full 2>&1 + _scratch_mount + ;; *) _fail "_fsv_scratch_corrupt_merkle_tree() unimplemented on $FSTYP" ;;
Hi all, From Darrick J. Wong: This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement fsverity for XFS. The biggest thing that I didn't like in the v5 patchset is the abuse of the data device's buffer cache to store the incore version of the merkle tree blocks. Not only do verity state flags end up in xfs_buf, but the double-alloc flag wastes memory and doesn't remain internally consistent if the xattrs shift around. I replaced all of that with a per-inode xarray that indexes incore merkle tree blocks. For cache hits, this dramatically reduces the amount of work that xfs has to do to feed fsverity. The per-block overhead is much lower (8 bytes instead of ~300 for xfs_bufs), and we no longer have to entertain layering violations in the buffer cache. I also added a per-filesystem shrinker so that reclaim can cull cached merkle tree blocks, starting with the leaf tree nodes. I've also rolled in some changes recommended by the fsverity maintainer, fixed some organization and naming problems in the xfs code, fixed a collision in the xfs_inode iflags, and improved dead merkle tree cleanup per the discussion of the v5 series. At this point I'm happy enough with this code to start integrating and testing it in my trees, so it's time to send it out a coherent patchset for comments. For v5.3, I've added bits and pieces of online and offline repair support, reduced the size of partially filled merkle tree blocks by removing trailing zeroes, changed the xattr hash function to better avoid collisions between merkle tree keys, made the fsverity invalidation bitmap unnecessary, and made it so that we can save space on sparse verity files by not storing merkle tree blocks that hash totally zeroed data blocks. From Andrey Albershteyn: Here's v5 of my patchset of adding fs-verity support to XFS. This implementation uses extended attributes to store fs-verity metadata. The Merkle tree blocks are stored in the remote extended attributes. The names are offsets into the tree. A few key points of this patchset: - fs-verity can work with Merkle tree blocks based caching (xfs) and PAGE caching (ext4, f2fs, btrfs) - iomap does fs-verity verification - In XFS, fs-verity metadata is stored in extended attributes - per-sb workqueue for verification processing - Inodes with fs-verity have new on-disk diflag - xfs_attr_get() can return a buffer with an extended attribute - xfs_buf can allocate double space for Merkle tree blocks. Part of the space is used to store the extended attribute data without leaf headers - xfs_buf tracks verified status of merkle tree blocks The patchset consists of five parts: - [1]: fs-verity spinlock removal pending in fsverity/for-next - [2..4]: Parent pointers adding binary xattr names - [5]: Expose FS_XFLAG_VERITY for fs-verity files - [6..9]: Changes to fs-verity core - [10]: Integrate fs-verity to iomap - [11-24]: Add fs-verity support to XFS Testing: The patchset is tested with xfstests -g verity on xfs_1k, xfs_4k, xfs_1k_quota, xfs_4k_quota, ext4_4k, and ext4_4k_quota. With KMEMLEAK and KASAN enabled. More testing on the way. Changes from V4: - Mainly fs-verity changes; removed unnecessary functions - Replace XFS workqueue with per-sb workqueue created in fsverity_set_ops() - Drop patch with readahead calculation in bytes Changes from V3: - redone changes to fs-verity core as previous version had an issue on ext4 - add blocks invalidation interface to fs-verity - move memory ordering primitives out of block status check to fs read block function - add fs-verity verification to iomap instead of general post read processing Changes from V2: - FS_XFLAG_VERITY extended attribute flag - Change fs-verity to use Merkle tree blocks instead of expecting PAGE references from filesystem - Change approach in iomap to filesystem provided bio_set and submit_io instead of just callouts to filesystem - Add possibility for xfs_buf allocate more space for fs-verity extended attributes - Make xfs_attr module to copy fs-verity blocks inside the xfs_buf, so XFS can get data without leaf headers - Add Merkle tree removal for error path - Makae scrub aware of new dinode flag Changes from V1: - Added parent pointer patches for easier testing - Many issues and refactoring points fixed from the V1 review - Adjusted for recent changes in fs-verity core (folios, non-4k) - Dropped disabling of large folios - Completely new fsverity patches (fix, callout, log_blocksize) - Change approach to verification in iomap to the same one as in write path. Callouts to fs instead of direct fs-verity use. - New XFS workqueue for post read folio verification - xfs_attr_get() can return underlying xfs_buf - xfs_bufs are marked with XBF_VERITY_CHECKED to track verified blocks If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fsverity xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=fsverity fstests git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fsverity --- Commits in this patchset: * common/verity: enable fsverity for XFS * xfs/{021,122}: adapt to fsverity xattrs * common/populate: add verity files to populate xfs images --- common/populate | 21 +++++++++++++++++++++ common/verity | 29 ++++++++++++++++++++++++++++- tests/xfs/021 | 3 +++ tests/xfs/122.out | 1 + 4 files changed, 53 insertions(+), 1 deletion(-)
[-- Attachment #1: Type: text/plain, Size: 5619 bytes --] Hi all, The for-next branch of the xfstests repository at: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git has just been updated and tagged as v2024.03.17 release. Release Notes: 1) There're 32 commits in this release, but most of them are fixes and updates. Then One new test cases -- xfs/607. Another new case btrfs/320 is splited from old btrfs/022. 2) As upstream dm-error supports zoned devices now, so fstests allow dm-error tests on zoned devices too. But it might cause some failures on old kernel, if you test with zoned devices. 3) Some of cases are improved for 64k blocksize testing, feel free to give it a try. There're also lots of fixes or improvements, more details refer to below. Thanks for all these contributions! Thanks, Zorro The new head of the for-next branch is commit: ec5b77f70e60 fstests: add missing commit IDs to some tests New commits ([N]ew, [U]pdate): Anand Jain (1): [U] [2b0216017bc2] common/rc: specify required device size Christoph Hellwig (4): [U] [5e1e51df9f29] shared/298: call fs commands on the loop device [U] [926ae266c86a] shared/298: run xfs_db against the loop device instead of the image file [U] [cfb92ce9070c] common: dm-error now supports zoned devices [U] [7bb78927c0f5] generic/392: stop checking st_blocks Darrick J. Wong (10): [U] [d46a46c7d5d0] generic/604: try to make race occur reliably [U] [ed6967784854] xfs/155: fail the test if xfs_repair hangs for too long [U] [0943fadcf0e9] generic/192: fix spurious timeout [U] [a3ce52894e0e] generic/491: increase test timeout [U] [27f314276da9] xfs/599: reduce the amount of attrs created here [U] [480bf0025bff] xfs/122: update test to pick up rtword/suminfo ondisk unions [U] [b41b9ca96f69] xfs/43[4-6]: make module reloading optional [N] [70fc4244b7dc] xfs: test for premature ENOSPC with large cow delalloc extents [U] [7a97d9c0fa1b] misc: fix test that fail formatting with 64k blocksize [U] [c9d50e4d1c2a] generic/574: don't fail the test on intentional coredump Disha Goel (1): [U] [7a338a6af80d] xfstest: add detection for ext4.h presence in configure.ac Filipe Manana (2): [U] [07fa85f11232] btrfs: fix grep warning at _require_btrfs_mkfs_uuid_option() [U] [ec5b77f70e60] fstests: add missing commit IDs to some tests Josef Bacik (8): [U] [18ecbf559bc5] btrfs/011: increase the runtime for replace cancel [U] [51f280cfa124] btrfs/012: adjust how we populate the fs to convert [U] [46bc2d68fa74] btrfs/131: don't run with subpage blocksizes [U] [e990496ffc09] btrfs/213: make the test more reliable [U] [94937700f4ee] btrfs/271: adjust failure condition [U] [f682432926a0] btrfs/287,btrfs/293: filter all btrfs subvolume delete calls [U] [3552178f39be] btrfs/291: remove image file after teardown [N] [512dba4fdbfc] btrfs: test normal qgroup operations in a compress friendly way Li Zhijian (1): [U] [d76c4886ba83] xfs: Fix no executable permission Qu Wenruo (1): [U] [8000febb32af] fstests: btrfs/121: allow snapshot with invalid qgroup to return error Su Yue (1): [U] [b3ff58fa319a] btrfs/172,206: call _log_writes_cleanup in _cleanup Zorro Lang (3): [U] [46b783fe67fc] fsstress: check io_uring_queue_init errno properly [U] [7bad5527e2bc] fsstress: bypass io_uring testing if io_uring_queue_init returns EPERM [U] [790a9e7c3c94] common/rc: notrun if io_uring is disabled by sysctl Code Diffstat: README | 6 +++ check | 6 --- common/btrfs | 2 +- common/module | 33 +++++++++++++--- common/rc | 66 +++++++++++++++++++++++++++++--- configure.ac | 1 + ltp/fsstress.c | 32 ++++++++++++---- src/ext4_resize.c | 5 ++- src/feature.c | 19 ++++++---- tests/btrfs/011 | 9 ++++- tests/btrfs/012 | 14 ++++--- tests/btrfs/022 | 86 ++--------------------------------------- tests/btrfs/121 | 10 ++++- tests/btrfs/131 | 4 ++ tests/btrfs/172 | 7 ++++ tests/btrfs/206 | 7 ++++ tests/btrfs/213 | 20 +++++----- tests/btrfs/249 | 2 +- tests/btrfs/271 | 11 +++--- tests/btrfs/287 | 4 +- tests/btrfs/287.out | 2 +- tests/btrfs/291 | 2 +- tests/btrfs/293 | 6 +-- tests/btrfs/293.out | 4 +- tests/btrfs/303 | 2 +- tests/btrfs/309 | 2 +- tests/btrfs/316 | 2 +- tests/btrfs/317 | 4 +- tests/btrfs/320 | 107 ++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/320.out | 2 + tests/generic/042 | 9 +---- tests/generic/081 | 7 +++- tests/generic/108 | 6 ++- tests/generic/192 | 16 ++++++-- tests/generic/392 | 16 ++------ tests/generic/455 | 3 +- tests/generic/457 | 3 +- tests/generic/482 | 3 +- tests/generic/491 | 2 +- tests/generic/574 | 3 +- tests/generic/604 | 8 ++-- tests/generic/704 | 3 +- tests/generic/707 | 4 +- tests/generic/730 | 3 +- tests/generic/731 | 3 +- tests/shared/298 | 22 ++++++----- tests/xfs/122 | 2 +- tests/xfs/122.out | 2 + tests/xfs/155 | 5 +++ tests/xfs/279 | 7 ++-- tests/xfs/434 | 3 +- tests/xfs/435 | 3 +- tests/xfs/436 | 3 +- tests/xfs/557 | 0 tests/xfs/567 | 0 tests/xfs/568 | 0 tests/xfs/599 | 9 ++--- tests/xfs/607 | 86 +++++++++++++++++++++++++++++++++++++++++ tests/xfs/607.out | 8 ++++ 59 files changed, 495 insertions(+), 221 deletions(-) -- Zorro Lang zlang@kernel.org [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 5274 bytes --] Hi all, The master branch of the xfstests repository at: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git has just been updated to v2024.03.03 version. Thanks, Zorro The new head of the master branch is commit: 9b6df9a01ac8 btrfs/016: fix a false alert due to xattrs mismatch New commits: Anand Jain (10): [c5c8a2223881] common/rc: assign SCRATCH_DEV_POOL to an array [b89613e85fe2] btrfs: introduce tempfsid test group [257d00e13dca] btrfs: create a helper function, check_fsid(), to verify the tempfsid [e6c1d2ab76a9] btrfs: verify that subvolume mounts are unaffected by tempfsid [408376708514] btrfs: check if cloned device mounts with tempfsid [5673516f9824] btrfs: test case prerequisite _require_btrfs_mkfs_uuid_option [3ec2d4378c6c] btrfs: introduce helper for creating cloned devices with mkfs [0a50ae6c2006] btrfs: verify tempfsid clones using mkfs [4af68c15504d] btrfs: validate send-receive operation with tempfsid. [a6c984e10159] btrfs: test tempfsid with device add, seed, and balance Anthony Iliopoulos (2): [cc23dcb3ebe6] generic/68[12]: use the dir blocksize for xfs filesystems [28928ea7af86] generic/682: update and fix-up golden output Filipe Manana (6): [faabc4852ad9] btrfs: test incremental send on sparse file with trailing hole [15c084fe1137] btrfs: require no nodatacow for tests that exercise compression [45d779c9d241] btrfs/173: make the test work when mounting with nodatacow [d9d267ada5a0] btrfs/299: skip test if we were mounted with nodatacow [07e9d3d39c3c] btrfs: require no nodatacow for tests that exercise read repair [3f740470e96b] generic/736: fix a buffer overflow in readdir-while-renames.c Johannes Thumshirn (3): [43f4d620203e] filter.brtfs: add filter for conversion [fe6326040eb3] filter.btrfs: add filter for btrfs device add [8c9102789fec] btrfs: check conversion of zoned fileystems Luis Chamberlain (1): [cb1f76a28453] common/config: fix CANON_DEVS=yes when file does not exist Martin Jansa (1): [a8fbb364834f] tests/*/Makefile: make sure group.list DIRT exists before install Qu Wenruo (4): [00989655e212] btrfs: validate inconsitent qgroup won't leak reserved data space [779ae21a305d] btrfs: btrfs/224 do not assign snapshot to a subvolume qgroup [662db2673538] btrfs: detect regular qgroup for older kernels correctly [9b6df9a01ac8] btrfs/016: fix a false alert due to xattrs mismatch Code Diffstat: common/btrfs | 76 ++++++++++++++++++++++++++++++++++++- common/config | 2 +- common/filter.btrfs | 15 ++++++++ common/rc | 29 ++++++++++++-- doc/group-names.txt | 1 + src/readdir-while-renames.c | 10 ++++- tests/btrfs/016 | 53 +++++++++++--------------- tests/btrfs/024 | 1 + tests/btrfs/048 | 1 + tests/btrfs/059 | 1 + tests/btrfs/138 | 1 + tests/btrfs/140 | 3 +- tests/btrfs/141 | 2 + tests/btrfs/157 | 2 + tests/btrfs/158 | 2 + tests/btrfs/173 | 5 +++ tests/btrfs/215 | 2 + tests/btrfs/224 | 6 +-- tests/btrfs/234 | 1 + tests/btrfs/265 | 2 + tests/btrfs/266 | 2 + tests/btrfs/267 | 2 + tests/btrfs/268 | 2 + tests/btrfs/269 | 2 + tests/btrfs/281 | 1 + tests/btrfs/289 | 2 + tests/btrfs/299 | 3 ++ tests/btrfs/303 | 92 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/303.out | 24 ++++++++++++ tests/btrfs/311 | 87 ++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/311.out | 24 ++++++++++++ tests/btrfs/312 | 78 ++++++++++++++++++++++++++++++++++++++ tests/btrfs/312.out | 19 ++++++++++ tests/btrfs/313 | 52 +++++++++++++++++++++++++ tests/btrfs/313.out | 16 ++++++++ tests/btrfs/314 | 78 ++++++++++++++++++++++++++++++++++++++ tests/btrfs/314.out | 23 ++++++++++++ tests/btrfs/315 | 91 ++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/315.out | 10 +++++ tests/btrfs/316 | 59 +++++++++++++++++++++++++++++ tests/btrfs/316.out | 2 + tests/btrfs/317 | 67 +++++++++++++++++++++++++++++++++ tests/btrfs/317.out | 9 +++++ tests/btrfs/Makefile | 2 +- tests/ceph/Makefile | 2 +- tests/cifs/Makefile | 2 +- tests/ext4/Makefile | 2 +- tests/f2fs/Makefile | 2 +- tests/generic/681 | 2 +- tests/generic/682 | 10 ++++- tests/generic/682.out | 2 +- tests/generic/Makefile | 2 +- tests/nfs/Makefile | 2 +- tests/ocfs2/Makefile | 2 +- tests/overlay/Makefile | 2 +- tests/perf/Makefile | 2 +- tests/selftest/Makefile | 2 +- tests/shared/Makefile | 2 +- tests/tmpfs/Makefile | 2 +- tests/udf/Makefile | 2 +- tests/xfs/Makefile | 2 +- 61 files changed, 942 insertions(+), 62 deletions(-) -- Zorro Lang zlang@kernel.org [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On Sat, Mar 16, 2024 at 2:39 PM Boris Burkov <boris@bur.io> wrote: > > On Sat, Mar 16, 2024 at 01:08:57PM +0000, Filipe Manana wrote: > > On Fri, Mar 15, 2024 at 11:30 PM Boris Burkov <boris@bur.io> wrote: > > > > > > On Thu, Mar 14, 2024 at 07:56:28PM -0700, Darrick J. Wong wrote: > > > > On Wed, Mar 13, 2024 at 04:47:47PM -0700, Boris Burkov wrote: > > > > > This test removes a SCSI debug device out from under a mounted > > > > > filesystem with a (probably) dirty file. This assumes that page cache > > > > > cannot save us from EIO, for a reason that I can't quite explain. In > > > > > fact, this test fails for exactly that reason, at least on btrfs. > > > > > > > > > > The original patches: > > > > > > > > > > https://lore.kernel.org/fstests/20230807112100.GB15405@lst.de/ > > > > > > > > > > refer to this passing on xfs and not btrfs, so I suspect I am missing > > > > > something. With that said, on my machine this actually fails on xfs with > > > > > and without my patch, so this is clearly not enough. > > > > > > > > > > High level, I am trying to understand what is really the expected > > > > > behavior from a filesystem under this condition and what this test is > > > > > getting at. Of btrfs, ext4, and xfs, only ext4 passes it, while btrfs > > > > > does pass with this additional syncing/cache dropping to nudge it to an > > > > > error. > > > > > > > > Does btrfs prefetch pagecache data as soon as a file opens? Or I guess > > > > it could be that xfs trips an IO error and shuts down, and > > > > xfs_file_read_iter will return EIO after that happens. > > > > > > > > --D > > > > > > > > > > I might be wildly misunderstanding, since I'm very far from an expert on > > > the page cache, but didn't we just do a buffered write to the file? So it > > > would make sense for it to be in cache? > > > > > > At any rate, what I observed on my system was that xfs does not pass > > > this test. Curious if that is only on my system or just the current > > > state of the test. > > > > Instead of xfs, do you mean btrfs? > > > > For me xfs and ext4 always pass this test, i.e. the cat command fails > > with -EIO, both with and without your patch. > > > > On the other hand, btrfs and f2fs always fail without your patch, and > > pass with your patch applied. > > $ sudo ./check generic/730 > FSTYP -- xfs (non-debug) > PLATFORM -- Linux/x86_64 v0 6.8.0-rc7+ #205 SMP PREEMPT_DYNAMIC Thu Mar 14 16:44:32 PDT 2024 > MKFS_OPTIONS -- -f /dev/mapper/tst-scr0 > MOUNT_OPTIONS -- /dev/mapper/tst-scr0 /mnt/scratch > > generic/730 1s ... [failed, exit status 1]- output mismatch (see /home/vmuser/fstests/results//generic/730.out.bad) > --- tests/generic/730.out 2023-11-01 22:14:50.011000000 +0000 > +++ /home/vmuser/fstests/results//generic/730.out.bad 2024-03-16 14:38:02.431000000 +0000 > @@ -1,2 +1 @@ > QA output created by 730 > -cat: -: Input/output error > ... > (Run 'diff -u /home/vmuser/fstests/tests/generic/730.out /home/vmuser/fstests/results//generic/730.out.bad' to see the entire diff) > Ran: generic/730 > Failures: generic/730 > Failed 1 of 1 tests > > I suspect it has something to do with my Kconfig. Perhaps "xfs (non-debug)" is a clue? Mine is also non-debug and yet it always succeeds for me (without your patch or any other of course): root 17:40:02 /home/fdmanana/git/hub/xfstests > ./check generic/730 FSTYP -- xfs (non-debug) PLATFORM -- Linux/x86_64 debian0 6.8.0-rc7-btrfs-next-153+ #1 SMP PREEMPT_DYNAMIC Mon Mar 4 17:19:19 WET 2024 MKFS_OPTIONS -- -f /dev/sdb MOUNT_OPTIONS -- /dev/sdb /home/fdmanana/btrfs-tests/scratch_1 generic/730 2s ... 2s Ran: generic/730 Passed all 1 tests > > > > > From a btrfs point of view, it makes sense for the cat command to > > succeed, since all the file data is in the page cache. > > That was my intuition as well, as I mentioned above. > > > > > I don't know why the cat command fails (-EIO) with xfs and ext4, and I > > haven't looked at any of their source code (my knowledge of those > > filesystem's internals is too little anyway). > > > > An alternative to your patch is just to make the write with direct IO > > by passing -d to $XFS_IO_PROG. It makes the test succeed on these 4 > > filesystems. > > > > > > I like this idea a lot more than my patch! > > > > > > > Summary of my test matrix: > > > ext4: passes with and without this patch > > > btrfs: fails without this patch, passes with it > > > xfs: fails with and without this patch > > > > > > For that reason, I don't think this a good patch, I just mostly want to > > > figure out where to go with this test! > > > > > > Thanks, > > > Boris > > > > > > > > Signed-off-by: Boris Burkov <boris@bur.io> > > > > > --- > > > > > tests/generic/730 | 3 +++ > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > diff --git a/tests/generic/730 b/tests/generic/730 > > > > > index 11308cdaa..ca5037c57 100755 > > > > > --- a/tests/generic/730 > > > > > +++ b/tests/generic/730 > > > > > @@ -47,6 +47,9 @@ exec 3< $SCSI_DEBUG_MNT/testfile > > > > > # delete the scsi debug device while it still has dirty data > > > > > echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete > > > > > > > > > > +sync > > > > > +echo 3 > /proc/sys/vm/drop_caches > > > > > + > > > > > # try to read from the file, which should give us -EIO > > > > > cat <&3 > /dev/null > > > > > > > > > > -- > > > > > 2.43.0 > > > > > > > > > > > > >
When a dm Flakey device is configured, (or similar dm where both physical and dm devices are accessible) we have access to both the physical device and the dm flakey device, ensure that the physical device mount fails. Signed-off-by: Anand Jain <anand.jain@oracle.com> --- Remove redirect for rm Add _require_test Add _filter_error_mount Change log - make it more sound generic with a dm device Add quick group tests/generic/741 | 60 +++++++++++++++++++++++++++++++++++++++++++ tests/generic/741.out | 3 +++ 2 files changed, 63 insertions(+) create mode 100755 tests/generic/741 create mode 100644 tests/generic/741.out diff --git a/tests/generic/741 b/tests/generic/741 new file mode 100755 index 000000000000..f8f9a7be7619 --- /dev/null +++ b/tests/generic/741 @@ -0,0 +1,60 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2024 Oracle. All Rights Reserved. +# +# FS QA Test 741 +# +# Attempt to mount both the DM physical device and the DM flakey device. +# Verify the returned error message. +# +. ./common/preamble +_begin_fstest auto quick volume tempfsid + +# Override the default cleanup function. +_cleanup() +{ + umount $extra_mnt &> /dev/null + rm -rf $extra_mnt + _unmount_flakey + _cleanup_flakey + cd / + rm -r -f $tmp.* +} + +# Import common functions. +. ./common/filter +. ./common/dmflakey + +# real QA test starts here +_supported_fs generic +_require_test +_require_scratch +_require_dm_target flakey + +[ "$FSTYP" = "btrfs" ] && _fixed_by_kernel_commit XXXXXXXXXXXX \ + "btrfs: return accurate error code on open failure" + +_scratch_mkfs >> $seqres.full +_init_flakey +_mount_flakey + +extra_mnt=$TEST_DIR/extra_mnt +rm -rf $extra_mnt +mkdir -p $extra_mnt + +# Mount must fail because the physical device has a dm created on it. +# Filters alter the return code of the mount. +_mount $SCRATCH_DEV $extra_mnt 2>&1 | \ + _filter_testdir_and_scratch | _filter_error_mount + +# Try again with flakey unmounted, must fail. +_unmount_flakey +_mount $SCRATCH_DEV $extra_mnt 2>&1 | \ + _filter_testdir_and_scratch | _filter_error_mount + +# Removing dm should make mount successful. +_cleanup_flakey +_scratch_mount + +status=0 +exit diff --git a/tests/generic/741.out b/tests/generic/741.out new file mode 100644 index 000000000000..b694f5fad6b8 --- /dev/null +++ b/tests/generic/741.out @@ -0,0 +1,3 @@ +QA output created by 741 +mount: TEST_DIR/extra_mnt: SCRATCH_DEV already mounted or mount point busy +mount: TEST_DIR/extra_mnt: SCRATCH_DEV already mounted or mount point busy -- 2.39.3
Given that ext4 also allows mounting of a cloned filesystem, the btrfs test case btrfs/312, which assesses the functionality of cloned filesystem support, can be refactored to be under the shared group. Signed-off-by: Anand Jain <anand.jain@oracle.com> --- v2: Move to shared testcase instead of generic. Add _require_block_device $TEST_DEV. Add _require_duplicated_fsid. common/rc | 14 +++++++ tests/btrfs/312 | 78 -------------------------------------- tests/btrfs/312.out | 19 ---------- tests/shared/001 | 89 ++++++++++++++++++++++++++++++++++++++++++++ tests/shared/001.out | 4 ++ 5 files changed, 107 insertions(+), 97 deletions(-) delete mode 100755 tests/btrfs/312 delete mode 100644 tests/btrfs/312.out create mode 100755 tests/shared/001 create mode 100644 tests/shared/001.out diff --git a/common/rc b/common/rc index 36cad89cfc5d..2638dfb8e9b3 100644 --- a/common/rc +++ b/common/rc @@ -5408,6 +5408,20 @@ _random_file() { echo "$basedir/$(ls -U $basedir | shuf -n 1)" } +_require_duplicate_fsid() +{ + case "$FSTYP" in + "btrfs") + _require_btrfs_fs_feature temp_fsid + ;; + "ext4") + ;; + *) + _notrun "$FSTYP cannot support mounting with duplicate fsid" + ;; + esac +} + init_rc ################################################################################ diff --git a/tests/btrfs/312 b/tests/btrfs/312 deleted file mode 100755 index eedcf11a2308..000000000000 --- a/tests/btrfs/312 +++ /dev/null @@ -1,78 +0,0 @@ -#! /bin/bash -# SPDX-License-Identifier: GPL-2.0 -# Copyright (c) 2024 Oracle. All Rights Reserved. -# -# FS QA Test 312 -# -# On a clone a device check to see if tempfsid is activated. -# -. ./common/preamble -_begin_fstest auto quick clone tempfsid - -_cleanup() -{ - cd / - $UMOUNT_PROG $mnt1 > /dev/null 2>&1 - rm -r -f $tmp.* - rm -r -f $mnt1 -} - -. ./common/filter.btrfs -. ./common/reflink - -_supported_fs btrfs -_require_scratch_dev_pool 2 -_scratch_dev_pool_get 2 -_require_btrfs_fs_feature temp_fsid - -mnt1=$TEST_DIR/$seq/mnt1 -mkdir -p $mnt1 - -create_cloned_devices() -{ - local dev1=$1 - local dev2=$2 - - echo -n Creating cloned device... - _mkfs_dev -fq -b $((1024 * 1024 * 300)) $dev1 - - _mount $dev1 $SCRATCH_MNT - - $XFS_IO_PROG -fc 'pwrite -S 0x61 0 9000' $SCRATCH_MNT/foo | \ - _filter_xfs_io - $UMOUNT_PROG $SCRATCH_MNT - # device dump of $dev1 to $dev2 - dd if=$dev1 of=$dev2 bs=300M count=1 conv=fsync status=none || \ - _fail "dd failed: $?" - echo done -} - -mount_cloned_device() -{ - echo ---- $FUNCNAME ---- - create_cloned_devices ${SCRATCH_DEV_NAME[0]} ${SCRATCH_DEV_NAME[1]} - - echo Mounting original device - _mount ${SCRATCH_DEV_NAME[0]} $SCRATCH_MNT - $XFS_IO_PROG -fc 'pwrite -S 0x61 0 9000' $SCRATCH_MNT/foo | \ - _filter_xfs_io - check_fsid ${SCRATCH_DEV_NAME[0]} - - echo Mounting cloned device - _mount ${SCRATCH_DEV_NAME[1]} $mnt1 || \ - _fail "mount failed, tempfsid didn't work" - - echo cp reflink must fail - _cp_reflink $SCRATCH_MNT/foo $mnt1/bar 2>&1 | \ - _filter_testdir_and_scratch - - check_fsid ${SCRATCH_DEV_NAME[1]} -} - -mount_cloned_device - -_scratch_dev_pool_put - -# success, all done -status=0 -exit diff --git a/tests/btrfs/312.out b/tests/btrfs/312.out deleted file mode 100644 index b7de6ce3cc6e..000000000000 --- a/tests/btrfs/312.out +++ /dev/null @@ -1,19 +0,0 @@ -QA output created by 312 ----- mount_cloned_device ---- -Creating cloned device...wrote 9000/9000 bytes at offset 0 -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -done -Mounting original device -wrote 9000/9000 bytes at offset 0 -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -On disk fsid: FSID -Metadata uuid: FSID -Temp fsid: FSID -Tempfsid status: 0 -Mounting cloned device -cp reflink must fail -cp: failed to clone 'TEST_DIR/312/mnt1/bar' from 'SCRATCH_MNT/foo': Invalid cross-device link -On disk fsid: FSID -Metadata uuid: FSID -Temp fsid: TEMPFSID -Tempfsid status: 1 diff --git a/tests/shared/001 b/tests/shared/001 new file mode 100755 index 000000000000..3f2b85a41099 --- /dev/null +++ b/tests/shared/001 @@ -0,0 +1,89 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2024 Oracle. All Rights Reserved. +# +# FS QA Test 001 +# +# Set up a filesystem, create a clone, mount both, and verify if the cp reflink +# operation between these two mounts fails. +# +. ./common/preamble +_begin_fstest auto quick clone volume tempfsid + +_cleanup() +{ + cd / + rm -r -f $tmp.* + + $UMOUNT_PROG $mnt2 &> /dev/null + rm -r -f $mnt2 + _destroy_loop_device $loop_dev2 &> /dev/null + rm -r -f $loop_file2 + + $UMOUNT_PROG $mnt1 &> /dev/null + rm -r -f $mnt1 + _destroy_loop_device $loop_dev1 &> /dev/null + rm -r -f $loop_file1 +} + +. ./common/filter +. ./common/reflink + +# Modify as appropriate. +_supported_fs btrfs ext4 +_require_duplicate_fsid +_require_cp_reflink +_require_test +_require_block_device $TEST_DEV +_require_loop + +[[ $FSTYP == "btrfs" ]] && _require_btrfs_fs_feature temp_fsid + +clone_filesystem() +{ + local dev1=$1 + local dev2=$2 + + _mkfs_dev $dev1 + + _mount $dev1 $mnt1 + $XFS_IO_PROG -fc 'pwrite -S 0x61 0 9000' $mnt1/foo >> $seqres.full + $UMOUNT_PROG $mnt1 + + # device dump of $dev1 to $dev2 + dd if=$dev1 of=$dev2 conv=fsync status=none || _fail "dd failed: $?" +} + +mnt1=$TEST_DIR/$seq/mnt1 +rm -r -f $mnt1 +mkdir -p $mnt1 + +mnt2=$TEST_DIR/$seq/mnt2 +rm -r -f $mnt2 +mkdir -p $mnt2 + +loop_file1="$TEST_DIR/$seq/image1" +rm -r -f $loop_file1 +truncate -s 300m "$loop_file1" +loop_dev1=$(_create_loop_device "$loop_file1") + +loop_file2="$TEST_DIR/$seq/image2" +rm -r -f $loop_file2 +truncate -s 300m "$loop_file2" +loop_dev2=$(_create_loop_device "$loop_file2") + +clone_filesystem ${loop_dev1} ${loop_dev2} + +# Mounting original device +_mount $loop_dev1 $mnt1 +$XFS_IO_PROG -fc 'pwrite -S 0x61 0 9000' $mnt1/foo | _filter_xfs_io + +# Mounting cloned device +_mount $loop_dev2 $mnt2 || _fail "mount of cloned device failed" + +# cp reflink across two different filesystems must fail +_cp_reflink $mnt1/foo $mnt2/bar 2>&1 | _filter_test_dir + +# success, all done +status=0 +exit diff --git a/tests/shared/001.out b/tests/shared/001.out new file mode 100644 index 000000000000..56b697ca3972 --- /dev/null +++ b/tests/shared/001.out @@ -0,0 +1,4 @@ +QA output created by 001 +wrote 9000/9000 bytes at offset 0 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +cp: failed to clone 'TEST_DIR/001/mnt2/bar' from 'TEST_DIR/001/mnt1/foo': Invalid cross-device link -- 2.39.3
v2: Fixes as per review comments on the individual patches. Patch 1/2 relocates a tempfsid (clone device) test case from btrfs group to the generic group. Patch 2/2 validates a recently discovered and resolved bug in Btrfs; however, the test case can be made generic. Anand Jain (2): shared: move btrfs clone device testcase to the shared group generic: test mount fails on physical device with configured dm volume common/rc | 14 +++++++ tests/btrfs/312 | 78 ------------------------------------- tests/btrfs/312.out | 19 --------- tests/generic/741 | 60 +++++++++++++++++++++++++++++ tests/generic/741.out | 3 ++ tests/shared/001 | 89 +++++++++++++++++++++++++++++++++++++++++++ tests/shared/001.out | 4 ++ 7 files changed, 170 insertions(+), 97 deletions(-) delete mode 100755 tests/btrfs/312 delete mode 100644 tests/btrfs/312.out create mode 100755 tests/generic/741 create mode 100644 tests/generic/741.out create mode 100755 tests/shared/001 create mode 100644 tests/shared/001.out -- 2.39.3
On Sat, Mar 16, 2024 at 01:08:57PM +0000, Filipe Manana wrote: > On Fri, Mar 15, 2024 at 11:30 PM Boris Burkov <boris@bur.io> wrote: > > > > On Thu, Mar 14, 2024 at 07:56:28PM -0700, Darrick J. Wong wrote: > > > On Wed, Mar 13, 2024 at 04:47:47PM -0700, Boris Burkov wrote: > > > > This test removes a SCSI debug device out from under a mounted > > > > filesystem with a (probably) dirty file. This assumes that page cache > > > > cannot save us from EIO, for a reason that I can't quite explain. In > > > > fact, this test fails for exactly that reason, at least on btrfs. > > > > > > > > The original patches: > > > > > > > > https://lore.kernel.org/fstests/20230807112100.GB15405@lst.de/ > > > > > > > > refer to this passing on xfs and not btrfs, so I suspect I am missing > > > > something. With that said, on my machine this actually fails on xfs with > > > > and without my patch, so this is clearly not enough. > > > > > > > > High level, I am trying to understand what is really the expected > > > > behavior from a filesystem under this condition and what this test is > > > > getting at. Of btrfs, ext4, and xfs, only ext4 passes it, while btrfs > > > > does pass with this additional syncing/cache dropping to nudge it to an > > > > error. > > > > > > Does btrfs prefetch pagecache data as soon as a file opens? Or I guess > > > it could be that xfs trips an IO error and shuts down, and > > > xfs_file_read_iter will return EIO after that happens. > > > > > > --D > > > > > > > I might be wildly misunderstanding, since I'm very far from an expert on > > the page cache, but didn't we just do a buffered write to the file? So it > > would make sense for it to be in cache? > > > > At any rate, what I observed on my system was that xfs does not pass > > this test. Curious if that is only on my system or just the current > > state of the test. > > Instead of xfs, do you mean btrfs? > > For me xfs and ext4 always pass this test, i.e. the cat command fails > with -EIO, both with and without your patch. > > On the other hand, btrfs and f2fs always fail without your patch, and > pass with your patch applied. $ sudo ./check generic/730 FSTYP -- xfs (non-debug) PLATFORM -- Linux/x86_64 v0 6.8.0-rc7+ #205 SMP PREEMPT_DYNAMIC Thu Mar 14 16:44:32 PDT 2024 MKFS_OPTIONS -- -f /dev/mapper/tst-scr0 MOUNT_OPTIONS -- /dev/mapper/tst-scr0 /mnt/scratch generic/730 1s ... [failed, exit status 1]- output mismatch (see /home/vmuser/fstests/results//generic/730.out.bad) --- tests/generic/730.out 2023-11-01 22:14:50.011000000 +0000 +++ /home/vmuser/fstests/results//generic/730.out.bad 2024-03-16 14:38:02.431000000 +0000 @@ -1,2 +1 @@ QA output created by 730 -cat: -: Input/output error ... (Run 'diff -u /home/vmuser/fstests/tests/generic/730.out /home/vmuser/fstests/results//generic/730.out.bad' to see the entire diff) Ran: generic/730 Failures: generic/730 Failed 1 of 1 tests I suspect it has something to do with my Kconfig. Perhaps "xfs (non-debug)" is a clue? > > From a btrfs point of view, it makes sense for the cat command to > succeed, since all the file data is in the page cache. That was my intuition as well, as I mentioned above. > > I don't know why the cat command fails (-EIO) with xfs and ext4, and I > haven't looked at any of their source code (my knowledge of those > filesystem's internals is too little anyway). > > An alternative to your patch is just to make the write with direct IO > by passing -d to $XFS_IO_PROG. It makes the test succeed on these 4 > filesystems. > > I like this idea a lot more than my patch! > > > > Summary of my test matrix: > > ext4: passes with and without this patch > > btrfs: fails without this patch, passes with it > > xfs: fails with and without this patch > > > > For that reason, I don't think this a good patch, I just mostly want to > > figure out where to go with this test! > > > > Thanks, > > Boris > > > > > > Signed-off-by: Boris Burkov <boris@bur.io> > > > > --- > > > > tests/generic/730 | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/tests/generic/730 b/tests/generic/730 > > > > index 11308cdaa..ca5037c57 100755 > > > > --- a/tests/generic/730 > > > > +++ b/tests/generic/730 > > > > @@ -47,6 +47,9 @@ exec 3< $SCSI_DEBUG_MNT/testfile > > > > # delete the scsi debug device while it still has dirty data > > > > echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete > > > > > > > > +sync > > > > +echo 3 > /proc/sys/vm/drop_caches > > > > + > > > > # try to read from the file, which should give us -EIO > > > > cat <&3 > /dev/null > > > > > > > > -- > > > > 2.43.0 > > > > > > > > > >
On Fri, Mar 15, 2024 at 11:30 PM Boris Burkov <boris@bur.io> wrote: > > On Thu, Mar 14, 2024 at 07:56:28PM -0700, Darrick J. Wong wrote: > > On Wed, Mar 13, 2024 at 04:47:47PM -0700, Boris Burkov wrote: > > > This test removes a SCSI debug device out from under a mounted > > > filesystem with a (probably) dirty file. This assumes that page cache > > > cannot save us from EIO, for a reason that I can't quite explain. In > > > fact, this test fails for exactly that reason, at least on btrfs. > > > > > > The original patches: > > > > > > https://lore.kernel.org/fstests/20230807112100.GB15405@lst.de/ > > > > > > refer to this passing on xfs and not btrfs, so I suspect I am missing > > > something. With that said, on my machine this actually fails on xfs with > > > and without my patch, so this is clearly not enough. > > > > > > High level, I am trying to understand what is really the expected > > > behavior from a filesystem under this condition and what this test is > > > getting at. Of btrfs, ext4, and xfs, only ext4 passes it, while btrfs > > > does pass with this additional syncing/cache dropping to nudge it to an > > > error. > > > > Does btrfs prefetch pagecache data as soon as a file opens? Or I guess > > it could be that xfs trips an IO error and shuts down, and > > xfs_file_read_iter will return EIO after that happens. > > > > --D > > > > I might be wildly misunderstanding, since I'm very far from an expert on > the page cache, but didn't we just do a buffered write to the file? So it > would make sense for it to be in cache? > > At any rate, what I observed on my system was that xfs does not pass > this test. Curious if that is only on my system or just the current > state of the test. Instead of xfs, do you mean btrfs? For me xfs and ext4 always pass this test, i.e. the cat command fails with -EIO, both with and without your patch. On the other hand, btrfs and f2fs always fail without your patch, and pass with your patch applied. From a btrfs point of view, it makes sense for the cat command to succeed, since all the file data is in the page cache. I don't know why the cat command fails (-EIO) with xfs and ext4, and I haven't looked at any of their source code (my knowledge of those filesystem's internals is too little anyway). An alternative to your patch is just to make the write with direct IO by passing -d to $XFS_IO_PROG. It makes the test succeed on these 4 filesystems. > > Summary of my test matrix: > ext4: passes with and without this patch > btrfs: fails without this patch, passes with it > xfs: fails with and without this patch > > For that reason, I don't think this a good patch, I just mostly want to > figure out where to go with this test! > > Thanks, > Boris > > > > Signed-off-by: Boris Burkov <boris@bur.io> > > > --- > > > tests/generic/730 | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/tests/generic/730 b/tests/generic/730 > > > index 11308cdaa..ca5037c57 100755 > > > --- a/tests/generic/730 > > > +++ b/tests/generic/730 > > > @@ -47,6 +47,9 @@ exec 3< $SCSI_DEBUG_MNT/testfile > > > # delete the scsi debug device while it still has dirty data > > > echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete > > > > > > +sync > > > +echo 3 > /proc/sys/vm/drop_caches > > > + > > > # try to read from the file, which should give us -EIO > > > cat <&3 > /dev/null > > > > > > -- > > > 2.43.0 > > > > > > >
On Thu, Mar 14, 2024 at 07:56:28PM -0700, Darrick J. Wong wrote: > On Wed, Mar 13, 2024 at 04:47:47PM -0700, Boris Burkov wrote: > > This test removes a SCSI debug device out from under a mounted > > filesystem with a (probably) dirty file. This assumes that page cache > > cannot save us from EIO, for a reason that I can't quite explain. In > > fact, this test fails for exactly that reason, at least on btrfs. > > > > The original patches: > > > > https://lore.kernel.org/fstests/20230807112100.GB15405@lst.de/ > > > > refer to this passing on xfs and not btrfs, so I suspect I am missing > > something. With that said, on my machine this actually fails on xfs with > > and without my patch, so this is clearly not enough. > > > > High level, I am trying to understand what is really the expected > > behavior from a filesystem under this condition and what this test is > > getting at. Of btrfs, ext4, and xfs, only ext4 passes it, while btrfs > > does pass with this additional syncing/cache dropping to nudge it to an > > error. > > Does btrfs prefetch pagecache data as soon as a file opens? Or I guess > it could be that xfs trips an IO error and shuts down, and > xfs_file_read_iter will return EIO after that happens. > > --D > I might be wildly misunderstanding, since I'm very far from an expert on the page cache, but didn't we just do a buffered write to the file? So it would make sense for it to be in cache? At any rate, what I observed on my system was that xfs does not pass this test. Curious if that is only on my system or just the current state of the test. Summary of my test matrix: ext4: passes with and without this patch btrfs: fails without this patch, passes with it xfs: fails with and without this patch For that reason, I don't think this a good patch, I just mostly want to figure out where to go with this test! Thanks, Boris > > Signed-off-by: Boris Burkov <boris@bur.io> > > --- > > tests/generic/730 | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/tests/generic/730 b/tests/generic/730 > > index 11308cdaa..ca5037c57 100755 > > --- a/tests/generic/730 > > +++ b/tests/generic/730 > > @@ -47,6 +47,9 @@ exec 3< $SCSI_DEBUG_MNT/testfile > > # delete the scsi debug device while it still has dirty data > > echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete > > > > +sync > > +echo 3 > /proc/sys/vm/drop_caches > > + > > # try to read from the file, which should give us -EIO > > cat <&3 > /dev/null > > > > -- > > 2.43.0 > > > >
Since commit 9bab148bb3c7 ("common/fuzzy: exercise the filesystem a little harder after repairing") funtion _scratch_fuzz_modify() has become xfs-specific due to the use of some functions that assume this filesytem, namely _xfs_force_bdev() and _xfs_has_feature(). Ensure _scratch_fuzz_modify() works again with other filesystems by using these functions only when testing xfs. Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> --- common/fuzzy | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/common/fuzzy b/common/fuzzy index f5d45cb28f07..218fe1654386 100644 --- a/common/fuzzy +++ b/common/fuzzy @@ -8,15 +8,17 @@ _scratch_fuzz_modify() { echo "+++ stressing filesystem" mkdir -p $SCRATCH_MNT/data - _xfs_force_bdev data $SCRATCH_MNT/data + [ "$FSTYP" == "xfs" ] && _xfs_force_bdev data $SCRATCH_MNT/data $FSSTRESS_PROG -n $((TIME_FACTOR * 10000)) -p $((LOAD_FACTOR * 4)) -d $SCRATCH_MNT/data - if _xfs_has_feature "$SCRATCH_MNT" realtime; then - mkdir -p $SCRATCH_MNT/rt - _xfs_force_bdev realtime $SCRATCH_MNT/rt - $FSSTRESS_PROG -n $((TIME_FACTOR * 10000)) -p $((LOAD_FACTOR * 4)) -d $SCRATCH_MNT/rt - else - echo "+++ xfs realtime not configured" + if [ "$FSTYP" = "xfs" ]; then + if _xfs_has_feature "$SCRATCH_MNT" realtime; then + mkdir -p $SCRATCH_MNT/rt + _xfs_force_bdev realtime $SCRATCH_MNT/rt + $FSSTRESS_PROG -n $((TIME_FACTOR * 10000)) -p $((LOAD_FACTOR * 4)) -d $SCRATCH_MNT/rt + else + echo "+++ xfs realtime not configured" + fi fi }
Test ext4/006 takes into account the number of lines produced by its own output. However, changes introduced to function _scratch_fuzz_modify() by commit 9bab148bb3c7 ("common/fuzzy: exercise the filesystem a little harder after repairing"), modified the output. Namely, the following three lines were removed: echo "+++ touch ${nr} files" echo "+++ create files" echo "+++ remove files" And a new one was added: echo "+++ stressing filesystem" However, the usage of 'fsstress' also added an extra line with: printf("seed = %ld\n", seed); So the delta is one line (-3 + 2). Modify test ext4/006 to take this change into account. Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> --- tests/ext4/006 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/ext4/006 b/tests/ext4/006 index 8792167bd9c2..b73692aa5b81 100755 --- a/tests/ext4/006 +++ b/tests/ext4/006 @@ -131,7 +131,7 @@ echo "++ check fs (2)" >> $seqres.full _check_scratch_fs >> $seqres.full 2>&1 grep -E -q '(did not fix|makes no progress)' $seqres.full && echo "e2fsck failed" | tee -a $seqres.full -if [ "$(wc -l < "$ROUND2_LOG")" -ne 8 ]; then +if [ "$(wc -l < "$ROUND2_LOG")" -ne 7 ]; then echo "e2fsck did not fix everything" | tee -a $seqres.full fi echo "finished fuzzing" | tee -a "$seqres.full"
Hi! I guess test ext4/006 isn't executed very often, because as far as I can see it has been broken for quite some time. The problem is that the function _scratch_fuzz_modify() is executing xfs-specific operations that will make this test fail. The first patch updates this function so that xfs-specific operations are executed only when that filesystem is being tested. The second patch simply updates the test to take into account the new output of the function (and yeah, counting log lines is an ugly hack). Changes since v1: * Updated the second patch commit message to add further details on what changed with the test output. Luis Henriques (SUSE) (2): common/fuzzy: make _scratch_fuzz_modify work for non-xfs filesystems ext4/006: take into account updates to _scratch_fuzz_modify() common/fuzzy | 16 +++++++++------- tests/ext4/006 | 2 +- 2 files changed, 10 insertions(+), 8 deletions(-)
On Fri, Mar 15, 2024 at 09:11:32AM +0000, Luis Henriques wrote: > "Darrick J. Wong" <djwong@kernel.org> writes: > > > On Thu, Mar 14, 2024 at 05:25:12PM +0000, Luis Henriques (SUSE) wrote: > >> Since function _scratch_fuzz_modify() has been updated, its output has > >> changed. Modify test ext4/006 to take this change into account. > > > > Now that you've blocked off the XFS stuff, what line changed? > > Well, there are three lines that are gone with commit 9bab148bb3c7: > > echo "+++ touch ${nr} files" > echo "+++ create files" > echo "+++ remove files" > > And a new one was added: > > echo "+++ stressing filesystem" > > However, running fsstress will also add a new line with: > > printf("seed = %ld\n", seed); > > So the delta is one line (seven instead of eight). Ah, ok. Can you copy-paste that into the commit message so that we have some record of what caused the need for the adjustment, please? With that added, Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > Cheers, > -- > Luís > > > > > --D > > > >> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> > >> --- > >> tests/ext4/006 | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/tests/ext4/006 b/tests/ext4/006 > >> index 8792167bd9c2..b73692aa5b81 100755 > >> --- a/tests/ext4/006 > >> +++ b/tests/ext4/006 > >> @@ -131,7 +131,7 @@ echo "++ check fs (2)" >> $seqres.full > >> _check_scratch_fs >> $seqres.full 2>&1 > >> > >> grep -E -q '(did not fix|makes no progress)' $seqres.full && echo "e2fsck failed" | tee -a $seqres.full > >> -if [ "$(wc -l < "$ROUND2_LOG")" -ne 8 ]; then > >> +if [ "$(wc -l < "$ROUND2_LOG")" -ne 7 ]; then > >> echo "e2fsck did not fix everything" | tee -a $seqres.full > >> fi > >> echo "finished fuzzing" | tee -a "$seqres.full" > >> >