All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v2 0/1] fstests: fix unshare data corruption bug
@ 2023-09-25 21:43 Darrick J. Wong
  2023-09-25 21:43 ` [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded Darrick J. Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2023-09-25 21:43 UTC (permalink / raw)
  To: djwong, zlang; +Cc: linux-xfs, fstests, guan, ritesh.list, willy

Hi all,

I rebased djwong-dev atop 6.6-rc1, and discovered that the iomap unshare
code writes garbage into the unshared file if the unshared range covers
at least one base page's worth of file range and there weren't any
folios in the pagecache for that region.

The root cause is an optimization applied to __iomap_write_begin for 6.6
that caused it to ignore !uptodate folios.  This is fine for the
write/zeroing cases since they're going to write to the folio anyway,
but unshare merely marks the folio dirty and lets writeback handle the
unsharing.

While I was rooting around in there, I also noticed that the unshare
operation wasn't ported to use large folios.  This leads to suboptimal
performance if userspace funshares a file and continues using the page
cache, since the cache is now using base pages.

v2: reduce redundant variable access

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been lightly tested with fstests.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=iomap-fix-unshare

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=iomap-fix-unshare
---
 tests/xfs/1936     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1936.out |    4 ++
 2 files changed, 92 insertions(+)
 create mode 100755 tests/xfs/1936
 create mode 100644 tests/xfs/1936.out


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded
  2023-09-25 21:43 [PATCHSET v2 0/1] fstests: fix unshare data corruption bug Darrick J. Wong
@ 2023-09-25 21:43 ` Darrick J. Wong
  2023-09-26  5:21   ` Ritesh Harjani
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2023-09-25 21:43 UTC (permalink / raw)
  To: djwong, zlang; +Cc: linux-xfs, fstests, guan, ritesh.list, willy

From: Darrick J. Wong <djwong@kernel.org>

Add a regression test for funsharing uncached files to ensure that we
actually manage the pagecache state correctly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/xfs/1936     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1936.out |    4 ++
 2 files changed, 92 insertions(+)
 create mode 100755 tests/xfs/1936
 create mode 100644 tests/xfs/1936.out


diff --git a/tests/xfs/1936 b/tests/xfs/1936
new file mode 100755
index 0000000000..e07b8f4796
--- /dev/null
+++ b/tests/xfs/1936
@@ -0,0 +1,88 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2023 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1936
+#
+# This is a regression test for the kernel commit noted below.  The stale
+# memory exposure can be exploited by creating a file with shared blocks,
+# evicting the page cache for that file, and then funshareing at least one
+# memory page's worth of data.  iomap will mark the page uptodate and dirty
+# without ever reading the ondisk contents.
+#
+. ./common/preamble
+_begin_fstest auto quick unshare clone
+
+_cleanup()
+{
+	cd /
+	rm -r -f $tmp.* $testdir
+}
+
+# real QA test starts here
+
+# Import common functions.
+. ./common/filter
+. ./common/attr
+. ./common/reflink
+
+_fixed_by_git_commit kernel XXXXXXXXXXXXX \
+	"iomap: don't skip reading in !uptodate folios when unsharing a range"
+
+# real QA test starts here
+_require_test_reflink
+_require_cp_reflink
+_require_xfs_io_command "funshare"
+
+testdir=$TEST_DIR/test-$seq
+rm -rf $testdir
+mkdir $testdir
+
+# Create a file that is at least four pages in size and aligned to the
+# file allocation unit size so that we don't trigger any unnecessary zeroing.
+pagesz=$(_get_page_size)
+alloc_unit=$(_get_file_block_size $TEST_DIR)
+filesz=$(( ( (4 * pagesz) + alloc_unit - 1) / alloc_unit * alloc_unit))
+
+echo "Create the original file and a clone"
+_pwrite_byte 0x61 0 $filesz $testdir/file2.chk >> $seqres.full
+_pwrite_byte 0x61 0 $filesz $testdir/file1 >> $seqres.full
+_cp_reflink $testdir/file1 $testdir/file2
+_cp_reflink $testdir/file1 $testdir/file3
+
+_test_cycle_mount
+
+cat $testdir/file3 > /dev/null
+
+echo "Funshare at least one pagecache page"
+$XFS_IO_PROG -c "funshare 0 $filesz" $testdir/file2
+$XFS_IO_PROG -c "funshare 0 $filesz" $testdir/file3
+_pwrite_byte 0x61 0 $filesz $testdir/file2.chk >> $seqres.full
+
+echo "Check contents"
+
+# file2 wasn't cached when it was unshared, but it should match
+if ! cmp -s $testdir/file2.chk $testdir/file2; then
+	echo "file2.chk does not match file2"
+
+	echo "file2.chk contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file2.chk >> $seqres.full
+	echo "file2 contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file2 >> $seqres.full
+	echo "end bad contents" >> $seqres.full
+fi
+
+# file3 was cached when it was unshared, and it should match
+if ! cmp -s $testdir/file2.chk $testdir/file3; then
+	echo "file2.chk does not match file3"
+
+	echo "file2.chk contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file2.chk >> $seqres.full
+	echo "file3 contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file3 >> $seqres.full
+	echo "end bad contents" >> $seqres.full
+fi
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/1936.out b/tests/xfs/1936.out
new file mode 100644
index 0000000000..c7c820ced5
--- /dev/null
+++ b/tests/xfs/1936.out
@@ -0,0 +1,4 @@
+QA output created by 1936
+Create the original file and a clone
+Funshare at least one pagecache page
+Check contents


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded
  2023-09-25 21:43 ` [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded Darrick J. Wong
@ 2023-09-26  5:21   ` Ritesh Harjani
  2023-09-26 14:47     ` Darrick J. Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Ritesh Harjani @ 2023-09-26  5:21 UTC (permalink / raw)
  To: Darrick J. Wong, djwong, zlang; +Cc: linux-xfs, fstests, guan, willy

"Darrick J. Wong" <djwong@kernel.org> writes:

> From: Darrick J. Wong <djwong@kernel.org>
>
> Add a regression test for funsharing uncached files to ensure that we
> actually manage the pagecache state correctly.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  tests/xfs/1936     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1936.out |    4 ++
>  2 files changed, 92 insertions(+)
>  create mode 100755 tests/xfs/1936
>  create mode 100644 tests/xfs/1936.out
>
>
> diff --git a/tests/xfs/1936 b/tests/xfs/1936
> new file mode 100755
> index 0000000000..e07b8f4796
> --- /dev/null
> +++ b/tests/xfs/1936
> @@ -0,0 +1,88 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2023 Oracle.  All Rights Reserved.
> +#
> +# FS QA Test 1936
> +#
> +# This is a regression test for the kernel commit noted below.  The stale
> +# memory exposure can be exploited by creating a file with shared blocks,
> +# evicting the page cache for that file, and then funshareing at least one
> +# memory page's worth of data.  iomap will mark the page uptodate and dirty
> +# without ever reading the ondisk contents.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick unshare clone
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -r -f $tmp.* $testdir
> +}
> +
> +# real QA test starts here
> +
> +# Import common functions.
> +. ./common/filter
> +. ./common/attr

We might as well remove above imports if we are not using those in this test.

> +. ./common/reflink
> +
> +_fixed_by_git_commit kernel XXXXXXXXXXXXX \
> +	"iomap: don't skip reading in !uptodate folios when unsharing a range"

Once I guess it is merged, we will have the commit-id. Ohh wait, we have
it already right? 

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=35d30c9cf12730a1e37053dfde4007c7cc452d1a

With that the testcode looks good to me. Thanks for finding an easy
reproducer. Please feel free to add - 

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

-ritesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded
  2023-09-26  5:21   ` Ritesh Harjani
@ 2023-09-26 14:47     ` Darrick J. Wong
  0 siblings, 0 replies; 6+ messages in thread
From: Darrick J. Wong @ 2023-09-26 14:47 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: zlang, linux-xfs, fstests, guan, willy

On Tue, Sep 26, 2023 at 10:51:19AM +0530, Ritesh Harjani wrote:
> "Darrick J. Wong" <djwong@kernel.org> writes:
> 
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Add a regression test for funsharing uncached files to ensure that we
> > actually manage the pagecache state correctly.
> >
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  tests/xfs/1936     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1936.out |    4 ++
> >  2 files changed, 92 insertions(+)
> >  create mode 100755 tests/xfs/1936
> >  create mode 100644 tests/xfs/1936.out
> >
> >
> > diff --git a/tests/xfs/1936 b/tests/xfs/1936
> > new file mode 100755
> > index 0000000000..e07b8f4796
> > --- /dev/null
> > +++ b/tests/xfs/1936
> > @@ -0,0 +1,88 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2023 Oracle.  All Rights Reserved.
> > +#
> > +# FS QA Test 1936
> > +#
> > +# This is a regression test for the kernel commit noted below.  The stale
> > +# memory exposure can be exploited by creating a file with shared blocks,
> > +# evicting the page cache for that file, and then funshareing at least one
> > +# memory page's worth of data.  iomap will mark the page uptodate and dirty
> > +# without ever reading the ondisk contents.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick unshare clone
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -r -f $tmp.* $testdir
> > +}
> > +
> > +# real QA test starts here
> > +
> > +# Import common functions.
> > +. ./common/filter
> > +. ./common/attr
> 
> We might as well remove above imports if we are not using those in this test.

<nod>

> > +. ./common/reflink
> > +
> > +_fixed_by_git_commit kernel XXXXXXXXXXXXX \
> > +	"iomap: don't skip reading in !uptodate folios when unsharing a range"
> 
> Once I guess it is merged, we will have the commit-id. Ohh wait, we have
> it already right? 
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=35d30c9cf12730a1e37053dfde4007c7cc452d1a

Oops, yeah, I'll update the tag.

> With that the testcode looks good to me. Thanks for finding an easy
> reproducer. Please feel free to add - 
> 
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Thanks!

> -ritesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded
  2023-10-09 18:19 ` [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded Darrick J. Wong
@ 2023-10-10  7:05   ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2023-10-10  7:05 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: zlang, Ritesh Harjani (IBM), linux-xfs, fstests, guan, willy

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded
  2023-10-09 18:18 [PATCHSET v3 0/1] fstests: fix unshare data corruption bug Darrick J. Wong
@ 2023-10-09 18:19 ` Darrick J. Wong
  2023-10-10  7:05   ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2023-10-09 18:19 UTC (permalink / raw)
  To: djwong, zlang
  Cc: Ritesh Harjani (IBM), linux-xfs, fstests, guan, ritesh.list, willy

From: Darrick J. Wong <djwong@kernel.org>

Add a regression test for funsharing uncached files to ensure that we
actually manage the pagecache state correctly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 tests/xfs/1936     |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1936.out |    4 ++
 2 files changed, 91 insertions(+)
 create mode 100755 tests/xfs/1936
 create mode 100644 tests/xfs/1936.out


diff --git a/tests/xfs/1936 b/tests/xfs/1936
new file mode 100755
index 0000000000..4f183f17d7
--- /dev/null
+++ b/tests/xfs/1936
@@ -0,0 +1,87 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2023 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1936
+#
+# This is a regression test for the kernel commit noted below.  The stale
+# memory exposure can be exploited by creating a file with shared blocks,
+# evicting the page cache for that file, and then funshareing at least one
+# memory page's worth of data.  iomap will mark the page uptodate and dirty
+# without ever reading the ondisk contents.
+#
+. ./common/preamble
+_begin_fstest auto quick unshare clone
+
+_cleanup()
+{
+	cd /
+	rm -r -f $tmp.* $testdir
+}
+
+# real QA test starts here
+
+# Import common functions.
+. ./common/filter
+. ./common/reflink
+
+_fixed_by_git_commit kernel 35d30c9cf127 \
+	"iomap: don't skip reading in !uptodate folios when unsharing a range"
+
+# real QA test starts here
+_require_test_reflink
+_require_cp_reflink
+_require_xfs_io_command "funshare"
+
+testdir=$TEST_DIR/test-$seq
+rm -rf $testdir
+mkdir $testdir
+
+# Create a file that is at least four pages in size and aligned to the
+# file allocation unit size so that we don't trigger any unnecessary zeroing.
+pagesz=$(_get_page_size)
+alloc_unit=$(_get_file_block_size $TEST_DIR)
+filesz=$(( ( (4 * pagesz) + alloc_unit - 1) / alloc_unit * alloc_unit))
+
+echo "Create the original file and a clone"
+_pwrite_byte 0x61 0 $filesz $testdir/file2.chk >> $seqres.full
+_pwrite_byte 0x61 0 $filesz $testdir/file1 >> $seqres.full
+_cp_reflink $testdir/file1 $testdir/file2
+_cp_reflink $testdir/file1 $testdir/file3
+
+_test_cycle_mount
+
+cat $testdir/file3 > /dev/null
+
+echo "Funshare at least one pagecache page"
+$XFS_IO_PROG -c "funshare 0 $filesz" $testdir/file2
+$XFS_IO_PROG -c "funshare 0 $filesz" $testdir/file3
+_pwrite_byte 0x61 0 $filesz $testdir/file2.chk >> $seqres.full
+
+echo "Check contents"
+
+# file2 wasn't cached when it was unshared, but it should match
+if ! cmp -s $testdir/file2.chk $testdir/file2; then
+	echo "file2.chk does not match file2"
+
+	echo "file2.chk contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file2.chk >> $seqres.full
+	echo "file2 contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file2 >> $seqres.full
+	echo "end bad contents" >> $seqres.full
+fi
+
+# file3 was cached when it was unshared, and it should match
+if ! cmp -s $testdir/file2.chk $testdir/file3; then
+	echo "file2.chk does not match file3"
+
+	echo "file2.chk contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file2.chk >> $seqres.full
+	echo "file3 contents" >> $seqres.full
+	od -tx1 -Ad -c $testdir/file3 >> $seqres.full
+	echo "end bad contents" >> $seqres.full
+fi
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/1936.out b/tests/xfs/1936.out
new file mode 100644
index 0000000000..c7c820ced5
--- /dev/null
+++ b/tests/xfs/1936.out
@@ -0,0 +1,4 @@
+QA output created by 1936
+Create the original file and a clone
+Funshare at least one pagecache page
+Check contents


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-10  7:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-25 21:43 [PATCHSET v2 0/1] fstests: fix unshare data corruption bug Darrick J. Wong
2023-09-25 21:43 ` [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded Darrick J. Wong
2023-09-26  5:21   ` Ritesh Harjani
2023-09-26 14:47     ` Darrick J. Wong
2023-10-09 18:18 [PATCHSET v3 0/1] fstests: fix unshare data corruption bug Darrick J. Wong
2023-10-09 18:19 ` [PATCH 1/1] generic: test FALLOC_FL_UNSHARE when pagecache is not loaded Darrick J. Wong
2023-10-10  7:05   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.