linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xfs: add writepage map error tag
@ 2018-10-23 18:15 Brian Foster
  2018-10-23 18:16 ` [PATCH] tests/xfs: writepage map error unmount test Brian Foster
  0 siblings, 1 reply; 4+ messages in thread
From: Brian Foster @ 2018-10-23 18:15 UTC (permalink / raw)
  To: linux-xfs

Add an error tag to inject errors in the writeback block mapping
codepath. This facilitates testing of the error path responsible for
discarding delalloc blocks that could not be converted to real
blocks.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

Hi all,

I'm sending this along with a quick xfstest to call attention to an
issue I ran into when poking around the writepage error handling path.
The test demonstrates the details, but the short of it is that returning
an error from ->writepages() interrupts a sync writeback and allows us
to unmount a filesystem with dirty+delalloc pages. This behavior can
also be observed with something like the following:

# xfs_io -fc "pwrite 0 64k" /mnt/file
wrote 65536/65536 bytes at offset 0
64 KiB, 16 ops; 0.0000 sec (102.627 MiB/sec and 26272.5780 ops/sec)
# xfs_io -c fsync /mnt/file 
fsync: Input/output error
# xfs_io -c fsync /mnt/file 
fsync: Input/output error
# xfs_io -c fsync /mnt/file 
fsync: Input/output error

... and so on until we finally process every dirty page.

A quick hack to return 0 from xfs_writepage_map() changes the behavior
to process (i.e. toss/discard) all of the dirty pages as if writeback
was successful. I don't necessarily think that is the right approach
since ->writepage() should probably return an error even if
->writepages() doesn't, but it raises the question as to what the
expected behavior should be in this case.

ISTM that we have a bit of an impedence mismatch between what a return
code and mapping error mean vs. what XFS does. XFS essentially tosses
the page by throwing away delalloc blocks, clearing uptodate status on
the page and setting an error on the page/mapping (the latter of which
makes it to userspace on fsync, for example). By also returning an
error, we're telling mm not to continue processing the writeback, but
should we be doing that if we need to process these dirty pages whether
writeback succeeds or fails? For background writeback it might not
matter, it will loop around again until all pages are processed, but for
an integrity sync (or the umount case)?

Further, it also seems we're a bit inconsistent with page states between
writepage failures and I/O errors. The former clears the uptodate status
and thus immediately loses the page content, whereas the latter looks
like it just clears writeback and the content presumably sticks around
if/until the page is reclaimed and the old data is read from disk. This
is somewhat of a separate issue [1], however..

I need to poke around the writeback code some more, but I was thinking
of doing something like filtering out an error return somehow or another
if we're in ->writepages() and doing an integrity writeback. Any high
level thoughts on that? Other ideas?

Brian

[1] I'm mulling over the idea of deferring all such page processing to a
workqueue by setting some kind of an inode flag (a la inode reclaim,
eofblocks processing, etc.) such that we can make page error processing
1.) consistent, 2.) properly serialized with buffered writes and 3.)
perhaps even configurable via something like the metadata error handling
tunables.

 fs/xfs/libxfs/xfs_errortag.h | 4 +++-
 fs/xfs/xfs_aops.c            | 6 ++++++
 fs/xfs/xfs_error.c           | 3 +++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index 66077a105cbb..97c9eaa72dee 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -54,7 +54,8 @@
 #define XFS_ERRTAG_BUF_LRU_REF				31
 #define XFS_ERRTAG_FORCE_SCRUB_REPAIR			32
 #define XFS_ERRTAG_FORCE_SUMMARY_RECALC			33
-#define XFS_ERRTAG_MAX					34
+#define XFS_ERRTAG_WRITEPAGE_MAP			34
+#define XFS_ERRTAG_MAX					35
 
 /*
  * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -93,5 +94,6 @@
 #define XFS_RANDOM_BUF_LRU_REF				2
 #define XFS_RANDOM_FORCE_SCRUB_REPAIR			1
 #define XFS_RANDOM_FORCE_SUMMARY_RECALC			1
+#define XFS_RANDOM_WRITEPAGE_MAP			2
 
 #endif /* __XFS_ERRORTAG_H_ */
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 338b9d9984e0..0b5d41195282 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -21,6 +21,7 @@
 #include "xfs_bmap_util.h"
 #include "xfs_bmap_btree.h"
 #include "xfs_reflink.h"
+#include "xfs_errortag.h"
 #include <linux/writeback.h>
 
 /*
@@ -718,6 +719,11 @@ xfs_writepage_map(
 		if (iop && !test_bit(i, iop->uptodate))
 			continue;
 
+		if (XFS_TEST_ERROR(false, XFS_I(inode)->i_mount,
+				   XFS_ERRTAG_WRITEPAGE_MAP)) {
+			error = -EIO;
+			break;
+		}
 		error = xfs_map_blocks(wpc, inode, file_offset);
 		if (error)
 			break;
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 9866f542e77b..e15ac398b1da 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -51,6 +51,7 @@ static unsigned int xfs_errortag_random_default[] = {
 	XFS_RANDOM_BUF_LRU_REF,
 	XFS_RANDOM_FORCE_SCRUB_REPAIR,
 	XFS_RANDOM_FORCE_SUMMARY_RECALC,
+	XFS_RANDOM_WRITEPAGE_MAP,
 };
 
 struct xfs_errortag_attr {
@@ -159,6 +160,7 @@ XFS_ERRORTAG_ATTR_RW(log_item_pin,	XFS_ERRTAG_LOG_ITEM_PIN);
 XFS_ERRORTAG_ATTR_RW(buf_lru_ref,	XFS_ERRTAG_BUF_LRU_REF);
 XFS_ERRORTAG_ATTR_RW(force_repair,	XFS_ERRTAG_FORCE_SCRUB_REPAIR);
 XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
+XFS_ERRORTAG_ATTR_RW(writepage_map,	XFS_ERRTAG_WRITEPAGE_MAP);
 
 static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -195,6 +197,7 @@ static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(buf_lru_ref),
 	XFS_ERRORTAG_ATTR_LIST(force_repair),
 	XFS_ERRORTAG_ATTR_LIST(bad_summary),
+	XFS_ERRORTAG_ATTR_LIST(writepage_map),
 	NULL,
 };
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] tests/xfs: writepage map error unmount test
  2018-10-23 18:15 [PATCH] xfs: add writepage map error tag Brian Foster
@ 2018-10-23 18:16 ` Brian Foster
  2018-10-28 13:56   ` Eryu Guan
  0 siblings, 1 reply; 4+ messages in thread
From: Brian Foster @ 2018-10-23 18:16 UTC (permalink / raw)
  To: linux-xfs; +Cc: fstests

Certain failures in the page writeback codepath can cause a broader
writeback (i.e., a sync) sequence to exit prematurely. If this
occurs during an unmount, it's likely the filesystem will reclaim
inodes without having cleared all delayed allocation blocks. This
produces a warning and leaves the filesystem inconsistent.

This test reproduces this scenario using the 'writepage_map'
errortag. It performs delayed allocation, injects writeback errors
and immediately unmounts the filesystem.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

Note that this depends on a not yet available error tag.

Brian

 tests/xfs/494     | 57 +++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/494.out |  2 ++
 tests/xfs/group   |  1 +
 3 files changed, 60 insertions(+)
 create mode 100755 tests/xfs/494
 create mode 100644 tests/xfs/494.out

diff --git a/tests/xfs/494 b/tests/xfs/494
new file mode 100755
index 00000000..30877d38
--- /dev/null
+++ b/tests/xfs/494
@@ -0,0 +1,57 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2018 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test 494
+#
+# Simulate a serialization problem between page writeback error handling and
+# unmount. If ->writepages() fails and returns an error back to the mm
+# subsystem, writeback of subsequent pages is interrupted and the filesystem
+# unmounts before processing all dirty pages. This results in unmounting a
+# filesystem with dirty+delalloc pages, which in turn causes unmount time
+# warnings and filesystem inconsistency.
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/inject
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_error_injection
+_require_xfs_io_error_injection "writepage_map"
+_require_scratch
+
+_scratch_mkfs >> $seqres.full 2>&1
+
+_scratch_mount
+_scratch_inject_error "writepage_map" 1
+# use 512k to dirty multiple pages on large page size systems
+$XFS_IO_PROG -fc "pwrite 0 512k" $SCRATCH_MNT/file >> $seqres.full 2>&1
+_scratch_unmount
+
+echo Silence is golden
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/494.out b/tests/xfs/494.out
new file mode 100644
index 00000000..827c239a
--- /dev/null
+++ b/tests/xfs/494.out
@@ -0,0 +1,2 @@
+QA output created by 494
+Silence is golden
diff --git a/tests/xfs/group b/tests/xfs/group
index 2cec0585..136d3707 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -491,3 +491,4 @@
 491 auto quick fuzz
 492 auto quick fuzz
 493 auto quick fuzz
+494 auto quick
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] tests/xfs: writepage map error unmount test
  2018-10-23 18:16 ` [PATCH] tests/xfs: writepage map error unmount test Brian Foster
@ 2018-10-28 13:56   ` Eryu Guan
  2018-10-29  9:28     ` Brian Foster
  0 siblings, 1 reply; 4+ messages in thread
From: Eryu Guan @ 2018-10-28 13:56 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs, fstests

On Tue, Oct 23, 2018 at 02:16:38PM -0400, Brian Foster wrote:
> Certain failures in the page writeback codepath can cause a broader
> writeback (i.e., a sync) sequence to exit prematurely. If this
> occurs during an unmount, it's likely the filesystem will reclaim
> inodes without having cleared all delayed allocation blocks. This
> produces a warning and leaves the filesystem inconsistent.
> 
> This test reproduces this scenario using the 'writepage_map'
> errortag. It performs delayed allocation, injects writeback errors
> and immediately unmounts the filesystem.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> Note that this depends on a not yet available error tag.

I'll wait for the "error tag" patch goes in first, or gets some ACKs
first. One really minor issue below.

> 
> Brian
> 
>  tests/xfs/494     | 57 +++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/494.out |  2 ++
>  tests/xfs/group   |  1 +
>  3 files changed, 60 insertions(+)
>  create mode 100755 tests/xfs/494
>  create mode 100644 tests/xfs/494.out
> 
> diff --git a/tests/xfs/494 b/tests/xfs/494
> new file mode 100755
> index 00000000..30877d38
> --- /dev/null
> +++ b/tests/xfs/494
> @@ -0,0 +1,57 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2018 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test 494
> +#
> +# Simulate a serialization problem between page writeback error handling and
> +# unmount. If ->writepages() fails and returns an error back to the mm
> +# subsystem, writeback of subsequent pages is interrupted and the filesystem
> +# unmounts before processing all dirty pages. This results in unmounting a
> +# filesystem with dirty+delalloc pages, which in turn causes unmount time
> +# warnings and filesystem inconsistency.
> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/inject
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs generic

Supported fs should be 'xfs' here.

Thanks,
Eryu

> +_supported_os Linux
> +_require_error_injection
> +_require_xfs_io_error_injection "writepage_map"
> +_require_scratch
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +
> +_scratch_mount
> +_scratch_inject_error "writepage_map" 1
> +# use 512k to dirty multiple pages on large page size systems
> +$XFS_IO_PROG -fc "pwrite 0 512k" $SCRATCH_MNT/file >> $seqres.full 2>&1
> +_scratch_unmount
> +
> +echo Silence is golden
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/xfs/494.out b/tests/xfs/494.out
> new file mode 100644
> index 00000000..827c239a
> --- /dev/null
> +++ b/tests/xfs/494.out
> @@ -0,0 +1,2 @@
> +QA output created by 494
> +Silence is golden
> diff --git a/tests/xfs/group b/tests/xfs/group
> index 2cec0585..136d3707 100644
> --- a/tests/xfs/group
> +++ b/tests/xfs/group
> @@ -491,3 +491,4 @@
>  491 auto quick fuzz
>  492 auto quick fuzz
>  493 auto quick fuzz
> +494 auto quick
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tests/xfs: writepage map error unmount test
  2018-10-28 13:56   ` Eryu Guan
@ 2018-10-29  9:28     ` Brian Foster
  0 siblings, 0 replies; 4+ messages in thread
From: Brian Foster @ 2018-10-29  9:28 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Sun, Oct 28, 2018 at 09:56:33PM +0800, Eryu Guan wrote:
> On Tue, Oct 23, 2018 at 02:16:38PM -0400, Brian Foster wrote:
> > Certain failures in the page writeback codepath can cause a broader
> > writeback (i.e., a sync) sequence to exit prematurely. If this
> > occurs during an unmount, it's likely the filesystem will reclaim
> > inodes without having cleared all delayed allocation blocks. This
> > produces a warning and leaves the filesystem inconsistent.
> > 
> > This test reproduces this scenario using the 'writepage_map'
> > errortag. It performs delayed allocation, injects writeback errors
> > and immediately unmounts the filesystem.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > Note that this depends on a not yet available error tag.
> 
> I'll wait for the "error tag" patch goes in first, or gets some ACKs
> first. One really minor issue below.
> 

Yep. I probably should have been more clear and/or set this RFC. This
test is wholly dependent on if/whether/how to deal with the associated
error handling issue, which atm is still up in the air. The tag/test is
primarily to demonstrate the problem and may not be independently useful
until there's a supporting fix. I may float a patch for that soon..

> > 
> > Brian
> > 
> >  tests/xfs/494     | 57 +++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/494.out |  2 ++
> >  tests/xfs/group   |  1 +
> >  3 files changed, 60 insertions(+)
> >  create mode 100755 tests/xfs/494
> >  create mode 100644 tests/xfs/494.out
> > 
> > diff --git a/tests/xfs/494 b/tests/xfs/494
> > new file mode 100755
> > index 00000000..30877d38
> > --- /dev/null
> > +++ b/tests/xfs/494
> > @@ -0,0 +1,57 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2018 Red Hat, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test 494
> > +#
> > +# Simulate a serialization problem between page writeback error handling and
> > +# unmount. If ->writepages() fails and returns an error back to the mm
> > +# subsystem, writeback of subsequent pages is interrupted and the filesystem
> > +# unmounts before processing all dirty pages. This results in unmounting a
> > +# filesystem with dirty+delalloc pages, which in turn causes unmount time
> > +# warnings and filesystem inconsistency.
> > +#
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1	# failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/inject
> > +
> > +# remove previous $seqres.full before test
> > +rm -f $seqres.full
> > +
> > +# real QA test starts here
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> 
> Supported fs should be 'xfs' here.
> 

Will fix, thanks.

Brian

> Thanks,
> Eryu
> 
> > +_supported_os Linux
> > +_require_error_injection
> > +_require_xfs_io_error_injection "writepage_map"
> > +_require_scratch
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +
> > +_scratch_mount
> > +_scratch_inject_error "writepage_map" 1
> > +# use 512k to dirty multiple pages on large page size systems
> > +$XFS_IO_PROG -fc "pwrite 0 512k" $SCRATCH_MNT/file >> $seqres.full 2>&1
> > +_scratch_unmount
> > +
> > +echo Silence is golden
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/xfs/494.out b/tests/xfs/494.out
> > new file mode 100644
> > index 00000000..827c239a
> > --- /dev/null
> > +++ b/tests/xfs/494.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 494
> > +Silence is golden
> > diff --git a/tests/xfs/group b/tests/xfs/group
> > index 2cec0585..136d3707 100644
> > --- a/tests/xfs/group
> > +++ b/tests/xfs/group
> > @@ -491,3 +491,4 @@
> >  491 auto quick fuzz
> >  492 auto quick fuzz
> >  493 auto quick fuzz
> > +494 auto quick
> > -- 
> > 2.17.2
> > 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-29 18:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-23 18:15 [PATCH] xfs: add writepage map error tag Brian Foster
2018-10-23 18:16 ` [PATCH] tests/xfs: writepage map error unmount test Brian Foster
2018-10-28 13:56   ` Eryu Guan
2018-10-29  9:28     ` Brian Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).