linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [xfstests PATCH v2 0/3] xfstest for updated writeback error handling
@ 2017-05-09 16:12 Jeff Layton
  2017-05-09 16:12 ` [xfstests PATCH v2 1/3] generic: add a writeback error handling test Jeff Layton
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Jeff Layton @ 2017-05-09 16:12 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, linux-btrfs, linux-ext4, linux-cifs,
	linux-nfs, linux-mm, jfs-discussion, linux-xfs, cluster-devel,
	linux-f2fs-devel, v9fs-developer, linux-nilfs, linux-block,
	fstests
  Cc: dhowells, akpm, hch, ross.zwisler, mawilcox, jack, viro, corbet,
	neilb, clm, tytso, axboe, josef, hubcap, rpeterso, bo.li.liu

I've numbered the new test as 999 for the moment so as not to collide with
tests being added while I've been working on this. I can change that and
resend if this should go in.

I'm working on a set of kernel patches to change how writeback errors
are handled and reported in the kernel. Instead of reporting a
writeback error to only the first fsync caller on the file, I aim
to make the kernel report them once on every file description:

   http://www.spinics.net/lists/kernel/msg2504453.html

This patch adds a test for the new behavior, on local filesystems that
can handle journalling to a separate device. Basically, open many fds to
the same file, turn on dm_error, write to each of the fds, and then
fsync them all to ensure that they all get an error back. Then, flip the
device back to working, reopen the files and ensure that no error is
reported.

With the kernel patch series in place, ext4 and xfs now pass. btrfs still
clears the error after the first fsync, so it seems like it still needs a
bit of work.

Jeff Layton (3):
  generic: add a writeback error handling test
  ext4: allow ext4 to use $SCRATCH_LOGDEV
  btrfs: allow it to use $SCRATCH_LOGDEV

 common/dmerror        |  13 +++--
 common/rc             |   5 ++
 src/Makefile          |   2 +-
 src/fsync-err.c       | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/999     |  75 +++++++++++++++++++++++++++
 tests/generic/999.out |   3 ++
 tests/generic/group   |   1 +
 tools/dmerror         |  47 +++++++++++++++++
 8 files changed, 278 insertions(+), 6 deletions(-)
 create mode 100644 src/fsync-err.c
 create mode 100755 tests/generic/999
 create mode 100644 tests/generic/999.out
 create mode 100755 tools/dmerror

-- 
2.9.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [xfstests PATCH v2 1/3] generic: add a writeback error handling test
  2017-05-09 16:12 [xfstests PATCH v2 0/3] xfstest for updated writeback error handling Jeff Layton
@ 2017-05-09 16:12 ` Jeff Layton
  2017-05-09 16:12 ` [xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV Jeff Layton
  2017-05-09 16:12 ` [xfstests PATCH v2 3/3] btrfs: allow it " Jeff Layton
  2 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2017-05-09 16:12 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, linux-btrfs, linux-ext4, linux-cifs,
	linux-nfs, linux-mm, jfs-discussion, linux-xfs, cluster-devel,
	linux-f2fs-devel, v9fs-developer, linux-nilfs, linux-block,
	fstests
  Cc: dhowells, akpm, hch, ross.zwisler, mawilcox, jack, viro, corbet,
	neilb, clm, tytso, axboe, josef, hubcap, rpeterso, bo.li.liu

I'm working on a set of kernel patches to change how writeback errors
are handled and reported in the kernel. Instead of reporting a
writeback error to only the first fsync caller on the file, I aim
to make the kernel report them once on every file description.

This patch adds a test for the new behavior. Basically, open many fds
to the same file, turn on dm_error, write to each of the fds, and then
fsync them all to ensure that they all get an error back.

To do that, I'm adding a new tools/dmerror script that the C program
can use to load the error table. For now, that's all it can do, but
we can fill it out with other commands as necessary.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 common/dmerror        |  13 +++--
 src/Makefile          |   2 +-
 src/fsync-err.c       | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/999     |  75 +++++++++++++++++++++++++++
 tests/generic/999.out |   3 ++
 tests/generic/group   |   1 +
 tools/dmerror         |  47 +++++++++++++++++
 7 files changed, 273 insertions(+), 6 deletions(-)
 create mode 100644 src/fsync-err.c
 create mode 100755 tests/generic/999
 create mode 100644 tests/generic/999.out
 create mode 100755 tools/dmerror

diff --git a/common/dmerror b/common/dmerror
index d46c5d0b7266..238baa213b1f 100644
--- a/common/dmerror
+++ b/common/dmerror
@@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
 	_notrun "Cannot run tests with DAX on dmerror devices"
 fi
 
-_dmerror_init()
+_dmerror_setup()
 {
 	local dm_backing_dev=$SCRATCH_DEV
 
-	$DMSETUP_PROG remove error-test > /dev/null 2>&1
-
 	local blk_dev_size=`blockdev --getsz $dm_backing_dev`
 
 	DMERROR_DEV='/dev/mapper/error-test'
 
 	DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
 
+	DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
+}
+
+_dmerror_init()
+{
+	_dmerror_setup
+	$DMSETUP_PROG remove error-test > /dev/null 2>&1
 	$DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
 		_fatal "failed to create dm linear device"
-
-	DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
 }
 
 _dmerror_mount()
diff --git a/src/Makefile b/src/Makefile
index e5042c9b7d8e..eb8ff6c4a344 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -12,7 +12,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
 	godown resvtest writemod makeextents itrash rename \
 	multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
 	t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
-	holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd
+	holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd fsync-err
 
 LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
 	preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
diff --git a/src/fsync-err.c b/src/fsync-err.c
new file mode 100644
index 000000000000..8ad27b3da3e7
--- /dev/null
+++ b/src/fsync-err.c
@@ -0,0 +1,138 @@
+/*
+ * fsync-err.c: test whether writeback errors are reported to all open fds
+ * Copyright (c) 2017: Jeff Layton <jlayton@redhat.com>
+ *
+ * Open a file several times, write to it and then fsync. Flip dm_error over
+ * to make the backing device stop working. Overwrite the same section and
+ * call fsync on all fds and verify that we get errors on all of them. Then,
+ * fsync one more time on all of them and verify that they return 0.
+ */
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+#define NUM_FDS	10
+
+static void usage() {
+	fprintf(stderr, "Usage: fsync-err <filename>\n");
+}
+
+int main(int argc, char **argv)
+{
+	int fd[NUM_FDS], ret, i;
+	char *fname, *buf;
+
+	if (argc < 1) {
+		usage();
+		return 1;
+	}
+
+	/* First argument is filename */
+	fname = argv[1];
+
+	for (i = 0; i < NUM_FDS; ++i) {
+		fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
+		if (fd[i] < 0) {
+			printf("open of fd[%d] failed: %m\n", i);
+			return 1;
+		}
+	}
+
+	buf = "foobar";
+	for (i = 0; i < NUM_FDS; ++i) {
+		ret = write(fd[i], buf, strlen(buf) + 1);
+		if (ret < 0) {
+			printf("First write on fd[%d] failed: %m\n", i);
+			return 1;
+		}
+	}
+
+	for (i = 0; i < NUM_FDS; ++i) {
+		ret = fsync(fd[i]);
+		if (ret < 0) {
+			printf("First fsync on fd[%d] failed: %m\n", i);
+			return 1;
+		}
+	}
+
+	/* flip the device to non-working mode */
+	ret = system("./tools/dmerror load_error_table");
+	if (ret) {
+		if (WIFEXITED(ret))
+			printf("system: program exited: %d\n",
+					WEXITSTATUS(ret));
+		else
+			printf("system: 0x%x\n", (int)ret);
+
+		return 1;
+	}
+
+	for (i = 0; i < NUM_FDS; ++i) {
+		ret = write(fd[i], buf, strlen(buf) + 1);
+		if (ret < 0) {
+			printf("Second write on fd[%d] failed: %m\n", i);
+			return 1;
+		}
+	}
+
+	for (i = 0; i < NUM_FDS; ++i) {
+		ret = fsync(fd[i]);
+		/* Now, we EXPECT the error! */
+		if (ret >= 0) {
+			printf("Success on second fsync on fd[%d]!\n", i);
+			return 1;
+		}
+	}
+
+	/* flip the device to working mode */
+	ret = system("./tools/dmerror load_working_table");
+	if (ret) {
+		if (WIFEXITED(ret))
+			printf("system: program exited: %d\n",
+					WEXITSTATUS(ret));
+		else
+			printf("system: 0x%x\n", (int)ret);
+
+		return 1;
+	}
+
+	for (i = 0; i < NUM_FDS; ++i) {
+		ret = fsync(fd[i]);
+		if (ret < 0) {
+			/* Now the error should be clear */
+			printf("Third fsync on fd[%d] failed: %m\n", i);
+			return 1;
+		}
+	}
+
+	/*
+	 * reopen each file to guarantee the same one stays in core. fsync
+	 * each one to make sure we see no errors.
+	 */
+	for (i = 0; i < NUM_FDS; ++i) {
+		ret = close(fd[i]);
+		if (ret < 0) {
+			printf("Close of fd[%d] returned unexpected error: %m\n", i);
+			return 1;
+		}
+		fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
+		if (fd[i] < 0) {
+			printf("Second open of fd[%d] failed: %m\n", i);
+			return 1;
+		}
+		ret = fsync(fd[i]);
+		if (ret < 0) {
+			/* New opens should not return an error */
+			printf("First fsync after reopen of fd[%d] failed: %m\n", i);
+			return 1;
+		}
+	}
+
+	printf("Test passed!\n");
+	return 0;
+}
diff --git a/tests/generic/999 b/tests/generic/999
new file mode 100755
index 000000000000..57e093b43416
--- /dev/null
+++ b/tests/generic/999
@@ -0,0 +1,75 @@
+#! /bin/bash
+# FS QA Test No. 999
+#
+# Open a file several times, write to it, fsync on all fds and make sure that
+# they all return 0. Change the device to start throwing errors. Write again
+# on all fds and fsync on all fds. Ensure that we get errors on all of them.
+# Then fsync on all one last time and verify that all return 0.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017, Jeff Layton <jlayton@redhat.com>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1    # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+    cd /
+    rm -rf $tmp.* $testdir
+    _dmerror_cleanup
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmerror
+
+# real QA test starts here
+_supported_os Linux
+_require_scratch
+_require_logdev
+_require_dm_target error
+
+rm -f $seqres.full
+
+echo "Format and mount"
+$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full
+_scratch_mkfs > $seqres.full 2>&1
+_dmerror_init
+_dmerror_mount >> $seqres.full 2>&1
+_dmerror_unmount
+_dmerror_mount
+
+_require_fs_space $SCRATCH_MNT 8192
+
+testfile=$SCRATCH_MNT/fsync-err-test
+
+$here/src/fsync-err $testfile
+
+# success, all done
+_dmerror_load_working_table
+_dmerror_unmount
+_dmerror_cleanup
+_repair_scratch_fs >> $seqres.full
+status=0
+exit
diff --git a/tests/generic/999.out b/tests/generic/999.out
new file mode 100644
index 000000000000..2e48492ff6d1
--- /dev/null
+++ b/tests/generic/999.out
@@ -0,0 +1,3 @@
+QA output created by 999
+Format and mount
+Test passed!
diff --git a/tests/generic/group b/tests/generic/group
index 9fc7f9d419a8..854c39277dd5 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -432,3 +432,4 @@
 427 auto quick aio rw
 428 auto quick
 429 auto encrypt
+999 auto quick
diff --git a/tools/dmerror b/tools/dmerror
new file mode 100755
index 000000000000..fc0de295e778
--- /dev/null
+++ b/tools/dmerror
@@ -0,0 +1,47 @@
+#!/bin/bash
+#-----------------------------------------------------------------------
+# Copyright (c) 2017, Jeff Layton <jlayton@redhat.com>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+
+. ./common/rc
+. ./common/dmerror
+
+_dmerror_setup
+
+case $1 in
+cleanup)
+	_dmerror_cleanup
+	;;
+init)
+	_dmerror_init
+	;;
+load_error_table)
+	_dmerror_load_error_table
+	;;
+load_working_table)
+	_dmerror_load_working_table
+	;;
+mount)
+	_dmerror_mount
+	;;
+*)
+	echo "Usage: $0 {init|cleanup|load_error_table|load_working_table|mount}"
+	exit 1
+	;;
+esac
+
+status=0
+exit
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV
  2017-05-09 16:12 [xfstests PATCH v2 0/3] xfstest for updated writeback error handling Jeff Layton
  2017-05-09 16:12 ` [xfstests PATCH v2 1/3] generic: add a writeback error handling test Jeff Layton
@ 2017-05-09 16:12 ` Jeff Layton
  2017-05-15 17:42   ` Darrick J. Wong
  2017-05-09 16:12 ` [xfstests PATCH v2 3/3] btrfs: allow it " Jeff Layton
  2 siblings, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2017-05-09 16:12 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, linux-btrfs, linux-ext4, linux-cifs,
	linux-nfs, linux-mm, jfs-discussion, linux-xfs, cluster-devel,
	linux-f2fs-devel, v9fs-developer, linux-nilfs, linux-block,
	fstests
  Cc: dhowells, akpm, hch, ross.zwisler, mawilcox, jack, viro, corbet,
	neilb, clm, tytso, axboe, josef, hubcap, rpeterso, bo.li.liu

The writeback error handling test requires that you put the journal on a
separate device. This allows us to use dmerror to simulate data
writeback failure, without affecting the journal.

xfs already has infrastructure for this (a'la $SCRATCH_LOGDEV), so wire
up the ext4 code so that it can do the same thing when _scratch_mkfs is
called.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 common/rc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/common/rc b/common/rc
index 257b1903359d..8b815d9c8c33 100644
--- a/common/rc
+++ b/common/rc
@@ -675,6 +675,9 @@ _scratch_mkfs_ext4()
 	local tmp=`mktemp`
 	local mkfs_status
 
+	[ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
+	    $mkfs_cmd -O journal_dev $SCRATCH_LOGDEV && \
+	    mkfs_cmd="$mkfs_cmd -J device=$SCRATCH_LOGDEV"
 
 	_scratch_do_mkfs "$mkfs_cmd" "$mkfs_filter" $* 2>$tmp.mkfserr 1>$tmp.mkfsstd
 	mkfs_status=$?
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [xfstests PATCH v2 3/3] btrfs: allow it to use $SCRATCH_LOGDEV
  2017-05-09 16:12 [xfstests PATCH v2 0/3] xfstest for updated writeback error handling Jeff Layton
  2017-05-09 16:12 ` [xfstests PATCH v2 1/3] generic: add a writeback error handling test Jeff Layton
  2017-05-09 16:12 ` [xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV Jeff Layton
@ 2017-05-09 16:12 ` Jeff Layton
  2 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2017-05-09 16:12 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, linux-btrfs, linux-ext4, linux-cifs,
	linux-nfs, linux-mm, jfs-discussion, linux-xfs, cluster-devel,
	linux-f2fs-devel, v9fs-developer, linux-nilfs, linux-block,
	fstests
  Cc: dhowells, akpm, hch, ross.zwisler, mawilcox, jack, viro, corbet,
	neilb, clm, tytso, axboe, josef, hubcap, rpeterso, bo.li.liu

With btrfs, we can't really put the log on a separate device. What we
can do however is mirror the metadata across two devices and put the
data on a single device. When we turn on dmerror then the metadata can
fall back to using the other mirror while the data errors out.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 common/rc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/common/rc b/common/rc
index 8b815d9c8c33..2084c1f24f30 100644
--- a/common/rc
+++ b/common/rc
@@ -829,6 +829,8 @@ _scratch_mkfs()
 		;;
 	btrfs)
 		mkfs_cmd="$MKFS_BTRFS_PROG"
+		[ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
+			mkfs_cmd="$mkfs_cmd -d single -m raid1 $SCRATCH_LOGDEV"
 		mkfs_filter="cat"
 		;;
 	ext2|ext3)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV
  2017-05-09 16:12 ` [xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV Jeff Layton
@ 2017-05-15 17:42   ` Darrick J. Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Darrick J. Wong @ 2017-05-15 17:42 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, linux-btrfs, linux-ext4, linux-cifs,
	linux-nfs, linux-mm, jfs-discussion, linux-xfs, cluster-devel,
	linux-f2fs-devel, v9fs-developer, linux-nilfs, linux-block,
	fstests, dhowells, akpm, hch, ross.zwisler, mawilcox, jack, viro,
	corbet, neilb, clm, tytso, axboe, josef, hubcap, rpeterso,
	bo.li.liu

On Tue, May 09, 2017 at 12:12:44PM -0400, Jeff Layton wrote:
> The writeback error handling test requires that you put the journal on a
> separate device. This allows us to use dmerror to simulate data
> writeback failure, without affecting the journal.
> 
> xfs already has infrastructure for this (a'la $SCRATCH_LOGDEV), so wire
> up the ext4 code so that it can do the same thing when _scratch_mkfs is
> called.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  common/rc | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/common/rc b/common/rc
> index 257b1903359d..8b815d9c8c33 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -675,6 +675,9 @@ _scratch_mkfs_ext4()
>  	local tmp=`mktemp`
>  	local mkfs_status
>  
> +	[ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
> +	    $mkfs_cmd -O journal_dev $SCRATCH_LOGDEV && \
> +	    mkfs_cmd="$mkfs_cmd -J device=$SCRATCH_LOGDEV"
>  
>  	_scratch_do_mkfs "$mkfs_cmd" "$mkfs_filter" $* 2>$tmp.mkfserr 1>$tmp.mkfsstd
>  	mkfs_status=$?
> -- 
> 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-15 17:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-09 16:12 [xfstests PATCH v2 0/3] xfstest for updated writeback error handling Jeff Layton
2017-05-09 16:12 ` [xfstests PATCH v2 1/3] generic: add a writeback error handling test Jeff Layton
2017-05-09 16:12 ` [xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV Jeff Layton
2017-05-15 17:42   ` Darrick J. Wong
2017-05-09 16:12 ` [xfstests PATCH v2 3/3] btrfs: allow it " Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).