All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
@ 2017-08-30 14:51 Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 01/14] common/rc: convert some egrep to grep Amir Goldstein
                   ` (14 more replies)
  0 siblings, 15 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Hi all,

This is the 2nd revision of crash consistency patch set.
The main thing that changed since v1 is my confidence in the failures
reported by the test, along with some more debugging options for
running the test tools.

I've collected these patches that have been sitting in Josef Bacik's
tree for a few years and kicked them a bit into shape.
The dm-log-writes target has been merged to kernel v4.1, see:
https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt

For this posting, I kept the random seeds constant for the test.
I set these constant seeds after running with random seed for a little
while and getting failure reports. With the current values in the test
I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
The probablity of reproducing the failure is higher on a spinning disk.

For xfs, I posted a fix for potential data loss post fsync+crash.
For ext4, I posted a reliable reproducer using dm-flakey.
For btrfs, I shared the recorded log with Josef.

There is an outstanding problem with the test - when I run it with
kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
I suppose its a bug in dm-log-writes with some kernel config or with virtio
I wasn't able to determine the reason and have little time to debug this.

Since dm-log-writes is anyway in upstream kernel, I don't think a bug
in dm-log-writes for a certain config is a reason to block this xfstest
from being merged.
Anyway, I would be glad if someone could take a look at the soft lockup
issue. Josef?

Thanks,
Amir.

Amir Goldstein (14):
  common/rc: convert some egrep to grep
  common/rc: fix _require_xfs_io_command params check
  fsx: fixes to random seed
  fsx: fix path of .fsx* files
  fsx: fix compile warnings
  fsx: add support for integrity check with dm-log-writes target
  fsx: add optional logid prefix to log messages
  fsx: add support for --record-ops
  fsx: add support for -g filldata
  log-writes: add replay-log program to replay dm-log-writes target
  replay-log: output log replay offset in verbose mode
  replay-log: add support for replaying ops in target device range
  fstests: add support for working with dm-log-writes target
  fstests: add crash consistency fsx test using dm-log-writes

 .gitignore                   |   1 +
 README                       |   2 +
 common/dmlogwrites           |  84 +++++++++
 common/rc                    |  15 +-
 doc/auxiliary-programs.txt   |   8 +
 doc/requirement-checking.txt |  20 +++
 ltp/fsx.c                    | 220 +++++++++++++++++++----
 src/Makefile                 |   2 +-
 src/log-writes/Makefile      |  23 +++
 src/log-writes/SOURCE        |   6 +
 src/log-writes/log-writes.c  | 410 +++++++++++++++++++++++++++++++++++++++++++
 src/log-writes/log-writes.h  |  72 ++++++++
 src/log-writes/replay-log.c  | 381 ++++++++++++++++++++++++++++++++++++++++
 tests/generic/500            | 138 +++++++++++++++
 tests/generic/500.out        |   2 +
 tests/generic/group          |   1 +
 16 files changed, 1340 insertions(+), 45 deletions(-)
 create mode 100644 common/dmlogwrites
 create mode 100644 src/log-writes/Makefile
 create mode 100644 src/log-writes/SOURCE
 create mode 100644 src/log-writes/log-writes.c
 create mode 100644 src/log-writes/log-writes.h
 create mode 100644 src/log-writes/replay-log.c
 create mode 100755 tests/generic/500
 create mode 100644 tests/generic/500.out

-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 01/14] common/rc: convert some egrep to grep
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 15:45   ` Darrick J. Wong
  2017-08-30 14:51 ` [PATCH v2 02/14] common/rc: fix _require_xfs_io_command params check Amir Goldstein
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 common/rc | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/common/rc b/common/rc
index 9c5f54a..9d7b783 100644
--- a/common/rc
+++ b/common/rc
@@ -2177,7 +2177,7 @@ _require_xfs_io_command()
 		;;
 	"fsmap" )
 		testio=`$XFS_IO_PROG -f -c "fsmap" $testfile 2>&1`
-		echo $testio | egrep -q "Inappropriate ioctl" && \
+		echo $testio | grep -q "Inappropriate ioctl" && \
 			_notrun "xfs_io $command support is missing"
 		;;
 	"open")
@@ -2185,12 +2185,12 @@ _require_xfs_io_command()
 		# a new -C flag was introduced to execute one shot commands.
 		# Check for -C flag support as an indication for the bug fix.
 		testio=`$XFS_IO_PROG -F -f -C "open $testfile" $testfile 2>&1`
-		echo $testio | egrep -q "invalid option" && \
+		echo $testio | grep -q "invalid option" && \
 			_notrun "xfs_io $command support is missing"
 		;;
 	"scrub"|"repair")
 		testio=`$XFS_IO_PROG -x -c "$command dummy 0" $TEST_DIR 2>&1`
-		echo $testio | egrep -q "Inappropriate ioctl" && \
+		echo $testio | grep -q "Inappropriate ioctl" && \
 			_notrun "xfs_io $command support is missing"
 		;;
 	"utimes" )
@@ -2209,7 +2209,7 @@ _require_xfs_io_command()
 		_notrun "xfs_io $command failed (old kernel/wrong fs/bad args?)"
 	echo $testio | grep -q "foreign file active" && \
 		_notrun "xfs_io $command not supported on $FSTYP"
-	echo $testio | egrep -q "Function not implemented" && \
+	echo $testio | grep -q "Function not implemented" && \
 		_notrun "xfs_io $command support is missing (missing syscall?)"
 
 	if [ -n "$param" -a $param_checked -eq 0 ]; then
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 02/14] common/rc: fix _require_xfs_io_command params check
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 01/14] common/rc: convert some egrep to grep Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 16:17   ` Darrick J. Wong
  2017-08-30 14:51 ` [PATCH v2 03/14] fsx: fixes to random seed Amir Goldstein
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

When _require_xfs_io_command is passed command parameters,
the resulting error from invalid parameters may be ignored.

For example, the following bogus params would not abort the test:
_require_xfs_io_command "falloc" "-X"
_require_xfs_io_command "fiemap" "-X"

Fix this by looking for the relevant error message.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 common/rc | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/common/rc b/common/rc
index 9d7b783..44b98f6 100644
--- a/common/rc
+++ b/common/rc
@@ -2212,9 +2212,14 @@ _require_xfs_io_command()
 	echo $testio | grep -q "Function not implemented" && \
 		_notrun "xfs_io $command support is missing (missing syscall?)"
 
-	if [ -n "$param" -a $param_checked -eq 0 ]; then
+	[ -n "$param" ] || return
+
+	if [ $param_checked -eq 0 ]; then
 		$XFS_IO_PROG -c "help $command" | grep -q "^ $param --" || \
 			_notrun "xfs_io $command doesn't support $param"
+	else
+		echo $testio | grep -q "invalid option" && \
+			_notrun "xfs_io $command doesn't support $param"
 	fi
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 03/14] fsx: fixes to random seed
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 01/14] common/rc: convert some egrep to grep Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 02/14] common/rc: fix _require_xfs_io_command params check Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 04/14] fsx: fix path of .fsx* files Amir Goldstein
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Not sure why, but with initstate()/setstate(), sfx generates
same events regadless of the input seed argument.

Change to use srandom() to fix the problem.

Add pid to auto random seed, so parallel fsx executions with auto
seed will use different seed values.

At this time there are 6 tests that use sfx, out of which:
2 use -S 0 as seed (gettime()) - generic/{075,112}
2 do not specify seed (default = 1) - generic/{091,263}
1 uses explicit constant seed - generic/127
1 uses explicit $RANDOM seed - generic/231

This change affects all those tests.
The tests that intended to randomize the seed will now really
randomize the seed.
The tests that intended to use a constant seed will still use
a constant seed, but resulting event sequence will be different
than before this change.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 3713bbe..572df75 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -116,7 +116,6 @@ int	fd;				/* fd for our test file */
 blksize_t	block_size = 0;
 off_t		file_size = 0;
 off_t		biggest = 0;
-char		state[256];
 unsigned long	testcalls = 0;		/* calls to function "test" */
 
 unsigned long	simulatedopcount = 0;	/* -b flag */
@@ -1909,8 +1908,10 @@ main(int argc, char **argv)
                         break;
 		case 'S':
                         seed = getnum(optarg, &endp);
-			if (seed == 0)
+			if (seed == 0) {
 				seed = time(0) % 10000;
+				seed += (int)getpid();
+			}
 			if (!quiet)
 				fprintf(stdout, "Seed set to %d\n", seed);
 			if (seed < 0)
@@ -1948,8 +1949,7 @@ main(int argc, char **argv)
 	signal(SIGUSR1,	cleanup);
 	signal(SIGUSR2,	cleanup);
 
-	initstate(seed, state, 256);
-	setstate(state);
+	srandom(seed);
 	fd = open(fname,
 		O_RDWR|(lite ? 0 : O_CREAT|O_TRUNC)|o_direct, 0666);
 	if (fd < 0) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 04/14] fsx: fix path of .fsx* files
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 03/14] fsx: fixes to random seed Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 05/14] fsx: fix compile warnings Amir Goldstein
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

When command line arg -P <dirpath> is used, compose the
path for .fsxgood .fsxlog .fsxops files from dirpath and
work file basename.

This fix is ported from LTP.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 572df75..1502905 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -1581,7 +1581,7 @@ usage(void)
 "	-L: fsxLite - no file creations & no file size changes\n\
 	-N numops: total # operations to do (default infinity)\n\
 	-O: use oplen (see -o flag) for every op (default random)\n\
-	-P: save .fsxlog and .fsxgood files in dirpath (default ./)\n\
+	-P: save .fsxlog .fsxops and .fsxgood files in dirpath (default ./)\n\
 	-S seed: for random # generator (default 1) 0 gets timestamp\n\
 	-W: mapped write operations DISabled\n\
         -R: read() system calls only (mapped reads disabled)\n\
@@ -1761,6 +1761,7 @@ main(int argc, char **argv)
 	char	*endp;
 	char goodfile[1024];
 	char logfile[1024];
+	int dirpath = 0;
 	struct stat statbuf;
 
 	goodfile[0] = 0;
@@ -1902,6 +1903,9 @@ main(int argc, char **argv)
 			strcat(goodfile, "/");
 			strncpy(logfile, optarg, sizeof(logfile));
 			strcat(logfile, "/");
+			strncpy(opsfile, optarg, sizeof(logfile));
+			strcat(opsfile, "/");
+			dirpath = 1;
 			break;
                 case 'R':
                         mapped_reads = 0;
@@ -1978,21 +1982,21 @@ main(int argc, char **argv)
 		}
 	}
 #endif
-	strncat(goodfile, fname, 256);
+	strncat(goodfile, dirpath ? basename(fname) : fname, 256);
 	strcat (goodfile, ".fsxgood");
 	fsxgoodfd = open(goodfile, O_RDWR|O_CREAT|O_TRUNC, 0666);
 	if (fsxgoodfd < 0) {
 		prterr(goodfile);
 		exit(92);
 	}
-	strncat(logfile, fname, 256);
+	strncat(logfile, dirpath ? basename(fname) : fname, 256);
 	strcat (logfile, ".fsxlog");
 	fsxlogf = fopen(logfile, "w");
 	if (fsxlogf == NULL) {
 		prterr(logfile);
 		exit(93);
 	}
-	strncat(opsfile, fname, 256);
+	strncat(opsfile, dirpath ? basename(fname) : fname, 256);
 	strcat(opsfile, ".fsxops");
 	unlink(opsfile);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 05/14] fsx: fix compile warnings
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 04/14] fsx: fix path of .fsx* files Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 06/14] fsx: add support for integrity check with dm-log-writes target Amir Goldstein
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 1502905..e789aad 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -211,9 +211,9 @@ prt(const char *fmt, ...)
 	va_start(args, fmt);
 	vsnprintf(buffer, BUF_SIZE, fmt, args);
 	va_end(args);
-	fprintf(stdout, buffer);
+	fputs(buffer, stdout);
 	if (fsxlogf)
-		fprintf(fsxlogf, buffer);
+		fputs(buffer, fsxlogf);
 }
 
 void
@@ -569,14 +569,20 @@ check_trunc_hack(void)
 {
 	struct stat statbuf;
 
-	ftruncate(fd, (off_t)0);
-	ftruncate(fd, (off_t)100000);
+	if (ftruncate(fd, (off_t)0))
+		goto ftruncate_err;
+	if (ftruncate(fd, (off_t)100000))
+		goto ftruncate_err;
 	fstat(fd, &statbuf);
 	if (statbuf.st_size != (off_t)100000) {
 		prt("no extend on truncate! not posix!\n");
 		exit(130);
 	}
-	ftruncate(fd, 0);
+	if (ftruncate(fd, 0)) {
+ftruncate_err:
+		prterr("check_trunc_hack: ftruncate");
+		exit(131);
+	}
 }
 
 void
@@ -1742,7 +1748,10 @@ __test_fallocate(int mode, const char *mode_str)
 					mode_str);
 		} else {
 			ret = 1;
-			ftruncate(fd, 0);
+			if (ftruncate(fd, 0)) {
+				warn("main: ftruncate");
+				exit(132);
+			}
 		}
 	}
 	return ret;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 06/14] fsx: add support for integrity check with dm-log-writes target
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (4 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 05/14] fsx: fix compile warnings Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 07/14] fsx: add optional logid prefix to log messages Amir Goldstein
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Cherry-picked the relevant fsx bits from commit 70d41e17164b
in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
Quoting from Josef's commit message:

  I've rigged up fsx to have an integrity check mode.  Basically it works
  like it normally works, but when it fsync()'s it marks the log with a
  unique mark and dumps it's buffer to a file with the mark in the filename.
  I did this with a system() call simply because it was the fastest.  I can
  link the device-mapper libraries and do it programatically if that would
  be preferred, but this works pretty well.

  Signed-off-by: Josef Bacik <jbacik@fb.com>

[Amir:]
- Fix some exit codes
- Require -P dirpath for -i logdev

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 123 insertions(+), 24 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index e789aad..d206a3a 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -67,15 +67,17 @@ int			logcount = 0;	/* total ops */
  * be careful in how we select the different operations. The active operations
  * are mapped to numbers as follows:
  *
- *		lite	!lite
- * READ:	0	0
- * WRITE:	1	1
- * MAPREAD:	2	2
- * MAPWRITE:	3	3
- * TRUNCATE:	-	4
- * FALLOCATE:	-	5
- * PUNCH HOLE:	-	6
- * ZERO RANGE:	-	7
+ *			lite	!lite	integrity
+ * READ:		0	0	0
+ * WRITE:		1	1	1
+ * MAPREAD:		2	2	2
+ * MAPWRITE:		3	3	3
+ * TRUNCATE:		-	4	4
+ * FALLOCATE:		-	5	5
+ * PUNCH HOLE:		-	6	6
+ * ZERO RANGE:		-	7	7
+ * COLLAPSE RANGE:	-	8	8
+ * FSYNC:		-	-	9
  *
  * When mapped read/writes are disabled, they are simply converted to normal
  * reads and writes. When fallocate/fpunch calls are disabled, they are
@@ -102,6 +104,10 @@ int			logcount = 0;	/* total ops */
 #define OP_INSERT_RANGE	9
 #define OP_MAX_FULL		10
 
+/* integrity operations */
+#define OP_FSYNC		10
+#define OP_MAX_INTEGRITY	11
+
 #undef PAGE_SIZE
 #define PAGE_SIZE       getpagesize()
 #undef PAGE_MASK
@@ -111,6 +117,10 @@ char	*original_buf;			/* a pointer to the original data */
 char	*good_buf;			/* a pointer to the correct data */
 char	*temp_buf;			/* a pointer to the current data */
 char	*fname;				/* name of our test file */
+char	*bname;				/* basename of our test file */
+char	*logdev;			/* -i flag */
+char	dname[1024];			/* -P flag */
+int	dirpath = 0;			/* -P flag */
 int	fd;				/* fd for our test file */
 
 blksize_t	block_size = 0;
@@ -148,9 +158,11 @@ int     zero_range_calls = 1;           /* -z flag disables */
 int	collapse_range_calls = 1;	/* -C flag disables */
 int	insert_range_calls = 1;		/* -I flag disables */
 int 	mapped_reads = 1;		/* -R flag disables it */
+int	integrity = 0;			/* -i flag */
 int	fsxgoodfd = 0;
 int	o_direct;			/* -Z */
 int	aio = 0;
+int	mark_nr = 0;
 
 int page_size;
 int page_mask;
@@ -234,6 +246,7 @@ static const char *op_names[] = {
 	[OP_ZERO_RANGE] = "zero_range",
 	[OP_COLLAPSE_RANGE] = "collapse_range",
 	[OP_INSERT_RANGE] = "insert_range",
+	[OP_FSYNC] = "fsync",
 };
 
 static const char *op_name(int operation)
@@ -397,6 +410,9 @@ logdump(void)
 			if (overlap)
 				prt("\t******IIII");
 			break;
+		case OP_FSYNC:
+			prt("FSYNC");
+			break;
 		default:
 			prt("BOGUS LOG ENTRY (operation code = %d)!",
 			    lp->operation);
@@ -500,6 +516,43 @@ report_failure(int status)
 				        *(((unsigned char *)(cp)) + 1)))
 
 void
+mark_log(void)
+{
+	char command[256];
+	int ret;
+
+	snprintf(command, 256, "dmsetup message %s 0 mark %s.mark%d", logdev,
+		 bname, mark_nr);
+	ret = system(command);
+	if (ret) {
+		prterr("dmsetup mark failed");
+		exit(211);
+	}
+}
+
+void
+dump_fsync_buffer(void)
+{
+	char fname_buffer[1024];
+	int good_fd;
+
+	if (!good_buf)
+		return;
+
+	snprintf(fname_buffer, 1024, "%s%s.mark%d", dname,
+		 bname, mark_nr);
+	good_fd = open(fname_buffer, O_WRONLY|O_CREAT|O_TRUNC, 0666);
+	if (good_fd < 0) {
+		prterr(fname_buffer);
+		exit(212);
+	}
+
+	save_buffer(good_buf, file_size, good_fd);
+	close(good_fd);
+	prt("Dumped fsync buffer to %s\n", fname_buffer + dirpath);
+}
+
+void
 check_buffers(unsigned offset, unsigned size)
 {
 	unsigned char c, t;
@@ -1256,6 +1309,25 @@ docloseopen(void)
 	}
 }
 
+void
+dofsync(void)
+{
+	int ret;
+
+	if (testcalls <= simulatedopcount)
+		return;
+	if (debug)
+		prt("%lu fsync\n", testcalls);
+	log4(OP_FSYNC, 0, 0, 0);
+	ret = fsync(fd);
+	if (ret < 0) {
+		prterr("dofsync");
+		report_failure(210);
+	}
+	mark_log();
+	dump_fsync_buffer();
+	mark_nr++;
+}
 
 #define TRIM_OFF(off, size)			\
 do {						\
@@ -1403,8 +1475,10 @@ test(void)
 	/* calculate appropriate op to run */
 	if (lite)
 		op = rv % OP_MAX_LITE;
-	else
+	else if (!integrity)
 		op = rv % OP_MAX_FULL;
+	else
+		op = rv % OP_MAX_INTEGRITY;
 
 	switch(op) {
 	case OP_TRUNCATE:
@@ -1528,6 +1602,9 @@ have_op:
 
 		do_insert_range(offset, size);
 		break;
+	case OP_FSYNC:
+		dofsync();
+		break;
 	default:
 		prterr("test: unknown operation");
 		report_failure(42);
@@ -1547,11 +1624,12 @@ void
 usage(void)
 {
 	fprintf(stdout, "usage: %s",
-		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
+		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-i logdev] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
 	-f flush and invalidate cache after I/O\n\
+	-i logdev: do integrity testing, logdev is the dm log writes device\n\
 	-l flen: the upper bound on file size (default 262144)\n\
 	-m startop:endop: monitor (print debug output) specified byte range (default 0:infinity)\n\
 	-n: no verifications of file size\n\
@@ -1767,14 +1845,14 @@ int
 main(int argc, char **argv)
 {
 	int	i, style, ch;
-	char	*endp;
+	char	*endp, *tmp;
 	char goodfile[1024];
 	char logfile[1024];
-	int dirpath = 0;
 	struct stat statbuf;
 
 	goodfile[0] = 0;
 	logfile[0] = 0;
+	dname[0] = 0;
 
 	page_size = getpagesize();
 	page_mask = page_size - 1;
@@ -1784,7 +1862,7 @@ main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "b:c:dfl:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
+				 "b:c:dfi:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
 		case 'b':
@@ -1811,6 +1889,14 @@ main(int argc, char **argv)
 		case 'f':
 			flush = 1;
 			break;
+		case 'i':
+			integrity = 1;
+			logdev = strdup(optarg);
+			if (!logdev) {
+				prterr("strdup");
+				exit(101);
+			}
+			break;
 		case 'l':
 			maxfilelen = getnum(optarg, &endp);
 			if (maxfilelen <= 0)
@@ -1908,13 +1994,13 @@ main(int argc, char **argv)
 			randomoplen = 0;
 			break;
 		case 'P':
-			strncpy(goodfile, optarg, sizeof(goodfile));
-			strcat(goodfile, "/");
-			strncpy(logfile, optarg, sizeof(logfile));
-			strcat(logfile, "/");
-			strncpy(opsfile, optarg, sizeof(logfile));
-			strcat(opsfile, "/");
-			dirpath = 1;
+			strncpy(dname, optarg, sizeof(dname));
+			strcat(dname, "/");
+			dirpath = strlen(dname);
+
+			strncpy(goodfile, dname, sizeof(goodfile));
+			strncpy(logfile, dname, sizeof(logfile));
+			strncpy(opsfile, dname, sizeof(logfile));
 			break;
                 case 'R':
                         mapped_reads = 0;
@@ -1949,7 +2035,19 @@ main(int argc, char **argv)
 	argv += optind;
 	if (argc != 1)
 		usage();
+
+	if (integrity && !dirpath) {
+		fprintf(stderr, "option -i <logdev> requires -P <dirpath>\n");
+		usage();
+	}
+
 	fname = argv[0];
+	tmp = strdup(fname);
+	if (!tmp) {
+		prterr("strdup");
+		exit(101);
+	}
+	bname = basename(tmp);
 
 	signal(SIGHUP,	cleanup);
 	signal(SIGINT,	cleanup);
@@ -1991,21 +2089,21 @@ main(int argc, char **argv)
 		}
 	}
 #endif
-	strncat(goodfile, dirpath ? basename(fname) : fname, 256);
+	strncat(goodfile, dirpath ? bname : fname, 256);
 	strcat (goodfile, ".fsxgood");
 	fsxgoodfd = open(goodfile, O_RDWR|O_CREAT|O_TRUNC, 0666);
 	if (fsxgoodfd < 0) {
 		prterr(goodfile);
 		exit(92);
 	}
-	strncat(logfile, dirpath ? basename(fname) : fname, 256);
+	strncat(logfile, dirpath ? bname : fname, 256);
 	strcat (logfile, ".fsxlog");
 	fsxlogf = fopen(logfile, "w");
 	if (fsxlogf == NULL) {
 		prterr(logfile);
 		exit(93);
 	}
-	strncat(opsfile, dirpath ? basename(fname) : fname, 256);
+	strncat(opsfile, dirpath ? bname : fname, 256);
 	strcat(opsfile, ".fsxops");
 	unlink(opsfile);
 
@@ -2081,6 +2179,7 @@ main(int argc, char **argv)
 		if (!test())
 			break;
 
+	free(tmp);
 	if (close(fd)) {
 		prterr("close");
 		report_failure(99);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 07/14] fsx: add optional logid prefix to log messages
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (5 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 06/14] fsx: add support for integrity check with dm-log-writes target Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-09-05 10:46   ` Eryu Guan
  2017-08-30 14:51 ` [PATCH v2 08/14] fsx: add support for --record-ops Amir Goldstein
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

When writing the intermixed output of several fsx processes
to a single log file, it is usefull to prefix logs with a log id.
Use fsx -j <logid> to define the log messages prefix.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index d206a3a..a6420f6 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -132,6 +132,7 @@ unsigned long	simulatedopcount = 0;	/* -b flag */
 int	closeprob = 0;			/* -c flag */
 int	debug = 0;			/* -d flag */
 unsigned long	debugstart = 0;		/* -D flag */
+int	logid = 0;			/* -j flag */
 int	flush = 0;			/* -f flag */
 int	do_fsync = 0;			/* -y flag */
 unsigned long	maxfilelen = 256 * 1024;	/* -l flag */
@@ -195,13 +196,16 @@ static void *round_ptr_up(void *ptr, unsigned long align, unsigned long offset)
 }
 
 void
-vwarnc(int code, const char *fmt, va_list ap) {
-  fprintf(stderr, "fsx: ");
-  if (fmt != NULL) {
-	vfprintf(stderr, fmt, ap);
-	fprintf(stderr, ": ");
-  }
-  fprintf(stderr, "%s\n", strerror(code));
+vwarnc(int code, const char *fmt, va_list ap)
+{
+	if (logid)
+		fprintf(stderr, "%d: ", logid);
+	fprintf(stderr, "fsx: ");
+	if (fmt != NULL) {
+		vfprintf(stderr, fmt, ap);
+		fprintf(stderr, ": ");
+	}
+	fprintf(stderr, "%s\n", strerror(code));
 }
 
 void
@@ -223,6 +227,8 @@ prt(const char *fmt, ...)
 	va_start(args, fmt);
 	vsnprintf(buffer, BUF_SIZE, fmt, args);
 	va_end(args);
+	if (logid)
+		fprintf(stdout, "%d: ", logid);
 	fputs(buffer, stdout);
 	if (fsxlogf)
 		fputs(buffer, fsxlogf);
@@ -1624,12 +1630,13 @@ void
 usage(void)
 {
 	fprintf(stdout, "usage: %s",
-		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-i logdev] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
+		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-i logdev] [-j logid] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
 	-f flush and invalidate cache after I/O\n\
 	-i logdev: do integrity testing, logdev is the dm log writes device\n\
+	-j logid: prefix logs with this id\n\
 	-l flen: the upper bound on file size (default 262144)\n\
 	-m startop:endop: monitor (print debug output) specified byte range (default 0:infinity)\n\
 	-n: no verifications of file size\n\
@@ -1862,7 +1869,7 @@ main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "b:c:dfi:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
+				 "b:c:dfi:j:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
 		case 'b':
@@ -1897,6 +1904,9 @@ main(int argc, char **argv)
 				exit(101);
 			}
 			break;
+		case 'j':
+			logid = getnum(optarg, &endp);
+			break;
 		case 'l':
 			maxfilelen = getnum(optarg, &endp);
 			if (maxfilelen <= 0)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 08/14] fsx: add support for --record-ops
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (6 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 07/14] fsx: add optional logid prefix to log messages Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 09/14] fsx: add support for -g filldata Amir Goldstein
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Usually, fsx dumps an .fsxops file on failure with same basename
as work file and possibly under dirctory specified by -P dirpath.

The --record-ops[=opsfile] flag can be use to dump ops file also
on success and to optionally specify the ops file name.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index a6420f6..dd6b637 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -180,6 +180,7 @@ int aio_rw(int rw, int fd, char *buf, unsigned len, unsigned offset);
 #endif
 
 const char *replayops = NULL;
+const char *recordops = NULL;
 FILE *	fsxlogf = NULL;
 FILE *	replayopsf = NULL;
 char opsfile[1024];
@@ -1677,6 +1678,8 @@ usage(void)
 	-W: mapped write operations DISabled\n\
         -R: read() system calls only (mapped reads disabled)\n\
         -Z: O_DIRECT (use -R, -W, -r and -w too)\n\
+	--replay-ops opsfile: replay ops from recorded .fsxops file\n\
+	--record-ops[=opsfile]: dump ops file also on success. optionally specify ops file name\n\
 	fname: this filename is REQUIRED (no default)\n");
 	exit(90);
 }
@@ -1845,6 +1848,7 @@ __test_fallocate(int mode, const char *mode_str)
 
 static struct option longopts[] = {
 	{"replay-ops", required_argument, 0, 256},
+	{"record-ops", optional_argument, 0, 255},
 	{ }
 };
 
@@ -2034,6 +2038,11 @@ main(int argc, char **argv)
 		case 'Z':
 			o_direct = O_DIRECT;
 			break;
+		case 255:  /* --record-ops */
+			if (optarg)
+				strncpy(opsfile, optarg, sizeof(opsfile));
+			recordops = opsfile;
+			break;
 		case 256:  /* --replay-ops */
 			replayops = optarg;
 			break;
@@ -2113,8 +2122,10 @@ main(int argc, char **argv)
 		prterr(logfile);
 		exit(93);
 	}
-	strncat(opsfile, dirpath ? bname : fname, 256);
-	strcat(opsfile, ".fsxops");
+	if (!*opsfile) {
+		strncat(opsfile, dirpath ? bname : fname, 256);
+		strcat(opsfile, ".fsxops");
+	}
 	unlink(opsfile);
 
 	if (replayops) {
@@ -2195,6 +2206,8 @@ main(int argc, char **argv)
 		report_failure(99);
 	}
 	prt("All %lu operations completed A-OK!\n", testcalls);
+	if (recordops)
+		logdump();
 
 	exit(0);
 	return 0;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 09/14] fsx: add support for -g filldata
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (7 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 08/14] fsx: add support for --record-ops Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-09-05 10:50   ` Eryu Guan
  2017-08-30 14:51 ` [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target Amir Goldstein
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

-g X: write character X instead of random generated data

This is useful to compare holes between good and bad buffer.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 ltp/fsx.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index dd6b637..a75bc55 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -132,6 +132,7 @@ unsigned long	simulatedopcount = 0;	/* -b flag */
 int	closeprob = 0;			/* -c flag */
 int	debug = 0;			/* -d flag */
 unsigned long	debugstart = 0;		/* -D flag */
+char	filldata = 0;			/* -g flag */
 int	logid = 0;			/* -j flag */
 int	flush = 0;			/* -f flag */
 int	do_fsync = 0;			/* -y flag */
@@ -817,6 +818,8 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
 		good_buf[offset] = testcalls % 256; 
 		if (offset % 2)
 			good_buf[offset] += original_buf[offset];
+		if (filldata)
+			good_buf[offset] = filldata;
 		offset++;
 	}
 }
@@ -1631,11 +1634,12 @@ void
 usage(void)
 {
 	fprintf(stdout, "usage: %s",
-		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-i logdev] [-j logid] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
+		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
 	-f flush and invalidate cache after I/O\n\
+	-g X: write character X instead of random generated data\n\
 	-i logdev: do integrity testing, logdev is the dm log writes device\n\
 	-j logid: prefix logs with this id\n\
 	-l flen: the upper bound on file size (default 262144)\n\
@@ -1873,7 +1877,7 @@ main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "b:c:dfi:j:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
+				 "b:c:dfg:i:j:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
 		case 'b':
@@ -1900,6 +1904,9 @@ main(int argc, char **argv)
 		case 'f':
 			flush = 1;
 			break;
+		case 'g':
+			filldata = *optarg;
+			break;
 		case 'i':
 			integrity = 1;
 			logdev = strdup(optarg);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (8 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 09/14] fsx: add support for -g filldata Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-09-05 11:03   ` Eryu Guan
  2017-08-30 14:51 ` [PATCH v2 11/14] replay-log: output log replay offset in verbose mode Amir Goldstein
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Imported Josef Bacik's code from:
https://github.com/josefbacik/log-writes.git

Specialized program for replaying a write log that was recorded by
device mapper log-writes target.  The tools is used to perform
crash consistency tests, allowing to run an arbitrary check tool
(fsck) at specified checkpoints in the write log.

[Amir:]
- Add project Makefile and SOURCE files
- Document the replay-log auxiliary program

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 .gitignore                  |   1 +
 doc/auxiliary-programs.txt  |   8 +
 src/Makefile                |   2 +-
 src/log-writes/Makefile     |  23 +++
 src/log-writes/SOURCE       |   6 +
 src/log-writes/log-writes.c | 379 ++++++++++++++++++++++++++++++++++++++++++++
 src/log-writes/log-writes.h |  70 ++++++++
 src/log-writes/replay-log.c | 348 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 836 insertions(+), 1 deletion(-)
 create mode 100644 src/log-writes/Makefile
 create mode 100644 src/log-writes/SOURCE
 create mode 100644 src/log-writes/log-writes.c
 create mode 100644 src/log-writes/log-writes.h
 create mode 100644 src/log-writes/replay-log.c

diff --git a/.gitignore b/.gitignore
index fcbc0cd..c26c92f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -153,6 +153,7 @@
 /src/t_mmap_stale_pmd
 /src/t_mmap_cow_race
 /src/t_mmap_fallocate
+/src/log-writes/replay-log
 
 # dmapi/ binaries
 /dmapi/src/common/cmd/read_invis
diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
index bcab453..de15832 100644
--- a/doc/auxiliary-programs.txt
+++ b/doc/auxiliary-programs.txt
@@ -18,6 +18,7 @@ Contents:
  - af_unix		-- Create an AF_UNIX socket
  - dmerror		-- fault injection block device control
  - fsync-err		-- tests fsync error reporting after failed writeback
+ - log-writes/replay-log -- Replay log from device mapper log-writes target
  - open_by_handle	-- open_by_handle_at syscall exercise
  - stat_test		-- statx syscall exercise
  - t_dir_type		-- print directory entries and their file type
@@ -46,6 +47,13 @@ fsync-err
 	writeback and test that errors are reported during fsync and cleared
 	afterward.
 
+log-writes/replay-log
+
+	Specialized program for replaying a write log that was recorded by
+	device mapper log-writes target.  The tools is used to perform crash
+	consistency tests, allowing to run an arbitrary check tool (fsck) at
+	specified checkpoints in the write log.
+
 open_by_handle
 
 	The open_by_handle program exercises the open_by_handle_at() system
diff --git a/src/Makefile b/src/Makefile
index b8aff49..7d1306b 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -25,7 +25,7 @@ LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
 	attr-list-by-handle-cursor-test listxattr dio-interleaved t_dir_type \
 	dio-invalidate-cache stat_test t_encrypted_d_revalidate
 
-SUBDIRS =
+SUBDIRS = log-writes
 
 LLDLIBS = $(LIBATTR) $(LIBHANDLE) $(LIBACL) -lpthread
 
diff --git a/src/log-writes/Makefile b/src/log-writes/Makefile
new file mode 100644
index 0000000..d114177
--- /dev/null
+++ b/src/log-writes/Makefile
@@ -0,0 +1,23 @@
+TOPDIR = ../..
+include $(TOPDIR)/include/builddefs
+
+TARGETS = replay-log
+
+CFILES = replay-log.c log-writes.c
+LDIRT = $(TARGETS)
+
+default: depend $(TARGETS)
+
+depend: .dep
+
+include $(BUILDRULES)
+
+$(TARGETS): $(CFILES)
+	@echo "    [CC]    $@"
+	$(Q)$(LTLINK) $(CFILES) -o $@ $(CFLAGS) $(LDFLAGS) $(LDLIBS)
+
+install:
+	$(INSTALL) -m 755 -d $(PKG_LIB_DIR)/src/log-writes
+	$(INSTALL) -m 755 $(TARGETS) $(PKG_LIB_DIR)/src/log-writes
+
+-include .dep
diff --git a/src/log-writes/SOURCE b/src/log-writes/SOURCE
new file mode 100644
index 0000000..d6d143c
--- /dev/null
+++ b/src/log-writes/SOURCE
@@ -0,0 +1,6 @@
+From:
+https://github.com/josefbacik/log-writes.git
+
+description	Helper code for dm-log-writes target
+owner	Josef Bacik <jbacik@fb.com>
+URL	https://github.com/josefbacik/log-writes.git
diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c
new file mode 100644
index 0000000..fa4f3f3
--- /dev/null
+++ b/src/log-writes/log-writes.c
@@ -0,0 +1,379 @@
+#include <linux/fs.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <unistd.h>
+#include <string.h>
+#include "log-writes.h"
+
+int log_writes_verbose = 0;
+
+/*
+ * @log: the log to free.
+ *
+ * This will close any open fd's the log has and free up its memory.
+ */
+void log_free(struct log *log)
+{
+	if (log->replayfd >= 0)
+		close(log->replayfd);
+	if (log->logfd >= 0)
+		close(log->logfd);
+	free(log);
+}
+
+static int discard_range(struct log *log, u64 start, u64 len)
+{
+	u64 range[2] = { start, len };
+
+	if (ioctl(log->replayfd, BLKDISCARD, &range) < 0) {
+		if (log_writes_verbose)
+			printf("replay device doesn't support discard, "
+			       "switching to writing zeros\n");
+		log->flags |= LOG_DISCARD_NOT_SUPP;
+	}
+	return 0;
+}
+
+static int zero_range(struct log *log, u64 start, u64 len)
+{
+	u64 bufsize = len;
+	ssize_t ret;
+	char *buf = NULL;
+
+	if (log->max_zero_size < len) {
+		if (log_writes_verbose)
+			printf("discard len %llu larger than max %llu\n",
+			       (unsigned long long)len,
+			       (unsigned long long)log->max_zero_size);
+		return 0;
+	}
+
+	while (!buf) {
+		buf = malloc(sizeof(char) * len);
+		if (!buf)
+			bufsize >>= 1;
+		if (!bufsize) {
+			fprintf(stderr, "Couldn't allocate zero buffer");
+			return -1;
+		}
+	}
+
+	memset(buf, 0, bufsize);
+	while (len) {
+		ret = pwrite(log->replayfd, buf, bufsize, start);
+		if (ret != bufsize) {
+			fprintf(stderr, "Error zeroing file: %d\n", errno);
+			free(buf);
+			return -1;
+		}
+		len -= ret;
+		start += ret;
+	}
+	free(buf);
+	return 0;
+}
+
+/*
+ * @log: the log we are replaying.
+ * @entry: the discard entry.
+ *
+ * Discard the given length.  If the device supports discard we will call that
+ * ioctl, otherwise we will write 0's to emulate discard.  If the discard size
+ * is larger than log->max_zero_size then we will simply skip the zero'ing if
+ * the drive doesn't support discard.
+ */
+int log_discard(struct log *log, struct log_write_entry *entry)
+{
+	u64 start = le64_to_cpu(entry->sector) * log->sectorsize;
+	u64 size = le64_to_cpu(entry->nr_sectors) * log->sectorsize;
+	u64 max_chunk = 1 * 1024 * 1024 * 1024;
+
+	if (log->flags & LOG_IGNORE_DISCARD)
+		return 0;
+
+	while (size) {
+		u64 len = size > max_chunk ? max_chunk : size;
+		int ret;
+
+		/*
+		 * Do this check first in case it is our first discard, that way
+		 * if we return EOPNOTSUPP we will fall back to the 0 method
+		 * automatically.
+		 */
+		if (!(log->flags & LOG_DISCARD_NOT_SUPP))
+			ret = discard_range(log, start, len);
+		if (log->flags & LOG_DISCARD_NOT_SUPP)
+			ret = zero_range(log, start, len);
+		if (ret)
+			return -1;
+		size -= len;
+		start += len;
+	}
+	return 0;
+}
+
+/*
+ * @log: the log we are replaying.
+ * @entry: where we put the entry.
+ * @read_data: read the entry data as well, entry must be log->sectorsize sized
+ * if this is set.
+ *
+ * @return: 0 if we replayed, 1 if we are at the end, -1 if there was an error.
+ *
+ * Replay the next entry in our log onto the replay device.
+ */
+int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
+			  int read_data)
+{
+	u64 size;
+	u64 flags;
+	size_t read_size = read_data ? log->sectorsize :
+		sizeof(struct log_write_entry);
+	char *buf;
+	ssize_t ret;
+	off_t offset;
+
+	if (log->cur_entry >= log->nr_entries)
+		return 1;
+
+	ret = read(log->logfd, entry, read_size);
+	if (ret != read_size) {
+		fprintf(stderr, "Error reading entry: %d\n", errno);
+		return -1;
+	}
+	log->cur_entry++;
+
+	size = le64_to_cpu(entry->nr_sectors) * log->sectorsize;
+	if (read_size < log->sectorsize) {
+		if (lseek(log->logfd,
+			  log->sectorsize - sizeof(struct log_write_entry),
+			  SEEK_CUR) == (off_t)-1) {
+			fprintf(stderr, "Error seeking in log: %d\n", errno);
+			return -1;
+		}
+	}
+
+	if (log_writes_verbose)
+		printf("replaying %d: sector %llu, size %llu, flags %llu\n",
+		       (int)log->cur_entry - 1,
+		       (unsigned long long)le64_to_cpu(entry->sector),
+		       (unsigned long long)size,
+		       (unsigned long long)le64_to_cpu(entry->flags));
+	if (!size)
+		return 0;
+
+	flags = le64_to_cpu(entry->flags);
+	if (flags & LOG_DISCARD_FLAG)
+		return log_discard(log, entry);
+
+	buf = malloc(size);
+	if (!buf) {
+		fprintf(stderr, "Error allocating buffer %llu entry %llu\n", (unsigned long long)size, (unsigned long long)log->cur_entry - 1);
+		return -1;
+	}
+
+	ret = read(log->logfd, buf, size);
+	if (ret != size) {
+		fprintf(stderr, "Erro reading data: %d\n", errno);
+		free(buf);
+		return -1;
+	}
+
+	offset = le64_to_cpu(entry->sector) * log->sectorsize;
+	ret = pwrite(log->replayfd, buf, size, offset);
+	free(buf);
+	if (ret != size) {
+		fprintf(stderr, "Error writing data: %d\n", errno);
+		return -1;
+	}
+
+	return 0;
+}
+
+/*
+ * @log: the log we are manipulating.
+ * @entry_num: the entry we want.
+ *
+ * Seek to the given entry in the log, starting at 0 and ending at
+ * log->nr_entries - 1.
+ */
+int log_seek_entry(struct log *log, u64 entry_num)
+{
+	u64 i = 0;
+
+	if (entry_num >= log->nr_entries) {
+		fprintf(stderr, "Invalid entry number\n");
+		return -1;
+	}
+
+	if (lseek(log->logfd, log->sectorsize, SEEK_SET) == (off_t)-1) {
+		fprintf(stderr, "Error seeking in file: %d\n", errno);
+		return -1;
+	}
+
+	for (i = log->cur_entry; i < entry_num; i++) {
+		struct log_write_entry entry;
+		ssize_t ret;
+		off_t seek_size;
+		u64 flags;
+
+		ret = read(log->logfd, &entry, sizeof(entry));
+		if (ret != sizeof(entry)) {
+			fprintf(stderr, "Error reading entry: %d\n", errno);
+			return -1;
+		}
+		if (log_writes_verbose > 1)
+			printf("seek entry %d: %llu, size %llu, flags %llu\n",
+			       (int)i,
+			       (unsigned long long)le64_to_cpu(entry.sector),
+			       (unsigned long long)le64_to_cpu(entry.nr_sectors),
+			       (unsigned long long)le64_to_cpu(entry.flags));
+		flags = le64_to_cpu(entry.flags);
+		seek_size = log->sectorsize - sizeof(entry);
+		if (!(flags & LOG_DISCARD_FLAG))
+			seek_size += le64_to_cpu(entry.nr_sectors) *
+				log->sectorsize;
+		if (lseek(log->logfd, seek_size, SEEK_CUR) == (off_t)-1) {
+			fprintf(stderr, "Error seeking in file: %d\n", errno);
+			return -1;
+		}
+		log->cur_entry++;
+	}
+
+	return 0;
+}
+
+/*
+ * @log: the log we are manipulating.
+ * @entry: the entry we read.
+ * @read_data: read the extra data for the entry, your entry must be
+ * log->sectorsize large.
+ *
+ * @return: 1 if we hit the end of the log, 0 we got the next entry, < 0 if
+ * there was an error.
+ *
+ * Seek to the next entry in the log.
+ */
+int log_seek_next_entry(struct log *log, struct log_write_entry *entry,
+			int read_data)
+{
+	size_t read_size = read_data ? log->sectorsize :
+		sizeof(struct log_write_entry);
+	u64 flags;
+	ssize_t ret;
+
+	if (log->cur_entry >= log->nr_entries)
+		return 1;
+
+	ret = read(log->logfd, entry, read_size);
+	if (ret != read_size) {
+		fprintf(stderr, "Error reading entry: %d\n", errno);
+		return -1;
+	}
+	log->cur_entry++;
+
+	if (read_size < log->sectorsize) {
+		if (lseek(log->logfd,
+			  log->sectorsize - sizeof(struct log_write_entry),
+			  SEEK_CUR) == (off_t)-1) {
+			fprintf(stderr, "Error seeking in log: %d\n", errno);
+			return -1;
+		}
+	}
+	if (log_writes_verbose > 1)
+		printf("seek entry %d: %llu, size %llu, flags %llu\n",
+		       (int)log->cur_entry - 1,
+		       (unsigned long long)le64_to_cpu(entry->sector),
+		       (unsigned long long)le64_to_cpu(entry->nr_sectors),
+		       (unsigned long long)le64_to_cpu(entry->flags));
+
+	flags = le32_to_cpu(entry->flags);
+	read_size = le32_to_cpu(entry->nr_sectors) * log->sectorsize;
+	if (!read_size || (flags & LOG_DISCARD_FLAG))
+		return 0;
+
+	if (lseek(log->logfd, read_size, SEEK_CUR) == (off_t)-1) {
+		fprintf(stderr, "Error seeking in log: %d\n", errno);
+		return -1;
+	}
+
+	return 0;
+}
+
+/*
+ * @logfile: the file that contains the write log.
+ * @replayfile: the file/device to replay onto, can be NULL.
+ *
+ * Opens a logfile and makes sure it is valid and returns a struct log.
+ */
+struct log *log_open(char *logfile, char *replayfile)
+{
+	struct log *log;
+	struct log_write_super super;
+	ssize_t ret;
+
+	log = malloc(sizeof(struct log));
+	if (!log) {
+		fprintf(stderr, "Couldn't alloc log\n");
+		return NULL;
+	}
+
+	log->replayfd = -1;
+
+	log->logfd = open(logfile, O_RDONLY);
+	if (log->logfd < 0) {
+		fprintf(stderr, "Couldn't open log %s: %d\n", logfile,
+			errno);
+		log_free(log);
+		return NULL;
+	}
+
+	if (replayfile) {
+		log->replayfd = open(replayfile, O_WRONLY);
+		if (log->replayfd < 0) {
+			fprintf(stderr, "Couldn't open replay file %s: %d\n",
+				replayfile, errno);
+			log_free(log);
+			return NULL;
+		}
+	}
+
+	ret = read(log->logfd, &super, sizeof(struct log_write_super));
+	if (ret < sizeof(struct log_write_super)) {
+		fprintf(stderr, "Error reading super: %d\n", errno);
+		log_free(log);
+		return NULL;
+	}
+
+	if (le64_to_cpu(super.magic) != WRITE_LOG_MAGIC) {
+		fprintf(stderr, "Magic doesn't match\n");
+		log_free(log);
+		return NULL;
+	}
+
+	if (le64_to_cpu(super.version) != WRITE_LOG_VERSION) {
+		fprintf(stderr, "Version mismatch, wanted %d, have %d\n",
+			WRITE_LOG_VERSION, (int)le64_to_cpu(super.version));
+		log_free(log);
+		return NULL;
+	}
+
+	log->sectorsize = le32_to_cpu(super.sectorsize);
+	log->nr_entries = le64_to_cpu(super.nr_entries);
+	log->max_zero_size = 128 * 1024 * 1024;
+
+	if (lseek(log->logfd, log->sectorsize - sizeof(super), SEEK_CUR) ==
+	    (off_t) -1) {
+		fprintf(stderr, "Error seeking to first entry: %d\n", errno);
+		log_free(log);
+		return NULL;
+	}
+	log->cur_entry = 0;
+
+	return log;
+}
diff --git a/src/log-writes/log-writes.h b/src/log-writes/log-writes.h
new file mode 100644
index 0000000..13f98ff
--- /dev/null
+++ b/src/log-writes/log-writes.h
@@ -0,0 +1,70 @@
+#ifndef _LOG_WRITES_H_
+#define _LOG_WRITES_H_
+
+#include <linux/types.h>
+#include <linux/byteorder/little_endian.h>
+
+extern int log_writes_verbose;
+
+#define le64_to_cpu __le64_to_cpu
+#define le32_to_cpu __le32_to_cpu
+
+typedef __u64 u64;
+typedef __u32 u32;
+
+#define LOG_FLUSH_FLAG (1 << 0)
+#define LOG_FUA_FLAG (1 << 1)
+#define LOG_DISCARD_FLAG (1 << 2)
+#define LOG_MARK_FLAG (1 << 3)
+
+#define WRITE_LOG_VERSION 1
+#define WRITE_LOG_MAGIC 0x6a736677736872
+
+
+/*
+ * Basic info about the log for userspace.
+ */
+struct log_write_super {
+	__le64 magic;
+	__le64 version;
+	__le64 nr_entries;
+	__le32 sectorsize;
+};
+
+/*
+ * sector - the sector we wrote.
+ * nr_sectors - the number of sectors we wrote.
+ * flags - flags for this log entry.
+ * data_len - the size of the data in this log entry, this is for private log
+ * entry stuff, the MARK data provided by userspace for example.
+ */
+struct log_write_entry {
+	__le64 sector;
+	__le64 nr_sectors;
+	__le64 flags;
+	__le64 data_len;
+};
+
+#define LOG_IGNORE_DISCARD (1 << 0)
+#define LOG_DISCARD_NOT_SUPP (1 << 1)
+
+struct log {
+	int logfd;
+	int replayfd;
+	unsigned long flags;
+	u64 sectorsize;
+	u64 nr_entries;
+	u64 cur_entry;
+	u64 max_zero_size;
+	off_t cur_pos;
+};
+
+struct log *log_open(char *logfile, char *replayfile);
+int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
+			  int read_data);
+int log_seek_entry(struct log *log, u64 entry_num);
+int log_seek_next_entry(struct log *log, struct log_write_entry *entry,
+			int read_data);
+void log_free(struct log *log);
+
+#endif
diff --git a/src/log-writes/replay-log.c b/src/log-writes/replay-log.c
new file mode 100644
index 0000000..759c3c7
--- /dev/null
+++ b/src/log-writes/replay-log.c
@@ -0,0 +1,348 @@
+#include <stdio.h>
+#include <unistd.h>
+#include <getopt.h>
+#include <stdlib.h>
+#include <string.h>
+#include "log-writes.h"
+
+enum option_indexes {
+	NEXT_FLUSH,
+	NEXT_FUA,
+	START_ENTRY,
+	END_MARK,
+	LOG,
+	REPLAY,
+	LIMIT,
+	VERBOSE,
+	FIND,
+	NUM_ENTRIES,
+	NO_DISCARD,
+	FSCK,
+	CHECK,
+	START_MARK,
+};
+
+static struct option long_options[] = {
+	{"next-flush", no_argument, NULL, 0},
+	{"next-fua", no_argument, NULL, 0},
+	{"start-entry", required_argument, NULL, 0},
+	{"end-mark", required_argument, NULL, 0},
+	{"log", required_argument, NULL, 0},
+	{"replay", required_argument, NULL, 0},
+	{"limit", required_argument, NULL, 0},
+	{"verbose", no_argument, NULL, 'v'},
+	{"find", no_argument, NULL, 0},
+	{"num-entries", no_argument, NULL, 0},
+	{"no-discard", no_argument, NULL, 0},
+	{"fsck", required_argument, NULL, 0},
+	{"check", required_argument, NULL, 0},
+	{"start-mark", required_argument, NULL, 0},
+	{ NULL, 0, NULL, 0 },
+};
+
+static void usage(void)
+{
+	fprintf(stderr, "Usage: replay-log --log <logfile> [options]\n");
+	fprintf(stderr, "\t--replay <device> - replay onto a specific "
+		"device\n");
+	fprintf(stderr, "\t--limit <number> - number of entries to replay\n");
+	fprintf(stderr, "\t--next-flush - replay to/find the next flush\n");
+	fprintf(stderr, "\t--next-fua - replay to/find the next fua\n");
+	fprintf(stderr, "\t--start-entry <entry> - start at the given "
+		"entry #\n");
+	fprintf(stderr, "\t--start-mark <mark> - mark to start from\n");
+	fprintf(stderr, "\t--end-mark <mark> - replay to/find the given mark\n");
+	fprintf(stderr, "\t--find - put replay-log in find mode, will search "
+		"based on the other options\n");
+	fprintf(stderr, "\t--number-entries - print the number of entries in "
+		"the log\n");
+	fprintf(stderr, "\t--no-discard - don't process discard entries\n");
+	fprintf(stderr, "\t--fsck - the fsck command to run, must specify "
+		"--check\n");
+	fprintf(stderr, "\t--check [<number>|flush|fua] when to check the "
+		"file system, mush specify --fsck\n");
+	exit(1);
+}
+
+static int should_stop(struct log_write_entry *entry, u64 stop_flags,
+		       char *mark)
+{
+	u64 flags = le64_to_cpu(entry->flags);
+	int check_mark = (stop_flags & LOG_MARK_FLAG);
+	char *buf = (char *)(entry + 1);
+
+	if (flags & stop_flags) {
+		if (!check_mark)
+			return 1;
+		if ((flags & LOG_MARK_FLAG) && !strcmp(mark, buf))
+			return 1;
+	}
+	return 0;
+}
+
+static int run_fsck(struct log *log, char *fsck_command)
+{
+	int ret = fsync(log->replayfd);
+	if (ret)
+		return ret;
+	ret = system(fsck_command);
+	if (ret >= 0)
+		ret = WEXITSTATUS(ret);
+	return ret ? -1 : 0;
+}
+
+enum log_replay_check_mode {
+	CHECK_NUMBER = 1,
+	CHECK_FUA = 2,
+	CHECK_FLUSH = 3,
+};
+
+static int seek_to_mark(struct log *log, struct log_write_entry *entry,
+			char *mark)
+{
+	int ret;
+
+	while ((ret = log_seek_next_entry(log, entry, 1)) == 0) {
+		if (should_stop(entry, LOG_MARK_FLAG, mark))
+			break;
+	}
+	if (ret == 1) {
+		fprintf(stderr, "Couldn't find starting mark\n");
+		ret = -1;
+	}
+
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char *logfile = NULL, *replayfile = NULL, *fsck_command = NULL;
+	struct log_write_entry *entry;
+	u64 stop_flags = 0;
+	u64 start_entry = 0;
+	u64 run_limit = 0;
+	u64 num_entries = 0;
+	u64 check_number = 0;
+	char *end_mark = NULL, *start_mark = NULL;
+	char *tmp = NULL;
+	struct log *log;
+	int find_mode = 0;
+	int c;
+	int opt_index;
+	int ret;
+	int print_num_entries = 0;
+	int discard = 1;
+	enum log_replay_check_mode check_mode = 0;
+
+	while ((c = getopt_long(argc, argv, "v", long_options,
+				&opt_index)) >= 0) {
+		switch(c) {
+		case 'v':
+			log_writes_verbose++;
+			continue;
+		default:
+			break;
+		}
+
+		switch(opt_index) {
+		case NEXT_FLUSH:
+			stop_flags |= LOG_FLUSH_FLAG;
+			break;
+		case NEXT_FUA:
+			stop_flags |= LOG_FUA_FLAG;
+			break;
+		case START_ENTRY:
+			start_entry = strtoull(optarg, &tmp, 0);
+			if (tmp && *tmp != '\0') {
+				fprintf(stderr, "Invalid entry number\n");
+				exit(1);
+			}
+			tmp = NULL;
+			break;
+		case START_MARK:
+			/*
+			 * Biggest sectorsize is 4k atm, so limit the mark to 4k
+			 * minus the size of the entry.  Say 4097 since we want
+			 * an extra slot for \0.
+			 */
+			start_mark = strndup(optarg, 4097 -
+					     sizeof(struct log_write_entry));
+			if (!start_mark) {
+				fprintf(stderr, "Couldn't allocate memory\n");
+				exit(1);
+			}
+			break;
+		case END_MARK:
+			/*
+			 * Biggest sectorsize is 4k atm, so limit the mark to 4k
+			 * minus the size of the entry.  Say 4097 since we want
+			 * an extra slot for \0.
+			 */
+			end_mark = strndup(optarg, 4097 -
+					   sizeof(struct log_write_entry));
+			if (!end_mark) {
+				fprintf(stderr, "Couldn't allocate memory\n");
+				exit(1);
+			}
+			stop_flags |= LOG_MARK_FLAG;
+			break;
+		case LOG:
+			logfile = strdup(optarg);
+			if (!logfile) {
+				fprintf(stderr, "Couldn't allocate memory\n");
+				exit(1);
+			}
+			break;
+		case REPLAY:
+			replayfile = strdup(optarg);
+			if (!replayfile) {
+				fprintf(stderr, "Couldn't allocate memory\n");
+				exit(1);
+			}
+			break;
+		case LIMIT:
+			run_limit = strtoull(optarg, &tmp, 0);
+			if (tmp && *tmp != '\0') {
+				fprintf(stderr, "Invalid entry number\n");
+				exit(1);
+			}
+			tmp = NULL;
+			break;
+		case FIND:
+			find_mode = 1;
+			break;
+		case NUM_ENTRIES:
+			print_num_entries = 1;
+			break;
+		case NO_DISCARD:
+			discard = 0;
+			break;
+		case FSCK:
+			fsck_command = strdup(optarg);
+			if (!fsck_command) {
+				fprintf(stderr, "Couldn't allocate memory\n");
+				exit(1);
+			}
+			break;
+		case CHECK:
+			if (!strcmp(optarg, "flush")) {
+				check_mode = CHECK_FLUSH;
+			} else if (!strcmp(optarg, "fua")) {
+				check_mode = CHECK_FUA;
+			} else {
+				check_mode = CHECK_NUMBER;
+				check_number = strtoull(optarg, &tmp, 0);
+				if (!check_number || (tmp && *tmp != '\0')) {
+					fprintf(stderr,
+						"Invalid entry number\n");
+					exit(1);
+				}
+				tmp = NULL;
+			}
+			break;
+		default:
+			usage();
+		}
+	}
+
+	if (!logfile)
+		usage();
+
+	log = log_open(logfile, replayfile);
+	if (!log)
+		exit(1);
+	free(logfile);
+	free(replayfile);
+
+	if (!discard)
+		log->flags |= LOG_IGNORE_DISCARD;
+
+	entry = malloc(log->sectorsize);
+	if (!entry) {
+		fprintf(stderr, "Couldn't allocate buffer\n");
+		log_free(log);
+		exit(1);
+	}
+
+	if (start_mark) {
+		ret = seek_to_mark(log, entry, start_mark);
+		if (ret)
+			exit(1);
+		free(start_mark);
+	} else {
+		ret = log_seek_entry(log, start_entry);
+		if (ret)
+			exit(1);
+	}
+
+	if ((fsck_command && !check_mode) || (!fsck_command && check_mode))
+		usage();
+
+	/* We just want to find a given entry */
+	if (find_mode) {
+		while ((ret = log_seek_next_entry(log, entry, 1)) == 0) {
+			num_entries++;
+			if ((run_limit && num_entries == run_limit) ||
+			    should_stop(entry, stop_flags, end_mark)) {
+				printf("%llu\n",
+				       (unsigned long long)log->cur_entry - 1);
+				log_free(log);
+				return 0;
+			}
+		}
+		log_free(log);
+		if (ret < 0)
+			return ret;
+		fprintf(stderr, "Couldn't find entry\n");
+		return 1;
+	}
+
+	/* Used for scripts, just print the number of entries in the log */
+	if (print_num_entries) {
+		printf("%llu\n", (unsigned long long)log->nr_entries);
+		log_free(log);
+		return 0;
+	}
+
+	/* No replay, just spit out the log info. */
+	if (!replayfile) {
+		printf("Log version=%d, sectorsize=%lu, entries=%llu\n",
+		       WRITE_LOG_VERSION, (unsigned long)log->sectorsize,
+		       (unsigned long long)log->nr_entries);
+		log_free(log);
+		return 0;
+	}
+
+	while ((ret = log_replay_next_entry(log, entry, 1)) == 0) {
+		num_entries++;
+		if (fsck_command) {
+			if ((check_mode == CHECK_NUMBER) &&
+			    !(num_entries % check_number))
+				ret = run_fsck(log, fsck_command);
+			else if ((check_mode == CHECK_FUA) &&
+				 should_stop(entry, LOG_FUA_FLAG, NULL))
+				ret = run_fsck(log, fsck_command);
+			else if ((check_mode == CHECK_FLUSH) &&
+				 should_stop(entry, LOG_FLUSH_FLAG, NULL))
+				ret = run_fsck(log, fsck_command);
+			else
+				ret = 0;
+			if (ret) {
+				fprintf(stderr, "Fsck errored out on entry "
+					"%llu\n",
+					(unsigned long long)log->cur_entry - 1);
+				break;
+			}
+		}
+
+		if ((run_limit && num_entries == run_limit) ||
+		    should_stop(entry, stop_flags, end_mark))
+			break;
+	}
+	fsync(log->replayfd);
+	log_free(log);
+	free(end_mark);
+	if (ret < 0)
+		exit(1);
+	return 0;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 11/14] replay-log: output log replay offset in verbose mode
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (9 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-08-30 14:51 ` [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range Amir Goldstein
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

This helps exporting the recorded log to an image file using dd.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 src/log-writes/log-writes.c | 8 +++++---
 src/log-writes/replay-log.c | 6 ++++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c
index fa4f3f3..ba66a5c 100644
--- a/src/log-writes/log-writes.c
+++ b/src/log-writes/log-writes.c
@@ -158,12 +158,14 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
 		}
 	}
 
-	if (log_writes_verbose)
-		printf("replaying %d: sector %llu, size %llu, flags %llu\n",
-		       (int)log->cur_entry - 1,
+	if (log_writes_verbose) {
+		offset = lseek(log->logfd, 0, SEEK_CUR);
+		printf("replaying %d@%llu: sector %llu, size %llu, flags %llu\n",
+		       (int)log->cur_entry - 1, offset / log->sectorsize,
 		       (unsigned long long)le64_to_cpu(entry->sector),
 		       (unsigned long long)size,
 		       (unsigned long long)le64_to_cpu(entry->flags));
+	}
 	if (!size)
 		return 0;
 
diff --git a/src/log-writes/replay-log.c b/src/log-writes/replay-log.c
index 759c3c7..87c03a2 100644
--- a/src/log-writes/replay-log.c
+++ b/src/log-writes/replay-log.c
@@ -284,8 +284,10 @@ int main(int argc, char **argv)
 			num_entries++;
 			if ((run_limit && num_entries == run_limit) ||
 			    should_stop(entry, stop_flags, end_mark)) {
-				printf("%llu\n",
-				       (unsigned long long)log->cur_entry - 1);
+				off_t offset = lseek(log->logfd, 0, SEEK_CUR);
+
+				printf("%llu@%llu\n",
+				       (unsigned long long)log->cur_entry - 1, offset / log->sectorsize);
 				log_free(log);
 				return 0;
 			}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (10 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 11/14] replay-log: output log replay offset in verbose mode Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-09-05 11:07   ` Eryu Guan
  2017-08-30 14:51 ` [PATCH v2 13/14] fstests: add support for working with dm-log-writes target Amir Goldstein
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Using command line options --start-sector and --end-sector, only
operations acting on the specified target device range will be
replayed.

Single vebbose mode (-v) prints out only replayed operations.
Double verbose mode (-vv) prints out also skipped operations.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 src/log-writes/log-writes.c | 33 +++++++++++++++++++++++++++++++--
 src/log-writes/log-writes.h |  2 ++
 src/log-writes/replay-log.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c
index ba66a5c..d832c2a 100644
--- a/src/log-writes/log-writes.c
+++ b/src/log-writes/log-writes.c
@@ -119,6 +119,24 @@ int log_discard(struct log *log, struct log_write_entry *entry)
 
 /*
  * @log: the log we are replaying.
+ * @entry: entry to be replayed.
+ *
+ * @return: 0 if we should replay the entry, > 0 if we should skip it.
+ *
+ * Should we skip the entry in our log or replay onto the replay device.
+ */
+int log_should_skip(struct log *log, struct log_write_entry *entry)
+{
+	if (!entry->nr_sectors)
+		return 0;
+	if (entry->sector + entry->nr_sectors < log->start_sector ||
+	    entry->sector > log->end_sector)
+		return 1;
+	return 0;
+}
+
+/*
+ * @log: the log we are replaying.
  * @entry: where we put the entry.
  * @read_data: read the entry data as well, entry must be log->sectorsize sized
  * if this is set.
@@ -137,6 +155,7 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
 	char *buf;
 	ssize_t ret;
 	off_t offset;
+	u64 skip = 0;
 
 	if (log->cur_entry >= log->nr_entries)
 		return 1;
@@ -158,9 +177,11 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
 		}
 	}
 
-	if (log_writes_verbose) {
+	skip = log_should_skip(log, entry);
+	if (log_writes_verbose > 1 || (log_writes_verbose && !skip)) {
 		offset = lseek(log->logfd, 0, SEEK_CUR);
-		printf("replaying %d@%llu: sector %llu, size %llu, flags %llu\n",
+		printf("%s %d@%llu: sector %llu, size %llu, flags %llu\n",
+		       skip ? "skipping" : "replaying",
 		       (int)log->cur_entry - 1, offset / log->sectorsize,
 		       (unsigned long long)le64_to_cpu(entry->sector),
 		       (unsigned long long)size,
@@ -173,6 +194,14 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
 	if (flags & LOG_DISCARD_FLAG)
 		return log_discard(log, entry);
 
+	if (skip) {
+		if (lseek(log->logfd, size, SEEK_CUR) == (off_t)-1) {
+			fprintf(stderr, "Error seeking in log: %d\n", errno);
+			return -1;
+		}
+		return 0;
+	}
+
 	buf = malloc(size);
 	if (!buf) {
 		fprintf(stderr, "Error allocating buffer %llu entry %llu\n", (unsigned long long)size, (unsigned long long)log->cur_entry - 1);
diff --git a/src/log-writes/log-writes.h b/src/log-writes/log-writes.h
index 13f98ff..fc84acf 100644
--- a/src/log-writes/log-writes.h
+++ b/src/log-writes/log-writes.h
@@ -53,6 +53,8 @@ struct log {
 	int replayfd;
 	unsigned long flags;
 	u64 sectorsize;
+	u64 start_sector;
+	u64 end_sector;
 	u64 nr_entries;
 	u64 cur_entry;
 	u64 max_zero_size;
diff --git a/src/log-writes/replay-log.c b/src/log-writes/replay-log.c
index 87c03a2..971974b 100644
--- a/src/log-writes/replay-log.c
+++ b/src/log-writes/replay-log.c
@@ -20,6 +20,8 @@ enum option_indexes {
 	FSCK,
 	CHECK,
 	START_MARK,
+	START_SECTOR,
+	END_SECTOR,
 };
 
 static struct option long_options[] = {
@@ -37,6 +39,8 @@ static struct option long_options[] = {
 	{"fsck", required_argument, NULL, 0},
 	{"check", required_argument, NULL, 0},
 	{"start-mark", required_argument, NULL, 0},
+	{"start-sector", required_argument, NULL, 0},
+	{"end-sector", required_argument, NULL, 0},
 	{ NULL, 0, NULL, 0 },
 };
 
@@ -61,6 +65,12 @@ static void usage(void)
 		"--check\n");
 	fprintf(stderr, "\t--check [<number>|flush|fua] when to check the "
 		"file system, mush specify --fsck\n");
+	fprintf(stderr, "\t--start-sector <sector> - replay ops on region "
+		"from <sector> onto <device>\n");
+	fprintf(stderr, "\t--end-sector <sector> - replay ops on region "
+		"to <sector> onto <device>\n");
+	fprintf(stderr, "\t-v or --verbose - print replayed ops\n");
+	fprintf(stderr, "\t-vv - print also skipped ops\n");
 	exit(1);
 }
 
@@ -120,6 +130,8 @@ int main(int argc, char **argv)
 	struct log_write_entry *entry;
 	u64 stop_flags = 0;
 	u64 start_entry = 0;
+	u64 start_sector = 0;
+	u64 end_sector = -1ULL;
 	u64 run_limit = 0;
 	u64 num_entries = 0;
 	u64 check_number = 0;
@@ -240,6 +252,22 @@ int main(int argc, char **argv)
 				tmp = NULL;
 			}
 			break;
+		case START_SECTOR:
+			start_sector = strtoull(optarg, &tmp, 0);
+			if (tmp && *tmp != '\0') {
+				fprintf(stderr, "Invalid sector number\n");
+				exit(1);
+			}
+			tmp = NULL;
+			break;
+		case END_SECTOR:
+			end_sector = strtoull(optarg, &tmp, 0);
+			if (tmp && *tmp != '\0') {
+				fprintf(stderr, "Invalid sector number\n");
+				exit(1);
+			}
+			tmp = NULL;
+			break;
 		default:
 			usage();
 		}
@@ -257,6 +285,9 @@ int main(int argc, char **argv)
 	if (!discard)
 		log->flags |= LOG_IGNORE_DISCARD;
 
+	log->start_sector = start_sector;
+	log->end_sector = end_sector;
+
 	entry = malloc(log->sectorsize);
 	if (!entry) {
 		fprintf(stderr, "Couldn't allocate buffer\n");
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 13/14] fstests: add support for working with dm-log-writes target
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (11 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-09-05 11:22   ` Eryu Guan
  2017-08-30 14:51 ` [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes Amir Goldstein
  2017-08-30 15:04 ` [PATCH v2 00/14] Crash consistency xfstest " Amir Goldstein
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

Cherry-picked the relevant common bits from commit 70d41e17164b
in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
Quoting from Josef's commit message:

  This patch adds the supporting code for using the dm-log-writes
  target.  The dmlogwrites code is similar to the dmflakey code, it just
  gives us functions to build and tear down a dm-log-writes target.  We
  add a new LOGWRITES_DEV variable to take in the device we will use as
  the log and add checks for that.

[Amir:]
- Removed unneeded _test_falloc_support
- Moved _require_log_writes to dmlogwrites
- Document _require_log_writes

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 README                       |  2 ++
 common/dmlogwrites           | 84 ++++++++++++++++++++++++++++++++++++++++++++
 doc/requirement-checking.txt | 20 +++++++++++
 3 files changed, 106 insertions(+)
 create mode 100644 common/dmlogwrites

diff --git a/README b/README
index 9456fa7..4963d28 100644
--- a/README
+++ b/README
@@ -91,6 +91,8 @@ Preparing system for tests:
              - set TEST_XFS_SCRUB=1 to have _check_xfs_filesystem run
                xfs_scrub -vd to scrub the filesystem metadata online before
                unmounting to run the offline check.
+             - setenv LOGWRITES_DEV to a block device to use for power fail
+               testing.
 
         - or add a case to the switch in common/config assigning
           these variables based on the hostname of your test
diff --git a/common/dmlogwrites b/common/dmlogwrites
new file mode 100644
index 0000000..a36724d
--- /dev/null
+++ b/common/dmlogwrites
@@ -0,0 +1,84 @@
+##/bin/bash
+#
+# Copyright (c) 2015 Facebook, Inc.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#
+# common functions for setting up and tearing down a dm log-writes device
+
+_require_log_writes()
+{
+	_require_dm_target log-writes
+	_require_test_program "log-writes/replay-log"
+}
+
+_init_log_writes()
+{
+	local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
+	LOGWRITES_NAME=logwrites-test
+	LOGWRITES_DMDEV=/dev/mapper/$LOGWRITES_NAME
+	LOGWRITES_TABLE="0 $BLK_DEV_SIZE log-writes $SCRATCH_DEV $LOGWRITES_DEV"
+	$DMSETUP_PROG create $LOGWRITES_NAME --table "$LOGWRITES_TABLE" || \
+		_fatal "failed to create log-writes device"
+	$DMSETUP_PROG mknodes > /dev/null 2>&1
+}
+
+_log_writes_mark()
+{
+	[ $# -ne 1 ] && _fatal "_log_writes_mark takes one argument"
+	$DMSETUP_PROG message $LOGWRITES_NAME 0 mark $1
+}
+
+_log_writes_mkfs()
+{
+	_scratch_options mkfs
+	_mkfs_dev $SCRATCH_OPTIONS $LOGWRITES_DMDEV
+	_log_writes_mark mkfs
+}
+
+_mount_log_writes()
+{
+	mount -t $FSTYP $MOUNT_OPTIONS $* $LOGWRITES_DMDEV $SCRATCH_MNT
+}
+
+_unmount_log_writes()
+{
+	$UMOUNT_PROG $SCRATCH_MNT
+}
+
+# _replay_log <mark>
+#
+# This replays the log contained on $INTEGRITY_DEV onto $SCRATCH_DEV upto the
+# mark passed in.
+_replay_log()
+{
+	_mark=$1
+
+	$here/src/log-writes/replay-log --log $LOGWRITES_DEV --replay $SCRATCH_DEV \
+		--end-mark $_mark > /dev/null 2>&1
+	[ $? -ne 0 ] && _fatal "replay failed"
+}
+
+_log_writes_remove()
+{
+	$DMSETUP_PROG remove $LOGWRITES_NAME > /dev/null 2>&1
+	$DMSETUP_PROG mknodes > /dev/null 2>&1
+}
+
+_cleanup_log_writes()
+{
+	$UMOUNT_PROG $SCRATCH_MNT > /dev/null 2>&1
+	_log_writes_remove
+}
diff --git a/doc/requirement-checking.txt b/doc/requirement-checking.txt
index 95d10e6..4e01b1f 100644
--- a/doc/requirement-checking.txt
+++ b/doc/requirement-checking.txt
@@ -21,6 +21,10 @@ they have.  This is done with _require_<xxx> macros, which may take parameters.
 
 	_require_statx
 
+ (4) Device mapper requirement.
+
+	_require_dm_target
+	_require_log_writes
 
 ====================
 GENERAL REQUIREMENTS
@@ -102,3 +106,19 @@ _require_statx
 
      The test requires the use of the statx() system call and will be skipped
      if it isn't available in the kernel.
+
+
+==========================
+DEVICE MAPPER REQUIREMENTS
+==========================
+
+_require_dm_target <name>
+
+     The test requires the use of the device mapper target and will be skipped
+     if it isn't available in the kernel.
+
+_require_log_writes
+
+     The test requires the use of the device mapper target log-writes.
+     The test also requires the test program log-writes/replay-log is built
+     and will be skipped if either isn't available.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (12 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 13/14] fstests: add support for working with dm-log-writes target Amir Goldstein
@ 2017-08-30 14:51 ` Amir Goldstein
  2017-09-05 11:28   ` Eryu Guan
  2017-08-30 15:04 ` [PATCH v2 00/14] Crash consistency xfstest " Amir Goldstein
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 14:51 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

DO NOT MERGE!!! this test fails most likely due to test bug.

The random seed values in this patch fail the test consistently on ext4
always with the same fsck error ((end of extent exceeds allowed value).
btrfs also fails, but with slightly different fsck errors each run.
xfs fails sometimes on file checksum error.

Cherry-picked the test from commit 70d41e17164b
in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
Quoting from Josef's commit message:

  The test just runs some ops and exits, then finds all of the good buffers
  in the directory we provided and:
  - replays up to the mark given
  - mounts the file system and compares the md5sum
  - unmounts and fsck's to check for metadata integrity

  dm-log-writes will pretend to do discard and the replay-log tool will
  replay it properly depending on the underlying device, either by writing
  0's or actually calling the discard ioctl, so I've enabled discard in the
  test for maximum fun.

[Amir:]
- Removed unneeded _test_falloc_support dynamic FSX_OPTS
- Added place holders for using constant random seeds
- Add test to new 'replay' group

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 tests/generic/500     | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/500.out |   2 +
 tests/generic/group   |   1 +
 3 files changed, 141 insertions(+)
 create mode 100755 tests/generic/500
 create mode 100644 tests/generic/500.out

diff --git a/tests/generic/500 b/tests/generic/500
new file mode 100755
index 0000000..81d45ef
--- /dev/null
+++ b/tests/generic/500
@@ -0,0 +1,138 @@
+#! /bin/bash
+# FS QA Test No. 500
+#
+# Run fsx with log writes to verify power fail safeness.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2015 Facebook. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_log_writes
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmlogwrites
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_test
+_require_scratch_nocheck
+_require_log_writes
+
+rm -f $seqres.full
+rm -rf $TEST_DIR/fsxtests
+
+check_files()
+{
+	local _name=$1
+
+	# Now look for our files
+	for i in $(find $SANITY_DIR -type f | grep $_name | grep mark)
+	do
+		local filename=$(basename $i)
+		local mark="${filename##*.}"
+		local expected_size=$(_ls_l -h $i | awk '{ print $5 }')
+		echo "checking $filename ($expected_size)" >> $seqres.full
+		_replay_log $filename
+		_scratch_mount
+		local expected_md5=$(md5sum $i | cut -f 1 -d ' ')
+		local md5=$(md5sum $SCRATCH_MNT/$_name | cut -f 1 -d ' ')
+		local size=$(_ls_l -h $SCRATCH_MNT/$_name | awk '{ print $5 }')
+		[ "${md5}" != "${expected_md5}" ] && _fatal "$filename ($size) md5sum mismatched"
+		_scratch_unmount
+		_check_scratch_fs
+	done
+}
+
+SANITY_DIR=$TEST_DIR/fsxtests
+mkdir $SANITY_DIR
+
+# Create the log
+_init_log_writes
+
+_log_writes_mkfs >> $seqres.full 2>&1
+
+# Log writes emulates discard support, turn it on for maximum crying.
+_mount_log_writes -o discard
+
+NUM_FILES=4
+NUM_OPS=200
+FSX_OPTS="-N $NUM_OPS -d -P $SANITY_DIR -i $LOGWRITES_DMDEV"
+# Set random seeds for fsx runs (0 for timestamp + pid)
+seeds=(- 2885 2886 2887 2888)
+# Run fsx for a while
+for j in `seq 1 $NUM_FILES`
+do
+	run_check $here/ltp/fsx $FSX_OPTS -S ${seeds[$j]} -j $j $SCRATCH_MNT/testfile$j &
+done
+wait
+
+test_md5=()
+test_size=()
+for j in `seq 1 $NUM_FILES`
+do
+	test_md5[$j]=$(md5sum $SCRATCH_MNT/testfile$j | cut -f 1 -d ' ')
+	test_size[$j]=$(_ls_l -h $SCRATCH_MNT/testfile$j | awk '{ print $5 }')
+done
+
+# Unmount the scratch dir and tear down the log writes target
+_log_writes_mark last
+_unmount_log_writes
+_log_writes_mark end
+_log_writes_remove
+_check_scratch_fs
+
+# check pre umount
+_replay_log last
+_scratch_mount
+_scratch_unmount
+_check_scratch_fs
+
+for j in `seq 1 $NUM_FILES`
+do
+	check_files testfile$j
+done
+
+# Check the end
+_replay_log end
+_scratch_mount
+for j in `seq 1 $NUM_FILES`
+do
+	md5=$(md5sum $SCRATCH_MNT/testfile$j | cut -f 1 -d ' ')
+	size=$(_ls_l -h $SCRATCH_MNT/testfile$j | awk '{ print $5 }')
+	[ "${md5}" != "${test_md5[$j]}" ] && _fatal "testfile$j end md5sum mismatched ($size:${test_size[$j]})"
+done
+_scratch_unmount
+_check_scratch_fs
+
+echo "Silence is golden"
+status=0
+exit
+
diff --git a/tests/generic/500.out b/tests/generic/500.out
new file mode 100644
index 0000000..883b2ca
--- /dev/null
+++ b/tests/generic/500.out
@@ -0,0 +1,2 @@
+QA output created by 500
+Silence is golden
diff --git a/tests/generic/group b/tests/generic/group
index 044ec3f..2396b72 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -453,3 +453,4 @@
 448 auto quick rw
 449 auto quick acl enospc
 450 auto quick rw
+500 auto log replay
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
                   ` (13 preceding siblings ...)
  2017-08-30 14:51 ` [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes Amir Goldstein
@ 2017-08-30 15:04 ` Amir Goldstein
  2017-08-30 15:23   ` Josef Bacik
  14 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 15:04 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests,
	linux-fsdevel, linux-xfs

Sorry noise xfs list, I meant to CC fsdevel

On Wed, Aug 30, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Hi all,
>
> This is the 2nd revision of crash consistency patch set.
> The main thing that changed since v1 is my confidence in the failures
> reported by the test, along with some more debugging options for
> running the test tools.
>
> I've collected these patches that have been sitting in Josef Bacik's
> tree for a few years and kicked them a bit into shape.
> The dm-log-writes target has been merged to kernel v4.1, see:
> https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt
>
> For this posting, I kept the random seeds constant for the test.
> I set these constant seeds after running with random seed for a little
> while and getting failure reports. With the current values in the test
> I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
> The probablity of reproducing the failure is higher on a spinning disk.
>
> For xfs, I posted a fix for potential data loss post fsync+crash.
> For ext4, I posted a reliable reproducer using dm-flakey.
> For btrfs, I shared the recorded log with Josef.
>
> There is an outstanding problem with the test - when I run it with
> kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
> I suppose its a bug in dm-log-writes with some kernel config or with virtio
> I wasn't able to determine the reason and have little time to debug this.
>
> Since dm-log-writes is anyway in upstream kernel, I don't think a bug
> in dm-log-writes for a certain config is a reason to block this xfstest
> from being merged.
> Anyway, I would be glad if someone could take a look at the soft lockup
> issue. Josef?
>
> Thanks,
> Amir.
>
> Amir Goldstein (14):
>   common/rc: convert some egrep to grep
>   common/rc: fix _require_xfs_io_command params check
>   fsx: fixes to random seed
>   fsx: fix path of .fsx* files
>   fsx: fix compile warnings
>   fsx: add support for integrity check with dm-log-writes target
>   fsx: add optional logid prefix to log messages
>   fsx: add support for --record-ops
>   fsx: add support for -g filldata
>   log-writes: add replay-log program to replay dm-log-writes target
>   replay-log: output log replay offset in verbose mode
>   replay-log: add support for replaying ops in target device range
>   fstests: add support for working with dm-log-writes target
>   fstests: add crash consistency fsx test using dm-log-writes
>
>  .gitignore                   |   1 +
>  README                       |   2 +
>  common/dmlogwrites           |  84 +++++++++
>  common/rc                    |  15 +-
>  doc/auxiliary-programs.txt   |   8 +
>  doc/requirement-checking.txt |  20 +++
>  ltp/fsx.c                    | 220 +++++++++++++++++++----
>  src/Makefile                 |   2 +-
>  src/log-writes/Makefile      |  23 +++
>  src/log-writes/SOURCE        |   6 +
>  src/log-writes/log-writes.c  | 410 +++++++++++++++++++++++++++++++++++++++++++
>  src/log-writes/log-writes.h  |  72 ++++++++
>  src/log-writes/replay-log.c  | 381 ++++++++++++++++++++++++++++++++++++++++
>  tests/generic/500            | 138 +++++++++++++++
>  tests/generic/500.out        |   2 +
>  tests/generic/group          |   1 +
>  16 files changed, 1340 insertions(+), 45 deletions(-)
>  create mode 100644 common/dmlogwrites
>  create mode 100644 src/log-writes/Makefile
>  create mode 100644 src/log-writes/SOURCE
>  create mode 100644 src/log-writes/log-writes.c
>  create mode 100644 src/log-writes/log-writes.h
>  create mode 100644 src/log-writes/replay-log.c
>  create mode 100755 tests/generic/500
>  create mode 100644 tests/generic/500.out
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-30 15:04 ` [PATCH v2 00/14] Crash consistency xfstest " Amir Goldstein
@ 2017-08-30 15:23   ` Josef Bacik
  2017-08-30 18:39     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Josef Bacik @ 2017-08-30 15:23 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Eryu Guan, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, linux-xfs

On Wed, Aug 30, 2017 at 06:04:26PM +0300, Amir Goldstein wrote:
> Sorry noise xfs list, I meant to CC fsdevel
> 
> On Wed, Aug 30, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > Hi all,
> >
> > This is the 2nd revision of crash consistency patch set.
> > The main thing that changed since v1 is my confidence in the failures
> > reported by the test, along with some more debugging options for
> > running the test tools.
> >
> > I've collected these patches that have been sitting in Josef Bacik's
> > tree for a few years and kicked them a bit into shape.
> > The dm-log-writes target has been merged to kernel v4.1, see:
> > https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt
> >
> > For this posting, I kept the random seeds constant for the test.
> > I set these constant seeds after running with random seed for a little
> > while and getting failure reports. With the current values in the test
> > I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
> > The probablity of reproducing the failure is higher on a spinning disk.
> >

I'd rather we make it as evil as possible.  As long as we're printing out the
seed that was used in the output then we can go in and manually change the test
to use the same seed over and over again if we need to debug a problem.

> > For xfs, I posted a fix for potential data loss post fsync+crash.
> > For ext4, I posted a reliable reproducer using dm-flakey.
> > For btrfs, I shared the recorded log with Josef.
> >

I posted a patch to fix the problem you reported by the way, but my
git-send-email thing isn't set to cc people in the commit, sorry about that.

> > There is an outstanding problem with the test - when I run it with
> > kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
> > I suppose its a bug in dm-log-writes with some kernel config or with virtio
> > I wasn't able to determine the reason and have little time to debug this.
> >
> > Since dm-log-writes is anyway in upstream kernel, I don't think a bug
> > in dm-log-writes for a certain config is a reason to block this xfstest
> > from being merged.
> > Anyway, I would be glad if someone could take a look at the soft lockup
> > issue. Josef?
> >

Yeah can you give this a try and see if the soft lockup goes away?

diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index a1da0eb..b900758 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -345,6 +345,7 @@ static int log_writes_kthread(void *arg)
 		struct pending_block *block = NULL;
 		int ret;
 
+		cond_resched();
 		spin_lock_irq(&lc->blocks_lock);
 		if (!list_empty(&lc->logging_blocks)) {
 			block = list_first_entry(&lc->logging_blocks,

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/14] common/rc: convert some egrep to grep
  2017-08-30 14:51 ` [PATCH v2 01/14] common/rc: convert some egrep to grep Amir Goldstein
@ 2017-08-30 15:45   ` Darrick J. Wong
  0 siblings, 0 replies; 48+ messages in thread
From: Darrick J. Wong @ 2017-08-30 15:45 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Eryu Guan, Josef Bacik, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:33PM +0300, Amir Goldstein wrote:
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

> ---
>  common/rc | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/common/rc b/common/rc
> index 9c5f54a..9d7b783 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2177,7 +2177,7 @@ _require_xfs_io_command()
>  		;;
>  	"fsmap" )
>  		testio=`$XFS_IO_PROG -f -c "fsmap" $testfile 2>&1`
> -		echo $testio | egrep -q "Inappropriate ioctl" && \
> +		echo $testio | grep -q "Inappropriate ioctl" && \
>  			_notrun "xfs_io $command support is missing"
>  		;;
>  	"open")
> @@ -2185,12 +2185,12 @@ _require_xfs_io_command()
>  		# a new -C flag was introduced to execute one shot commands.
>  		# Check for -C flag support as an indication for the bug fix.
>  		testio=`$XFS_IO_PROG -F -f -C "open $testfile" $testfile 2>&1`
> -		echo $testio | egrep -q "invalid option" && \
> +		echo $testio | grep -q "invalid option" && \
>  			_notrun "xfs_io $command support is missing"
>  		;;
>  	"scrub"|"repair")
>  		testio=`$XFS_IO_PROG -x -c "$command dummy 0" $TEST_DIR 2>&1`
> -		echo $testio | egrep -q "Inappropriate ioctl" && \
> +		echo $testio | grep -q "Inappropriate ioctl" && \
>  			_notrun "xfs_io $command support is missing"
>  		;;
>  	"utimes" )
> @@ -2209,7 +2209,7 @@ _require_xfs_io_command()
>  		_notrun "xfs_io $command failed (old kernel/wrong fs/bad args?)"
>  	echo $testio | grep -q "foreign file active" && \
>  		_notrun "xfs_io $command not supported on $FSTYP"
> -	echo $testio | egrep -q "Function not implemented" && \
> +	echo $testio | grep -q "Function not implemented" && \
>  		_notrun "xfs_io $command support is missing (missing syscall?)"
>  
>  	if [ -n "$param" -a $param_checked -eq 0 ]; then
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 02/14] common/rc: fix _require_xfs_io_command params check
  2017-08-30 14:51 ` [PATCH v2 02/14] common/rc: fix _require_xfs_io_command params check Amir Goldstein
@ 2017-08-30 16:17   ` Darrick J. Wong
  0 siblings, 0 replies; 48+ messages in thread
From: Darrick J. Wong @ 2017-08-30 16:17 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Eryu Guan, Josef Bacik, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:34PM +0300, Amir Goldstein wrote:
> When _require_xfs_io_command is passed command parameters,
> the resulting error from invalid parameters may be ignored.
> 
> For example, the following bogus params would not abort the test:
> _require_xfs_io_command "falloc" "-X"
> _require_xfs_io_command "fiemap" "-X"
> 
> Fix this by looking for the relevant error message.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

> ---
>  common/rc | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/common/rc b/common/rc
> index 9d7b783..44b98f6 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2212,9 +2212,14 @@ _require_xfs_io_command()
>  	echo $testio | grep -q "Function not implemented" && \
>  		_notrun "xfs_io $command support is missing (missing syscall?)"
>  
> -	if [ -n "$param" -a $param_checked -eq 0 ]; then
> +	[ -n "$param" ] || return
> +
> +	if [ $param_checked -eq 0 ]; then
>  		$XFS_IO_PROG -c "help $command" | grep -q "^ $param --" || \
>  			_notrun "xfs_io $command doesn't support $param"
> +	else
> +		echo $testio | grep -q "invalid option" && \
> +			_notrun "xfs_io $command doesn't support $param"
>  	fi
>  }
>  
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-30 15:23   ` Josef Bacik
@ 2017-08-30 18:39     ` Amir Goldstein
  2017-08-30 18:55       ` Josef Bacik
  2017-08-31  3:38       ` Eryu Guan
  0 siblings, 2 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 18:39 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Eryu Guan, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, linux-xfs

On Wed, Aug 30, 2017 at 6:23 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> On Wed, Aug 30, 2017 at 06:04:26PM +0300, Amir Goldstein wrote:
>> Sorry noise xfs list, I meant to CC fsdevel
>>
>> On Wed, Aug 30, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> > Hi all,
>> >
>> > This is the 2nd revision of crash consistency patch set.
>> > The main thing that changed since v1 is my confidence in the failures
>> > reported by the test, along with some more debugging options for
>> > running the test tools.
>> >
>> > I've collected these patches that have been sitting in Josef Bacik's
>> > tree for a few years and kicked them a bit into shape.
>> > The dm-log-writes target has been merged to kernel v4.1, see:
>> > https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt
>> >
>> > For this posting, I kept the random seeds constant for the test.
>> > I set these constant seeds after running with random seed for a little
>> > while and getting failure reports. With the current values in the test
>> > I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
>> > The probablity of reproducing the failure is higher on a spinning disk.
>> >
>
> I'd rather we make it as evil as possible.  As long as we're printing out the
> seed that was used in the output then we can go in and manually change the test
> to use the same seed over and over again if we need to debug a problem.

Yeh that's what I did, but then I found values that reproduce a problem,
so maybe its worth clinging on to these values now until the bugs are fixed in
upstream and then as regression tests.

Anyway, I can keep these presets commented out, or run the test twice,
once with presets and once with random seed, whatever Eryu decides.


>
>> > There is an outstanding problem with the test - when I run it with
>> > kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
>> > I suppose its a bug in dm-log-writes with some kernel config or with virtio
>> > I wasn't able to determine the reason and have little time to debug this.
>> >
>> > Since dm-log-writes is anyway in upstream kernel, I don't think a bug
>> > in dm-log-writes for a certain config is a reason to block this xfstest
>> > from being merged.
>> > Anyway, I would be glad if someone could take a look at the soft lockup
>> > issue. Josef?
>> >
>
> Yeah can you give this a try and see if the soft lockup goes away?
>

It does go away. Thanks!
Now something's wrong with the log.
it get corrupted in most of the test runs, something like this:

replaying 17624@158946: sector 8651296, size 4096, flags 0
replaying 17625@158955: sector 0, size 0, flags 0
replaying 17626@158956: sector 72057596591815616, size 103079215104, flags 0
Error allocating buffer 103079215104 entry 17626

I'll look into it

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-30 18:39     ` Amir Goldstein
@ 2017-08-30 18:55       ` Josef Bacik
  2017-08-30 19:43         ` Amir Goldstein
  2017-08-31  3:38       ` Eryu Guan
  1 sibling, 1 reply; 48+ messages in thread
From: Josef Bacik @ 2017-08-30 18:55 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Eryu Guan, Josef Bacik, Darrick J . Wong,
	Christoph Hellwig, fstests, linux-fsdevel, linux-xfs

On Wed, Aug 30, 2017 at 09:39:39PM +0300, Amir Goldstein wrote:
> On Wed, Aug 30, 2017 at 6:23 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> > On Wed, Aug 30, 2017 at 06:04:26PM +0300, Amir Goldstein wrote:
> >> Sorry noise xfs list, I meant to CC fsdevel
> >>
> >> On Wed, Aug 30, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > This is the 2nd revision of crash consistency patch set.
> >> > The main thing that changed since v1 is my confidence in the failures
> >> > reported by the test, along with some more debugging options for
> >> > running the test tools.
> >> >
> >> > I've collected these patches that have been sitting in Josef Bacik's
> >> > tree for a few years and kicked them a bit into shape.
> >> > The dm-log-writes target has been merged to kernel v4.1, see:
> >> > https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt
> >> >
> >> > For this posting, I kept the random seeds constant for the test.
> >> > I set these constant seeds after running with random seed for a little
> >> > while and getting failure reports. With the current values in the test
> >> > I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
> >> > The probablity of reproducing the failure is higher on a spinning disk.
> >> >
> >
> > I'd rather we make it as evil as possible.  As long as we're printing out the
> > seed that was used in the output then we can go in and manually change the test
> > to use the same seed over and over again if we need to debug a problem.
> 
> Yeh that's what I did, but then I found values that reproduce a problem,
> so maybe its worth clinging on to these values now until the bugs are fixed in
> upstream and then as regression tests.
> 
> Anyway, I can keep these presets commented out, or run the test twice,
> once with presets and once with random seed, whatever Eryu decides.
> 
> 
> >
> >> > There is an outstanding problem with the test - when I run it with
> >> > kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
> >> > I suppose its a bug in dm-log-writes with some kernel config or with virtio
> >> > I wasn't able to determine the reason and have little time to debug this.
> >> >
> >> > Since dm-log-writes is anyway in upstream kernel, I don't think a bug
> >> > in dm-log-writes for a certain config is a reason to block this xfstest
> >> > from being merged.
> >> > Anyway, I would be glad if someone could take a look at the soft lockup
> >> > issue. Josef?
> >> >
> >
> > Yeah can you give this a try and see if the soft lockup goes away?
> >
> 
> It does go away. Thanks!
> Now something's wrong with the log.
> it get corrupted in most of the test runs, something like this:
> 
> replaying 17624@158946: sector 8651296, size 4096, flags 0
> replaying 17625@158955: sector 0, size 0, flags 0
> replaying 17626@158956: sector 72057596591815616, size 103079215104, flags 0
> Error allocating buffer 103079215104 entry 17626
> 
> I'll look into it

Oh are the devices 4k sectorsize devices?  I fucked up 4k sectorsize support, I
sent some patches to fix it but they haven't been integrated yet, I'll poke
those again.  They are in my dm-log-writes-fixes branch in my btrfs-next tree on
kernel.org.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-30 18:55       ` Josef Bacik
@ 2017-08-30 19:43         ` Amir Goldstein
       [not found]           ` <CAOQ4uxjt-zZ7_iE7ZYUcp8qWYUH=aDLSum70Dmbnth-5smFQ+A@mail.gmail.com>
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-08-30 19:43 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Eryu Guan, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, linux-xfs

On Wed, Aug 30, 2017 at 9:55 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> On Wed, Aug 30, 2017 at 09:39:39PM +0300, Amir Goldstein wrote:
>> On Wed, Aug 30, 2017 at 6:23 PM, Josef Bacik <josef@toxicpanda.com> wrote:
>> > On Wed, Aug 30, 2017 at 06:04:26PM +0300, Amir Goldstein wrote:
>> >> Sorry noise xfs list, I meant to CC fsdevel
>> >>
>> >> On Wed, Aug 30, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> >> > Hi all,
>> >> >
>> >> > This is the 2nd revision of crash consistency patch set.
>> >> > The main thing that changed since v1 is my confidence in the failures
>> >> > reported by the test, along with some more debugging options for
>> >> > running the test tools.
>> >> >
>> >> > I've collected these patches that have been sitting in Josef Bacik's
>> >> > tree for a few years and kicked them a bit into shape.
>> >> > The dm-log-writes target has been merged to kernel v4.1, see:
>> >> > https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt
>> >> >
>> >> > For this posting, I kept the random seeds constant for the test.
>> >> > I set these constant seeds after running with random seed for a little
>> >> > while and getting failure reports. With the current values in the test
>> >> > I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
>> >> > The probablity of reproducing the failure is higher on a spinning disk.
>> >> >
>> >
>> > I'd rather we make it as evil as possible.  As long as we're printing out the
>> > seed that was used in the output then we can go in and manually change the test
>> > to use the same seed over and over again if we need to debug a problem.
>>
>> Yeh that's what I did, but then I found values that reproduce a problem,
>> so maybe its worth clinging on to these values now until the bugs are fixed in
>> upstream and then as regression tests.
>>
>> Anyway, I can keep these presets commented out, or run the test twice,
>> once with presets and once with random seed, whatever Eryu decides.
>>
>>
>> >
>> >> > There is an outstanding problem with the test - when I run it with
>> >> > kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
>> >> > I suppose its a bug in dm-log-writes with some kernel config or with virtio
>> >> > I wasn't able to determine the reason and have little time to debug this.
>> >> >
>> >> > Since dm-log-writes is anyway in upstream kernel, I don't think a bug
>> >> > in dm-log-writes for a certain config is a reason to block this xfstest
>> >> > from being merged.
>> >> > Anyway, I would be glad if someone could take a look at the soft lockup
>> >> > issue. Josef?
>> >> >
>> >
>> > Yeah can you give this a try and see if the soft lockup goes away?
>> >
>>
>> It does go away. Thanks!
>> Now something's wrong with the log.
>> it get corrupted in most of the test runs, something like this:
>>
>> replaying 17624@158946: sector 8651296, size 4096, flags 0
>> replaying 17625@158955: sector 0, size 0, flags 0
>> replaying 17626@158956: sector 72057596591815616, size 103079215104, flags 0
>> Error allocating buffer 103079215104 entry 17626
>>
>> I'll look into it
>
> Oh are the devices 4k sectorsize devices?  I fucked up 4k sectorsize support, I
> sent some patches to fix it but they haven't been integrated yet, I'll poke
> those again.  They are in my dm-log-writes-fixes branch in my btrfs-next tree on
> kernel.org.  Thanks,
>

No they are just virtio devices in kvm reflecting my ssd LV, on whom
the same test works just fine not inside kvm.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-30 18:39     ` Amir Goldstein
  2017-08-30 18:55       ` Josef Bacik
@ 2017-08-31  3:38       ` Eryu Guan
  2017-08-31  4:29         ` Amir Goldstein
  2017-09-01  7:29         ` Amir Goldstein
  1 sibling, 2 replies; 48+ messages in thread
From: Eryu Guan @ 2017-08-31  3:38 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, linux-xfs

On Wed, Aug 30, 2017 at 09:39:39PM +0300, Amir Goldstein wrote:
> On Wed, Aug 30, 2017 at 6:23 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> > On Wed, Aug 30, 2017 at 06:04:26PM +0300, Amir Goldstein wrote:
> >> Sorry noise xfs list, I meant to CC fsdevel
> >>
> >> On Wed, Aug 30, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > This is the 2nd revision of crash consistency patch set.
> >> > The main thing that changed since v1 is my confidence in the failures
> >> > reported by the test, along with some more debugging options for
> >> > running the test tools.
> >> >
> >> > I've collected these patches that have been sitting in Josef Bacik's
> >> > tree for a few years and kicked them a bit into shape.
> >> > The dm-log-writes target has been merged to kernel v4.1, see:
> >> > https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt
> >> >
> >> > For this posting, I kept the random seeds constant for the test.
> >> > I set these constant seeds after running with random seed for a little
> >> > while and getting failure reports. With the current values in the test
> >> > I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
> >> > The probablity of reproducing the failure is higher on a spinning disk.
> >> >
> >
> > I'd rather we make it as evil as possible.  As long as we're printing out the
> > seed that was used in the output then we can go in and manually change the test
> > to use the same seed over and over again if we need to debug a problem.
> 
> Yeh that's what I did, but then I found values that reproduce a problem,
> so maybe its worth clinging on to these values now until the bugs are fixed in
> upstream and then as regression tests.
> 
> Anyway, I can keep these presets commented out, or run the test twice,
> once with presets and once with random seed, whatever Eryu decides.

My thought on this with first glance is using random seed, if a specific
seed reproduce something, maybe another targeted regression test can be
added, as what you did for that ext4 corruption?

> 
> 
> >
> >> > There is an outstanding problem with the test - when I run it with
> >> > kvm-xfstests, the test halts and I get soft lockup of log_writes_kthread.
> >> > I suppose its a bug in dm-log-writes with some kernel config or with virtio
> >> > I wasn't able to determine the reason and have little time to debug this.
> >> >
> >> > Since dm-log-writes is anyway in upstream kernel, I don't think a bug
> >> > in dm-log-writes for a certain config is a reason to block this xfstest
> >> > from being merged.
> >> > Anyway, I would be glad if someone could take a look at the soft lockup
> >> > issue. Josef?
> >> >
> >
> > Yeah can you give this a try and see if the soft lockup goes away?
> >
> 
> It does go away. Thanks!
> Now something's wrong with the log.
> it get corrupted in most of the test runs, something like this:
> 
> replaying 17624@158946: sector 8651296, size 4096, flags 0
> replaying 17625@158955: sector 0, size 0, flags 0
> replaying 17626@158956: sector 72057596591815616, size 103079215104, flags 0
> Error allocating buffer 103079215104 entry 17626
> 
> I'll look into it
> 
> Amir.

The first 6 patches are all prepare work and seem fine, so I probably
will push them out this week. But I may need more time to look into all
these log-writes dm target and fsx changes.

But seems that there're still problems not sorted out (e.g. this
log-write bug), I'd prefer, when they get merged, removing the auto
group for now until things settle down a bit.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-31  3:38       ` Eryu Guan
@ 2017-08-31  4:29         ` Amir Goldstein
  2017-09-01  7:29         ` Amir Goldstein
  1 sibling, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-08-31  4:29 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, Theodore Tso

On Thu, Aug 31, 2017 at 6:38 AM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 09:39:39PM +0300, Amir Goldstein wrote:
...
>> >> > For this posting, I kept the random seeds constant for the test.
>> >> > I set these constant seeds after running with random seed for a little
>> >> > while and getting failure reports. With the current values in the test
>> >> > I was able to reproduce at high probablity failures with xfs, ext4 and btrfs.
>> >> > The probablity of reproducing the failure is higher on a spinning disk.
>> >> >
>> >
>> > I'd rather we make it as evil as possible.  As long as we're printing out the
>> > seed that was used in the output then we can go in and manually change the test
>> > to use the same seed over and over again if we need to debug a problem.
>>
>> Yeh that's what I did, but then I found values that reproduce a problem,
>> so maybe its worth clinging on to these values now until the bugs are fixed in
>> upstream and then as regression tests.
>>
>> Anyway, I can keep these presets commented out, or run the test twice,
>> once with presets and once with random seed, whatever Eryu decides.
>
> My thought on this with first glance is using random seed, if a specific
> seed reproduce something, maybe another targeted regression test can be
> added, as what you did for that ext4 corruption?
>

Sure. Speaking of ext4 corruption, I did not re-post this test with this
series because its quite an ugly black box test. I figured if ext4 guys
would take a look and understand the problem they could write a more
intelligent test. OTOH maybe its better than nothing?

BTW, Josef, did/could you write a more intelligent test to catch the
extent crc bug that you fixed? if not, was it easy to reproduce with the
provided seed presets? and without them?
I am asking to understand if a regression test to that bug is in order
beyond random seed fsx.

BTW2, the xfs bug I found is reproduced with reasonable likelihood
with any random seed. By using the provided presets, I was able to
reduce the test run time and debug cycle considerably. I used
NUM_FILES=2; NUM_OPS=31 to reproduce at > 50% probability
within seconds. So this bug doesn't require a specialized regression test.

...
>
> The first 6 patches are all prepare work and seem fine, so I probably
> will push them out this week. But I may need more time to look into all
> these log-writes dm target and fsx changes.
>
> But seems that there're still problems not sorted out (e.g. this
> log-write bug), I'd prefer, when they get merged, removing the auto
> group for now until things settle down a bit.
>

Good idea. Anyway, I would be happy to see these tests used by N > 1
testers for start.
If some version is merged so people can start pointing this big gun to
their file systems, I imagine more interesting bug will come surface.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
       [not found]                 ` <20170831205403.2tene34ccvw55yo7@destiny>
@ 2017-09-01  6:52                   ` Amir Goldstein
  2017-09-01  7:03                     ` Josef Bacik
                                       ` (3 more replies)
  0 siblings, 4 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-01  6:52 UTC (permalink / raw)
  To: Josef Bacik; +Cc: fstests, Theodore Tso, Eryu Guan

[CC list, Ted]

On Thu, Aug 31, 2017 at 11:54 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> On Thu, Aug 31, 2017 at 05:02:46PM +0300, Amir Goldstein wrote:
>> On Thu, Aug 31, 2017 at 4:43 PM, Josef Bacik <josef@toxicpanda.com> wrote:
>> > On Thu, Aug 31, 2017 at 03:48:44PM +0300, Amir Goldstein wrote:
>> >>
>> >> Josef,
>> >>
>> >> I am at lost with these log corruptions.
>> >> I see log entry bios submitted and log_end_io report success,
>> >> but then in the log I see old data on disk where that entry should be.
>> >> This happens quite randomly and I assume it also happens on
>> >> logged data, because tests sometime fail on checksum on ext4.
>> >>
>> >> Mean while I added some more log entry sanity checks and debug
>> >> prints to replay-log to debug the corruption:
>> >> https://github.com/amir73il/xfstests/commit/bb946deb0dc285867be394613ddb19ce281392cc
>> >>
>> >> This only happens to me when running in kvm, so maybe something
>> >> with the virtio devices is fishy.
>> >>
>> >> Anyway, I ran out of time to work on this for now, so if you have
>> >> any ideas and/or time to test this issue, let me know.
>> >>
>> >
...
>>
>
> Alright I tested it and it's working fine for me.  I'm creating three lv's and
> then doing
>
> -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
>
> And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
> fine.  What is your -drive option line and I'll duplicate what you are doing.
> Thanks,
>

I am using Ted's kvm-xfstests, so this is the qemu command line:
https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104

The only difference in -drive command is no aio=native.
BINGO! when I add aio-native there are no more log corruptions :)
Please try to use aio=threads to see if you also get log corruptions.

Thing is we cannot change kvm-xfstests to always use aio=native because
it is not recommended for sparse images:
https://access.redhat.com/articles/41313
I will try to work something out so that kvm-xfstest will use aio=native
when using the recommended (by not default) LV setup.

However, why would aio=threads cause log corruption?
Does it indicate a bug in kvm-qemu or in dm-log-writes??

Did you try to use kvm-xfstests? its quite convenient to deploy in masses,
so I think it would be ideal to integrate crash tests with.
It also helps unifying the environment between us fs developers
when a bug can not be reproduced on another system. see:
https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md

Anyway, if you do end up using kvm-xfstests, you'l need this
small patch to automatically define the log-writes device:

--- a/kvm-xfstests/test-appliance/files/root/runtests.sh
+++ b/kvm-xfstests/test-appliance/files/root/runtests.sh
@@ -269,9 +269,11 @@ do
            if test "$SIZE" = "large" ; then
                export SCRATCH_DEV=$LG_SCR_DEV
                export SCRATCH_MNT=$LG_SCR_MNT
+               export LOGWRITES_DEV=$SM_SCR_DEV
            else
                export SCRATCH_DEV=$SM_SCR_DEV
                export SCRATCH_MNT=$SM_SCR_MNT
+               export LOGWRITES_DEV=$LG_SCR_DEV
            fi
        fi

kvm-xfstests defined 2 sets of test/scratch a small and a large set
and uses only one of those sets depending on command line,
so I use the "other" scratch as the log writes device.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-01  6:52                   ` Amir Goldstein
@ 2017-09-01  7:03                     ` Josef Bacik
  2017-09-01 20:07                     ` Josef Bacik
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 48+ messages in thread
From: Josef Bacik @ 2017-09-01  7:03 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Josef Bacik, fstests, Theodore Tso, Eryu Guan

On Fri, Sep 01, 2017 at 09:52:18AM +0300, Amir Goldstein wrote:
> [CC list, Ted]
> 
> On Thu, Aug 31, 2017 at 11:54 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> > On Thu, Aug 31, 2017 at 05:02:46PM +0300, Amir Goldstein wrote:
> >> On Thu, Aug 31, 2017 at 4:43 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> >> > On Thu, Aug 31, 2017 at 03:48:44PM +0300, Amir Goldstein wrote:
> >> >>
> >> >> Josef,
> >> >>
> >> >> I am at lost with these log corruptions.
> >> >> I see log entry bios submitted and log_end_io report success,
> >> >> but then in the log I see old data on disk where that entry should be.
> >> >> This happens quite randomly and I assume it also happens on
> >> >> logged data, because tests sometime fail on checksum on ext4.
> >> >>
> >> >> Mean while I added some more log entry sanity checks and debug
> >> >> prints to replay-log to debug the corruption:
> >> >> https://github.com/amir73il/xfstests/commit/bb946deb0dc285867be394613ddb19ce281392cc
> >> >>
> >> >> This only happens to me when running in kvm, so maybe something
> >> >> with the virtio devices is fishy.
> >> >>
> >> >> Anyway, I ran out of time to work on this for now, so if you have
> >> >> any ideas and/or time to test this issue, let me know.
> >> >>
> >> >
> ...
> >>
> >
> > Alright I tested it and it's working fine for me.  I'm creating three lv's and
> > then doing
> >
> > -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
> >
> > And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
> > fine.  What is your -drive option line and I'll duplicate what you are doing.
> > Thanks,
> >
> 
> I am using Ted's kvm-xfstests, so this is the qemu command line:
> https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104
> 
> The only difference in -drive command is no aio=native.
> BINGO! when I add aio-native there are no more log corruptions :)
> Please try to use aio=threads to see if you also get log corruptions.
> 
> Thing is we cannot change kvm-xfstests to always use aio=native because
> it is not recommended for sparse images:
> https://access.redhat.com/articles/41313
> I will try to work something out so that kvm-xfstest will use aio=native
> when using the recommended (by not default) LV setup.
> 
> However, why would aio=threads cause log corruption?
> Does it indicate a bug in kvm-qemu or in dm-log-writes??
> 
> Did you try to use kvm-xfstests? its quite convenient to deploy in masses,
> so I think it would be ideal to integrate crash tests with.
> It also helps unifying the environment between us fs developers
> when a bug can not be reproduced on another system. see:
> https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md
> 
> Anyway, if you do end up using kvm-xfstests, you'l need this
> small patch to automatically define the log-writes device:
> 
> --- a/kvm-xfstests/test-appliance/files/root/runtests.sh
> +++ b/kvm-xfstests/test-appliance/files/root/runtests.sh
> @@ -269,9 +269,11 @@ do
>             if test "$SIZE" = "large" ; then
>                 export SCRATCH_DEV=$LG_SCR_DEV
>                 export SCRATCH_MNT=$LG_SCR_MNT
> +               export LOGWRITES_DEV=$SM_SCR_DEV
>             else
>                 export SCRATCH_DEV=$SM_SCR_DEV
>                 export SCRATCH_MNT=$SM_SCR_MNT
> +               export LOGWRITES_DEV=$LG_SCR_DEV
>             fi
>         fi
> 
> kvm-xfstests defined 2 sets of test/scratch a small and a large set
> and uses only one of those sets depending on command line,
> so I use the "other" scratch as the log writes device.
>

Cool I didn't know about kvm-xfstests, I'll give that a whirl.  The baby just
woke me up but when I get up for real I'll switch my config to use aio=threads
and see what happens, but I'm starting to suspect there's a bug in qemu.
Thanks,

Josef 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-08-31  3:38       ` Eryu Guan
  2017-08-31  4:29         ` Amir Goldstein
@ 2017-09-01  7:29         ` Amir Goldstein
  2017-09-01  7:45           ` Eryu Guan
  1 sibling, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-09-01  7:29 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, linux-xfs

On Thu, Aug 31, 2017 at 6:38 AM, Eryu Guan <eguan@redhat.com> wrote:
...
> The first 6 patches are all prepare work and seem fine, so I probably
> will push them out this week. But I may need more time to look into all
> these log-writes dm target and fsx changes.
>
> But seems that there're still problems not sorted out (e.g. this
> log-write bug), I'd prefer, when they get merged, removing the auto
> group for now until things settle down a bit.
>

I don't object to removing the auto group, but keep in mind that this test
is opt-in anyway, because it requires to define LOGWRITES_DEV
(well it SHOULD require, I actually forgot to check it...)

For now, it seems that the problem observed with kvm-xfstests
is specific to kvm-qemu aio=threads config, so you shouldn't have any
problems trying out the test on non kvm setup.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-01  7:29         ` Amir Goldstein
@ 2017-09-01  7:45           ` Eryu Guan
  0 siblings, 0 replies; 48+ messages in thread
From: Eryu Guan @ 2017-09-01  7:45 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Josef Bacik, Darrick J . Wong, Christoph Hellwig,
	fstests, linux-fsdevel, linux-xfs

On Fri, Sep 01, 2017 at 10:29:38AM +0300, Amir Goldstein wrote:
> On Thu, Aug 31, 2017 at 6:38 AM, Eryu Guan <eguan@redhat.com> wrote:
> ...
> > The first 6 patches are all prepare work and seem fine, so I probably
> > will push them out this week. But I may need more time to look into all
> > these log-writes dm target and fsx changes.
> >
> > But seems that there're still problems not sorted out (e.g. this
> > log-write bug), I'd prefer, when they get merged, removing the auto
> > group for now until things settle down a bit.
> >
> 
> I don't object to removing the auto group, but keep in mind that this test
> is opt-in anyway, because it requires to define LOGWRITES_DEV

That's a good point.

> (well it SHOULD require, I actually forgot to check it...)
> 
> For now, it seems that the problem observed with kvm-xfstests
> is specific to kvm-qemu aio=threads config, so you shouldn't have any
> problems trying out the test on non kvm setup.

Thanks for the heads-up! I'll run this test and look into the code
closely and see what's the best option.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-01  6:52                   ` Amir Goldstein
  2017-09-01  7:03                     ` Josef Bacik
@ 2017-09-01 20:07                     ` Josef Bacik
  2017-09-03 13:39                       ` Amir Goldstein
  2017-09-04  6:42                     ` Dave Chinner
  2018-05-25  8:58                     ` Amir Goldstein
  3 siblings, 1 reply; 48+ messages in thread
From: Josef Bacik @ 2017-09-01 20:07 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Josef Bacik, fstests, Theodore Tso, Eryu Guan

On Fri, Sep 01, 2017 at 09:52:18AM +0300, Amir Goldstein wrote:
> [CC list, Ted]
> 
> On Thu, Aug 31, 2017 at 11:54 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> > On Thu, Aug 31, 2017 at 05:02:46PM +0300, Amir Goldstein wrote:
> >> On Thu, Aug 31, 2017 at 4:43 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> >> > On Thu, Aug 31, 2017 at 03:48:44PM +0300, Amir Goldstein wrote:
> >> >>
> >> >> Josef,
> >> >>
> >> >> I am at lost with these log corruptions.
> >> >> I see log entry bios submitted and log_end_io report success,
> >> >> but then in the log I see old data on disk where that entry should be.
> >> >> This happens quite randomly and I assume it also happens on
> >> >> logged data, because tests sometime fail on checksum on ext4.
> >> >>
> >> >> Mean while I added some more log entry sanity checks and debug
> >> >> prints to replay-log to debug the corruption:
> >> >> https://github.com/amir73il/xfstests/commit/bb946deb0dc285867be394613ddb19ce281392cc
> >> >>
> >> >> This only happens to me when running in kvm, so maybe something
> >> >> with the virtio devices is fishy.
> >> >>
> >> >> Anyway, I ran out of time to work on this for now, so if you have
> >> >> any ideas and/or time to test this issue, let me know.
> >> >>
> >> >
> ...
> >>
> >
> > Alright I tested it and it's working fine for me.  I'm creating three lv's and
> > then doing
> >
> > -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
> >
> > And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
> > fine.  What is your -drive option line and I'll duplicate what you are doing.
> > Thanks,
> >
> 
> I am using Ted's kvm-xfstests, so this is the qemu command line:
> https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104
> 
> The only difference in -drive command is no aio=native.
> BINGO! when I add aio-native there are no more log corruptions :)
> Please try to use aio=threads to see if you also get log corruptions.
> 
> Thing is we cannot change kvm-xfstests to always use aio=native because
> it is not recommended for sparse images:
> https://access.redhat.com/articles/41313
> I will try to work something out so that kvm-xfstest will use aio=native
> when using the recommended (by not default) LV setup.
> 
> However, why would aio=threads cause log corruption?
> Does it indicate a bug in kvm-qemu or in dm-log-writes??

So I've been running this in a loop all day with aio=threads and it's not
blowing up.  This is my qemu version

QEMU emulator version 2.9.0(qemu-2.9.0-1.fb1)

Maybe it has to do with the version of qemu?  Thanks,

Josef

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-01 20:07                     ` Josef Bacik
@ 2017-09-03 13:39                       ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-03 13:39 UTC (permalink / raw)
  To: Josef Bacik; +Cc: fstests, Theodore Tso, Eryu Guan

On Fri, Sep 1, 2017 at 11:07 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> On Fri, Sep 01, 2017 at 09:52:18AM +0300, Amir Goldstein wrote:
>> [CC list, Ted]
>>
>> On Thu, Aug 31, 2017 at 11:54 PM, Josef Bacik <josef@toxicpanda.com> wrote:
>> > On Thu, Aug 31, 2017 at 05:02:46PM +0300, Amir Goldstein wrote:
>> >> On Thu, Aug 31, 2017 at 4:43 PM, Josef Bacik <josef@toxicpanda.com> wrote:
>> >> > On Thu, Aug 31, 2017 at 03:48:44PM +0300, Amir Goldstein wrote:
>> >> >>
>> >> >> Josef,
>> >> >>
>> >> >> I am at lost with these log corruptions.
>> >> >> I see log entry bios submitted and log_end_io report success,
>> >> >> but then in the log I see old data on disk where that entry should be.
>> >> >> This happens quite randomly and I assume it also happens on
>> >> >> logged data, because tests sometime fail on checksum on ext4.
>> >> >>
>> >> >> Mean while I added some more log entry sanity checks and debug
>> >> >> prints to replay-log to debug the corruption:
>> >> >> https://github.com/amir73il/xfstests/commit/bb946deb0dc285867be394613ddb19ce281392cc
>> >> >>
>> >> >> This only happens to me when running in kvm, so maybe something
>> >> >> with the virtio devices is fishy.
>> >> >>
>> >> >> Anyway, I ran out of time to work on this for now, so if you have
>> >> >> any ideas and/or time to test this issue, let me know.
>> >> >>
>> >> >
>> ...
>> >>
>> >
>> > Alright I tested it and it's working fine for me.  I'm creating three lv's and
>> > then doing
>> >
>> > -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
>> >
>> > And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
>> > fine.  What is your -drive option line and I'll duplicate what you are doing.
>> > Thanks,
>> >
>>
>> I am using Ted's kvm-xfstests, so this is the qemu command line:
>> https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104
>>
>> The only difference in -drive command is no aio=native.
>> BINGO! when I add aio-native there are no more log corruptions :)
>> Please try to use aio=threads to see if you also get log corruptions.
>>
>> Thing is we cannot change kvm-xfstests to always use aio=native because
>> it is not recommended for sparse images:
>> https://access.redhat.com/articles/41313
>> I will try to work something out so that kvm-xfstest will use aio=native
>> when using the recommended (by not default) LV setup.
>>
>> However, why would aio=threads cause log corruption?
>> Does it indicate a bug in kvm-qemu or in dm-log-writes??
>
> So I've been running this in a loop all day with aio=threads and it's not
> blowing up.  This is my qemu version
>
> QEMU emulator version 2.9.0(qemu-2.9.0-1.fb1)
>
> Maybe it has to do with the version of qemu?  Thanks,
>

Maybe. I am running QEMU 2.5.0

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-01  6:52                   ` Amir Goldstein
  2017-09-01  7:03                     ` Josef Bacik
  2017-09-01 20:07                     ` Josef Bacik
@ 2017-09-04  6:42                     ` Dave Chinner
  2017-09-04  6:49                       ` Amir Goldstein
  2018-05-25  8:58                     ` Amir Goldstein
  3 siblings, 1 reply; 48+ messages in thread
From: Dave Chinner @ 2017-09-04  6:42 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Josef Bacik, fstests, Theodore Tso, Eryu Guan

On Fri, Sep 01, 2017 at 09:52:18AM +0300, Amir Goldstein wrote:
> [CC list, Ted]
> 
> On Thu, Aug 31, 2017 at 11:54 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> > On Thu, Aug 31, 2017 at 05:02:46PM +0300, Amir Goldstein wrote:
> >> On Thu, Aug 31, 2017 at 4:43 PM, Josef Bacik <josef@toxicpanda.com> wrote:
> >> > On Thu, Aug 31, 2017 at 03:48:44PM +0300, Amir Goldstein wrote:
> >> >>
> >> >> Josef,
> >> >>
> >> >> I am at lost with these log corruptions.
> >> >> I see log entry bios submitted and log_end_io report success,
> >> >> but then in the log I see old data on disk where that entry should be.
> >> >> This happens quite randomly and I assume it also happens on
> >> >> logged data, because tests sometime fail on checksum on ext4.
> >> >>
> >> >> Mean while I added some more log entry sanity checks and debug
> >> >> prints to replay-log to debug the corruption:
> >> >> https://github.com/amir73il/xfstests/commit/bb946deb0dc285867be394613ddb19ce281392cc
> >> >>
> >> >> This only happens to me when running in kvm, so maybe something
> >> >> with the virtio devices is fishy.
> >> >>
> >> >> Anyway, I ran out of time to work on this for now, so if you have
> >> >> any ideas and/or time to test this issue, let me know.
> >> >>
> >> >
> ...
> >>
> >
> > Alright I tested it and it's working fine for me.  I'm creating three lv's and
> > then doing
> >
> > -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
> >
> > And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
> > fine.  What is your -drive option line and I'll duplicate what you are doing.
> > Thanks,
> >
> 
> I am using Ted's kvm-xfstests, so this is the qemu command line:
> https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104
> 
> The only difference in -drive command is no aio=native.
> BINGO! when I add aio-native there are no more log corruptions :)
> Please try to use aio=threads to see if you also get log corruptions.
> 
> Thing is we cannot change kvm-xfstests to always use aio=native because
> it is not recommended for sparse images:
> https://access.redhat.com/articles/41313

Hmmmm. I think you're looking at an article that's at least 6 years
out of date. It was last updated at:

	Updated September 16 2012 at 2:04 AM

Looking at the bug it references there was a heap of problems in the
DIO code, the AIO code and the filesystem code that we fixed in
upstream kernels in late 2010/early 2011. e.g

http://oss.sgi.com/archives/xfs/2011-01/msg00156.html

Those took some time to get back into vendor kernels, but the
aio=native kvm problems described in that kbase article were fixed
in a RHEL 6.1 point release in May 2011.

IOWs, if qemu w/ aio=native doesn't work these days, the article
you've quoted is not the reason.


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-04  6:42                     ` Dave Chinner
@ 2017-09-04  6:49                       ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-04  6:49 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Josef Bacik, fstests, Theodore Tso, Eryu Guan

On Mon, Sep 4, 2017 at 9:42 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Sep 01, 2017 at 09:52:18AM +0300, Amir Goldstein wrote:
>> [CC list, Ted]
>>
>> ...
>> >>
>> >
>> > Alright I tested it and it's working fine for me.  I'm creating three lv's and
>> > then doing
>> >
>> > -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
>> >
>> > And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
>> > fine.  What is your -drive option line and I'll duplicate what you are doing.
>> > Thanks,
>> >
>>
>> I am using Ted's kvm-xfstests, so this is the qemu command line:
>> https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104
>>
>> The only difference in -drive command is no aio=native.
>> BINGO! when I add aio-native there are no more log corruptions :)
>> Please try to use aio=threads to see if you also get log corruptions.
>>
>> Thing is we cannot change kvm-xfstests to always use aio=native because
>> it is not recommended for sparse images:
>> https://access.redhat.com/articles/41313
>
> Hmmmm. I think you're looking at an article that's at least 6 years
> out of date. It was last updated at:
>
>         Updated September 16 2012 at 2:04 AM
>
> Looking at the bug it references there was a heap of problems in the
> DIO code, the AIO code and the filesystem code that we fixed in
> upstream kernels in late 2010/early 2011. e.g
>
> http://oss.sgi.com/archives/xfs/2011-01/msg00156.html
>
> Those took some time to get back into vendor kernels, but the
> aio=native kvm problems described in that kbase article were fixed
> in a RHEL 6.1 point release in May 2011.
>
> IOWs, if qemu w/ aio=native doesn't work these days, the article
> you've quoted is not the reason.
>
>

In that case, I'll post a patch to have kvm-xfstests use aio=native.
I suppose it is the proper way to run xfstests inside kvm anyway.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 07/14] fsx: add optional logid prefix to log messages
  2017-08-30 14:51 ` [PATCH v2 07/14] fsx: add optional logid prefix to log messages Amir Goldstein
@ 2017-09-05 10:46   ` Eryu Guan
  2017-09-05 11:24     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 10:46 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:39PM +0300, Amir Goldstein wrote:
> When writing the intermixed output of several fsx processes
> to a single log file, it is usefull to prefix logs with a log id.
> Use fsx -j <logid> to define the log messages prefix.

Would it be better to allow any string as prefix, not limit to id
number?

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/14] fsx: add support for -g filldata
  2017-08-30 14:51 ` [PATCH v2 09/14] fsx: add support for -g filldata Amir Goldstein
@ 2017-09-05 10:50   ` Eryu Guan
  2017-09-05 11:29     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 10:50 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:41PM +0300, Amir Goldstein wrote:
> -g X: write character X instead of random generated data
> 
> This is useful to compare holes between good and bad buffer.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

This seems useful, but I don't see this option gets used in this
patchset. Perhaps introduce it when it gets used in the test?

> ---
>  ltp/fsx.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index dd6b637..a75bc55 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -132,6 +132,7 @@ unsigned long	simulatedopcount = 0;	/* -b flag */
>  int	closeprob = 0;			/* -c flag */
>  int	debug = 0;			/* -d flag */
>  unsigned long	debugstart = 0;		/* -D flag */
> +char	filldata = 0;			/* -g flag */
>  int	logid = 0;			/* -j flag */
>  int	flush = 0;			/* -f flag */
>  int	do_fsync = 0;			/* -y flag */
> @@ -817,6 +818,8 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
>  		good_buf[offset] = testcalls % 256; 
>  		if (offset % 2)
>  			good_buf[offset] += original_buf[offset];
> +		if (filldata)
> +			good_buf[offset] = filldata;

If filldata is not null, we're wasting cycles setting good_buf[offset]
and overwriting it with filldata. Use a if-else switch? e.g.

Thanks,
Eryu

>  		offset++;
>  	}
>  }
> @@ -1631,11 +1634,12 @@ void
>  usage(void)
>  {
>  	fprintf(stdout, "usage: %s",
> -		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-i logdev] [-j logid] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
> +		"fsx [-dnqxAFLOWZ] [-b opnum] [-c Prob] [-g filldata] [-i logdev] [-j logid] [-l flen] [-m start:end] [-o oplen] [-p progressinterval] [-r readbdy] [-s style] [-t truncbdy] [-w writebdy] [-D startingop] [-N numops] [-P dirpath] [-S seed] fname\n\
>  	-b opnum: beginning operation number (default 1)\n\
>  	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
>  	-d: debug output for all operations\n\
>  	-f flush and invalidate cache after I/O\n\
> +	-g X: write character X instead of random generated data\n\
>  	-i logdev: do integrity testing, logdev is the dm log writes device\n\
>  	-j logid: prefix logs with this id\n\
>  	-l flen: the upper bound on file size (default 262144)\n\
> @@ -1873,7 +1877,7 @@ main(int argc, char **argv)
>  	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
>  
>  	while ((ch = getopt_long(argc, argv,
> -				 "b:c:dfi:j:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
> +				 "b:c:dfg:i:j:l:m:no:p:qr:s:t:w:xyAD:FKHzCILN:OP:RS:WZ",
>  				 longopts, NULL)) != EOF)
>  		switch (ch) {
>  		case 'b':
> @@ -1900,6 +1904,9 @@ main(int argc, char **argv)
>  		case 'f':
>  			flush = 1;
>  			break;
> +		case 'g':
> +			filldata = *optarg;
> +			break;
>  		case 'i':
>  			integrity = 1;
>  			logdev = strdup(optarg);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target
  2017-08-30 14:51 ` [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target Amir Goldstein
@ 2017-09-05 11:03   ` Eryu Guan
  2017-09-05 13:40     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 11:03 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:42PM +0300, Amir Goldstein wrote:
> Imported Josef Bacik's code from:
> https://github.com/josefbacik/log-writes.git
> 
> Specialized program for replaying a write log that was recorded by
> device mapper log-writes target.  The tools is used to perform
> crash consistency tests, allowing to run an arbitrary check tool
> (fsck) at specified checkpoints in the write log.
> 
> [Amir:]
> - Add project Makefile and SOURCE files
> - Document the replay-log auxiliary program
> 
> Cc: Josef Bacik <jbacik@fb.com>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  .gitignore                  |   1 +
>  doc/auxiliary-programs.txt  |   8 +
>  src/Makefile                |   2 +-
>  src/log-writes/Makefile     |  23 +++
>  src/log-writes/SOURCE       |   6 +
>  src/log-writes/log-writes.c | 379 ++++++++++++++++++++++++++++++++++++++++++++
>  src/log-writes/log-writes.h |  70 ++++++++
>  src/log-writes/replay-log.c | 348 ++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 836 insertions(+), 1 deletion(-)
>  create mode 100644 src/log-writes/Makefile
>  create mode 100644 src/log-writes/SOURCE
>  create mode 100644 src/log-writes/log-writes.c
>  create mode 100644 src/log-writes/log-writes.h
>  create mode 100644 src/log-writes/replay-log.c
> 
> diff --git a/.gitignore b/.gitignore
> index fcbc0cd..c26c92f 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -153,6 +153,7 @@
>  /src/t_mmap_stale_pmd
>  /src/t_mmap_cow_race
>  /src/t_mmap_fallocate
> +/src/log-writes/replay-log
>  
>  # dmapi/ binaries
>  /dmapi/src/common/cmd/read_invis
> diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> index bcab453..de15832 100644
> --- a/doc/auxiliary-programs.txt
> +++ b/doc/auxiliary-programs.txt
> @@ -18,6 +18,7 @@ Contents:
>   - af_unix		-- Create an AF_UNIX socket
>   - dmerror		-- fault injection block device control
>   - fsync-err		-- tests fsync error reporting after failed writeback
> + - log-writes/replay-log -- Replay log from device mapper log-writes target
>   - open_by_handle	-- open_by_handle_at syscall exercise
>   - stat_test		-- statx syscall exercise
>   - t_dir_type		-- print directory entries and their file type
> @@ -46,6 +47,13 @@ fsync-err
>  	writeback and test that errors are reported during fsync and cleared
>  	afterward.
>  
> +log-writes/replay-log
> +
> +	Specialized program for replaying a write log that was recorded by
> +	device mapper log-writes target.  The tools is used to perform crash
> +	consistency tests, allowing to run an arbitrary check tool (fsck) at
> +	specified checkpoints in the write log.
> +
>  open_by_handle
>  
>  	The open_by_handle program exercises the open_by_handle_at() system
> diff --git a/src/Makefile b/src/Makefile
> index b8aff49..7d1306b 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -25,7 +25,7 @@ LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
>  	attr-list-by-handle-cursor-test listxattr dio-interleaved t_dir_type \
>  	dio-invalidate-cache stat_test t_encrypted_d_revalidate
>  
> -SUBDIRS =
> +SUBDIRS = log-writes
>  
>  LLDLIBS = $(LIBATTR) $(LIBHANDLE) $(LIBACL) -lpthread
>  
> diff --git a/src/log-writes/Makefile b/src/log-writes/Makefile
> new file mode 100644
> index 0000000..d114177
> --- /dev/null
> +++ b/src/log-writes/Makefile
> @@ -0,0 +1,23 @@
> +TOPDIR = ../..
> +include $(TOPDIR)/include/builddefs
> +
> +TARGETS = replay-log
> +
> +CFILES = replay-log.c log-writes.c
> +LDIRT = $(TARGETS)
> +
> +default: depend $(TARGETS)
> +
> +depend: .dep
> +
> +include $(BUILDRULES)
> +
> +$(TARGETS): $(CFILES)
> +	@echo "    [CC]    $@"
> +	$(Q)$(LTLINK) $(CFILES) -o $@ $(CFLAGS) $(LDFLAGS) $(LDLIBS)
> +
> +install:
> +	$(INSTALL) -m 755 -d $(PKG_LIB_DIR)/src/log-writes
> +	$(INSTALL) -m 755 $(TARGETS) $(PKG_LIB_DIR)/src/log-writes
> +
> +-include .dep
> diff --git a/src/log-writes/SOURCE b/src/log-writes/SOURCE
> new file mode 100644
> index 0000000..d6d143c
> --- /dev/null
> +++ b/src/log-writes/SOURCE
> @@ -0,0 +1,6 @@
> +From:
> +https://github.com/josefbacik/log-writes.git
> +
> +description	Helper code for dm-log-writes target
> +owner	Josef Bacik <jbacik@fb.com>
> +URL	https://github.com/josefbacik/log-writes.git
> diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c
> new file mode 100644
> index 0000000..fa4f3f3
> --- /dev/null
> +++ b/src/log-writes/log-writes.c
> @@ -0,0 +1,379 @@
> +#include <linux/fs.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <sys/ioctl.h>
> +#include <fcntl.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <errno.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include "log-writes.h"
> +
> +int log_writes_verbose = 0;
> +
> +/*
> + * @log: the log to free.
> + *
> + * This will close any open fd's the log has and free up its memory.
> + */
> +void log_free(struct log *log)
> +{
> +	if (log->replayfd >= 0)
> +		close(log->replayfd);
> +	if (log->logfd >= 0)
> +		close(log->logfd);
> +	free(log);
> +}
> +
> +static int discard_range(struct log *log, u64 start, u64 len)
> +{
> +	u64 range[2] = { start, len };
> +
> +	if (ioctl(log->replayfd, BLKDISCARD, &range) < 0) {
> +		if (log_writes_verbose)
> +			printf("replay device doesn't support discard, "
> +			       "switching to writing zeros\n");
> +		log->flags |= LOG_DISCARD_NOT_SUPP;
> +	}
> +	return 0;
> +}
> +
> +static int zero_range(struct log *log, u64 start, u64 len)
> +{
> +	u64 bufsize = len;
> +	ssize_t ret;
> +	char *buf = NULL;
> +
> +	if (log->max_zero_size < len) {
> +		if (log_writes_verbose)
> +			printf("discard len %llu larger than max %llu\n",
> +			       (unsigned long long)len,
> +			       (unsigned long long)log->max_zero_size);
> +		return 0;
> +	}
> +
> +	while (!buf) {
> +		buf = malloc(sizeof(char) * len);
                                            ^^^^ shouldn't this be bufsize?

> +		if (!buf)
> +			bufsize >>= 1;
> +		if (!bufsize) {
> +			fprintf(stderr, "Couldn't allocate zero buffer");
> +			return -1;
> +		}
> +	}
> +
> +	memset(buf, 0, bufsize);
> +	while (len) {
> +		ret = pwrite(log->replayfd, buf, bufsize, start);
> +		if (ret != bufsize) {
> +			fprintf(stderr, "Error zeroing file: %d\n", errno);
> +			free(buf);
> +			return -1;
> +		}
> +		len -= ret;
> +		start += ret;
> +	}
> +	free(buf);
> +	return 0;
> +}
> +
> +/*
> + * @log: the log we are replaying.
> + * @entry: the discard entry.
> + *
> + * Discard the given length.  If the device supports discard we will call that
> + * ioctl, otherwise we will write 0's to emulate discard.  If the discard size
> + * is larger than log->max_zero_size then we will simply skip the zero'ing if
> + * the drive doesn't support discard.
> + */
> +int log_discard(struct log *log, struct log_write_entry *entry)
> +{
> +	u64 start = le64_to_cpu(entry->sector) * log->sectorsize;
> +	u64 size = le64_to_cpu(entry->nr_sectors) * log->sectorsize;
> +	u64 max_chunk = 1 * 1024 * 1024 * 1024;
> +
> +	if (log->flags & LOG_IGNORE_DISCARD)
> +		return 0;
> +
> +	while (size) {
> +		u64 len = size > max_chunk ? max_chunk : size;
> +		int ret;
> +
> +		/*
> +		 * Do this check first in case it is our first discard, that way
> +		 * if we return EOPNOTSUPP we will fall back to the 0 method
> +		 * automatically.
> +		 */
> +		if (!(log->flags & LOG_DISCARD_NOT_SUPP))
> +			ret = discard_range(log, start, len);
> +		if (log->flags & LOG_DISCARD_NOT_SUPP)
> +			ret = zero_range(log, start, len);
> +		if (ret)
> +			return -1;
> +		size -= len;
> +		start += len;
> +	}
> +	return 0;
> +}
> +
> +/*
> + * @log: the log we are replaying.
> + * @entry: where we put the entry.
> + * @read_data: read the entry data as well, entry must be log->sectorsize sized
> + * if this is set.
> + *
> + * @return: 0 if we replayed, 1 if we are at the end, -1 if there was an error.
> + *
> + * Replay the next entry in our log onto the replay device.
> + */
> +int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
> +			  int read_data)
> +{
> +	u64 size;
> +	u64 flags;
> +	size_t read_size = read_data ? log->sectorsize :
> +		sizeof(struct log_write_entry);
> +	char *buf;
> +	ssize_t ret;
> +	off_t offset;
> +
> +	if (log->cur_entry >= log->nr_entries)
> +		return 1;
> +
> +	ret = read(log->logfd, entry, read_size);
> +	if (ret != read_size) {
> +		fprintf(stderr, "Error reading entry: %d\n", errno);
> +		return -1;
> +	}
> +	log->cur_entry++;
> +
> +	size = le64_to_cpu(entry->nr_sectors) * log->sectorsize;
> +	if (read_size < log->sectorsize) {
> +		if (lseek(log->logfd,
> +			  log->sectorsize - sizeof(struct log_write_entry),
> +			  SEEK_CUR) == (off_t)-1) {
> +			fprintf(stderr, "Error seeking in log: %d\n", errno);
> +			return -1;
> +		}
> +	}
> +
> +	if (log_writes_verbose)
> +		printf("replaying %d: sector %llu, size %llu, flags %llu\n",
> +		       (int)log->cur_entry - 1,
> +		       (unsigned long long)le64_to_cpu(entry->sector),
> +		       (unsigned long long)size,
> +		       (unsigned long long)le64_to_cpu(entry->flags));
> +	if (!size)
> +		return 0;
> +
> +	flags = le64_to_cpu(entry->flags);
> +	if (flags & LOG_DISCARD_FLAG)
> +		return log_discard(log, entry);
> +
> +	buf = malloc(size);
> +	if (!buf) {
> +		fprintf(stderr, "Error allocating buffer %llu entry %llu\n", (unsigned long long)size, (unsigned long long)log->cur_entry - 1);
> +		return -1;
> +	}
> +
> +	ret = read(log->logfd, buf, size);
> +	if (ret != size) {
> +		fprintf(stderr, "Erro reading data: %d\n", errno);
                                 ^^^^ Typo here :)

> +		free(buf);
> +		return -1;
> +	}
> +
> +	offset = le64_to_cpu(entry->sector) * log->sectorsize;
> +	ret = pwrite(log->replayfd, buf, size, offset);
> +	free(buf);
> +	if (ret != size) {
> +		fprintf(stderr, "Error writing data: %d\n", errno);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * @log: the log we are manipulating.
> + * @entry_num: the entry we want.
> + *
> + * Seek to the given entry in the log, starting at 0 and ending at
> + * log->nr_entries - 1.
> + */
> +int log_seek_entry(struct log *log, u64 entry_num)
> +{
> +	u64 i = 0;
> +
> +	if (entry_num >= log->nr_entries) {
> +		fprintf(stderr, "Invalid entry number\n");
> +		return -1;
> +	}
> +
> +	if (lseek(log->logfd, log->sectorsize, SEEK_SET) == (off_t)-1) {
> +		fprintf(stderr, "Error seeking in file: %d\n", errno);
> +		return -1;
> +	}

Hmm, we reset the log position to the first log entry by seeking to
log->sectorsize, shouldn't log->cur_entry be reset to 0 too? Though it
doesn't make any difference for now, because log_seek_entry() is only
called at init time, log->cur_entry is 0 anyway. But still, I think it
should be fixed.

BTW, better to add some comments about the seek, it's not so obvious
it's seeking off the log super block on first read :)

> +
> +	for (i = log->cur_entry; i < entry_num; i++) {
> +		struct log_write_entry entry;
> +		ssize_t ret;
> +		off_t seek_size;
> +		u64 flags;
> +
> +		ret = read(log->logfd, &entry, sizeof(entry));
> +		if (ret != sizeof(entry)) {
> +			fprintf(stderr, "Error reading entry: %d\n", errno);
> +			return -1;
> +		}
> +		if (log_writes_verbose > 1)
> +			printf("seek entry %d: %llu, size %llu, flags %llu\n",
> +			       (int)i,
> +			       (unsigned long long)le64_to_cpu(entry.sector),
> +			       (unsigned long long)le64_to_cpu(entry.nr_sectors),
> +			       (unsigned long long)le64_to_cpu(entry.flags));
> +		flags = le64_to_cpu(entry.flags);
> +		seek_size = log->sectorsize - sizeof(entry);
> +		if (!(flags & LOG_DISCARD_FLAG))
> +			seek_size += le64_to_cpu(entry.nr_sectors) *
> +				log->sectorsize;
> +		if (lseek(log->logfd, seek_size, SEEK_CUR) == (off_t)-1) {
> +			fprintf(stderr, "Error seeking in file: %d\n", errno);
> +			return -1;
> +		}
> +		log->cur_entry++;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * @log: the log we are manipulating.
> + * @entry: the entry we read.
> + * @read_data: read the extra data for the entry, your entry must be
> + * log->sectorsize large.
> + *
> + * @return: 1 if we hit the end of the log, 0 we got the next entry, < 0 if
> + * there was an error.
> + *
> + * Seek to the next entry in the log.
> + */
> +int log_seek_next_entry(struct log *log, struct log_write_entry *entry,
> +			int read_data)
> +{
> +	size_t read_size = read_data ? log->sectorsize :
> +		sizeof(struct log_write_entry);
> +	u64 flags;
> +	ssize_t ret;
> +
> +	if (log->cur_entry >= log->nr_entries)
> +		return 1;
> +
> +	ret = read(log->logfd, entry, read_size);
> +	if (ret != read_size) {
> +		fprintf(stderr, "Error reading entry: %d\n", errno);
> +		return -1;
> +	}
> +	log->cur_entry++;
> +
> +	if (read_size < log->sectorsize) {
> +		if (lseek(log->logfd,
> +			  log->sectorsize - sizeof(struct log_write_entry),
> +			  SEEK_CUR) == (off_t)-1) {
> +			fprintf(stderr, "Error seeking in log: %d\n", errno);
> +			return -1;
> +		}
> +	}
> +	if (log_writes_verbose > 1)
> +		printf("seek entry %d: %llu, size %llu, flags %llu\n",
> +		       (int)log->cur_entry - 1,
> +		       (unsigned long long)le64_to_cpu(entry->sector),
> +		       (unsigned long long)le64_to_cpu(entry->nr_sectors),
> +		       (unsigned long long)le64_to_cpu(entry->flags));
> +
> +	flags = le32_to_cpu(entry->flags);
> +	read_size = le32_to_cpu(entry->nr_sectors) * log->sectorsize;
> +	if (!read_size || (flags & LOG_DISCARD_FLAG))
> +		return 0;
> +
> +	if (lseek(log->logfd, read_size, SEEK_CUR) == (off_t)-1) {
> +		fprintf(stderr, "Error seeking in log: %d\n", errno);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * @logfile: the file that contains the write log.
> + * @replayfile: the file/device to replay onto, can be NULL.
> + *
> + * Opens a logfile and makes sure it is valid and returns a struct log.
> + */
> +struct log *log_open(char *logfile, char *replayfile)
> +{
> +	struct log *log;
> +	struct log_write_super super;
> +	ssize_t ret;
> +
> +	log = malloc(sizeof(struct log));
> +	if (!log) {
> +		fprintf(stderr, "Couldn't alloc log\n");
> +		return NULL;
> +	}
> +
> +	log->replayfd = -1;
> +
> +	log->logfd = open(logfile, O_RDONLY);
> +	if (log->logfd < 0) {
> +		fprintf(stderr, "Couldn't open log %s: %d\n", logfile,
> +			errno);
> +		log_free(log);
> +		return NULL;
> +	}
> +
> +	if (replayfile) {
> +		log->replayfd = open(replayfile, O_WRONLY);
> +		if (log->replayfd < 0) {
> +			fprintf(stderr, "Couldn't open replay file %s: %d\n",
> +				replayfile, errno);
> +			log_free(log);
> +			return NULL;
> +		}
> +	}
> +
> +	ret = read(log->logfd, &super, sizeof(struct log_write_super));
> +	if (ret < sizeof(struct log_write_super)) {
> +		fprintf(stderr, "Error reading super: %d\n", errno);
> +		log_free(log);
> +		return NULL;
> +	}
> +
> +	if (le64_to_cpu(super.magic) != WRITE_LOG_MAGIC) {
> +		fprintf(stderr, "Magic doesn't match\n");
> +		log_free(log);
> +		return NULL;
> +	}
> +
> +	if (le64_to_cpu(super.version) != WRITE_LOG_VERSION) {
> +		fprintf(stderr, "Version mismatch, wanted %d, have %d\n",
> +			WRITE_LOG_VERSION, (int)le64_to_cpu(super.version));
> +		log_free(log);
> +		return NULL;
> +	}
> +
> +	log->sectorsize = le32_to_cpu(super.sectorsize);
> +	log->nr_entries = le64_to_cpu(super.nr_entries);
> +	log->max_zero_size = 128 * 1024 * 1024;
> +
> +	if (lseek(log->logfd, log->sectorsize - sizeof(super), SEEK_CUR) ==
> +	    (off_t) -1) {
> +		fprintf(stderr, "Error seeking to first entry: %d\n", errno);
> +		log_free(log);
> +		return NULL;
> +	}
> +	log->cur_entry = 0;
> +
> +	return log;
> +}
> diff --git a/src/log-writes/log-writes.h b/src/log-writes/log-writes.h
> new file mode 100644
> index 0000000..13f98ff
> --- /dev/null
> +++ b/src/log-writes/log-writes.h
> @@ -0,0 +1,70 @@
> +#ifndef _LOG_WRITES_H_
> +#define _LOG_WRITES_H_
> +
> +#include <linux/types.h>
> +#include <linux/byteorder/little_endian.h>
> +
> +extern int log_writes_verbose;
> +
> +#define le64_to_cpu __le64_to_cpu
> +#define le32_to_cpu __le32_to_cpu
> +
> +typedef __u64 u64;
> +typedef __u32 u32;
> +
> +#define LOG_FLUSH_FLAG (1 << 0)
> +#define LOG_FUA_FLAG (1 << 1)
> +#define LOG_DISCARD_FLAG (1 << 2)
> +#define LOG_MARK_FLAG (1 << 3)
> +
> +#define WRITE_LOG_VERSION 1
> +#define WRITE_LOG_MAGIC 0x6a736677736872
> +
> +
> +/*
> + * Basic info about the log for userspace.
> + */
> +struct log_write_super {
> +	__le64 magic;
> +	__le64 version;
> +	__le64 nr_entries;
> +	__le32 sectorsize;
> +};
> +
> +/*
> + * sector - the sector we wrote.
> + * nr_sectors - the number of sectors we wrote.
> + * flags - flags for this log entry.
> + * data_len - the size of the data in this log entry, this is for private log
> + * entry stuff, the MARK data provided by userspace for example.
> + */
> +struct log_write_entry {
> +	__le64 sector;
> +	__le64 nr_sectors;
> +	__le64 flags;
> +	__le64 data_len;

This has to match the in-kernel log_write_entry structure, but the
data_len field is not used in this userspace program, better to add
comments to explain that.

> +};
> +
> +#define LOG_IGNORE_DISCARD (1 << 0)
> +#define LOG_DISCARD_NOT_SUPP (1 << 1)
> +
> +struct log {
> +	int logfd;
> +	int replayfd;
> +	unsigned long flags;
> +	u64 sectorsize;
> +	u64 nr_entries;
> +	u64 cur_entry;
> +	u64 max_zero_size;
> +	off_t cur_pos;

cur_pos is not used, can be removed?

> +};
> +
> +struct log *log_open(char *logfile, char *replayfile);
> +int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
> +			  int read_data);
> +int log_seek_entry(struct log *log, u64 entry_num);
> +int log_seek_next_entry(struct log *log, struct log_write_entry *entry,
> +			int read_data);
> +void log_free(struct log *log);
> +
> +#endif
> diff --git a/src/log-writes/replay-log.c b/src/log-writes/replay-log.c
> new file mode 100644
> index 0000000..759c3c7
> --- /dev/null
> +++ b/src/log-writes/replay-log.c
> @@ -0,0 +1,348 @@
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <getopt.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include "log-writes.h"
> +
> +enum option_indexes {
> +	NEXT_FLUSH,
> +	NEXT_FUA,
> +	START_ENTRY,
> +	END_MARK,
> +	LOG,
> +	REPLAY,
> +	LIMIT,
> +	VERBOSE,
> +	FIND,
> +	NUM_ENTRIES,
> +	NO_DISCARD,
> +	FSCK,
> +	CHECK,
> +	START_MARK,
> +};
> +
> +static struct option long_options[] = {
> +	{"next-flush", no_argument, NULL, 0},
> +	{"next-fua", no_argument, NULL, 0},
> +	{"start-entry", required_argument, NULL, 0},
> +	{"end-mark", required_argument, NULL, 0},
> +	{"log", required_argument, NULL, 0},
> +	{"replay", required_argument, NULL, 0},
> +	{"limit", required_argument, NULL, 0},
> +	{"verbose", no_argument, NULL, 'v'},
> +	{"find", no_argument, NULL, 0},
> +	{"num-entries", no_argument, NULL, 0},
> +	{"no-discard", no_argument, NULL, 0},
> +	{"fsck", required_argument, NULL, 0},
> +	{"check", required_argument, NULL, 0},
> +	{"start-mark", required_argument, NULL, 0},
> +	{ NULL, 0, NULL, 0 },
> +};
> +
> +static void usage(void)
> +{
> +	fprintf(stderr, "Usage: replay-log --log <logfile> [options]\n");
> +	fprintf(stderr, "\t--replay <device> - replay onto a specific "
> +		"device\n");
> +	fprintf(stderr, "\t--limit <number> - number of entries to replay\n");
> +	fprintf(stderr, "\t--next-flush - replay to/find the next flush\n");
> +	fprintf(stderr, "\t--next-fua - replay to/find the next fua\n");
> +	fprintf(stderr, "\t--start-entry <entry> - start at the given "
> +		"entry #\n");
> +	fprintf(stderr, "\t--start-mark <mark> - mark to start from\n");
> +	fprintf(stderr, "\t--end-mark <mark> - replay to/find the given mark\n");
> +	fprintf(stderr, "\t--find - put replay-log in find mode, will search "
> +		"based on the other options\n");
> +	fprintf(stderr, "\t--number-entries - print the number of entries in "
> +		"the log\n");
> +	fprintf(stderr, "\t--no-discard - don't process discard entries\n");
> +	fprintf(stderr, "\t--fsck - the fsck command to run, must specify "
> +		"--check\n");
> +	fprintf(stderr, "\t--check [<number>|flush|fua] when to check the "
> +		"file system, mush specify --fsck\n");
> +	exit(1);
> +}
> +
> +static int should_stop(struct log_write_entry *entry, u64 stop_flags,
> +		       char *mark)

I found that the semantics of this function is hard to get, some
comments would help.

Thanks,
Eryu

> +{
> +	u64 flags = le64_to_cpu(entry->flags);
> +	int check_mark = (stop_flags & LOG_MARK_FLAG);
> +	char *buf = (char *)(entry + 1);
> +
> +	if (flags & stop_flags) {
> +		if (!check_mark)
> +			return 1;
> +		if ((flags & LOG_MARK_FLAG) && !strcmp(mark, buf))
> +			return 1;
> +	}
> +	return 0;
> +}
> +
> +static int run_fsck(struct log *log, char *fsck_command)
> +{
> +	int ret = fsync(log->replayfd);
> +	if (ret)
> +		return ret;
> +	ret = system(fsck_command);
> +	if (ret >= 0)
> +		ret = WEXITSTATUS(ret);
> +	return ret ? -1 : 0;
> +}
> +
> +enum log_replay_check_mode {
> +	CHECK_NUMBER = 1,
> +	CHECK_FUA = 2,
> +	CHECK_FLUSH = 3,
> +};
> +
> +static int seek_to_mark(struct log *log, struct log_write_entry *entry,
> +			char *mark)
> +{
> +	int ret;
> +
> +	while ((ret = log_seek_next_entry(log, entry, 1)) == 0) {
> +		if (should_stop(entry, LOG_MARK_FLAG, mark))
> +			break;
> +	}
> +	if (ret == 1) {
> +		fprintf(stderr, "Couldn't find starting mark\n");
> +		ret = -1;
> +	}
> +
> +	return ret;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	char *logfile = NULL, *replayfile = NULL, *fsck_command = NULL;
> +	struct log_write_entry *entry;
> +	u64 stop_flags = 0;
> +	u64 start_entry = 0;
> +	u64 run_limit = 0;
> +	u64 num_entries = 0;
> +	u64 check_number = 0;
> +	char *end_mark = NULL, *start_mark = NULL;
> +	char *tmp = NULL;
> +	struct log *log;
> +	int find_mode = 0;
> +	int c;
> +	int opt_index;
> +	int ret;
> +	int print_num_entries = 0;
> +	int discard = 1;
> +	enum log_replay_check_mode check_mode = 0;
> +
> +	while ((c = getopt_long(argc, argv, "v", long_options,
> +				&opt_index)) >= 0) {
> +		switch(c) {
> +		case 'v':
> +			log_writes_verbose++;
> +			continue;
> +		default:
> +			break;
> +		}
> +
> +		switch(opt_index) {
> +		case NEXT_FLUSH:
> +			stop_flags |= LOG_FLUSH_FLAG;
> +			break;
> +		case NEXT_FUA:
> +			stop_flags |= LOG_FUA_FLAG;
> +			break;
> +		case START_ENTRY:
> +			start_entry = strtoull(optarg, &tmp, 0);
> +			if (tmp && *tmp != '\0') {
> +				fprintf(stderr, "Invalid entry number\n");
> +				exit(1);
> +			}
> +			tmp = NULL;
> +			break;
> +		case START_MARK:
> +			/*
> +			 * Biggest sectorsize is 4k atm, so limit the mark to 4k
> +			 * minus the size of the entry.  Say 4097 since we want
> +			 * an extra slot for \0.
> +			 */
> +			start_mark = strndup(optarg, 4097 -
> +					     sizeof(struct log_write_entry));
> +			if (!start_mark) {
> +				fprintf(stderr, "Couldn't allocate memory\n");
> +				exit(1);
> +			}
> +			break;
> +		case END_MARK:
> +			/*
> +			 * Biggest sectorsize is 4k atm, so limit the mark to 4k
> +			 * minus the size of the entry.  Say 4097 since we want
> +			 * an extra slot for \0.
> +			 */
> +			end_mark = strndup(optarg, 4097 -
> +					   sizeof(struct log_write_entry));
> +			if (!end_mark) {
> +				fprintf(stderr, "Couldn't allocate memory\n");
> +				exit(1);
> +			}
> +			stop_flags |= LOG_MARK_FLAG;
> +			break;
> +		case LOG:
> +			logfile = strdup(optarg);
> +			if (!logfile) {
> +				fprintf(stderr, "Couldn't allocate memory\n");
> +				exit(1);
> +			}
> +			break;
> +		case REPLAY:
> +			replayfile = strdup(optarg);
> +			if (!replayfile) {
> +				fprintf(stderr, "Couldn't allocate memory\n");
> +				exit(1);
> +			}
> +			break;
> +		case LIMIT:
> +			run_limit = strtoull(optarg, &tmp, 0);
> +			if (tmp && *tmp != '\0') {
> +				fprintf(stderr, "Invalid entry number\n");
> +				exit(1);
> +			}
> +			tmp = NULL;
> +			break;
> +		case FIND:
> +			find_mode = 1;
> +			break;
> +		case NUM_ENTRIES:
> +			print_num_entries = 1;
> +			break;
> +		case NO_DISCARD:
> +			discard = 0;
> +			break;
> +		case FSCK:
> +			fsck_command = strdup(optarg);
> +			if (!fsck_command) {
> +				fprintf(stderr, "Couldn't allocate memory\n");
> +				exit(1);
> +			}
> +			break;
> +		case CHECK:
> +			if (!strcmp(optarg, "flush")) {
> +				check_mode = CHECK_FLUSH;
> +			} else if (!strcmp(optarg, "fua")) {
> +				check_mode = CHECK_FUA;
> +			} else {
> +				check_mode = CHECK_NUMBER;
> +				check_number = strtoull(optarg, &tmp, 0);
> +				if (!check_number || (tmp && *tmp != '\0')) {
> +					fprintf(stderr,
> +						"Invalid entry number\n");
> +					exit(1);
> +				}
> +				tmp = NULL;
> +			}
> +			break;
> +		default:
> +			usage();
> +		}
> +	}
> +
> +	if (!logfile)
> +		usage();
> +
> +	log = log_open(logfile, replayfile);
> +	if (!log)
> +		exit(1);
> +	free(logfile);
> +	free(replayfile);
> +
> +	if (!discard)
> +		log->flags |= LOG_IGNORE_DISCARD;
> +
> +	entry = malloc(log->sectorsize);
> +	if (!entry) {
> +		fprintf(stderr, "Couldn't allocate buffer\n");
> +		log_free(log);
> +		exit(1);
> +	}
> +
> +	if (start_mark) {
> +		ret = seek_to_mark(log, entry, start_mark);
> +		if (ret)
> +			exit(1);
> +		free(start_mark);
> +	} else {
> +		ret = log_seek_entry(log, start_entry);
> +		if (ret)
> +			exit(1);
> +	}
> +
> +	if ((fsck_command && !check_mode) || (!fsck_command && check_mode))
> +		usage();
> +
> +	/* We just want to find a given entry */
> +	if (find_mode) {
> +		while ((ret = log_seek_next_entry(log, entry, 1)) == 0) {
> +			num_entries++;
> +			if ((run_limit && num_entries == run_limit) ||
> +			    should_stop(entry, stop_flags, end_mark)) {
> +				printf("%llu\n",
> +				       (unsigned long long)log->cur_entry - 1);
> +				log_free(log);
> +				return 0;
> +			}
> +		}
> +		log_free(log);
> +		if (ret < 0)
> +			return ret;
> +		fprintf(stderr, "Couldn't find entry\n");
> +		return 1;
> +	}
> +
> +	/* Used for scripts, just print the number of entries in the log */
> +	if (print_num_entries) {
> +		printf("%llu\n", (unsigned long long)log->nr_entries);
> +		log_free(log);
> +		return 0;
> +	}
> +
> +	/* No replay, just spit out the log info. */
> +	if (!replayfile) {
> +		printf("Log version=%d, sectorsize=%lu, entries=%llu\n",
> +		       WRITE_LOG_VERSION, (unsigned long)log->sectorsize,
> +		       (unsigned long long)log->nr_entries);
> +		log_free(log);
> +		return 0;
> +	}
> +
> +	while ((ret = log_replay_next_entry(log, entry, 1)) == 0) {
> +		num_entries++;
> +		if (fsck_command) {
> +			if ((check_mode == CHECK_NUMBER) &&
> +			    !(num_entries % check_number))
> +				ret = run_fsck(log, fsck_command);
> +			else if ((check_mode == CHECK_FUA) &&
> +				 should_stop(entry, LOG_FUA_FLAG, NULL))
> +				ret = run_fsck(log, fsck_command);
> +			else if ((check_mode == CHECK_FLUSH) &&
> +				 should_stop(entry, LOG_FLUSH_FLAG, NULL))
> +				ret = run_fsck(log, fsck_command);
> +			else
> +				ret = 0;
> +			if (ret) {
> +				fprintf(stderr, "Fsck errored out on entry "
> +					"%llu\n",
> +					(unsigned long long)log->cur_entry - 1);
> +				break;
> +			}
> +		}
> +
> +		if ((run_limit && num_entries == run_limit) ||
> +		    should_stop(entry, stop_flags, end_mark))
> +			break;
> +	}
> +	fsync(log->replayfd);
> +	log_free(log);
> +	free(end_mark);
> +	if (ret < 0)
> +		exit(1);
> +	return 0;
> +}
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range
  2017-08-30 14:51 ` [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range Amir Goldstein
@ 2017-09-05 11:07   ` Eryu Guan
  2017-09-05 11:41     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 11:07 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:44PM +0300, Amir Goldstein wrote:
> Using command line options --start-sector and --end-sector, only
> operations acting on the specified target device range will be
> replayed.
> 
> Single vebbose mode (-v) prints out only replayed operations.
> Double verbose mode (-vv) prints out also skipped operations.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  src/log-writes/log-writes.c | 33 +++++++++++++++++++++++++++++++--
>  src/log-writes/log-writes.h |  2 ++
>  src/log-writes/replay-log.c | 31 +++++++++++++++++++++++++++++++
>  3 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c
> index ba66a5c..d832c2a 100644
> --- a/src/log-writes/log-writes.c
> +++ b/src/log-writes/log-writes.c
> @@ -119,6 +119,24 @@ int log_discard(struct log *log, struct log_write_entry *entry)
>  
>  /*
>   * @log: the log we are replaying.
> + * @entry: entry to be replayed.
> + *
> + * @return: 0 if we should replay the entry, > 0 if we should skip it.
> + *
> + * Should we skip the entry in our log or replay onto the replay device.
> + */
> +int log_should_skip(struct log *log, struct log_write_entry *entry)
> +{
> +	if (!entry->nr_sectors)
> +		return 0;
> +	if (entry->sector + entry->nr_sectors < log->start_sector ||
> +	    entry->sector > log->end_sector)

Seems values from entry can't be used directly, need le64_to_cpu first I
think.

> +		return 1;
> +	return 0;
> +}
> +
> +/*
> + * @log: the log we are replaying.
>   * @entry: where we put the entry.
>   * @read_data: read the entry data as well, entry must be log->sectorsize sized
>   * if this is set.
> @@ -137,6 +155,7 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
>  	char *buf;
>  	ssize_t ret;
>  	off_t offset;
> +	u64 skip = 0;

int skip? and log_should_skip returns int too.

Thanks,
Eryu

>  
>  	if (log->cur_entry >= log->nr_entries)
>  		return 1;
> @@ -158,9 +177,11 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
>  		}
>  	}
>  
> -	if (log_writes_verbose) {
> +	skip = log_should_skip(log, entry);
> +	if (log_writes_verbose > 1 || (log_writes_verbose && !skip)) {
>  		offset = lseek(log->logfd, 0, SEEK_CUR);
> -		printf("replaying %d@%llu: sector %llu, size %llu, flags %llu\n",
> +		printf("%s %d@%llu: sector %llu, size %llu, flags %llu\n",
> +		       skip ? "skipping" : "replaying",
>  		       (int)log->cur_entry - 1, offset / log->sectorsize,
>  		       (unsigned long long)le64_to_cpu(entry->sector),
>  		       (unsigned long long)size,
> @@ -173,6 +194,14 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
>  	if (flags & LOG_DISCARD_FLAG)
>  		return log_discard(log, entry);
>  
> +	if (skip) {
> +		if (lseek(log->logfd, size, SEEK_CUR) == (off_t)-1) {
> +			fprintf(stderr, "Error seeking in log: %d\n", errno);
> +			return -1;
> +		}
> +		return 0;
> +	}
> +
>  	buf = malloc(size);
>  	if (!buf) {
>  		fprintf(stderr, "Error allocating buffer %llu entry %llu\n", (unsigned long long)size, (unsigned long long)log->cur_entry - 1);
> diff --git a/src/log-writes/log-writes.h b/src/log-writes/log-writes.h
> index 13f98ff..fc84acf 100644
> --- a/src/log-writes/log-writes.h
> +++ b/src/log-writes/log-writes.h
> @@ -53,6 +53,8 @@ struct log {
>  	int replayfd;
>  	unsigned long flags;
>  	u64 sectorsize;
> +	u64 start_sector;
> +	u64 end_sector;
>  	u64 nr_entries;
>  	u64 cur_entry;
>  	u64 max_zero_size;
> diff --git a/src/log-writes/replay-log.c b/src/log-writes/replay-log.c
> index 87c03a2..971974b 100644
> --- a/src/log-writes/replay-log.c
> +++ b/src/log-writes/replay-log.c
> @@ -20,6 +20,8 @@ enum option_indexes {
>  	FSCK,
>  	CHECK,
>  	START_MARK,
> +	START_SECTOR,
> +	END_SECTOR,
>  };
>  
>  static struct option long_options[] = {
> @@ -37,6 +39,8 @@ static struct option long_options[] = {
>  	{"fsck", required_argument, NULL, 0},
>  	{"check", required_argument, NULL, 0},
>  	{"start-mark", required_argument, NULL, 0},
> +	{"start-sector", required_argument, NULL, 0},
> +	{"end-sector", required_argument, NULL, 0},
>  	{ NULL, 0, NULL, 0 },
>  };
>  
> @@ -61,6 +65,12 @@ static void usage(void)
>  		"--check\n");
>  	fprintf(stderr, "\t--check [<number>|flush|fua] when to check the "
>  		"file system, mush specify --fsck\n");
> +	fprintf(stderr, "\t--start-sector <sector> - replay ops on region "
> +		"from <sector> onto <device>\n");
> +	fprintf(stderr, "\t--end-sector <sector> - replay ops on region "
> +		"to <sector> onto <device>\n");
> +	fprintf(stderr, "\t-v or --verbose - print replayed ops\n");
> +	fprintf(stderr, "\t-vv - print also skipped ops\n");
>  	exit(1);
>  }
>  
> @@ -120,6 +130,8 @@ int main(int argc, char **argv)
>  	struct log_write_entry *entry;
>  	u64 stop_flags = 0;
>  	u64 start_entry = 0;
> +	u64 start_sector = 0;
> +	u64 end_sector = -1ULL;
>  	u64 run_limit = 0;
>  	u64 num_entries = 0;
>  	u64 check_number = 0;
> @@ -240,6 +252,22 @@ int main(int argc, char **argv)
>  				tmp = NULL;
>  			}
>  			break;
> +		case START_SECTOR:
> +			start_sector = strtoull(optarg, &tmp, 0);
> +			if (tmp && *tmp != '\0') {
> +				fprintf(stderr, "Invalid sector number\n");
> +				exit(1);
> +			}
> +			tmp = NULL;
> +			break;
> +		case END_SECTOR:
> +			end_sector = strtoull(optarg, &tmp, 0);
> +			if (tmp && *tmp != '\0') {
> +				fprintf(stderr, "Invalid sector number\n");
> +				exit(1);
> +			}
> +			tmp = NULL;
> +			break;
>  		default:
>  			usage();
>  		}
> @@ -257,6 +285,9 @@ int main(int argc, char **argv)
>  	if (!discard)
>  		log->flags |= LOG_IGNORE_DISCARD;
>  
> +	log->start_sector = start_sector;
> +	log->end_sector = end_sector;
> +
>  	entry = malloc(log->sectorsize);
>  	if (!entry) {
>  		fprintf(stderr, "Couldn't allocate buffer\n");
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 13/14] fstests: add support for working with dm-log-writes target
  2017-08-30 14:51 ` [PATCH v2 13/14] fstests: add support for working with dm-log-writes target Amir Goldstein
@ 2017-09-05 11:22   ` Eryu Guan
  2017-09-05 15:15     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 11:22 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:45PM +0300, Amir Goldstein wrote:
> Cherry-picked the relevant common bits from commit 70d41e17164b
> in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
> Quoting from Josef's commit message:
> 
>   This patch adds the supporting code for using the dm-log-writes
>   target.  The dmlogwrites code is similar to the dmflakey code, it just
>   gives us functions to build and tear down a dm-log-writes target.  We
>   add a new LOGWRITES_DEV variable to take in the device we will use as
>   the log and add checks for that.
> 
> [Amir:]
> - Removed unneeded _test_falloc_support
> - Moved _require_log_writes to dmlogwrites
> - Document _require_log_writes
> 
> Cc: Josef Bacik <jbacik@fb.com>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  README                       |  2 ++
>  common/dmlogwrites           | 84 ++++++++++++++++++++++++++++++++++++++++++++
>  doc/requirement-checking.txt | 20 +++++++++++
>  3 files changed, 106 insertions(+)
>  create mode 100644 common/dmlogwrites
> 
> diff --git a/README b/README
> index 9456fa7..4963d28 100644
> --- a/README
> +++ b/README
> @@ -91,6 +91,8 @@ Preparing system for tests:
>               - set TEST_XFS_SCRUB=1 to have _check_xfs_filesystem run
>                 xfs_scrub -vd to scrub the filesystem metadata online before
>                 unmounting to run the offline check.
> +             - setenv LOGWRITES_DEV to a block device to use for power fail
> +               testing.
>  
>          - or add a case to the switch in common/config assigning
>            these variables based on the hostname of your test
> diff --git a/common/dmlogwrites b/common/dmlogwrites
> new file mode 100644
> index 0000000..a36724d
> --- /dev/null
> +++ b/common/dmlogwrites
> @@ -0,0 +1,84 @@
> +##/bin/bash
> +#
> +# Copyright (c) 2015 Facebook, Inc.  All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#
> +#
> +# common functions for setting up and tearing down a dm log-writes device

I think we need to _notrun if testing with dax mount option? Like all
other dm target tests do.

> +
> +_require_log_writes()
> +{
> +	_require_dm_target log-writes
> +	_require_test_program "log-writes/replay-log"

As you mentioned before, need to check the existence of LOGWRITES_DEV
first. And is the size of LOGWRITES_DEV required to be => SCRATCH_DEV?
I guess so.

> +}
> +
> +_init_log_writes()

Seems the function names are not unified in this file, some are in
vert.-noun. format, some are in noun.-verb. format. Better to use the
same format across the file, either all prefixed with "_log_writes"
(except _require_log_writes) or suffixed with it.

> +{
> +	local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
> +	LOGWRITES_NAME=logwrites-test

Not a big deal, but LOGWRITES_DEVNAME seems better to me.

> +	LOGWRITES_DMDEV=/dev/mapper/$LOGWRITES_NAME
> +	LOGWRITES_TABLE="0 $BLK_DEV_SIZE log-writes $SCRATCH_DEV $LOGWRITES_DEV"
> +	$DMSETUP_PROG create $LOGWRITES_NAME --table "$LOGWRITES_TABLE" || \
> +		_fatal "failed to create log-writes device"

I think s/_fatal/_fail/g should be OK in this file.

> +	$DMSETUP_PROG mknodes > /dev/null 2>&1
> +}
> +
> +_log_writes_mark()
> +{
> +	[ $# -ne 1 ] && _fatal "_log_writes_mark takes one argument"
> +	$DMSETUP_PROG message $LOGWRITES_NAME 0 mark $1
> +}
> +
> +_log_writes_mkfs()
> +{
> +	_scratch_options mkfs
> +	_mkfs_dev $SCRATCH_OPTIONS $LOGWRITES_DMDEV
> +	_log_writes_mark mkfs
> +}
> +
> +_mount_log_writes()
> +{
> +	mount -t $FSTYP $MOUNT_OPTIONS $* $LOGWRITES_DMDEV $SCRATCH_MNT

$MOUNT_PROG, and I think we can follow _dmerror_mount() for this mount
function.

> +}
> +
> +_unmount_log_writes()
> +{
> +	$UMOUNT_PROG $SCRATCH_MNT
> +}
> +
> +# _replay_log <mark>

_replay_log looks like replaying filesystem log/journal, prefixed with
_log_writes? So I guess we're going the "_log_writes_<verb>" way :)

> +#
> +# This replays the log contained on $INTEGRITY_DEV onto $SCRATCH_DEV upto the
                                       ^^^^^^^^^^^^^^ LOGWRITES_DEV
> +# mark passed in.
> +_replay_log()
> +{
> +	_mark=$1
> +
> +	$here/src/log-writes/replay-log --log $LOGWRITES_DEV --replay $SCRATCH_DEV \
> +		--end-mark $_mark > /dev/null 2>&1

Dump output to $seqres.full for debug purpose?

Thanks,
Eryu

> +	[ $? -ne 0 ] && _fatal "replay failed"
> +}
> +
> +_log_writes_remove()
> +{
> +	$DMSETUP_PROG remove $LOGWRITES_NAME > /dev/null 2>&1
> +	$DMSETUP_PROG mknodes > /dev/null 2>&1
> +}
> +
> +_cleanup_log_writes()
> +{
> +	$UMOUNT_PROG $SCRATCH_MNT > /dev/null 2>&1
> +	_log_writes_remove
> +}
> diff --git a/doc/requirement-checking.txt b/doc/requirement-checking.txt
> index 95d10e6..4e01b1f 100644
> --- a/doc/requirement-checking.txt
> +++ b/doc/requirement-checking.txt
> @@ -21,6 +21,10 @@ they have.  This is done with _require_<xxx> macros, which may take parameters.
>  
>  	_require_statx
>  
> + (4) Device mapper requirement.
> +
> +	_require_dm_target
> +	_require_log_writes
>  
>  ====================
>  GENERAL REQUIREMENTS
> @@ -102,3 +106,19 @@ _require_statx
>  
>       The test requires the use of the statx() system call and will be skipped
>       if it isn't available in the kernel.
> +
> +
> +==========================
> +DEVICE MAPPER REQUIREMENTS
> +==========================
> +
> +_require_dm_target <name>
> +
> +     The test requires the use of the device mapper target and will be skipped
> +     if it isn't available in the kernel.
> +
> +_require_log_writes
> +
> +     The test requires the use of the device mapper target log-writes.
> +     The test also requires the test program log-writes/replay-log is built
> +     and will be skipped if either isn't available.
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 07/14] fsx: add optional logid prefix to log messages
  2017-09-05 10:46   ` Eryu Guan
@ 2017-09-05 11:24     ` Amir Goldstein
  2017-09-05 11:31       ` Eryu Guan
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-09-05 11:24 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 1:46 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 05:51:39PM +0300, Amir Goldstein wrote:
>> When writing the intermixed output of several fsx processes
>> to a single log file, it is usefull to prefix logs with a log id.
>> Use fsx -j <logid> to define the log messages prefix.
>
> Would it be better to allow any string as prefix, not limit to id
> number?
>

Maybe, but I didn't see an immediate need for that beyond
the concurrent test runs, for with numeric id is sufficient.

Besides, the function that prepends the prefix, prt()
is sometimes uses for continued line, which results with
weird looking lines like this one:

1: 60( 60 mod 256): 1: FALLOC   0x30140 thru 0x30f3c    (0xdfc bytes)
1: EXTENDING1:

So it's not really worth fixing properly, but anything more then a single
numeric prefix is going to look quite bad.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes
  2017-08-30 14:51 ` [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes Amir Goldstein
@ 2017-09-05 11:28   ` Eryu Guan
  2017-09-05 11:52     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 11:28 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Wed, Aug 30, 2017 at 05:51:46PM +0300, Amir Goldstein wrote:
> DO NOT MERGE!!! this test fails most likely due to test bug.
> 
> The random seed values in this patch fail the test consistently on ext4
> always with the same fsck error ((end of extent exceeds allowed value).
> btrfs also fails, but with slightly different fsck errors each run.
> xfs fails sometimes on file checksum error.
> 
> Cherry-picked the test from commit 70d41e17164b
> in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
> Quoting from Josef's commit message:
> 
>   The test just runs some ops and exits, then finds all of the good buffers
>   in the directory we provided and:
>   - replays up to the mark given
>   - mounts the file system and compares the md5sum
>   - unmounts and fsck's to check for metadata integrity
> 
>   dm-log-writes will pretend to do discard and the replay-log tool will
>   replay it properly depending on the underlying device, either by writing
>   0's or actually calling the discard ioctl, so I've enabled discard in the
>   test for maximum fun.
> 
> [Amir:]
> - Removed unneeded _test_falloc_support dynamic FSX_OPTS
> - Added place holders for using constant random seeds
> - Add test to new 'replay' group

Perhaps replace it with 'log' group?

> 
> Cc: Josef Bacik <jbacik@fb.com>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  tests/generic/500     | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/500.out |   2 +
>  tests/generic/group   |   1 +
>  3 files changed, 141 insertions(+)
>  create mode 100755 tests/generic/500
>  create mode 100644 tests/generic/500.out
> 
> diff --git a/tests/generic/500 b/tests/generic/500
> new file mode 100755
> index 0000000..81d45ef
> --- /dev/null
> +++ b/tests/generic/500
> @@ -0,0 +1,138 @@
> +#! /bin/bash
> +# FS QA Test No. 500
> +#
> +# Run fsx with log writes to verify power fail safeness.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2015 Facebook. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +status=1	# failure is the default!
> +
> +_cleanup()
> +{
> +	_cleanup_log_writes
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmlogwrites
> +
> +# real QA test starts here
> +_supported_fs generic
> +_supported_os Linux
> +_require_test
> +_require_scratch_nocheck
> +_require_log_writes
> +
> +rm -f $seqres.full
> +rm -rf $TEST_DIR/fsxtests

Remove it after SANITY_DIR is defined?

> +
> +check_files()
> +{
> +	local _name=$1

Don't need underscore for a local var.

> +
> +	# Now look for our files
> +	for i in $(find $SANITY_DIR -type f | grep $_name | grep mark)
> +	do
> +		local filename=$(basename $i)
> +		local mark="${filename##*.}"
> +		local expected_size=$(_ls_l -h $i | awk '{ print $5 }')
> +		echo "checking $filename ($expected_size)" >> $seqres.full
> +		_replay_log $filename
> +		_scratch_mount
> +		local expected_md5=$(md5sum $i | cut -f 1 -d ' ')
> +		local md5=$(md5sum $SCRATCH_MNT/$_name | cut -f 1 -d ' ')
> +		local size=$(_ls_l -h $SCRATCH_MNT/$_name | awk '{ print $5 }')

I see this md5 and size patterns are repeating several times in this
test, write new helpers?

> +		[ "${md5}" != "${expected_md5}" ] && _fatal "$filename ($size) md5sum mismatched"

Use _fail in this test.

> +		_scratch_unmount
> +		_check_scratch_fs
> +	done
> +}
> +
> +SANITY_DIR=$TEST_DIR/fsxtests
> +mkdir $SANITY_DIR
> +
> +# Create the log
> +_init_log_writes
> +
> +_log_writes_mkfs >> $seqres.full 2>&1
> +
> +# Log writes emulates discard support, turn it on for maximum crying.
> +_mount_log_writes -o discard
> +
> +NUM_FILES=4
> +NUM_OPS=200
> +FSX_OPTS="-N $NUM_OPS -d -P $SANITY_DIR -i $LOGWRITES_DMDEV"
> +# Set random seeds for fsx runs (0 for timestamp + pid)
> +seeds=(- 2885 2886 2887 2888)
         ^^ meant 0?

> +# Run fsx for a while
> +for j in `seq 1 $NUM_FILES`
> +do
> +	run_check $here/ltp/fsx $FSX_OPTS -S ${seeds[$j]} -j $j $SCRATCH_MNT/testfile$j &
> +done
> +wait
> +
> +test_md5=()
> +test_size=()
> +for j in `seq 1 $NUM_FILES`
> +do
> +	test_md5[$j]=$(md5sum $SCRATCH_MNT/testfile$j | cut -f 1 -d ' ')
> +	test_size[$j]=$(_ls_l -h $SCRATCH_MNT/testfile$j | awk '{ print $5 }')
> +done
> +
> +# Unmount the scratch dir and tear down the log writes target
> +_log_writes_mark last
> +_unmount_log_writes
> +_log_writes_mark end
> +_log_writes_remove
> +_check_scratch_fs
> +
> +# check pre umount
> +_replay_log last
> +_scratch_mount
> +_scratch_unmount
> +_check_scratch_fs
> +
> +for j in `seq 1 $NUM_FILES`
> +do
> +	check_files testfile$j
> +done
> +
> +# Check the end
> +_replay_log end
> +_scratch_mount
> +for j in `seq 1 $NUM_FILES`
> +do
> +	md5=$(md5sum $SCRATCH_MNT/testfile$j | cut -f 1 -d ' ')
> +	size=$(_ls_l -h $SCRATCH_MNT/testfile$j | awk '{ print $5 }')
> +	[ "${md5}" != "${test_md5[$j]}" ] && _fatal "testfile$j end md5sum mismatched ($size:${test_size[$j]})"
> +done
> +_scratch_unmount
> +_check_scratch_fs
> +
> +echo "Silence is golden"
> +status=0
> +exit
> +
> diff --git a/tests/generic/500.out b/tests/generic/500.out
> new file mode 100644
> index 0000000..883b2ca
> --- /dev/null
> +++ b/tests/generic/500.out
> @@ -0,0 +1,2 @@
> +QA output created by 500
> +Silence is golden
> diff --git a/tests/generic/group b/tests/generic/group
> index 044ec3f..2396b72 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -453,3 +453,4 @@
>  448 auto quick rw
>  449 auto quick acl enospc
>  450 auto quick rw
> +500 auto log replay

Adding it to auto group seems fine to me.

Thanks,
Eryu

> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/14] fsx: add support for -g filldata
  2017-09-05 10:50   ` Eryu Guan
@ 2017-09-05 11:29     ` Amir Goldstein
  2017-09-05 11:33       ` Eryu Guan
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-09-05 11:29 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 1:50 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 05:51:41PM +0300, Amir Goldstein wrote:
>> -g X: write character X instead of random generated data
>>
>> This is useful to compare holes between good and bad buffer.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> This seems useful, but I don't see this option gets used in this
> patchset. Perhaps introduce it when it gets used in the test?

I used it for debugging, to compare hexdump of good and bad files
when I suspected that checksum errors were due to different zeroed
ranges.


>
>> ---
>>  ltp/fsx.c | 11 +++++++++--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/ltp/fsx.c b/ltp/fsx.c
>> index dd6b637..a75bc55 100644
>> --- a/ltp/fsx.c
>> +++ b/ltp/fsx.c
>> @@ -132,6 +132,7 @@ unsigned long     simulatedopcount = 0;   /* -b flag */
>>  int  closeprob = 0;                  /* -c flag */
>>  int  debug = 0;                      /* -d flag */
>>  unsigned long        debugstart = 0;         /* -D flag */
>> +char filldata = 0;                   /* -g flag */
>>  int  logid = 0;                      /* -j flag */
>>  int  flush = 0;                      /* -f flag */
>>  int  do_fsync = 0;                   /* -y flag */
>> @@ -817,6 +818,8 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
>>               good_buf[offset] = testcalls % 256;
>>               if (offset % 2)
>>                       good_buf[offset] += original_buf[offset];
>> +             if (filldata)
>> +                     good_buf[offset] = filldata;
>
> If filldata is not null, we're wasting cycles setting good_buf[offset]
> and overwriting it with filldata. Use a if-else switch? e.g.
>

considering this is a debugging option that is not meant to optimize
performance, I don't think that's critical, but I can fix that.

Thanks
Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 07/14] fsx: add optional logid prefix to log messages
  2017-09-05 11:24     ` Amir Goldstein
@ 2017-09-05 11:31       ` Eryu Guan
  2017-09-07  7:10         ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 11:31 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 05, 2017 at 02:24:20PM +0300, Amir Goldstein wrote:
> On Tue, Sep 5, 2017 at 1:46 PM, Eryu Guan <eguan@redhat.com> wrote:
> > On Wed, Aug 30, 2017 at 05:51:39PM +0300, Amir Goldstein wrote:
> >> When writing the intermixed output of several fsx processes
> >> to a single log file, it is usefull to prefix logs with a log id.
> >> Use fsx -j <logid> to define the log messages prefix.
> >
> > Would it be better to allow any string as prefix, not limit to id
> > number?
> >
> 
> Maybe, but I didn't see an immediate need for that beyond
> the concurrent test runs, for with numeric id is sufficient.

I agreed, I don't have strong preference on this.

Thanks,
Eryu
> 
> Besides, the function that prepends the prefix, prt()
> is sometimes uses for continued line, which results with
> weird looking lines like this one:
> 
> 1: 60( 60 mod 256): 1: FALLOC   0x30140 thru 0x30f3c    (0xdfc bytes)
> 1: EXTENDING1:
> 
> So it's not really worth fixing properly, but anything more then a single
> numeric prefix is going to look quite bad.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/14] fsx: add support for -g filldata
  2017-09-05 11:29     ` Amir Goldstein
@ 2017-09-05 11:33       ` Eryu Guan
  0 siblings, 0 replies; 48+ messages in thread
From: Eryu Guan @ 2017-09-05 11:33 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 05, 2017 at 02:29:35PM +0300, Amir Goldstein wrote:
> On Tue, Sep 5, 2017 at 1:50 PM, Eryu Guan <eguan@redhat.com> wrote:
> > On Wed, Aug 30, 2017 at 05:51:41PM +0300, Amir Goldstein wrote:
> >> -g X: write character X instead of random generated data
> >>
> >> This is useful to compare holes between good and bad buffer.
> >>
> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > This seems useful, but I don't see this option gets used in this
> > patchset. Perhaps introduce it when it gets used in the test?
> 
> I used it for debugging, to compare hexdump of good and bad files
> when I suspected that checksum errors were due to different zeroed
> ranges.

That's fine, mentioning the debug purpose in commit log would be good :)

> 
> 
> >
> >> ---
> >>  ltp/fsx.c | 11 +++++++++--
> >>  1 file changed, 9 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/ltp/fsx.c b/ltp/fsx.c
> >> index dd6b637..a75bc55 100644
> >> --- a/ltp/fsx.c
> >> +++ b/ltp/fsx.c
> >> @@ -132,6 +132,7 @@ unsigned long     simulatedopcount = 0;   /* -b flag */
> >>  int  closeprob = 0;                  /* -c flag */
> >>  int  debug = 0;                      /* -d flag */
> >>  unsigned long        debugstart = 0;         /* -D flag */
> >> +char filldata = 0;                   /* -g flag */
> >>  int  logid = 0;                      /* -j flag */
> >>  int  flush = 0;                      /* -f flag */
> >>  int  do_fsync = 0;                   /* -y flag */
> >> @@ -817,6 +818,8 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
> >>               good_buf[offset] = testcalls % 256;
> >>               if (offset % 2)
> >>                       good_buf[offset] += original_buf[offset];
> >> +             if (filldata)
> >> +                     good_buf[offset] = filldata;
> >
> > If filldata is not null, we're wasting cycles setting good_buf[offset]
> > and overwriting it with filldata. Use a if-else switch? e.g.
> >
> 
> considering this is a debugging option that is not meant to optimize
> performance, I don't think that's critical, but I can fix that.

Thanks!

Eryu

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range
  2017-09-05 11:07   ` Eryu Guan
@ 2017-09-05 11:41     ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-05 11:41 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 2:07 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 05:51:44PM +0300, Amir Goldstein wrote:
>> Using command line options --start-sector and --end-sector, only
>> operations acting on the specified target device range will be
>> replayed.
>>
>> Single vebbose mode (-v) prints out only replayed operations.
>> Double verbose mode (-vv) prints out also skipped operations.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  src/log-writes/log-writes.c | 33 +++++++++++++++++++++++++++++++--
>>  src/log-writes/log-writes.h |  2 ++
>>  src/log-writes/replay-log.c | 31 +++++++++++++++++++++++++++++++
>>  3 files changed, 64 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c
>> index ba66a5c..d832c2a 100644
>> --- a/src/log-writes/log-writes.c
>> +++ b/src/log-writes/log-writes.c
>> @@ -119,6 +119,24 @@ int log_discard(struct log *log, struct log_write_entry *entry)
>>
>>  /*
>>   * @log: the log we are replaying.
>> + * @entry: entry to be replayed.
>> + *
>> + * @return: 0 if we should replay the entry, > 0 if we should skip it.
>> + *
>> + * Should we skip the entry in our log or replay onto the replay device.
>> + */
>> +int log_should_skip(struct log *log, struct log_write_entry *entry)
>> +{
>> +     if (!entry->nr_sectors)
>> +             return 0;
>> +     if (entry->sector + entry->nr_sectors < log->start_sector ||
>> +         entry->sector > log->end_sector)
>
> Seems values from entry can't be used directly, need le64_to_cpu first I
> think.
>
>> +             return 1;
>> +     return 0;
>> +}
>> +
>> +/*
>> + * @log: the log we are replaying.
>>   * @entry: where we put the entry.
>>   * @read_data: read the entry data as well, entry must be log->sectorsize sized
>>   * if this is set.
>> @@ -137,6 +155,7 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry,
>>       char *buf;
>>       ssize_t ret;
>>       off_t offset;
>> +     u64 skip = 0;
>
> int skip? and log_should_skip returns int too.
>

Right. Thanks

FYI, this is also a debugging option.
I used to to replay operations on a given range that was different between
good and bad buffers to narrow down the suspects.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes
  2017-09-05 11:28   ` Eryu Guan
@ 2017-09-05 11:52     ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-05 11:52 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 2:28 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 05:51:46PM +0300, Amir Goldstein wrote:
>> DO NOT MERGE!!! this test fails most likely due to test bug.
>>
>> The random seed values in this patch fail the test consistently on ext4
>> always with the same fsck error ((end of extent exceeds allowed value).
>> btrfs also fails, but with slightly different fsck errors each run.
>> xfs fails sometimes on file checksum error.
>>
>> Cherry-picked the test from commit 70d41e17164b
>> in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
>> Quoting from Josef's commit message:
>>
>>   The test just runs some ops and exits, then finds all of the good buffers
>>   in the directory we provided and:
>>   - replays up to the mark given
>>   - mounts the file system and compares the md5sum
>>   - unmounts and fsck's to check for metadata integrity
>>
>>   dm-log-writes will pretend to do discard and the replay-log tool will
>>   replay it properly depending on the underlying device, either by writing
>>   0's or actually calling the discard ioctl, so I've enabled discard in the
>>   test for maximum fun.
>>
>> [Amir:]
>> - Removed unneeded _test_falloc_support dynamic FSX_OPTS
>> - Added place holders for using constant random seeds
>> - Add test to new 'replay' group
>
> Perhaps replace it with 'log' group?

Josef's version was in log group. I added replay also to define
a new group for crash consistency tests.
...
>> +
>> +NUM_FILES=4
>> +NUM_OPS=200
>> +FSX_OPTS="-N $NUM_OPS -d -P $SANITY_DIR -i $LOGWRITES_DMDEV"
>> +# Set random seeds for fsx runs (0 for timestamp + pid)
>> +seeds=(- 2885 2886 2887 2888)
>          ^^ meant 0?

The for loops below work on index 1..$NUM_FILES, so seeds[0] is not used
I can change to iterate 0..$((NUMFILES-1)) if that matter.
Anyway, next version is not going to have preset seed values.

>> +# Run fsx for a while
>> +for j in `seq 1 $NUM_FILES`
>> +do
>> +     run_check $here/ltp/fsx $FSX_OPTS -S ${seeds[$j]} -j $j $SCRATCH_MNT/testfile$j &
>> +done
>> +wait
>> +
>> +test_md5=()
>> +test_size=()
>> +for j in `seq 1 $NUM_FILES`
>> +do
>> +     test_md5[$j]=$(md5sum $SCRATCH_MNT/testfile$j | cut -f 1 -d ' ')
>> +     test_size[$j]=$(_ls_l -h $SCRATCH_MNT/testfile$j | awk '{ print $5 }')
>> +done
>> +

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target
  2017-09-05 11:03   ` Eryu Guan
@ 2017-09-05 13:40     ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-05 13:40 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 2:03 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 05:51:42PM +0300, Amir Goldstein wrote:
>> Imported Josef Bacik's code from:
>> https://github.com/josefbacik/log-writes.git
>>
>> Specialized program for replaying a write log that was recorded by
>> device mapper log-writes target.  The tools is used to perform
>> crash consistency tests, allowing to run an arbitrary check tool
>> (fsck) at specified checkpoints in the write log.
>>
>> [Amir:]
>> - Add project Makefile and SOURCE files
>> - Document the replay-log auxiliary program
>>
>> Cc: Josef Bacik <jbacik@fb.com>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
...
>> +static int zero_range(struct log *log, u64 start, u64 len)
>> +{
>> +     u64 bufsize = len;
>> +     ssize_t ret;
>> +     char *buf = NULL;
>> +
>> +     if (log->max_zero_size < len) {
>> +             if (log_writes_verbose)
>> +                     printf("discard len %llu larger than max %llu\n",
>> +                            (unsigned long long)len,
>> +                            (unsigned long long)log->max_zero_size);
>> +             return 0;
>> +     }
>> +
>> +     while (!buf) {
>> +             buf = malloc(sizeof(char) * len);
>                                             ^^^^ shouldn't this be bufsize?
>

Yeh, look like is should be...
FYI, zero_range() is  used to emulate DISCARD that
was recorded on a device that supports DISCARD but then
replayed on a device that does not support DISCARD
The only time I tested this scenario is when I replayed lof to /dev/null.

>> +/*
>> + * @log: the log we are manipulating.
>> + * @entry_num: the entry we want.
>> + *
>> + * Seek to the given entry in the log, starting at 0 and ending at
>> + * log->nr_entries - 1.
>> + */
>> +int log_seek_entry(struct log *log, u64 entry_num)
>> +{
>> +     u64 i = 0;
>> +
>> +     if (entry_num >= log->nr_entries) {
>> +             fprintf(stderr, "Invalid entry number\n");
>> +             return -1;
>> +     }
>> +
>> +     if (lseek(log->logfd, log->sectorsize, SEEK_SET) == (off_t)-1) {
>> +             fprintf(stderr, "Error seeking in file: %d\n", errno);
>> +             return -1;
>> +     }
>
> Hmm, we reset the log position to the first log entry by seeking to
> log->sectorsize, shouldn't log->cur_entry be reset to 0 too? Though it
> doesn't make any difference for now, because log_seek_entry() is only
> called at init time, log->cur_entry is 0 anyway. But still, I think it
> should be fixed.
>

True.

> BTW, better to add some comments about the seek, it's not so obvious
> it's seeking off the log super block on first read :)
>
...
>> +
>> +/*
>> + * Basic info about the log for userspace.
>> + */
>> +struct log_write_super {
>> +     __le64 magic;
>> +     __le64 version;
>> +     __le64 nr_entries;
>> +     __le32 sectorsize;
>> +};
>> +
>> +/*
>> + * sector - the sector we wrote.
>> + * nr_sectors - the number of sectors we wrote.
>> + * flags - flags for this log entry.
>> + * data_len - the size of the data in this log entry, this is for private log
>> + * entry stuff, the MARK data provided by userspace for example.
>> + */
>> +struct log_write_entry {
>> +     __le64 sector;
>> +     __le64 nr_sectors;
>> +     __le64 flags;
>> +     __le64 data_len;
>
> This has to match the in-kernel log_write_entry structure, but the
> data_len field is not used in this userspace program, better to add
> comments to explain that.

OK. also should_stop() should strncmp() with data_len instead of strcmp
so there is a use for data_len...

>
>> +};
>> +
>> +#define LOG_IGNORE_DISCARD (1 << 0)
>> +#define LOG_DISCARD_NOT_SUPP (1 << 1)
>> +
>> +struct log {
>> +     int logfd;
>> +     int replayfd;
>> +     unsigned long flags;
>> +     u64 sectorsize;
>> +     u64 nr_entries;
>> +     u64 cur_entry;
>> +     u64 max_zero_size;
>> +     off_t cur_pos;
>
> cur_pos is not used, can be removed?

I think it is best if I used it in patch
("replay-log: add validations for corrupt log entries")
every time I added lseek(log->logfd, 0, SEEK_CUR)
for printing offset in debug logs.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 13/14] fstests: add support for working with dm-log-writes target
  2017-09-05 11:22   ` Eryu Guan
@ 2017-09-05 15:15     ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-05 15:15 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 2:22 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Wed, Aug 30, 2017 at 05:51:45PM +0300, Amir Goldstein wrote:
>> Cherry-picked the relevant common bits from commit 70d41e17164b
>> in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
>> Quoting from Josef's commit message:
>>
>>   This patch adds the supporting code for using the dm-log-writes
>>   target.  The dmlogwrites code is similar to the dmflakey code, it just
>>   gives us functions to build and tear down a dm-log-writes target.  We
>>   add a new LOGWRITES_DEV variable to take in the device we will use as
>>   the log and add checks for that.
>>
>> [Amir:]
>> - Removed unneeded _test_falloc_support
>> - Moved _require_log_writes to dmlogwrites
>> - Document _require_log_writes
>>
>> Cc: Josef Bacik <jbacik@fb.com>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  README                       |  2 ++
>>  common/dmlogwrites           | 84 ++++++++++++++++++++++++++++++++++++++++++++
>>  doc/requirement-checking.txt | 20 +++++++++++
>>  3 files changed, 106 insertions(+)
>>  create mode 100644 common/dmlogwrites
>>
>> diff --git a/README b/README
>> index 9456fa7..4963d28 100644
>> --- a/README
>> +++ b/README
>> @@ -91,6 +91,8 @@ Preparing system for tests:
>>               - set TEST_XFS_SCRUB=1 to have _check_xfs_filesystem run
>>                 xfs_scrub -vd to scrub the filesystem metadata online before
>>                 unmounting to run the offline check.
>> +             - setenv LOGWRITES_DEV to a block device to use for power fail
>> +               testing.
>>
>>          - or add a case to the switch in common/config assigning
>>            these variables based on the hostname of your test
>> diff --git a/common/dmlogwrites b/common/dmlogwrites
>> new file mode 100644
>> index 0000000..a36724d
>> --- /dev/null
>> +++ b/common/dmlogwrites
>> @@ -0,0 +1,84 @@
>> +##/bin/bash
>> +#
>> +# Copyright (c) 2015 Facebook, Inc.  All Rights Reserved.
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License as
>> +# published by the Free Software Foundation.
>> +#
>> +# This program is distributed in the hope that it would be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program; if not, write the Free Software Foundation,
>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>> +#
>> +#
>> +# common functions for setting up and tearing down a dm log-writes device
>
> I think we need to _notrun if testing with dax mount option? Like all
> other dm target tests do.
>
>> +
>> +_require_log_writes()
>> +{
>> +     _require_dm_target log-writes
>> +     _require_test_program "log-writes/replay-log"
>
> As you mentioned before, need to check the existence of LOGWRITES_DEV
> first. And is the size of LOGWRITES_DEV required to be => SCRATCH_DEV?
> I guess so.

Not really. LOGWRITES_DEV just has to be large enough to log all the IOs
in the test. A test may be allocating large files via fallocate so may need
a large SCRATCH_DEV but those fallocs do not translate to large IOs.

>
>> +}
>> +
>> +_init_log_writes()
>
> Seems the function names are not unified in this file, some are in
> vert.-noun. format, some are in noun.-verb. format. Better to use the
> same format across the file, either all prefixed with "_log_writes"
> (except _require_log_writes) or suffixed with it.

Agreed.

>
>> +{
>> +     local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
>> +     LOGWRITES_NAME=logwrites-test
>
> Not a big deal, but LOGWRITES_DEVNAME seems better to me.

Too confusing :)
we already have LOGWRITES_DMDEV and LOGWRITES_DEV.

>
>> +     LOGWRITES_DMDEV=/dev/mapper/$LOGWRITES_NAME
>> +     LOGWRITES_TABLE="0 $BLK_DEV_SIZE log-writes $SCRATCH_DEV $LOGWRITES_DEV"
>> +     $DMSETUP_PROG create $LOGWRITES_NAME --table "$LOGWRITES_TABLE" || \
>> +             _fatal "failed to create log-writes device"
>
> I think s/_fatal/_fail/g should be OK in this file.
>
>> +     $DMSETUP_PROG mknodes > /dev/null 2>&1
>> +}
>> +
>> +_log_writes_mark()
>> +{
>> +     [ $# -ne 1 ] && _fatal "_log_writes_mark takes one argument"
>> +     $DMSETUP_PROG message $LOGWRITES_NAME 0 mark $1
>> +}
>> +
>> +_log_writes_mkfs()
>> +{
>> +     _scratch_options mkfs
>> +     _mkfs_dev $SCRATCH_OPTIONS $LOGWRITES_DMDEV
>> +     _log_writes_mark mkfs
>> +}
>> +
>> +_mount_log_writes()
>> +{
>> +     mount -t $FSTYP $MOUNT_OPTIONS $* $LOGWRITES_DMDEV $SCRATCH_MNT
>
> $MOUNT_PROG, and I think we can follow _dmerror_mount() for this mount
> function.
>
>> +}
>> +
>> +_unmount_log_writes()
>> +{
>> +     $UMOUNT_PROG $SCRATCH_MNT
>> +}
>> +
>> +# _replay_log <mark>
>
> _replay_log looks like replaying filesystem log/journal, prefixed with
> _log_writes? So I guess we're going the "_log_writes_<verb>" way :)
>
>> +#
>> +# This replays the log contained on $INTEGRITY_DEV onto $SCRATCH_DEV upto the
>                                        ^^^^^^^^^^^^^^ LOGWRITES_DEV
>> +# mark passed in.
>> +_replay_log()
>> +{
>> +     _mark=$1
>> +
>> +     $here/src/log-writes/replay-log --log $LOGWRITES_DEV --replay $SCRATCH_DEV \
>> +             --end-mark $_mark > /dev/null 2>&1
>
> Dump output to $seqres.full for debug purpose?
>

Definitely. Good idea.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 07/14] fsx: add optional logid prefix to log messages
  2017-09-05 11:31       ` Eryu Guan
@ 2017-09-07  7:10         ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-09-07  7:10 UTC (permalink / raw)
  To: Eryu Guan
  Cc: Josef Bacik, Darrick J . Wong, Christoph Hellwig, fstests, linux-xfs

On Tue, Sep 5, 2017 at 2:31 PM, Eryu Guan <eguan@redhat.com> wrote:
> On Tue, Sep 05, 2017 at 02:24:20PM +0300, Amir Goldstein wrote:
>> On Tue, Sep 5, 2017 at 1:46 PM, Eryu Guan <eguan@redhat.com> wrote:
>> > On Wed, Aug 30, 2017 at 05:51:39PM +0300, Amir Goldstein wrote:
>> >> When writing the intermixed output of several fsx processes
>> >> to a single log file, it is usefull to prefix logs with a log id.
>> >> Use fsx -j <logid> to define the log messages prefix.
>> >
>> > Would it be better to allow any string as prefix, not limit to id
>> > number?
>> >
>>
>> Maybe, but I didn't see an immediate need for that beyond
>> the concurrent test runs, for with numeric id is sufficient.
>
> I agreed, I don't have strong preference on this.
>

Nah, you were right the first time. Using numeric id was just me
being lazy and then I had a bug where -j 0 (not surprising) does
not prefix logs with 0:
I'll fix this and re-post the fsx patches.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes
  2017-09-01  6:52                   ` Amir Goldstein
                                       ` (2 preceding siblings ...)
  2017-09-04  6:42                     ` Dave Chinner
@ 2018-05-25  8:58                     ` Amir Goldstein
  3 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2018-05-25  8:58 UTC (permalink / raw)
  To: Josef Bacik; +Cc: fstests, Theodore Tso, Eryu Guan

On Fri, Sep 1, 2017 at 9:52 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> [CC list, Ted]
>
> On Thu, Aug 31, 2017 at 11:54 PM, Josef Bacik <josef@toxicpanda.com> wrote:
>> On Thu, Aug 31, 2017 at 05:02:46PM +0300, Amir Goldstein wrote:
>>> On Thu, Aug 31, 2017 at 4:43 PM, Josef Bacik <josef@toxicpanda.com> wrote:
>>> > On Thu, Aug 31, 2017 at 03:48:44PM +0300, Amir Goldstein wrote:
>>> >>
>>> >> Josef,
>>> >>
>>> >> I am at lost with these log corruptions.
>>> >> I see log entry bios submitted and log_end_io report success,
>>> >> but then in the log I see old data on disk where that entry should be.
>>> >> This happens quite randomly and I assume it also happens on
>>> >> logged data, because tests sometime fail on checksum on ext4.
>>> >>
>>> >> Mean while I added some more log entry sanity checks and debug
>>> >> prints to replay-log to debug the corruption:
>>> >> https://github.com/amir73il/xfstests/commit/bb946deb0dc285867be394613ddb19ce281392cc
>>> >>
>>> >> This only happens to me when running in kvm, so maybe something
>>> >> with the virtio devices is fishy.
>>> >>
>>> >> Anyway, I ran out of time to work on this for now, so if you have
>>> >> any ideas and/or time to test this issue, let me know.
>>> >>
>>> >
> ...
>>>
>>
>> Alright I tested it and it's working fine for me.  I'm creating three lv's and
>> then doing
>>
>> -drive file=/dev/mapper/whatever,format=raw,cache=none,if=virtio,aio=native
>>
>> And I get /dev/vd[bcd] which I use for my test/scratch/log dev and it works out
>> fine.  What is your -drive option line and I'll duplicate what you are doing.
>> Thanks,
>>
>
> I am using Ted's kvm-xfstests, so this is the qemu command line:
> https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/kvm-xfstests#L104
>
> The only difference in -drive command is no aio=native.
> BINGO! when I add aio-native there are no more log corruptions :)
> Please try to use aio=threads to see if you also get log corruptions.
>
> Thing is we cannot change kvm-xfstests to always use aio=native because
> it is not recommended for sparse images:
> https://access.redhat.com/articles/41313
> I will try to work something out so that kvm-xfstest will use aio=native
> when using the recommended (by not default) LV setup.
>
> However, why would aio=threads cause log corruption?
> Does it indicate a bug in kvm-qemu or in dm-log-writes??
>
> Did you try to use kvm-xfstests? its quite convenient to deploy in masses,
> so I think it would be ideal to integrate crash tests with.
> It also helps unifying the environment between us fs developers
> when a bug can not be reproduced on another system. see:
> https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md
>
> Anyway, if you do end up using kvm-xfstests, you'l need this
> small patch to automatically define the log-writes device:
>
> --- a/kvm-xfstests/test-appliance/files/root/runtests.sh
> +++ b/kvm-xfstests/test-appliance/files/root/runtests.sh
> @@ -269,9 +269,11 @@ do
>             if test "$SIZE" = "large" ; then
>                 export SCRATCH_DEV=$LG_SCR_DEV
>                 export SCRATCH_MNT=$LG_SCR_MNT
> +               export LOGWRITES_DEV=$SM_SCR_DEV
>             else
>                 export SCRATCH_DEV=$SM_SCR_DEV
>                 export SCRATCH_MNT=$SM_SCR_MNT
> +               export LOGWRITES_DEV=$LG_SCR_DEV
>             fi
>         fi
>
> kvm-xfstests defined 2 sets of test/scratch a small and a large set
> and uses only one of those sets depending on command line,
> so I use the "other" scratch as the log writes device.
>

Ted,

I just realized that I am still carrying this patch, so please pull:
https://github.com/tytso/xfstests-bld/pull/8

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2018-05-25  8:58 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-30 14:51 [PATCH v2 00/14] Crash consistency xfstest using dm-log-writes Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 01/14] common/rc: convert some egrep to grep Amir Goldstein
2017-08-30 15:45   ` Darrick J. Wong
2017-08-30 14:51 ` [PATCH v2 02/14] common/rc: fix _require_xfs_io_command params check Amir Goldstein
2017-08-30 16:17   ` Darrick J. Wong
2017-08-30 14:51 ` [PATCH v2 03/14] fsx: fixes to random seed Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 04/14] fsx: fix path of .fsx* files Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 05/14] fsx: fix compile warnings Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 06/14] fsx: add support for integrity check with dm-log-writes target Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 07/14] fsx: add optional logid prefix to log messages Amir Goldstein
2017-09-05 10:46   ` Eryu Guan
2017-09-05 11:24     ` Amir Goldstein
2017-09-05 11:31       ` Eryu Guan
2017-09-07  7:10         ` Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 08/14] fsx: add support for --record-ops Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 09/14] fsx: add support for -g filldata Amir Goldstein
2017-09-05 10:50   ` Eryu Guan
2017-09-05 11:29     ` Amir Goldstein
2017-09-05 11:33       ` Eryu Guan
2017-08-30 14:51 ` [PATCH v2 10/14] log-writes: add replay-log program to replay dm-log-writes target Amir Goldstein
2017-09-05 11:03   ` Eryu Guan
2017-09-05 13:40     ` Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 11/14] replay-log: output log replay offset in verbose mode Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 12/14] replay-log: add support for replaying ops in target device sector range Amir Goldstein
2017-09-05 11:07   ` Eryu Guan
2017-09-05 11:41     ` Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 13/14] fstests: add support for working with dm-log-writes target Amir Goldstein
2017-09-05 11:22   ` Eryu Guan
2017-09-05 15:15     ` Amir Goldstein
2017-08-30 14:51 ` [PATCH v2 14/14] fstests: add crash consistency fsx test using dm-log-writes Amir Goldstein
2017-09-05 11:28   ` Eryu Guan
2017-09-05 11:52     ` Amir Goldstein
2017-08-30 15:04 ` [PATCH v2 00/14] Crash consistency xfstest " Amir Goldstein
2017-08-30 15:23   ` Josef Bacik
2017-08-30 18:39     ` Amir Goldstein
2017-08-30 18:55       ` Josef Bacik
2017-08-30 19:43         ` Amir Goldstein
     [not found]           ` <CAOQ4uxjt-zZ7_iE7ZYUcp8qWYUH=aDLSum70Dmbnth-5smFQ+A@mail.gmail.com>
     [not found]             ` <20170831134320.lnyu4jibsm3amuk7@destiny>
     [not found]               ` <CAOQ4uxhgOYDfRxZ74RNd=omOMHxF2MgP+wLe0O6HO7+emnrMfA@mail.gmail.com>
     [not found]                 ` <20170831205403.2tene34ccvw55yo7@destiny>
2017-09-01  6:52                   ` Amir Goldstein
2017-09-01  7:03                     ` Josef Bacik
2017-09-01 20:07                     ` Josef Bacik
2017-09-03 13:39                       ` Amir Goldstein
2017-09-04  6:42                     ` Dave Chinner
2017-09-04  6:49                       ` Amir Goldstein
2018-05-25  8:58                     ` Amir Goldstein
2017-08-31  3:38       ` Eryu Guan
2017-08-31  4:29         ` Amir Goldstein
2017-09-01  7:29         ` Amir Goldstein
2017-09-01  7:45           ` Eryu Guan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.