nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [fstests PATCH v2 0/2] Add test for DAX+ext4+DMA race
@ 2018-07-10 22:09 Ross Zwisler
  2018-07-10 22:09 ` [fstests PATCH v2 1/2] src/: add license and copyright info to files Ross Zwisler
  2018-07-10 22:09 ` [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch Ross Zwisler
  0 siblings, 2 replies; 8+ messages in thread
From: Ross Zwisler @ 2018-07-10 22:09 UTC (permalink / raw)
  To: Eryu Guan, fstests, Dave Chinner
  Cc: Jan Kara, linux-nvdimm, Christoph Hellwig, linux-ext4

As of kernel version 4.18-rc4, Linux has an issue with ext4+DAX where DMA
and direct I/O operations aren't synchronized with respect to operations
which can change the block mappings of an inode.  This can potentially
lead to data corruption, and is fixed in the following kernel series:

https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html

These patches add a regression test for that series.

---

Changes since v1:
 * Added copyright and license information to files I've previously
   authored in src/.

 * Added copyright, license information and info about the test's
   purpose to src/t_mmap_collision.c. (Dave)

 * Added comments to tests/generic/999 explaining why we unset the
   mount flags for SCRATCH_MNT. (Dave)

 * Subtests in src/t_mmap_collision.c are now conditional based on the
   existence of the necessary FALLOC_FL_* flags. (Eryu)

 * Added the test to the punch, collapse and zero groups. (Eryu)

Ross Zwisler (2):
  src/: add license and copyright info to files
  generic/999: test DAX DMA vs truncate/hole-punch

 .gitignore                          |   1 +
 src/Makefile                        |   2 +-
 src/t_ext4_dax_inline_corruption.c  |   2 +
 src/t_ext4_dax_journal_corruption.c |   2 +
 src/t_mmap_collision.c              | 235 ++++++++++++++++++++++++++++++++++++
 src/t_mmap_cow_race.c               |   2 +
 src/t_mmap_stale_pmd.c              |   2 +
 tests/generic/999                   |  53 ++++++++
 tests/generic/999.out               |   2 +
 tests/generic/group                 |   1 +
 10 files changed, 301 insertions(+), 1 deletion(-)
 create mode 100644 src/t_mmap_collision.c
 create mode 100755 tests/generic/999
 create mode 100644 tests/generic/999.out

-- 
2.14.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [fstests PATCH v2 1/2] src/: add license and copyright info to files
  2018-07-10 22:09 [fstests PATCH v2 0/2] Add test for DAX+ext4+DMA race Ross Zwisler
@ 2018-07-10 22:09 ` Ross Zwisler
  2018-07-14 12:54   ` Eryu Guan
  2018-07-10 22:09 ` [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch Ross Zwisler
  1 sibling, 1 reply; 8+ messages in thread
From: Ross Zwisler @ 2018-07-10 22:09 UTC (permalink / raw)
  To: Eryu Guan, fstests, Dave Chinner
  Cc: Jan Kara, linux-nvdimm, Christoph Hellwig, linux-ext4

Add copyright and license info to files that I've authored in src/.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 src/t_ext4_dax_inline_corruption.c  | 2 ++
 src/t_ext4_dax_journal_corruption.c | 2 ++
 src/t_mmap_cow_race.c               | 2 ++
 src/t_mmap_stale_pmd.c              | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/src/t_ext4_dax_inline_corruption.c b/src/t_ext4_dax_inline_corruption.c
index b52bcc0d..e1a39a6c 100644
--- a/src/t_ext4_dax_inline_corruption.c
+++ b/src/t_ext4_dax_inline_corruption.c
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018 Intel Corporation. */
 #include <errno.h>
 #include <fcntl.h>
 #include <stdio.h>
diff --git a/src/t_ext4_dax_journal_corruption.c b/src/t_ext4_dax_journal_corruption.c
index fccef8f5..ba7a96e4 100644
--- a/src/t_ext4_dax_journal_corruption.c
+++ b/src/t_ext4_dax_journal_corruption.c
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018 Intel Corporation. */
 #include <errno.h>
 #include <fcntl.h>
 #include <stdio.h>
diff --git a/src/t_mmap_cow_race.c b/src/t_mmap_cow_race.c
index 207ba425..5941fcf3 100644
--- a/src/t_mmap_cow_race.c
+++ b/src/t_mmap_cow_race.c
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2017 Intel Corporation. */
 #include <errno.h>
 #include <fcntl.h>
 #include <libgen.h>
diff --git a/src/t_mmap_stale_pmd.c b/src/t_mmap_stale_pmd.c
index 6a52201c..a139d8ac 100644
--- a/src/t_mmap_stale_pmd.c
+++ b/src/t_mmap_stale_pmd.c
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2017 Intel Corporation. */
 #include <errno.h>
 #include <fcntl.h>
 #include <libgen.h>
-- 
2.14.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch
  2018-07-10 22:09 [fstests PATCH v2 0/2] Add test for DAX+ext4+DMA race Ross Zwisler
  2018-07-10 22:09 ` [fstests PATCH v2 1/2] src/: add license and copyright info to files Ross Zwisler
@ 2018-07-10 22:09 ` Ross Zwisler
  2018-07-14  1:41   ` Dave Chinner
  2018-07-14 12:53   ` Eryu Guan
  1 sibling, 2 replies; 8+ messages in thread
From: Ross Zwisler @ 2018-07-10 22:09 UTC (permalink / raw)
  To: Eryu Guan, fstests, Dave Chinner
  Cc: Jan Kara, linux-nvdimm, Christoph Hellwig, linux-ext4

This adds a regression test for the following series:

https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html

which adds synchronization between DAX DMA in ext4 and truncate/hole-punch.
The intention of the test is to test those specific changes, but it runs
fine both with XFS and without DAX so I've put it in the generic tests
instead of ext4 and not restricted it to only DAX configurations.

When run with v4.18-rc4 + DAX + ext4, this test will hit the following
WARN_ON_ONCE() in dax_disassociate_entry():

	WARN_ON_ONCE(trunc && page_ref_count(page) > 1);

If you change this to a WARN_ON() instead, you can see that each of the
four paths being exercised in this test hits that condition many times in
the one second that the subtest is being run.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 .gitignore             |   1 +
 src/Makefile           |   2 +-
 src/t_mmap_collision.c | 235 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/999      |  53 +++++++++++
 tests/generic/999.out  |   2 +
 tests/generic/group    |   1 +
 6 files changed, 293 insertions(+), 1 deletion(-)
 create mode 100644 src/t_mmap_collision.c
 create mode 100755 tests/generic/999
 create mode 100644 tests/generic/999.out

diff --git a/.gitignore b/.gitignore
index efc73a7c..ea1aac8a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -125,6 +125,7 @@
 /src/t_holes
 /src/t_immutable
 /src/t_locks_execve
+/src/t_mmap_collision
 /src/t_mmap_cow_race
 /src/t_mmap_dio
 /src/t_mmap_fallocate
diff --git a/src/Makefile b/src/Makefile
index 9e971bcc..41826585 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -16,7 +16,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
 	holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
 	t_mmap_cow_race t_mmap_fallocate fsync-err t_mmap_write_ro \
 	t_ext4_dax_journal_corruption t_ext4_dax_inline_corruption \
-	t_ofd_locks t_locks_execve
+	t_ofd_locks t_locks_execve t_mmap_collision
 
 LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
 	preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
diff --git a/src/t_mmap_collision.c b/src/t_mmap_collision.c
new file mode 100644
index 00000000..5f58d858
--- /dev/null
+++ b/src/t_mmap_collision.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Intel Corporation.
+ *
+ * As of kernel version 4.18-rc4, Linux has an issue with ext4+DAX where DMA
+ * and direct I/O operations aren't synchronized with respect to operations
+ * which can change the block mappings of an inode.  This means that we can
+ * schedule an I/O for an inode and have the block mapping for that inode
+ * change before the I/O is actually complete.  So, blocks which were once
+ * allocated to a given inode and then freed could still have I/O operations
+ * happening to them.  If these blocks have also been reallocated to a
+ * different inode, this interaction can lead to data corruption.
+ *
+ * This test exercises four of the paths in ext4 which hit this issue.
+ */
+#include <errno.h>
+#include <fcntl.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#define PAGE(a) ((a)*0x1000)
+#define FILE_SIZE PAGE(4)
+
+void *dax_data;
+int nodax_fd;
+int dax_fd;
+bool done;
+
+#define err_exit(op)                                                          \
+{                                                                             \
+	fprintf(stderr, "%s %s: %s\n", __func__, op, strerror(errno));        \
+	exit(1);                                                              \
+}                                                                             \
+
+#if defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
+void punch_hole_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		if (read != FILE_SIZE || rc != 0)
+			err_exit("pread");
+
+		rc = fallocate(dax_fd,
+				FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				0, FILE_SIZE);
+		if (rc < 0)
+			err_exit("fallocate");
+
+		usleep(rand() % 1000);
+	}
+}
+#else
+void punch_hole_fn(void *ptr) { }
+#endif
+
+#if defined(FALLOC_FL_ZERO_RANGE) && defined(FALLOC_FL_KEEP_SIZE)
+void zero_range_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		if (read != FILE_SIZE || rc != 0)
+			err_exit("pread");
+
+		rc = fallocate(dax_fd,
+				FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE,
+				0, FILE_SIZE);
+		if (rc < 0)
+			err_exit("fallocate");
+
+		usleep(rand() % 1000);
+	}
+}
+#else
+void zero_range_fn(void *ptr) { }
+#endif
+
+void truncate_down_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		if (ftruncate(dax_fd, 0) < 0)
+			err_exit("ftruncate");
+		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+			err_exit("fallocate");
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		/*
+		 * For this test we ignore errors from pread().  These errors
+		 * can happen if we try and read while the other thread has
+		 * made the file size 0.
+		 */
+
+		usleep(rand() % 1000);
+	}
+}
+
+#ifdef FALLOC_FL_COLLAPSE_RANGE
+void collapse_range_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+			err_exit("fallocate 1");
+		if (fallocate(dax_fd, FALLOC_FL_COLLAPSE_RANGE, 0, PAGE(1)) < 0)
+			err_exit("fallocate 2");
+		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+			err_exit("fallocate 3");
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		/* For this test we ignore errors from pread. */
+
+		usleep(rand() % 1000);
+	}
+}
+#else
+void collapse_range_fn(void *ptr) { }
+#endif
+
+void run_test(void (*test_fn)(void *))
+{
+	const int NUM_THREADS = 2;
+	pthread_t worker_thread[NUM_THREADS];
+	int i;
+
+	done = 0;
+	for (i = 0; i < NUM_THREADS; i++)
+		pthread_create(&worker_thread[i], NULL, (void*)test_fn, NULL);
+
+	sleep(1);
+	done = 1;
+
+	for (i = 0; i < NUM_THREADS; i++)
+		pthread_join(worker_thread[i], NULL);
+}
+
+int main(int argc, char *argv[])
+{
+	int err;
+
+	if (argc != 3) {
+		printf("Usage: %s <dax file> <non-dax file>\n",
+				basename(argv[0]));
+		exit(0);
+	}
+
+	dax_fd = open(argv[1], O_RDWR|O_CREAT, S_IRUSR|S_IWUSR);
+	if (dax_fd < 0)
+		err_exit("dax_fd open");
+
+	nodax_fd = open(argv[2], O_RDWR|O_CREAT|O_DIRECT, S_IRUSR|S_IWUSR);
+	if (nodax_fd < 0)
+		err_exit("nodax_fd open");
+
+	if (ftruncate(dax_fd, 0) < 0)
+		err_exit("dax_fd ftruncate");
+	if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+		err_exit("dax_fd fallocate");
+
+	if (ftruncate(nodax_fd, 0) < 0)
+		err_exit("nodax_fd ftruncate");
+	if (fallocate(nodax_fd, 0, 0, FILE_SIZE) < 0)
+		err_exit("nodax_fd fallocate");
+
+	dax_data = mmap(NULL, FILE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED,
+			dax_fd, 0);
+	if (dax_data == MAP_FAILED)
+		err_exit("mmap");
+
+	run_test(&punch_hole_fn);
+	run_test(&zero_range_fn);
+	run_test(&truncate_down_fn);
+	run_test(&collapse_range_fn);
+
+	if (munmap(dax_data, FILE_SIZE) != 0)
+		err_exit("munmap");
+
+	err = close(dax_fd);
+	if (err < 0)
+		err_exit("dax_fd close");
+
+	err = close(nodax_fd);
+	if (err < 0)
+		err_exit("nodax_fd close");
+
+	return 0;
+}
diff --git a/tests/generic/999 b/tests/generic/999
new file mode 100755
index 00000000..0db824fb
--- /dev/null
+++ b/tests/generic/999
@@ -0,0 +1,53 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2018 Intel Corporation.  All Rights Reserved.
+#
+# FS QA Test generic/999
+#
+# This is a regression test for kernel patch:
+#   ext4: handle layout changes to pinned DAX mapping
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_test
+_require_test_program "t_mmap_collision"
+
+# To get the failure we turn off DAX on our SCRATCH_MNT so we can get O_DIRECT
+# behavior.  We will continue to use unmodified mount options for the test
+# TEST_DIR.  The failures fixed by the above mentioned kernel patch trigger
+# when those mount options include "-o dax", but the test runs fine without
+# that option so we don't require it.
+export MOUNT_OPTIONS=""
+_scratch_unmount >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+# real QA test starts here
+$here/src/t_mmap_collision $TEST_DIR/testfile $SCRATCH_MNT/testfile
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/999.out b/tests/generic/999.out
new file mode 100644
index 00000000..3b276ca8
--- /dev/null
+++ b/tests/generic/999.out
@@ -0,0 +1,2 @@
+QA output created by 999
+Silence is golden
diff --git a/tests/generic/group b/tests/generic/group
index b2a093f4..71f5c945 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -502,3 +502,4 @@
 497 auto quick swap collapse
 498 auto quick log
 499 auto quick rw collapse zero
+999 auto quick dax punch collapse zero
-- 
2.14.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch
  2018-07-10 22:09 ` [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch Ross Zwisler
@ 2018-07-14  1:41   ` Dave Chinner
  2018-07-16 16:07     ` Ross Zwisler
  2018-07-14 12:53   ` Eryu Guan
  1 sibling, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2018-07-14  1:41 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Jan Kara, Eryu Guan, linux-nvdimm, fstests, linux-ext4,
	Christoph Hellwig

On Tue, Jul 10, 2018 at 04:09:42PM -0600, Ross Zwisler wrote:
> This adds a regression test for the following series:
> 
> https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html
> 
> which adds synchronization between DAX DMA in ext4 and truncate/hole-punch.
> The intention of the test is to test those specific changes, but it runs
> fine both with XFS and without DAX so I've put it in the generic tests
> instead of ext4 and not restricted it to only DAX configurations.
> 
> When run with v4.18-rc4 + DAX + ext4, this test will hit the following
> WARN_ON_ONCE() in dax_disassociate_entry():
> 
> 	WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
> 
> If you change this to a WARN_ON() instead, you can see that each of the
> four paths being exercised in this test hits that condition many times in
> the one second that the subtest is being run.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>

just a FWIW:

> +int main(int argc, char *argv[])
> +{
> +	int err;
> +
> +	if (argc != 3) {
> +		printf("Usage: %s <dax file> <non-dax file>\n",
> +				basename(argv[0]));
> +		exit(0);
> +	}

This is an example of why we should be supporting per-inode DAX
enablement in all filesystems, not a global mount option....

> +_supported_fs generic
> +_supported_os Linux
> +_require_test
> +_require_test_program "t_mmap_collision"
> +
> +# To get the failure we turn off DAX on our SCRATCH_MNT so we can get O_DIRECT
> +# behavior.  We will continue to use unmodified mount options for the test
> +# TEST_DIR.  The failures fixed by the above mentioned kernel patch trigger
> +# when those mount options include "-o dax", but the test runs fine without
> +# that option so we don't require it.
> +export MOUNT_OPTIONS=""
> +_scratch_unmount >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1

And then all this crap would go away....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch
  2018-07-10 22:09 ` [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch Ross Zwisler
  2018-07-14  1:41   ` Dave Chinner
@ 2018-07-14 12:53   ` Eryu Guan
  2018-07-27 21:12     ` [fstests PATCH v3] " Ross Zwisler
  1 sibling, 1 reply; 8+ messages in thread
From: Eryu Guan @ 2018-07-14 12:53 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Jan Kara, Eryu Guan, linux-nvdimm, Dave Chinner, fstests,
	linux-ext4, Christoph Hellwig

On Tue, Jul 10, 2018 at 04:09:42PM -0600, Ross Zwisler wrote:
> This adds a regression test for the following series:
> 
> https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html
> 
> which adds synchronization between DAX DMA in ext4 and truncate/hole-punch.
> The intention of the test is to test those specific changes, but it runs
> fine both with XFS and without DAX so I've put it in the generic tests
> instead of ext4 and not restricted it to only DAX configurations.
> 
> When run with v4.18-rc4 + DAX + ext4, this test will hit the following
> WARN_ON_ONCE() in dax_disassociate_entry():
> 
> 	WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
> 
> If you change this to a WARN_ON() instead, you can see that each of the
> four paths being exercised in this test hits that condition many times in
> the one second that the subtest is being run.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> ---
>  .gitignore             |   1 +
>  src/Makefile           |   2 +-
>  src/t_mmap_collision.c | 235 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/999      |  53 +++++++++++
>  tests/generic/999.out  |   2 +
>  tests/generic/group    |   1 +
>  6 files changed, 293 insertions(+), 1 deletion(-)
>  create mode 100644 src/t_mmap_collision.c
>  create mode 100755 tests/generic/999
>  create mode 100644 tests/generic/999.out
> 
> diff --git a/.gitignore b/.gitignore
> index efc73a7c..ea1aac8a 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -125,6 +125,7 @@
>  /src/t_holes
>  /src/t_immutable
>  /src/t_locks_execve
> +/src/t_mmap_collision
>  /src/t_mmap_cow_race
>  /src/t_mmap_dio
>  /src/t_mmap_fallocate
> diff --git a/src/Makefile b/src/Makefile
> index 9e971bcc..41826585 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -16,7 +16,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
>  	holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
>  	t_mmap_cow_race t_mmap_fallocate fsync-err t_mmap_write_ro \
>  	t_ext4_dax_journal_corruption t_ext4_dax_inline_corruption \
> -	t_ofd_locks t_locks_execve
> +	t_ofd_locks t_locks_execve t_mmap_collision
>  
>  LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
>  	preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> diff --git a/src/t_mmap_collision.c b/src/t_mmap_collision.c
> new file mode 100644
> index 00000000..5f58d858
> --- /dev/null
> +++ b/src/t_mmap_collision.c
> @@ -0,0 +1,235 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2018 Intel Corporation.
> + *
> + * As of kernel version 4.18-rc4, Linux has an issue with ext4+DAX where DMA
> + * and direct I/O operations aren't synchronized with respect to operations
> + * which can change the block mappings of an inode.  This means that we can
> + * schedule an I/O for an inode and have the block mapping for that inode
> + * change before the I/O is actually complete.  So, blocks which were once
> + * allocated to a given inode and then freed could still have I/O operations
> + * happening to them.  If these blocks have also been reallocated to a
> + * different inode, this interaction can lead to data corruption.
> + *
> + * This test exercises four of the paths in ext4 which hit this issue.
> + */
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <pthread.h>
> +#include <stdbool.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/mman.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#define PAGE(a) ((a)*0x1000)
> +#define FILE_SIZE PAGE(4)
> +
> +void *dax_data;
> +int nodax_fd;
> +int dax_fd;
> +bool done;
> +
> +#define err_exit(op)                                                          \
> +{                                                                             \
> +	fprintf(stderr, "%s %s: %s\n", __func__, op, strerror(errno));        \
> +	exit(1);                                                              \
> +}                                                                             \
> +
> +#if defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)

Yeah, dropping '_require_xfs_io_command "falloc"' etc is fine, as long
as we do the check in c code.

But this check doesn't seem sufficient, it's possible that the system
has the flags defined but the filesystem doesn't support such
operations. e.g. I got failures like below on btrfs

    +collapse_range_fn fallocate 2: Operation not supported
    +collapse_range_fn fallocate 2: Operation not supported

and similar failure on NFS

    +main dax_fd fallocate: Operation not supported

IMHO, using _require_xfs_io_command is much easier :)

(Though we only need filesystems to support such operations not xfs_io,
I don't think we'll lose much test coverage, as it's unlikely that
filesystems have fallocate/collapse/punch/zero support but xfs_io
doesn't.)

> +void punch_hole_fn(void *ptr)
> +{
> +	ssize_t read;
> +	int rc;
> +
> +	while (!done) {
> +		read = 0;
> +
> +		do {
> +			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
> +					read);
> +			if (rc > 0)
> +				read += rc;
> +		} while (rc > 0);
> +
> +		if (read != FILE_SIZE || rc != 0)
> +			err_exit("pread");
> +
> +		rc = fallocate(dax_fd,
> +				FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> +				0, FILE_SIZE);
> +		if (rc < 0)
> +			err_exit("fallocate");
> +
> +		usleep(rand() % 1000);
> +	}
> +}
> +#else
> +void punch_hole_fn(void *ptr) { }
> +#endif
> +
> +#if defined(FALLOC_FL_ZERO_RANGE) && defined(FALLOC_FL_KEEP_SIZE)
> +void zero_range_fn(void *ptr)
> +{
> +	ssize_t read;
> +	int rc;
> +
> +	while (!done) {
> +		read = 0;
> +
> +		do {
> +			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
> +					read);
> +			if (rc > 0)
> +				read += rc;
> +		} while (rc > 0);
> +
> +		if (read != FILE_SIZE || rc != 0)
> +			err_exit("pread");
> +
> +		rc = fallocate(dax_fd,
> +				FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE,
> +				0, FILE_SIZE);
> +		if (rc < 0)
> +			err_exit("fallocate");
> +
> +		usleep(rand() % 1000);
> +	}
> +}
> +#else
> +void zero_range_fn(void *ptr) { }
> +#endif
> +
> +void truncate_down_fn(void *ptr)
> +{
> +	ssize_t read;
> +	int rc;
> +
> +	while (!done) {
> +		read = 0;
> +
> +		if (ftruncate(dax_fd, 0) < 0)
> +			err_exit("ftruncate");
> +		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
> +			err_exit("fallocate");
> +
> +		do {
> +			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
> +					read);
> +			if (rc > 0)
> +				read += rc;
> +		} while (rc > 0);
> +
> +		/*
> +		 * For this test we ignore errors from pread().  These errors
> +		 * can happen if we try and read while the other thread has
> +		 * made the file size 0.
> +		 */
> +
> +		usleep(rand() % 1000);
> +	}
> +}
> +
> +#ifdef FALLOC_FL_COLLAPSE_RANGE
> +void collapse_range_fn(void *ptr)
> +{
> +	ssize_t read;
> +	int rc;
> +
> +	while (!done) {
> +		read = 0;
> +
> +		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
> +			err_exit("fallocate 1");
> +		if (fallocate(dax_fd, FALLOC_FL_COLLAPSE_RANGE, 0, PAGE(1)) < 0)
> +			err_exit("fallocate 2");
> +		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
> +			err_exit("fallocate 3");
> +
> +		do {
> +			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
> +					read);
> +			if (rc > 0)
> +				read += rc;
> +		} while (rc > 0);
> +
> +		/* For this test we ignore errors from pread. */
> +
> +		usleep(rand() % 1000);
> +	}
> +}
> +#else
> +void collapse_range_fn(void *ptr) { }
> +#endif
> +
> +void run_test(void (*test_fn)(void *))
> +{
> +	const int NUM_THREADS = 2;
> +	pthread_t worker_thread[NUM_THREADS];
> +	int i;
> +
> +	done = 0;
> +	for (i = 0; i < NUM_THREADS; i++)
> +		pthread_create(&worker_thread[i], NULL, (void*)test_fn, NULL);
> +
> +	sleep(1);
> +	done = 1;
> +
> +	for (i = 0; i < NUM_THREADS; i++)
> +		pthread_join(worker_thread[i], NULL);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	int err;
> +
> +	if (argc != 3) {
> +		printf("Usage: %s <dax file> <non-dax file>\n",
> +				basename(argv[0]));
> +		exit(0);
> +	}
> +
> +	dax_fd = open(argv[1], O_RDWR|O_CREAT, S_IRUSR|S_IWUSR);
> +	if (dax_fd < 0)
> +		err_exit("dax_fd open");
> +
> +	nodax_fd = open(argv[2], O_RDWR|O_CREAT|O_DIRECT, S_IRUSR|S_IWUSR);
> +	if (nodax_fd < 0)
> +		err_exit("nodax_fd open");
> +
> +	if (ftruncate(dax_fd, 0) < 0)
> +		err_exit("dax_fd ftruncate");
> +	if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
> +		err_exit("dax_fd fallocate");
> +
> +	if (ftruncate(nodax_fd, 0) < 0)
> +		err_exit("nodax_fd ftruncate");
> +	if (fallocate(nodax_fd, 0, 0, FILE_SIZE) < 0)
> +		err_exit("nodax_fd fallocate");
> +
> +	dax_data = mmap(NULL, FILE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED,
> +			dax_fd, 0);
> +	if (dax_data == MAP_FAILED)
> +		err_exit("mmap");
> +
> +	run_test(&punch_hole_fn);
> +	run_test(&zero_range_fn);
> +	run_test(&truncate_down_fn);
> +	run_test(&collapse_range_fn);
> +
> +	if (munmap(dax_data, FILE_SIZE) != 0)
> +		err_exit("munmap");
> +
> +	err = close(dax_fd);
> +	if (err < 0)
> +		err_exit("dax_fd close");
> +
> +	err = close(nodax_fd);
> +	if (err < 0)
> +		err_exit("nodax_fd close");
> +
> +	return 0;
> +}
> diff --git a/tests/generic/999 b/tests/generic/999
> new file mode 100755
> index 00000000..0db824fb
> --- /dev/null
> +++ b/tests/generic/999
> @@ -0,0 +1,53 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2018 Intel Corporation.  All Rights Reserved.
> +#
> +# FS QA Test generic/999
> +#
> +# This is a regression test for kernel patch:
> +#   ext4: handle layout changes to pinned DAX mapping

Better to have detailed test description here too.

> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# remove previous $seqres.full before test
> +rm -f $seqres.full
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +_supported_os Linux
> +_require_test
> +_require_test_program "t_mmap_collision"
> +
> +# To get the failure we turn off DAX on our SCRATCH_MNT so we can get O_DIRECT
> +# behavior.  We will continue to use unmodified mount options for the test
> +# TEST_DIR.  The failures fixed by the above mentioned kernel patch trigger
> +# when those mount options include "-o dax", but the test runs fine without
> +# that option so we don't require it.
> +export MOUNT_OPTIONS=""
> +_scratch_unmount >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1

Need to _require_scratch first and do _scratch_mkfs, then
_scratch_mount. e.g.

_require_scratch
...
_scratch_mkfs >> $seqres.full
# <comments on unseting MOUNT_OPTIONS here>
export MOUNT_OPTIONS=""
_scratch_mount

Thanks,
Eryu

> +
> +# real QA test starts here
> +$here/src/t_mmap_collision $TEST_DIR/testfile $SCRATCH_MNT/testfile
> +
> +# success, all done
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/999.out b/tests/generic/999.out
> new file mode 100644
> index 00000000..3b276ca8
> --- /dev/null
> +++ b/tests/generic/999.out
> @@ -0,0 +1,2 @@
> +QA output created by 999
> +Silence is golden
> diff --git a/tests/generic/group b/tests/generic/group
> index b2a093f4..71f5c945 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -502,3 +502,4 @@
>  497 auto quick swap collapse
>  498 auto quick log
>  499 auto quick rw collapse zero
> +999 auto quick dax punch collapse zero
> -- 
> 2.14.4
> 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fstests PATCH v2 1/2] src/: add license and copyright info to files
  2018-07-10 22:09 ` [fstests PATCH v2 1/2] src/: add license and copyright info to files Ross Zwisler
@ 2018-07-14 12:54   ` Eryu Guan
  0 siblings, 0 replies; 8+ messages in thread
From: Eryu Guan @ 2018-07-14 12:54 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Jan Kara, Eryu Guan, linux-nvdimm, Dave Chinner, fstests,
	linux-ext4, Christoph Hellwig

On Tue, Jul 10, 2018 at 04:09:41PM -0600, Ross Zwisler wrote:
> Add copyright and license info to files that I've authored in src/.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>

Thanks for the copyright and license update, I've applied it for next
update.

Thanks,
Eryu
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch
  2018-07-14  1:41   ` Dave Chinner
@ 2018-07-16 16:07     ` Ross Zwisler
  0 siblings, 0 replies; 8+ messages in thread
From: Ross Zwisler @ 2018-07-16 16:07 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Jan Kara, Eryu Guan, linux-nvdimm, fstests, Christoph Hellwig,
	linux-ext4

On Sat, Jul 14, 2018 at 11:41:05AM +1000, Dave Chinner wrote:
> On Tue, Jul 10, 2018 at 04:09:42PM -0600, Ross Zwisler wrote:
> > This adds a regression test for the following series:
> > 
> > https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html
> > 
> > which adds synchronization between DAX DMA in ext4 and truncate/hole-punch.
> > The intention of the test is to test those specific changes, but it runs
> > fine both with XFS and without DAX so I've put it in the generic tests
> > instead of ext4 and not restricted it to only DAX configurations.
> > 
> > When run with v4.18-rc4 + DAX + ext4, this test will hit the following
> > WARN_ON_ONCE() in dax_disassociate_entry():
> > 
> > 	WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
> > 
> > If you change this to a WARN_ON() instead, you can see that each of the
> > four paths being exercised in this test hits that condition many times in
> > the one second that the subtest is being run.
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> 
> just a FWIW:
> 
> > +int main(int argc, char *argv[])
> > +{
> > +	int err;
> > +
> > +	if (argc != 3) {
> > +		printf("Usage: %s <dax file> <non-dax file>\n",
> > +				basename(argv[0]));
> > +		exit(0);
> > +	}
> 
> This is an example of why we should be supporting per-inode DAX
> enablement in all filesystems, not a global mount option....
> 
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_test
> > +_require_test_program "t_mmap_collision"
> > +
> > +# To get the failure we turn off DAX on our SCRATCH_MNT so we can get O_DIRECT
> > +# behavior.  We will continue to use unmodified mount options for the test
> > +# TEST_DIR.  The failures fixed by the above mentioned kernel patch trigger
> > +# when those mount options include "-o dax", but the test runs fine without
> > +# that option so we don't require it.
> > +export MOUNT_OPTIONS=""
> > +_scratch_unmount >> $seqres.full 2>&1
> > +_scratch_mount >> $seqres.full 2>&1
> 
> And then all this crap would go away....

Yep, I totally agree. :)
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [fstests PATCH v3] generic/999: test DAX DMA vs truncate/hole-punch
  2018-07-14 12:53   ` Eryu Guan
@ 2018-07-27 21:12     ` Ross Zwisler
  0 siblings, 0 replies; 8+ messages in thread
From: Ross Zwisler @ 2018-07-27 21:12 UTC (permalink / raw)
  To: Eryu Guan, fstests, Dave Chinner, Jan Kara, Dan Williams,
	Christoph Hellwig, linux-nvdimm, linux-ext4

From: Ross Zwisler <ross.zwisler@linux.intel.com>

This adds a regression test for the following series:

https://lists.01.org/pipermail/linux-nvdimm/2018-July/016842.html

which adds synchronization between DAX DMA in ext4 and truncate/hole-punch.
The intention of the test is to test those specific changes, but it runs
fine both with XFS and without DAX so I've put it in the generic tests
instead of ext4 and not restricted it to only DAX configurations.

When run with v4.18-rc6 + DAX + ext4, this test will hit the following
WARN_ON_ONCE() in dax_disassociate_entry():

	WARN_ON_ONCE(trunc && page_ref_count(page) > 1);

If you change this to a WARN_ON() instead, you can see that each of the
four paths being exercised in this test hits that condition many times in
the one second that the subtest is being run.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---

Changes since v2:
 - Added detailed description to tests/generic/999 explaining the purpose
   of the test (Eryu).

 - Added _require_xfs_io_command for falloc, fpunch, fcollapse and fzero
   to account for filesystems that don't support these commands (Eryu).

 - Incorporated other feedback from Eryu.  Thanks, Eryu, for the review.

---
 .gitignore             |   1 +
 src/Makefile           |   2 +-
 src/t_mmap_collision.c | 235 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/999      |  64 ++++++++++++++
 tests/generic/999.out  |   2 +
 tests/generic/group    |   1 +
 6 files changed, 304 insertions(+), 1 deletion(-)
 create mode 100644 src/t_mmap_collision.c
 create mode 100755 tests/generic/999
 create mode 100644 tests/generic/999.out

diff --git a/.gitignore b/.gitignore
index efc73a7c..ea1aac8a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -125,6 +125,7 @@
 /src/t_holes
 /src/t_immutable
 /src/t_locks_execve
+/src/t_mmap_collision
 /src/t_mmap_cow_race
 /src/t_mmap_dio
 /src/t_mmap_fallocate
diff --git a/src/Makefile b/src/Makefile
index 9e971bcc..41826585 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -16,7 +16,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
 	holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
 	t_mmap_cow_race t_mmap_fallocate fsync-err t_mmap_write_ro \
 	t_ext4_dax_journal_corruption t_ext4_dax_inline_corruption \
-	t_ofd_locks t_locks_execve
+	t_ofd_locks t_locks_execve t_mmap_collision
 
 LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
 	preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
diff --git a/src/t_mmap_collision.c b/src/t_mmap_collision.c
new file mode 100644
index 00000000..d547bc05
--- /dev/null
+++ b/src/t_mmap_collision.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Intel Corporation.
+ *
+ * As of kernel version 4.18-rc6 Linux has an issue with ext4+DAX where DMA
+ * and direct I/O operations aren't synchronized with respect to operations
+ * which can change the block mappings of an inode.  This means that we can
+ * schedule an I/O for an inode and have the block mapping for that inode
+ * change before the I/O is actually complete.  So, blocks which were once
+ * allocated to a given inode and then freed could still have I/O operations
+ * happening to them.  If these blocks have also been reallocated to a
+ * different inode, this interaction can lead to data corruption.
+ *
+ * This test exercises four of the paths in ext4 which hit this issue.
+ */
+#include <errno.h>
+#include <fcntl.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#define PAGE(a) ((a)*0x1000)
+#define FILE_SIZE PAGE(4)
+
+void *dax_data;
+int nodax_fd;
+int dax_fd;
+bool done;
+
+#define err_exit(op)                                                          \
+{                                                                             \
+	fprintf(stderr, "%s %s: %s\n", __func__, op, strerror(errno));        \
+	exit(1);                                                              \
+}
+
+#if defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
+void punch_hole_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		if (read != FILE_SIZE || rc != 0)
+			err_exit("pread");
+
+		rc = fallocate(dax_fd,
+				FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				0, FILE_SIZE);
+		if (rc < 0)
+			err_exit("fallocate");
+
+		usleep(rand() % 1000);
+	}
+}
+#else
+void punch_hole_fn(void *ptr) { }
+#endif
+
+#if defined(FALLOC_FL_ZERO_RANGE) && defined(FALLOC_FL_KEEP_SIZE)
+void zero_range_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		if (read != FILE_SIZE || rc != 0)
+			err_exit("pread");
+
+		rc = fallocate(dax_fd,
+				FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE,
+				0, FILE_SIZE);
+		if (rc < 0)
+			err_exit("fallocate");
+
+		usleep(rand() % 1000);
+	}
+}
+#else
+void zero_range_fn(void *ptr) { }
+#endif
+
+void truncate_down_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		if (ftruncate(dax_fd, 0) < 0)
+			err_exit("ftruncate");
+		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+			err_exit("fallocate");
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		/*
+		 * For this test we ignore errors from pread().  These errors
+		 * can happen if we try and read while the other thread has
+		 * made the file size 0.
+		 */
+
+		usleep(rand() % 1000);
+	}
+}
+
+#ifdef FALLOC_FL_COLLAPSE_RANGE
+void collapse_range_fn(void *ptr)
+{
+	ssize_t read;
+	int rc;
+
+	while (!done) {
+		read = 0;
+
+		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+			err_exit("fallocate 1");
+		if (fallocate(dax_fd, FALLOC_FL_COLLAPSE_RANGE, 0, PAGE(1)) < 0)
+			err_exit("fallocate 2");
+		if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+			err_exit("fallocate 3");
+
+		do {
+			rc = pread(nodax_fd, dax_data + read, FILE_SIZE - read,
+					read);
+			if (rc > 0)
+				read += rc;
+		} while (rc > 0);
+
+		/* For this test we ignore errors from pread. */
+
+		usleep(rand() % 1000);
+	}
+}
+#else
+void collapse_range_fn(void *ptr) { }
+#endif
+
+void run_test(void (*test_fn)(void *))
+{
+	const int NUM_THREADS = 2;
+	pthread_t worker_thread[NUM_THREADS];
+	int i;
+
+	done = 0;
+	for (i = 0; i < NUM_THREADS; i++)
+		pthread_create(&worker_thread[i], NULL, (void*)test_fn, NULL);
+
+	sleep(1);
+	done = 1;
+
+	for (i = 0; i < NUM_THREADS; i++)
+		pthread_join(worker_thread[i], NULL);
+}
+
+int main(int argc, char *argv[])
+{
+	int err;
+
+	if (argc != 3) {
+		printf("Usage: %s <dax file> <non-dax file>\n",
+				basename(argv[0]));
+		exit(0);
+	}
+
+	dax_fd = open(argv[1], O_RDWR|O_CREAT, S_IRUSR|S_IWUSR);
+	if (dax_fd < 0)
+		err_exit("dax_fd open");
+
+	nodax_fd = open(argv[2], O_RDWR|O_CREAT|O_DIRECT, S_IRUSR|S_IWUSR);
+	if (nodax_fd < 0)
+		err_exit("nodax_fd open");
+
+	if (ftruncate(dax_fd, 0) < 0)
+		err_exit("dax_fd ftruncate");
+	if (fallocate(dax_fd, 0, 0, FILE_SIZE) < 0)
+		err_exit("dax_fd fallocate");
+
+	if (ftruncate(nodax_fd, 0) < 0)
+		err_exit("nodax_fd ftruncate");
+	if (fallocate(nodax_fd, 0, 0, FILE_SIZE) < 0)
+		err_exit("nodax_fd fallocate");
+
+	dax_data = mmap(NULL, FILE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED,
+			dax_fd, 0);
+	if (dax_data == MAP_FAILED)
+		err_exit("mmap");
+
+	run_test(&punch_hole_fn);
+	run_test(&zero_range_fn);
+	run_test(&truncate_down_fn);
+	run_test(&collapse_range_fn);
+
+	if (munmap(dax_data, FILE_SIZE) != 0)
+		err_exit("munmap");
+
+	err = close(dax_fd);
+	if (err < 0)
+		err_exit("dax_fd close");
+
+	err = close(nodax_fd);
+	if (err < 0)
+		err_exit("nodax_fd close");
+
+	return 0;
+}
diff --git a/tests/generic/999 b/tests/generic/999
new file mode 100755
index 00000000..0681b075
--- /dev/null
+++ b/tests/generic/999
@@ -0,0 +1,64 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2018 Intel Corporation.  All Rights Reserved.
+#
+# FS QA Test generic/999
+#
+# This is a regression test for kernel patch:
+#   ext4: handle layout changes to pinned DAX mapping
+#
+# This test exercises each of the DAX paths in ext4 which remove blocks from
+# an inode's block map.  This includes things like hole punch, truncate down,
+# etc.  This test was written to regression test errors seen with an ext4 +
+# DAX setup, but the test runs fine with or without DAX and with XFS so we
+# don't require the DAX mount option or a specific filesystem for the test.
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_test
+_require_scratch
+_require_test_program "t_mmap_collision"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "fcollapse"
+_require_xfs_io_command "fzero"
+
+_scratch_mkfs >> $seqres.full 2>&1
+# To get the failure we turn off DAX on our SCRATCH_MNT so we can get O_DIRECT
+# behavior.  We will continue to use unmodified mount options for the test
+# TEST_DIR.  The failures fixed by the above mentioned kernel patch trigger
+# when those mount options include "-o dax", but the test runs fine without
+# that option so we don't require it.
+export MOUNT_OPTIONS=""
+_scratch_mount >> $seqres.full 2>&1
+
+# real QA test starts here
+$here/src/t_mmap_collision $TEST_DIR/testfile $SCRATCH_MNT/testfile
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/999.out b/tests/generic/999.out
new file mode 100644
index 00000000..3b276ca8
--- /dev/null
+++ b/tests/generic/999.out
@@ -0,0 +1,2 @@
+QA output created by 999
+Silence is golden
diff --git a/tests/generic/group b/tests/generic/group
index d0b7dcf6..92016cf4 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -505,3 +505,4 @@
 500 auto thin trim
 501 auto quick clone log
 502 auto quick log
+999 auto quick dax punch collapse zero
-- 
2.14.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-07-27 21:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-10 22:09 [fstests PATCH v2 0/2] Add test for DAX+ext4+DMA race Ross Zwisler
2018-07-10 22:09 ` [fstests PATCH v2 1/2] src/: add license and copyright info to files Ross Zwisler
2018-07-14 12:54   ` Eryu Guan
2018-07-10 22:09 ` [fstests PATCH v2 2/2] generic/999: test DAX DMA vs truncate/hole-punch Ross Zwisler
2018-07-14  1:41   ` Dave Chinner
2018-07-16 16:07     ` Ross Zwisler
2018-07-14 12:53   ` Eryu Guan
2018-07-27 21:12     ` [fstests PATCH v3] " Ross Zwisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).