linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Fix DM DAX handling
@ 2018-05-25  2:55 Ross Zwisler
  2018-05-25  2:55 ` [PATCH 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Ross Zwisler

This series fixes a few issues that I found with DM's handling of DAX
devices.  Here are some of the issues I found:

 * We can create a dm-stripe or dm-linear device which is made up of an
   fsdax PMEM namespace and a raw PMEM namespace but which can hold a
   filesystem mounted with the -o dax mount option.  DAX operations to
   the raw PMEM namespace part lack struct page and can fail in
   interesting/unexpected ways when doing things like fork(), examining
   memory with gdb, etc.

 * We can create a dm-stripe or dm-linear device which is made up of an
   fsdax PMEM namespace and a BRD ramdisk which can hold a filesystem
   mounted with the -o dax mount option.  All I/O to this filesystem
   will fail.

 * In DM you can't transition a dm target which could possibly support
   DAX (mode DM_TYPE_DAX_BIO_BASED) to one which can't support DAX
   (mode DM_TYPE_BIO_BASED), even if you never use DAX.

The first 2 patches in this series are prep work from Darrick and Dave
which improve bdev_dax_supported().  The last 5 problems fix the above
mentioned problems in DM.  I feel that this series simplifies the
handling of DAX devices in DM, and the last 5 DM-related patches have a
net code reduction of 50 lines.


Darrick J. Wong (1):
  fs: allow per-device dax status checking for filesystems

Dave Jiang (1):
  dax: change bdev_dax_supported() to support boolean returns

Ross Zwisler (5):
  dm: fix test for DAX device support
  dm: prevent DAX mounts if not supported
  dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode
  dm-snap: remove unnecessary direct_access() stub
  dm-error: remove unnecessary direct_access() stub

 drivers/dax/super.c           | 44 +++++++++++++++++++++----------------------
 drivers/md/dm-ioctl.c         | 16 ++++++----------
 drivers/md/dm-snap.c          |  8 --------
 drivers/md/dm-table.c         | 29 +++++++++++-----------------
 drivers/md/dm-target.c        |  7 -------
 drivers/md/dm.c               |  7 ++-----
 fs/ext2/super.c               |  3 +--
 fs/ext4/super.c               |  3 +--
 fs/xfs/xfs_ioctl.c            |  3 ++-
 fs/xfs/xfs_iops.c             | 30 ++++++++++++++++++++++++-----
 fs/xfs/xfs_super.c            | 10 ++++++++--
 include/linux/dax.h           | 12 ++++--------
 include/linux/device-mapper.h |  8 ++++++--
 13 files changed, 88 insertions(+), 92 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/7] fs: allow per-device dax status checking for filesystems
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  2018-05-25  5:02   ` Darrick J. Wong
  2018-05-26 14:07   ` kbuild test robot
  2018-05-25  2:55 ` [PATCH 2/7] dax: change bdev_dax_supported() to support boolean returns Ross Zwisler
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Darrick J. Wong, Ross Zwisler

From: "Darrick J. Wong" <darrick.wong@oracle.com>

Remove __bdev_dax_supported and change to bdev_dax_supported that takes a
bdev parameter.  This enables multi-device filesystems like xfs to check
that a dax device can work for the particular filesystem.  Once that's
in place, actually fix all the parts of XFS where we need to be able to
distinguish between datadev and rtdev.

This patch fixes the problem where we screw up the dax support checking
in xfs if the datadev and rtdev have different dax capabilities.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/dax/super.c | 30 +++++++++++++++---------------
 fs/ext2/super.c     |  2 +-
 fs/ext4/super.c     |  2 +-
 fs/xfs/xfs_ioctl.c  |  3 ++-
 fs/xfs/xfs_iops.c   | 30 +++++++++++++++++++++++++-----
 fs/xfs/xfs_super.c  | 10 ++++++++--
 include/linux/dax.h | 10 +++-------
 7 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 2b2332b605e4..9206539c8330 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -73,8 +73,8 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
 #endif
 
 /**
- * __bdev_dax_supported() - Check if the device supports dax for filesystem
- * @sb: The superblock of the device
+ * bdev_dax_supported() - Check if the device supports dax for filesystem
+ * @bdev: block device to check
  * @blocksize: The block size of the device
  *
  * This is a library function for filesystems to check if the block device
@@ -82,33 +82,33 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
  *
  * Return: negative errno if unsupported, 0 if supported.
  */
-int __bdev_dax_supported(struct super_block *sb, int blocksize)
+int bdev_dax_supported(struct block_device *bdev, int blocksize)
 {
-	struct block_device *bdev = sb->s_bdev;
 	struct dax_device *dax_dev;
 	pgoff_t pgoff;
 	int err, id;
 	void *kaddr;
 	pfn_t pfn;
 	long len;
+	char buf[BDEVNAME_SIZE];
 
 	if (blocksize != PAGE_SIZE) {
-		pr_debug("VFS (%s): error: unsupported blocksize for dax\n",
-				sb->s_id);
+		pr_debug("%s: error: unsupported blocksize for dax\n",
+				bdevname(bdev, buf));
 		return -EINVAL;
 	}
 
 	err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, &pgoff);
 	if (err) {
-		pr_debug("VFS (%s): error: unaligned partition for dax\n",
-				sb->s_id);
+		pr_debug("%s: error: unaligned partition for dax\n",
+				bdevname(bdev, buf));
 		return err;
 	}
 
 	dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
 	if (!dax_dev) {
-		pr_debug("VFS (%s): error: device does not support dax\n",
-				sb->s_id);
+		pr_debug("%s: error: device does not support dax\n",
+				bdevname(bdev, buf));
 		return -EOPNOTSUPP;
 	}
 
@@ -119,8 +119,8 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
 	put_dax(dax_dev);
 
 	if (len < 1) {
-		pr_debug("VFS (%s): error: dax access failed (%ld)\n",
-				sb->s_id, len);
+		pr_debug("%s: error: dax access failed (%ld)\n",
+				bdevname(bdev, buf), len);
 		return len < 0 ? len : -EIO;
 	}
 
@@ -137,14 +137,14 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
 	} else if (pfn_t_devmap(pfn)) {
 		/* pass */;
 	} else {
-		pr_debug("VFS (%s): error: dax support not enabled\n",
-				sb->s_id);
+		pr_debug("%s: error: dax support not enabled\n",
+				bdevname(bdev, buf));
 		return -EOPNOTSUPP;
 	}
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(__bdev_dax_supported);
+EXPORT_SYMBOL_GPL(bdev_dax_supported);
 #endif
 
 enum dax_device_flags {
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index de1694512f1f..9627c3054b5c 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -961,7 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
 	blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
 
 	if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
-		err = bdev_dax_supported(sb, blocksize);
+		err = bdev_dax_supported(sb->s_bdev, blocksize);
 		if (err) {
 			ext2_msg(sb, KERN_ERR,
 				"DAX unsupported by block device. Turning off DAX.");
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eb104e8476f0..089170e99895 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3732,7 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 					" that may contain inline data");
 			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
 		}
-		err = bdev_dax_supported(sb, blocksize);
+		err = bdev_dax_supported(sb->s_bdev, blocksize);
 		if (err) {
 			ext4_msg(sb, KERN_ERR,
 				"DAX unsupported by block device. Turning off DAX.");
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 89fb1eb80aae..0effd46b965f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1103,7 +1103,8 @@ xfs_ioctl_setattr_dax_invalidate(
 	if (fa->fsx_xflags & FS_XFLAG_DAX) {
 		if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
 			return -EINVAL;
-		if (bdev_dax_supported(sb, sb->s_blocksize) < 0)
+		if (bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
+				sb->s_blocksize) < 0)
 			return -EINVAL;
 	}
 
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index a3ed3c811dfa..6e83acf74a95 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1195,6 +1195,30 @@ static const struct inode_operations xfs_inline_symlink_inode_operations = {
 	.update_time		= xfs_vn_update_time,
 };
 
+/* Figure out if this file actually supports DAX. */
+static bool
+xfs_inode_supports_dax(
+	struct xfs_inode	*ip)
+{
+	struct xfs_mount	*mp = ip->i_mount;
+
+	/* Only supported on non-reflinked files. */
+	if (!S_ISREG(VFS_I(ip)->i_mode) || xfs_is_reflink_inode(ip))
+		return false;
+
+	/* DAX mount option or DAX iflag must be set. */
+	if (!(mp->m_flags & XFS_MOUNT_DAX) &&
+	    !(ip->i_d.di_flags2 & XFS_DIFLAG2_DAX))
+		return false;
+
+	/* Block size must match page size */
+	if (mp->m_sb.sb_blocksize != PAGE_SIZE)
+		return false;
+
+	/* Device has to support DAX too. */
+	return xfs_find_daxdev_for_inode(VFS_I(ip)) != NULL;
+}
+
 STATIC void
 xfs_diflags_to_iflags(
 	struct inode		*inode,
@@ -1213,11 +1237,7 @@ xfs_diflags_to_iflags(
 		inode->i_flags |= S_SYNC;
 	if (flags & XFS_DIFLAG_NOATIME)
 		inode->i_flags |= S_NOATIME;
-	if (S_ISREG(inode->i_mode) &&
-	    ip->i_mount->m_sb.sb_blocksize == PAGE_SIZE &&
-	    !xfs_is_reflink_inode(ip) &&
-	    (ip->i_mount->m_flags & XFS_MOUNT_DAX ||
-	     ip->i_d.di_flags2 & XFS_DIFLAG2_DAX))
+	if (xfs_inode_supports_dax(ip))
 		inode->i_flags |= S_DAX;
 }
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index d71424052917..62188c2a4c36 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1690,11 +1690,17 @@ xfs_fs_fill_super(
 		sb->s_flags |= SB_I_VERSION;
 
 	if (mp->m_flags & XFS_MOUNT_DAX) {
+		int	error2 = 0;
+
 		xfs_warn(mp,
 		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
 
-		error = bdev_dax_supported(sb, sb->s_blocksize);
-		if (error) {
+		error = bdev_dax_supported(mp->m_ddev_targp->bt_bdev,
+				sb->s_blocksize);
+		if (mp->m_rtdev_targp)
+			error2 = bdev_dax_supported(mp->m_rtdev_targp->bt_bdev,
+					sb->s_blocksize);
+		if (error && error2) {
 			xfs_alert(mp,
 			"DAX unsupported by block device. Turning off DAX.");
 			mp->m_flags &= ~XFS_MOUNT_DAX;
diff --git a/include/linux/dax.h b/include/linux/dax.h
index f9eb22ad341e..509a85ac8470 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -64,12 +64,7 @@ static inline bool dax_write_cache_enabled(struct dax_device *dax_dev)
 struct writeback_control;
 int bdev_dax_pgoff(struct block_device *, sector_t, size_t, pgoff_t *pgoff);
 #if IS_ENABLED(CONFIG_FS_DAX)
-int __bdev_dax_supported(struct super_block *sb, int blocksize);
-static inline int bdev_dax_supported(struct super_block *sb, int blocksize)
-{
-	return __bdev_dax_supported(sb, blocksize);
-}
-
+int bdev_dax_supported(struct block_device *bdev, int blocksize);
 static inline struct dax_device *fs_dax_get_by_host(const char *host)
 {
 	return dax_get_by_host(host);
@@ -84,7 +79,8 @@ struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev);
 int dax_writeback_mapping_range(struct address_space *mapping,
 		struct block_device *bdev, struct writeback_control *wbc);
 #else
-static inline int bdev_dax_supported(struct super_block *sb, int blocksize)
+static inline int bdev_dax_supported(struct block_device *bdev,
+		int blocksize)
 {
 	return -EOPNOTSUPP;
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/7] dax: change bdev_dax_supported() to support boolean returns
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
  2018-05-25  2:55 ` [PATCH 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  2018-05-25  5:01   ` Darrick J. Wong
  2018-05-25  2:55 ` [PATCH 3/7] dm: fix test for DAX device support Ross Zwisler
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Dave Jiang, Ross Zwisler

From: Dave Jiang <dave.jiang@intel.com>

The function return values are confusing with the way the function is
named. We expect a true or false return value but it actually returns
0/-errno.  This makes the code very confusing. Changing the return values
to return a bool where if DAX is supported then return true and no DAX
support returns false.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/dax/super.c | 16 ++++++++--------
 fs/ext2/super.c     |  3 +--
 fs/ext4/super.c     |  3 +--
 fs/xfs/xfs_ioctl.c  |  4 ++--
 fs/xfs/xfs_super.c  | 12 ++++++------
 include/linux/dax.h |  6 +++---
 6 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 9206539c8330..e5447eddecf8 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -80,9 +80,9 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
  * This is a library function for filesystems to check if the block device
  * can be mounted with dax option.
  *
- * Return: negative errno if unsupported, 0 if supported.
+ * Return: true if supported, false if unsupported
  */
-int bdev_dax_supported(struct block_device *bdev, int blocksize)
+bool bdev_dax_supported(struct block_device *bdev, int blocksize)
 {
 	struct dax_device *dax_dev;
 	pgoff_t pgoff;
@@ -95,21 +95,21 @@ int bdev_dax_supported(struct block_device *bdev, int blocksize)
 	if (blocksize != PAGE_SIZE) {
 		pr_debug("%s: error: unsupported blocksize for dax\n",
 				bdevname(bdev, buf));
-		return -EINVAL;
+		return false;
 	}
 
 	err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, &pgoff);
 	if (err) {
 		pr_debug("%s: error: unaligned partition for dax\n",
 				bdevname(bdev, buf));
-		return err;
+		return false;
 	}
 
 	dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
 	if (!dax_dev) {
 		pr_debug("%s: error: device does not support dax\n",
 				bdevname(bdev, buf));
-		return -EOPNOTSUPP;
+		return false;
 	}
 
 	id = dax_read_lock();
@@ -121,7 +121,7 @@ int bdev_dax_supported(struct block_device *bdev, int blocksize)
 	if (len < 1) {
 		pr_debug("%s: error: dax access failed (%ld)\n",
 				bdevname(bdev, buf), len);
-		return len < 0 ? len : -EIO;
+		return false;
 	}
 
 	if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
@@ -139,10 +139,10 @@ int bdev_dax_supported(struct block_device *bdev, int blocksize)
 	} else {
 		pr_debug("%s: error: dax support not enabled\n",
 				bdevname(bdev, buf));
-		return -EOPNOTSUPP;
+		return false;
 	}
 
-	return 0;
+	return true;
 }
 EXPORT_SYMBOL_GPL(bdev_dax_supported);
 #endif
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 9627c3054b5c..c09289a42dc5 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -961,8 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
 	blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
 
 	if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
-		err = bdev_dax_supported(sb->s_bdev, blocksize);
-		if (err) {
+		if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
 			ext2_msg(sb, KERN_ERR,
 				"DAX unsupported by block device. Turning off DAX.");
 			sbi->s_mount_opt &= ~EXT2_MOUNT_DAX;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 089170e99895..2e1622907f4a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3732,8 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 					" that may contain inline data");
 			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
 		}
-		err = bdev_dax_supported(sb->s_bdev, blocksize);
-		if (err) {
+		if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
 			ext4_msg(sb, KERN_ERR,
 				"DAX unsupported by block device. Turning off DAX.");
 			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 0effd46b965f..2c70a0a4f59f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1103,8 +1103,8 @@ xfs_ioctl_setattr_dax_invalidate(
 	if (fa->fsx_xflags & FS_XFLAG_DAX) {
 		if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
 			return -EINVAL;
-		if (bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
-				sb->s_blocksize) < 0)
+		if (!bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
+				sb->s_blocksize))
 			return -EINVAL;
 	}
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 62188c2a4c36..86915dc40eed 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1690,17 +1690,17 @@ xfs_fs_fill_super(
 		sb->s_flags |= SB_I_VERSION;
 
 	if (mp->m_flags & XFS_MOUNT_DAX) {
-		int	error2 = 0;
+		bool rtdev_is_dax = false, datadev_is_dax;
 
 		xfs_warn(mp,
 		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
 
-		error = bdev_dax_supported(mp->m_ddev_targp->bt_bdev,
-				sb->s_blocksize);
+		datadev_is_dax = bdev_dax_supported(mp->m_ddev_targp->bt_bdev,
+			sb->s_blocksize);
 		if (mp->m_rtdev_targp)
-			error2 = bdev_dax_supported(mp->m_rtdev_targp->bt_bdev,
-					sb->s_blocksize);
-		if (error && error2) {
+			rtdev_is_dax = bdev_dax_supported(
+				mp->m_rtdev_targp->bt_bdev, sb->s_blocksize);
+		if (!rtdev_is_dax && !datadev_is_dax) {
 			xfs_alert(mp,
 			"DAX unsupported by block device. Turning off DAX.");
 			mp->m_flags &= ~XFS_MOUNT_DAX;
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 509a85ac8470..547eb33dbc9e 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -64,7 +64,7 @@ static inline bool dax_write_cache_enabled(struct dax_device *dax_dev)
 struct writeback_control;
 int bdev_dax_pgoff(struct block_device *, sector_t, size_t, pgoff_t *pgoff);
 #if IS_ENABLED(CONFIG_FS_DAX)
-int bdev_dax_supported(struct block_device *bdev, int blocksize);
+bool bdev_dax_supported(struct block_device *bdev, int blocksize);
 static inline struct dax_device *fs_dax_get_by_host(const char *host)
 {
 	return dax_get_by_host(host);
@@ -79,10 +79,10 @@ struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev);
 int dax_writeback_mapping_range(struct address_space *mapping,
 		struct block_device *bdev, struct writeback_control *wbc);
 #else
-static inline int bdev_dax_supported(struct block_device *bdev,
+static inline bool bdev_dax_supported(struct block_device *bdev,
 		int blocksize)
 {
-	return -EOPNOTSUPP;
+	return false;
 }
 
 static inline struct dax_device *fs_dax_get_by_host(const char *host)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/7] dm: fix test for DAX device support
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
  2018-05-25  2:55 ` [PATCH 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler
  2018-05-25  2:55 ` [PATCH 2/7] dax: change bdev_dax_supported() to support boolean returns Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  2018-05-25  2:55 ` [PATCH 4/7] dm: prevent DAX mounts if not supported Ross Zwisler
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Ross Zwisler

Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
flag is set on the device's request queue to decide whether or not the
device supports filesystem DAX.  This is insufficient because there are
devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
which don't actually support DAX.

This means that you could create a dm-linear device, for example, where the
first part of the dm-linear device was a PMEM namespace in fsdax mode and
the second part was a PMEM namespace in raw mode.  Both DM and the
filesystem you put on that dm-linear device would think the whole device
supports DAX, which would lead to bad behavior once your raw PMEM namespace
part using DAX needed struct page for something.

Fix this by using bdev_dax_supported() like filesystems do at mount time.
This checks for raw mode and also performs other tests like checking to
make sure the dax_direct_access() path works.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
---
 drivers/md/dm-table.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 0589a4da12bb..5bb994b012ca 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -885,9 +885,7 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
 static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
 			       sector_t start, sector_t len, void *data)
 {
-	struct request_queue *q = bdev_get_queue(dev->bdev);
-
-	return q && blk_queue_dax(q);
+	return bdev_dax_supported(dev->bdev, PAGE_SIZE);
 }
 
 static bool dm_table_supports_dax(struct dm_table *t)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/7] dm: prevent DAX mounts if not supported
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
                   ` (2 preceding siblings ...)
  2018-05-25  2:55 ` [PATCH 3/7] dm: fix test for DAX device support Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  2018-05-25 19:54   ` Mike Snitzer
  2018-05-25  2:55 ` [PATCH 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode Ross Zwisler
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Ross Zwisler

Currently the code in dm_dax_direct_access() only checks whether the target
type has a direct_access() operation defined, not whether the underlying
block devices all support DAX.  This latter property can be seen by looking
at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
DM device.

This is problematic if we have, for example, a dm-linear device made up of
a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
we have a working direct_access() entry point and the first member of the
dm-linear set *does* support DAX.

This allows the user to create a filesystem on the dm-linear device, and
then mount it with DAX.  The filesystem's bdev_dax_supported() test will
pass because it'll operate on the first member of the dm-linear device,
which happens to be a fsdax PMEM namespace.

All DAX I/O will then fail to that dm-linear device because the lack of
QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working.  This means that
the struct dax_device isn't ever set in the filesystem, so
dax_direct_access() will always return -EOPNOTSUPP.

By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let
the filesystem know we don't support DAX at mount time.  The filesystem
will then silently fall back and remove the dax mount option, causing it to
work properly.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
---
 drivers/md/dm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 0a7b0107ca78..9728433362d1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
 
 	if (!ti)
 		goto out;
-	if (!ti->type->direct_access)
+	if (!blk_queue_dax(md->queue))
 		goto out;
 	len = max_io_len(sector, ti) / PAGE_SECTORS;
 	if (len < 1)
 		goto out;
 	nr_pages = min(len, nr_pages);
-	if (ti->type->direct_access)
-		ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
+	ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
 
  out:
 	dm_put_live_table(md, srcu_idx);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
                   ` (3 preceding siblings ...)
  2018-05-25  2:55 ` [PATCH 4/7] dm: prevent DAX mounts if not supported Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  2018-05-25  2:55 ` [PATCH 6/7] dm-snap: remove unnecessary direct_access() stub Ross Zwisler
  2018-05-25  2:55 ` [PATCH 7/7] dm-error: " Ross Zwisler
  6 siblings, 0 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Ross Zwisler

The DM_TYPE_DAX_BIO_BASED dm_queue_mode was introduced to prevent DM
devices that could possibly support DAX from transitioning into DM devices
that cannot support DAX.

For example, the following transition will currently fail:

 dm-linear: [fsdax pmem][fsdax pmem] => [fsdax pmem][fsdax raw]
	      DM_TYPE_DAX_BIO_BASED       DM_TYPE_BIO_BASED

but these will both succeed:

 dm-linear: [fsdax pmem][brd ramdisk] => [fsdax pmem][fsdax raw]
 		DM_TYPE_DAX_BASED        DM_TYPE_BIO_BASED

 dm-linear: [fsdax pmem][fsdax raw] => [fsdax pmem][fsdax pmem]
 		DM_TYPE_BIO_BASED        DM_TYPE_DAX_BIO_BASED

This seems arbitrary, as really the choice on whether to use DAX happens at
filesystem mount time.  There's no guarantee that the in the first case
(double fsdax pmem) we were using the dax mount option with our file
system.

Instead, get rid of DM_TYPE_DAX_BIO_BASED and all the special casing around
it, and instead make the request queue's QUEUE_FLAG_DAX be our one source
of truth.  If this is set, we can use DAX, and if not, not.  We keep this
up to date in table_load() as the table changes.  As with regular block
devices the filesystem will then know at mount time whether DAX is a
supported mount option or not.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/md/dm-ioctl.c         | 16 ++++++----------
 drivers/md/dm-table.c         | 25 ++++++++++---------------
 drivers/md/dm.c               |  2 --
 include/linux/device-mapper.h |  8 ++++++--
 4 files changed, 22 insertions(+), 29 deletions(-)

diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index 5acf77de5945..d1f86d0bb2d0 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1292,15 +1292,6 @@ static int populate_table(struct dm_table *table,
 	return dm_table_complete(table);
 }
 
-static bool is_valid_type(enum dm_queue_mode cur, enum dm_queue_mode new)
-{
-	if (cur == new ||
-	    (cur == DM_TYPE_BIO_BASED && new == DM_TYPE_DAX_BIO_BASED))
-		return true;
-
-	return false;
-}
-
 static int table_load(struct file *filp, struct dm_ioctl *param, size_t param_size)
 {
 	int r;
@@ -1343,12 +1334,17 @@ static int table_load(struct file *filp, struct dm_ioctl *param, size_t param_si
 			DMWARN("unable to set up device queue for new table.");
 			goto err_unlock_md_type;
 		}
-	} else if (!is_valid_type(dm_get_md_type(md), dm_table_get_type(t))) {
+	} else if (dm_get_md_type(md) != dm_table_get_type(t)) {
 		DMWARN("can't change device type after initial table load.");
 		r = -EINVAL;
 		goto err_unlock_md_type;
 	}
 
+	if (dm_table_supports_dax(t))
+		blk_queue_flag_set(QUEUE_FLAG_DAX, md->queue);
+	else
+		blk_queue_flag_clear(QUEUE_FLAG_DAX, md->queue);
+
 	dm_unlock_md_type(md);
 
 	/* stage inactive table */
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5bb994b012ca..ea5c4a1e6f5b 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -866,7 +866,6 @@ EXPORT_SYMBOL(dm_consume_args);
 static bool __table_type_bio_based(enum dm_queue_mode table_type)
 {
 	return (table_type == DM_TYPE_BIO_BASED ||
-		table_type == DM_TYPE_DAX_BIO_BASED ||
 		table_type == DM_TYPE_NVME_BIO_BASED);
 }
 
@@ -888,7 +887,7 @@ static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
 	return bdev_dax_supported(dev->bdev, PAGE_SIZE);
 }
 
-static bool dm_table_supports_dax(struct dm_table *t)
+bool dm_table_supports_dax(struct dm_table *t)
 {
 	struct dm_target *ti;
 	unsigned i;
@@ -907,6 +906,7 @@ static bool dm_table_supports_dax(struct dm_table *t)
 
 	return true;
 }
+EXPORT_SYMBOL_GPL(dm_table_supports_dax);
 
 static bool dm_table_does_not_support_partial_completion(struct dm_table *t);
 
@@ -944,7 +944,6 @@ static int dm_table_determine_type(struct dm_table *t)
 			/* possibly upgrade to a variant of bio-based */
 			goto verify_bio_based;
 		}
-		BUG_ON(t->type == DM_TYPE_DAX_BIO_BASED);
 		BUG_ON(t->type == DM_TYPE_NVME_BIO_BASED);
 		goto verify_rq_based;
 	}
@@ -981,18 +980,14 @@ static int dm_table_determine_type(struct dm_table *t)
 verify_bio_based:
 		/* We must use this table as bio-based */
 		t->type = DM_TYPE_BIO_BASED;
-		if (dm_table_supports_dax(t) ||
-		    (list_empty(devices) && live_md_type == DM_TYPE_DAX_BIO_BASED)) {
-			t->type = DM_TYPE_DAX_BIO_BASED;
-		} else {
-			/* Check if upgrading to NVMe bio-based is valid or required */
-			tgt = dm_table_get_immutable_target(t);
-			if (tgt && !tgt->max_io_len && dm_table_does_not_support_partial_completion(t)) {
-				t->type = DM_TYPE_NVME_BIO_BASED;
-				goto verify_rq_based; /* must be stacked directly on NVMe (blk-mq) */
-			} else if (list_empty(devices) && live_md_type == DM_TYPE_NVME_BIO_BASED) {
-				t->type = DM_TYPE_NVME_BIO_BASED;
-			}
+
+		/* Check if upgrading to NVMe bio-based is valid or required */
+		tgt = dm_table_get_immutable_target(t);
+		if (tgt && !tgt->max_io_len && dm_table_does_not_support_partial_completion(t)) {
+			t->type = DM_TYPE_NVME_BIO_BASED;
+			goto verify_rq_based; /* must be stacked directly on NVMe (blk-mq) */
+		} else if (list_empty(devices) && live_md_type == DM_TYPE_NVME_BIO_BASED) {
+			t->type = DM_TYPE_NVME_BIO_BASED;
 		}
 		return 0;
 	}
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 9728433362d1..0ce06fa292fd 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2192,7 +2192,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 		}
 		break;
 	case DM_TYPE_BIO_BASED:
-	case DM_TYPE_DAX_BIO_BASED:
 		dm_init_normal_md_queue(md);
 		blk_queue_make_request(md->queue, dm_make_request);
 		break;
@@ -2910,7 +2909,6 @@ struct dm_md_mempools *dm_alloc_md_mempools(struct mapped_device *md, enum dm_qu
 
 	switch (type) {
 	case DM_TYPE_BIO_BASED:
-	case DM_TYPE_DAX_BIO_BASED:
 	case DM_TYPE_NVME_BIO_BASED:
 		pool_size = max(dm_get_reserved_bio_based_ios(), min_pool_size);
 		front_pad = roundup(per_io_data_size, __alignof__(struct dm_target_io)) + offsetof(struct dm_target_io, clone);
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 31fef7c34185..cbf3d7e7ed33 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -27,8 +27,7 @@ enum dm_queue_mode {
 	DM_TYPE_BIO_BASED	 = 1,
 	DM_TYPE_REQUEST_BASED	 = 2,
 	DM_TYPE_MQ_REQUEST_BASED = 3,
-	DM_TYPE_DAX_BIO_BASED	 = 4,
-	DM_TYPE_NVME_BIO_BASED	 = 5,
+	DM_TYPE_NVME_BIO_BASED	 = 4,
 };
 
 typedef enum { STATUSTYPE_INFO, STATUSTYPE_TABLE } status_type_t;
@@ -460,6 +459,11 @@ void dm_table_add_target_callbacks(struct dm_table *t, struct dm_target_callback
  */
 void dm_table_set_type(struct dm_table *t, enum dm_queue_mode type);
 
+/*
+ * Check to see if this target type and all table devices support DAX.
+ */
+bool dm_table_supports_dax(struct dm_table *t);
+
 /*
  * Finally call this to make the table ready for use.
  */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/7] dm-snap: remove unnecessary direct_access() stub
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
                   ` (4 preceding siblings ...)
  2018-05-25  2:55 ` [PATCH 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  2018-05-25  2:55 ` [PATCH 7/7] dm-error: " Ross Zwisler
  6 siblings, 0 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Ross Zwisler

This stub was added so that we could use dm-snap with DM_TYPE_DAX_BIO_BASED
mode devices.  That mode and the transition issues associated with it no
longer exist, so we can remove this dead code.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/md/dm-snap.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index 216035be5661..0143b158d52d 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -2305,13 +2305,6 @@ static int origin_map(struct dm_target *ti, struct bio *bio)
 	return do_origin(o->dev, bio);
 }
 
-static long origin_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
-{
-	DMWARN("device does not support dax.");
-	return -EIO;
-}
-
 /*
  * Set the target "max_io_len" field to the minimum of all the snapshots'
  * chunk sizes.
@@ -2371,7 +2364,6 @@ static struct target_type origin_target = {
 	.postsuspend = origin_postsuspend,
 	.status  = origin_status,
 	.iterate_devices = origin_iterate_devices,
-	.direct_access = origin_dax_direct_access,
 };
 
 static struct target_type snapshot_target = {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/7] dm-error: remove unnecessary direct_access() stub
  2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
                   ` (5 preceding siblings ...)
  2018-05-25  2:55 ` [PATCH 6/7] dm-snap: remove unnecessary direct_access() stub Ross Zwisler
@ 2018-05-25  2:55 ` Ross Zwisler
  6 siblings, 0 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25  2:55 UTC (permalink / raw)
  To: Toshi Kani, Mike Snitzer, dm-devel
  Cc: linux-fsdevel, linux-kernel, linux-nvdimm, Ross Zwisler

This stub was added so that we could use dm-error with
DM_TYPE_DAX_BIO_BASED mode devices.  That mode and the transition issues
associated with it no longer exist, so we can remove this dead code.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/md/dm-target.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c
index 314d17ca6466..c4dbc15f7862 100644
--- a/drivers/md/dm-target.c
+++ b/drivers/md/dm-target.c
@@ -140,12 +140,6 @@ static void io_err_release_clone_rq(struct request *clone)
 {
 }
 
-static long io_err_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
-{
-	return -EIO;
-}
-
 static struct target_type error_target = {
 	.name = "error",
 	.version = {1, 5, 0},
@@ -155,7 +149,6 @@ static struct target_type error_target = {
 	.map  = io_err_map,
 	.clone_and_map_rq = io_err_clone_and_map_rq,
 	.release_clone_rq = io_err_release_clone_rq,
-	.direct_access = io_err_dax_direct_access,
 };
 
 int __init dm_target_init(void)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/7] dax: change bdev_dax_supported() to support boolean returns
  2018-05-25  2:55 ` [PATCH 2/7] dax: change bdev_dax_supported() to support boolean returns Ross Zwisler
@ 2018-05-25  5:01   ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2018-05-25  5:01 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Toshi Kani, Mike Snitzer, dm-devel, linux-fsdevel, linux-kernel,
	linux-nvdimm, Dave Jiang

On Thu, May 24, 2018 at 08:55:13PM -0600, Ross Zwisler wrote:
> From: Dave Jiang <dave.jiang@intel.com>
> 
> The function return values are confusing with the way the function is
> named. We expect a true or false return value but it actually returns
> 0/-errno.  This makes the code very confusing. Changing the return values
> to return a bool where if DAX is supported then return true and no DAX
> support returns false.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  drivers/dax/super.c | 16 ++++++++--------
>  fs/ext2/super.c     |  3 +--
>  fs/ext4/super.c     |  3 +--
>  fs/xfs/xfs_ioctl.c  |  4 ++--
>  fs/xfs/xfs_super.c  | 12 ++++++------
>  include/linux/dax.h |  6 +++---
>  6 files changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 9206539c8330..e5447eddecf8 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -80,9 +80,9 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
>   * This is a library function for filesystems to check if the block device
>   * can be mounted with dax option.
>   *
> - * Return: negative errno if unsupported, 0 if supported.
> + * Return: true if supported, false if unsupported
>   */
> -int bdev_dax_supported(struct block_device *bdev, int blocksize)
> +bool bdev_dax_supported(struct block_device *bdev, int blocksize)
>  {
>  	struct dax_device *dax_dev;
>  	pgoff_t pgoff;
> @@ -95,21 +95,21 @@ int bdev_dax_supported(struct block_device *bdev, int blocksize)
>  	if (blocksize != PAGE_SIZE) {
>  		pr_debug("%s: error: unsupported blocksize for dax\n",
>  				bdevname(bdev, buf));
> -		return -EINVAL;
> +		return false;
>  	}
>  
>  	err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, &pgoff);
>  	if (err) {
>  		pr_debug("%s: error: unaligned partition for dax\n",
>  				bdevname(bdev, buf));
> -		return err;
> +		return false;
>  	}
>  
>  	dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
>  	if (!dax_dev) {
>  		pr_debug("%s: error: device does not support dax\n",
>  				bdevname(bdev, buf));
> -		return -EOPNOTSUPP;
> +		return false;
>  	}
>  
>  	id = dax_read_lock();
> @@ -121,7 +121,7 @@ int bdev_dax_supported(struct block_device *bdev, int blocksize)
>  	if (len < 1) {
>  		pr_debug("%s: error: dax access failed (%ld)\n",
>  				bdevname(bdev, buf), len);
> -		return len < 0 ? len : -EIO;
> +		return false;
>  	}
>  
>  	if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
> @@ -139,10 +139,10 @@ int bdev_dax_supported(struct block_device *bdev, int blocksize)
>  	} else {
>  		pr_debug("%s: error: dax support not enabled\n",
>  				bdevname(bdev, buf));
> -		return -EOPNOTSUPP;
> +		return false;
>  	}
>  
> -	return 0;
> +	return true;
>  }
>  EXPORT_SYMBOL_GPL(bdev_dax_supported);
>  #endif
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index 9627c3054b5c..c09289a42dc5 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -961,8 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
>  	blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
>  
>  	if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
> -		err = bdev_dax_supported(sb->s_bdev, blocksize);
> -		if (err) {
> +		if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
>  			ext2_msg(sb, KERN_ERR,
>  				"DAX unsupported by block device. Turning off DAX.");
>  			sbi->s_mount_opt &= ~EXT2_MOUNT_DAX;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 089170e99895..2e1622907f4a 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3732,8 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  					" that may contain inline data");
>  			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
>  		}
> -		err = bdev_dax_supported(sb->s_bdev, blocksize);
> -		if (err) {
> +		if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
>  			ext4_msg(sb, KERN_ERR,
>  				"DAX unsupported by block device. Turning off DAX.");
>  			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 0effd46b965f..2c70a0a4f59f 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1103,8 +1103,8 @@ xfs_ioctl_setattr_dax_invalidate(
>  	if (fa->fsx_xflags & FS_XFLAG_DAX) {
>  		if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
>  			return -EINVAL;
> -		if (bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
> -				sb->s_blocksize) < 0)
> +		if (!bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
> +				sb->s_blocksize))
>  			return -EINVAL;
>  	}
>  
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 62188c2a4c36..86915dc40eed 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1690,17 +1690,17 @@ xfs_fs_fill_super(
>  		sb->s_flags |= SB_I_VERSION;
>  
>  	if (mp->m_flags & XFS_MOUNT_DAX) {
> -		int	error2 = 0;
> +		bool rtdev_is_dax = false, datadev_is_dax;
>  
>  		xfs_warn(mp,
>  		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
>  
> -		error = bdev_dax_supported(mp->m_ddev_targp->bt_bdev,
> -				sb->s_blocksize);
> +		datadev_is_dax = bdev_dax_supported(mp->m_ddev_targp->bt_bdev,
> +			sb->s_blocksize);
>  		if (mp->m_rtdev_targp)
> -			error2 = bdev_dax_supported(mp->m_rtdev_targp->bt_bdev,
> -					sb->s_blocksize);
> -		if (error && error2) {
> +			rtdev_is_dax = bdev_dax_supported(
> +				mp->m_rtdev_targp->bt_bdev, sb->s_blocksize);
> +		if (!rtdev_is_dax && !datadev_is_dax) {
>  			xfs_alert(mp,
>  			"DAX unsupported by block device. Turning off DAX.");
>  			mp->m_flags &= ~XFS_MOUNT_DAX;
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 509a85ac8470..547eb33dbc9e 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -64,7 +64,7 @@ static inline bool dax_write_cache_enabled(struct dax_device *dax_dev)
>  struct writeback_control;
>  int bdev_dax_pgoff(struct block_device *, sector_t, size_t, pgoff_t *pgoff);
>  #if IS_ENABLED(CONFIG_FS_DAX)
> -int bdev_dax_supported(struct block_device *bdev, int blocksize);
> +bool bdev_dax_supported(struct block_device *bdev, int blocksize);
>  static inline struct dax_device *fs_dax_get_by_host(const char *host)
>  {
>  	return dax_get_by_host(host);
> @@ -79,10 +79,10 @@ struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev);
>  int dax_writeback_mapping_range(struct address_space *mapping,
>  		struct block_device *bdev, struct writeback_control *wbc);
>  #else
> -static inline int bdev_dax_supported(struct block_device *bdev,
> +static inline bool bdev_dax_supported(struct block_device *bdev,
>  		int blocksize)
>  {
> -	return -EOPNOTSUPP;
> +	return false;
>  }
>  
>  static inline struct dax_device *fs_dax_get_by_host(const char *host)
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] fs: allow per-device dax status checking for filesystems
  2018-05-25  2:55 ` [PATCH 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler
@ 2018-05-25  5:02   ` Darrick J. Wong
  2018-05-25 15:42     ` Ross Zwisler
  2018-05-26 14:07   ` kbuild test robot
  1 sibling, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2018-05-25  5:02 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Toshi Kani, Mike Snitzer, dm-devel, linux-fsdevel, linux-kernel,
	linux-nvdimm, xfs

On Thu, May 24, 2018 at 08:55:12PM -0600, Ross Zwisler wrote:
> From: "Darrick J. Wong" <darrick.wong@oracle.com>
> 
> Remove __bdev_dax_supported and change to bdev_dax_supported that takes a
> bdev parameter.  This enables multi-device filesystems like xfs to check
> that a dax device can work for the particular filesystem.  Once that's
> in place, actually fix all the parts of XFS where we need to be able to
> distinguish between datadev and rtdev.
> 
> This patch fixes the problem where we screw up the dax support checking
> in xfs if the datadev and rtdev have different dax capabilities.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>

Reviewed-by: Darr...oh, I'm not allowed to do that, am I?

Would you mind (re)sending this to the xfs list so that someone else can
review it?

--D

> ---
>  drivers/dax/super.c | 30 +++++++++++++++---------------
>  fs/ext2/super.c     |  2 +-
>  fs/ext4/super.c     |  2 +-
>  fs/xfs/xfs_ioctl.c  |  3 ++-
>  fs/xfs/xfs_iops.c   | 30 +++++++++++++++++++++++++-----
>  fs/xfs/xfs_super.c  | 10 ++++++++--
>  include/linux/dax.h | 10 +++-------
>  7 files changed, 55 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 2b2332b605e4..9206539c8330 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -73,8 +73,8 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
>  #endif
>  
>  /**
> - * __bdev_dax_supported() - Check if the device supports dax for filesystem
> - * @sb: The superblock of the device
> + * bdev_dax_supported() - Check if the device supports dax for filesystem
> + * @bdev: block device to check
>   * @blocksize: The block size of the device
>   *
>   * This is a library function for filesystems to check if the block device
> @@ -82,33 +82,33 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
>   *
>   * Return: negative errno if unsupported, 0 if supported.
>   */
> -int __bdev_dax_supported(struct super_block *sb, int blocksize)
> +int bdev_dax_supported(struct block_device *bdev, int blocksize)
>  {
> -	struct block_device *bdev = sb->s_bdev;
>  	struct dax_device *dax_dev;
>  	pgoff_t pgoff;
>  	int err, id;
>  	void *kaddr;
>  	pfn_t pfn;
>  	long len;
> +	char buf[BDEVNAME_SIZE];
>  
>  	if (blocksize != PAGE_SIZE) {
> -		pr_debug("VFS (%s): error: unsupported blocksize for dax\n",
> -				sb->s_id);
> +		pr_debug("%s: error: unsupported blocksize for dax\n",
> +				bdevname(bdev, buf));
>  		return -EINVAL;
>  	}
>  
>  	err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, &pgoff);
>  	if (err) {
> -		pr_debug("VFS (%s): error: unaligned partition for dax\n",
> -				sb->s_id);
> +		pr_debug("%s: error: unaligned partition for dax\n",
> +				bdevname(bdev, buf));
>  		return err;
>  	}
>  
>  	dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
>  	if (!dax_dev) {
> -		pr_debug("VFS (%s): error: device does not support dax\n",
> -				sb->s_id);
> +		pr_debug("%s: error: device does not support dax\n",
> +				bdevname(bdev, buf));
>  		return -EOPNOTSUPP;
>  	}
>  
> @@ -119,8 +119,8 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
>  	put_dax(dax_dev);
>  
>  	if (len < 1) {
> -		pr_debug("VFS (%s): error: dax access failed (%ld)\n",
> -				sb->s_id, len);
> +		pr_debug("%s: error: dax access failed (%ld)\n",
> +				bdevname(bdev, buf), len);
>  		return len < 0 ? len : -EIO;
>  	}
>  
> @@ -137,14 +137,14 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
>  	} else if (pfn_t_devmap(pfn)) {
>  		/* pass */;
>  	} else {
> -		pr_debug("VFS (%s): error: dax support not enabled\n",
> -				sb->s_id);
> +		pr_debug("%s: error: dax support not enabled\n",
> +				bdevname(bdev, buf));
>  		return -EOPNOTSUPP;
>  	}
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(__bdev_dax_supported);
> +EXPORT_SYMBOL_GPL(bdev_dax_supported);
>  #endif
>  
>  enum dax_device_flags {
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index de1694512f1f..9627c3054b5c 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -961,7 +961,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
>  	blocksize = BLOCK_SIZE << le32_to_cpu(sbi->s_es->s_log_block_size);
>  
>  	if (sbi->s_mount_opt & EXT2_MOUNT_DAX) {
> -		err = bdev_dax_supported(sb, blocksize);
> +		err = bdev_dax_supported(sb->s_bdev, blocksize);
>  		if (err) {
>  			ext2_msg(sb, KERN_ERR,
>  				"DAX unsupported by block device. Turning off DAX.");
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index eb104e8476f0..089170e99895 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3732,7 +3732,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  					" that may contain inline data");
>  			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
>  		}
> -		err = bdev_dax_supported(sb, blocksize);
> +		err = bdev_dax_supported(sb->s_bdev, blocksize);
>  		if (err) {
>  			ext4_msg(sb, KERN_ERR,
>  				"DAX unsupported by block device. Turning off DAX.");
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 89fb1eb80aae..0effd46b965f 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1103,7 +1103,8 @@ xfs_ioctl_setattr_dax_invalidate(
>  	if (fa->fsx_xflags & FS_XFLAG_DAX) {
>  		if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
>  			return -EINVAL;
> -		if (bdev_dax_supported(sb, sb->s_blocksize) < 0)
> +		if (bdev_dax_supported(xfs_find_bdev_for_inode(VFS_I(ip)),
> +				sb->s_blocksize) < 0)
>  			return -EINVAL;
>  	}
>  
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index a3ed3c811dfa..6e83acf74a95 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -1195,6 +1195,30 @@ static const struct inode_operations xfs_inline_symlink_inode_operations = {
>  	.update_time		= xfs_vn_update_time,
>  };
>  
> +/* Figure out if this file actually supports DAX. */
> +static bool
> +xfs_inode_supports_dax(
> +	struct xfs_inode	*ip)
> +{
> +	struct xfs_mount	*mp = ip->i_mount;
> +
> +	/* Only supported on non-reflinked files. */
> +	if (!S_ISREG(VFS_I(ip)->i_mode) || xfs_is_reflink_inode(ip))
> +		return false;
> +
> +	/* DAX mount option or DAX iflag must be set. */
> +	if (!(mp->m_flags & XFS_MOUNT_DAX) &&
> +	    !(ip->i_d.di_flags2 & XFS_DIFLAG2_DAX))
> +		return false;
> +
> +	/* Block size must match page size */
> +	if (mp->m_sb.sb_blocksize != PAGE_SIZE)
> +		return false;
> +
> +	/* Device has to support DAX too. */
> +	return xfs_find_daxdev_for_inode(VFS_I(ip)) != NULL;
> +}
> +
>  STATIC void
>  xfs_diflags_to_iflags(
>  	struct inode		*inode,
> @@ -1213,11 +1237,7 @@ xfs_diflags_to_iflags(
>  		inode->i_flags |= S_SYNC;
>  	if (flags & XFS_DIFLAG_NOATIME)
>  		inode->i_flags |= S_NOATIME;
> -	if (S_ISREG(inode->i_mode) &&
> -	    ip->i_mount->m_sb.sb_blocksize == PAGE_SIZE &&
> -	    !xfs_is_reflink_inode(ip) &&
> -	    (ip->i_mount->m_flags & XFS_MOUNT_DAX ||
> -	     ip->i_d.di_flags2 & XFS_DIFLAG2_DAX))
> +	if (xfs_inode_supports_dax(ip))
>  		inode->i_flags |= S_DAX;
>  }
>  
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index d71424052917..62188c2a4c36 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1690,11 +1690,17 @@ xfs_fs_fill_super(
>  		sb->s_flags |= SB_I_VERSION;
>  
>  	if (mp->m_flags & XFS_MOUNT_DAX) {
> +		int	error2 = 0;
> +
>  		xfs_warn(mp,
>  		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
>  
> -		error = bdev_dax_supported(sb, sb->s_blocksize);
> -		if (error) {
> +		error = bdev_dax_supported(mp->m_ddev_targp->bt_bdev,
> +				sb->s_blocksize);
> +		if (mp->m_rtdev_targp)
> +			error2 = bdev_dax_supported(mp->m_rtdev_targp->bt_bdev,
> +					sb->s_blocksize);
> +		if (error && error2) {
>  			xfs_alert(mp,
>  			"DAX unsupported by block device. Turning off DAX.");
>  			mp->m_flags &= ~XFS_MOUNT_DAX;
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index f9eb22ad341e..509a85ac8470 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -64,12 +64,7 @@ static inline bool dax_write_cache_enabled(struct dax_device *dax_dev)
>  struct writeback_control;
>  int bdev_dax_pgoff(struct block_device *, sector_t, size_t, pgoff_t *pgoff);
>  #if IS_ENABLED(CONFIG_FS_DAX)
> -int __bdev_dax_supported(struct super_block *sb, int blocksize);
> -static inline int bdev_dax_supported(struct super_block *sb, int blocksize)
> -{
> -	return __bdev_dax_supported(sb, blocksize);
> -}
> -
> +int bdev_dax_supported(struct block_device *bdev, int blocksize);
>  static inline struct dax_device *fs_dax_get_by_host(const char *host)
>  {
>  	return dax_get_by_host(host);
> @@ -84,7 +79,8 @@ struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev);
>  int dax_writeback_mapping_range(struct address_space *mapping,
>  		struct block_device *bdev, struct writeback_control *wbc);
>  #else
> -static inline int bdev_dax_supported(struct super_block *sb, int blocksize)
> +static inline int bdev_dax_supported(struct block_device *bdev,
> +		int blocksize)
>  {
>  	return -EOPNOTSUPP;
>  }
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] fs: allow per-device dax status checking for filesystems
  2018-05-25  5:02   ` Darrick J. Wong
@ 2018-05-25 15:42     ` Ross Zwisler
  2018-05-25 19:23       ` Darrick J. Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25 15:42 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Ross Zwisler, Toshi Kani, Mike Snitzer, dm-devel, linux-fsdevel,
	linux-kernel, linux-nvdimm, xfs

On Thu, May 24, 2018 at 10:02:18PM -0700, Darrick J. Wong wrote:
> On Thu, May 24, 2018 at 08:55:12PM -0600, Ross Zwisler wrote:
> > From: "Darrick J. Wong" <darrick.wong@oracle.com>
> > 
> > Remove __bdev_dax_supported and change to bdev_dax_supported that takes a
> > bdev parameter.  This enables multi-device filesystems like xfs to check
> > that a dax device can work for the particular filesystem.  Once that's
> > in place, actually fix all the parts of XFS where we need to be able to
> > distinguish between datadev and rtdev.
> > 
> > This patch fixes the problem where we screw up the dax support checking
> > in xfs if the datadev and rtdev have different dax capabilities.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> 
> Reviewed-by: Darr...oh, I'm not allowed to do that, am I?
> 
> Would you mind (re)sending this to the xfs list so that someone else can
> review it?
> 
> --D

Thanks for the review, Darrick.

I think at one point Dave said that if you touch more than 1 filesystem with a
series you should just CC linux-fsdevel and omit the individual filesystems?
I realize that this series only touches ext2 and ext4 a little, but that's
what I opted for.

Is that sufficient to get to the rest of the XFS developers, or would you like
a resend adding linux-xfs?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] fs: allow per-device dax status checking for filesystems
  2018-05-25 15:42     ` Ross Zwisler
@ 2018-05-25 19:23       ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2018-05-25 19:23 UTC (permalink / raw)
  To: Ross Zwisler, Toshi Kani, Mike Snitzer, dm-devel, linux-fsdevel,
	linux-kernel, linux-nvdimm, xfs

On Fri, May 25, 2018 at 09:42:29AM -0600, Ross Zwisler wrote:
> On Thu, May 24, 2018 at 10:02:18PM -0700, Darrick J. Wong wrote:
> > On Thu, May 24, 2018 at 08:55:12PM -0600, Ross Zwisler wrote:
> > > From: "Darrick J. Wong" <darrick.wong@oracle.com>
> > > 
> > > Remove __bdev_dax_supported and change to bdev_dax_supported that takes a
> > > bdev parameter.  This enables multi-device filesystems like xfs to check
> > > that a dax device can work for the particular filesystem.  Once that's
> > > in place, actually fix all the parts of XFS where we need to be able to
> > > distinguish between datadev and rtdev.
> > > 
> > > This patch fixes the problem where we screw up the dax support checking
> > > in xfs if the datadev and rtdev have different dax capabilities.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > 
> > Reviewed-by: Darr...oh, I'm not allowed to do that, am I?
> > 
> > Would you mind (re)sending this to the xfs list so that someone else can
> > review it?
> > 
> > --D
> 
> Thanks for the review, Darrick.
> 
> I think at one point Dave said that if you touch more than 1 filesystem with a
> series you should just CC linux-fsdevel and omit the individual filesystems?
> I realize that this series only touches ext2 and ext4 a little, but that's
> what I opted for.
> 
> Is that sufficient to get to the rest of the XFS developers, or would you like
> a resend adding linux-xfs?

For a patch by any other author I'd be fine with -fsdevel, but for this
specific one (because it's written by me and I'm not yet so bofh that I
review my own patches ;)) someone else from the xfs community has to
review it, so yes I'd like it to be sent to linux-xfs, please.

(And it's ok to note in the resend that this is the same as the -fsdevel
series, but posted here to attract the attention of the other xfs
developers.)

--D

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/7] dm: prevent DAX mounts if not supported
  2018-05-25  2:55 ` [PATCH 4/7] dm: prevent DAX mounts if not supported Ross Zwisler
@ 2018-05-25 19:54   ` Mike Snitzer
  2018-05-25 21:36     ` Ross Zwisler
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2018-05-25 19:54 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Toshi Kani, dm-devel, linux-fsdevel, linux-kernel, linux-nvdimm

On Thu, May 24 2018 at 10:55pm -0400,
Ross Zwisler <ross.zwisler@linux.intel.com> wrote:

> Currently the code in dm_dax_direct_access() only checks whether the target
> type has a direct_access() operation defined, not whether the underlying
> block devices all support DAX.  This latter property can be seen by looking
> at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
> DM device.
> 
> This is problematic if we have, for example, a dm-linear device made up of
> a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
> QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
> we have a working direct_access() entry point and the first member of the
> dm-linear set *does* support DAX.
> 
> This allows the user to create a filesystem on the dm-linear device, and
> then mount it with DAX.  The filesystem's bdev_dax_supported() test will
> pass because it'll operate on the first member of the dm-linear device,
> which happens to be a fsdax PMEM namespace.
> 
> All DAX I/O will then fail to that dm-linear device because the lack of
> QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working.  This means that
> the struct dax_device isn't ever set in the filesystem, so
> dax_direct_access() will always return -EOPNOTSUPP.
> 
> By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let
> the filesystem know we don't support DAX at mount time.  The filesystem
> will then silently fall back and remove the dax mount option, causing it to
> work properly.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
> ---
>  drivers/md/dm.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 0a7b0107ca78..9728433362d1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
>  
>  	if (!ti)
>  		goto out;
> -	if (!ti->type->direct_access)
> +	if (!blk_queue_dax(md->queue))
>  		goto out;
>  	len = max_io_len(sector, ti) / PAGE_SECTORS;
>  	if (len < 1)
>  		goto out;
>  	nr_pages = min(len, nr_pages);
> -	if (ti->type->direct_access)
> -		ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
> +	ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);

So I followed all the rationale for this patch.  But the last change
doesn't make any sense.  We should still verify that the target has
ti->type->direct_access before calling it.  So please reinstate that
check before calling it.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/7] dm: prevent DAX mounts if not supported
  2018-05-25 19:54   ` Mike Snitzer
@ 2018-05-25 21:36     ` Ross Zwisler
  0 siblings, 0 replies; 15+ messages in thread
From: Ross Zwisler @ 2018-05-25 21:36 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Ross Zwisler, Toshi Kani, dm-devel, linux-fsdevel, linux-kernel,
	linux-nvdimm

On Fri, May 25, 2018 at 03:54:10PM -0400, Mike Snitzer wrote:
> On Thu, May 24 2018 at 10:55pm -0400,
> Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> 
> > Currently the code in dm_dax_direct_access() only checks whether the target
> > type has a direct_access() operation defined, not whether the underlying
> > block devices all support DAX.  This latter property can be seen by looking
> > at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
> > DM device.
> > 
> > This is problematic if we have, for example, a dm-linear device made up of
> > a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
> > QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
> > we have a working direct_access() entry point and the first member of the
> > dm-linear set *does* support DAX.
> > 
> > This allows the user to create a filesystem on the dm-linear device, and
> > then mount it with DAX.  The filesystem's bdev_dax_supported() test will
> > pass because it'll operate on the first member of the dm-linear device,
> > which happens to be a fsdax PMEM namespace.
> > 
> > All DAX I/O will then fail to that dm-linear device because the lack of
> > QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working.  This means that
> > the struct dax_device isn't ever set in the filesystem, so
> > dax_direct_access() will always return -EOPNOTSUPP.
> > 
> > By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let
> > the filesystem know we don't support DAX at mount time.  The filesystem
> > will then silently fall back and remove the dax mount option, causing it to
> > work properly.
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
> > ---
> >  drivers/md/dm.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 0a7b0107ca78..9728433362d1 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
> >  
> >  	if (!ti)
> >  		goto out;
> > -	if (!ti->type->direct_access)
> > +	if (!blk_queue_dax(md->queue))
> >  		goto out;
> >  	len = max_io_len(sector, ti) / PAGE_SECTORS;
> >  	if (len < 1)
> >  		goto out;
> >  	nr_pages = min(len, nr_pages);
> > -	if (ti->type->direct_access)
> > -		ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
> > +	ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
> 
> So I followed all the rationale for this patch.  But the last change
> doesn't make any sense.  We should still verify that the target has
> ti->type->direct_access before calling it.  So please reinstate that
> check before calling it.

You know that type has direct_access() via the blk_queue_dax() check.  This
tells you not only that the target has direct_access(), but also that you've
successfully checked all members of that DM device and they all have working
DAX I/O paths, etc.  This is all done via the bdev_dax_supported() check and
the rest of the code in dm_table_supports_dax() and device_supports_dax().

If this is too subtle I can add a comment or add the check back.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] fs: allow per-device dax status checking for filesystems
  2018-05-25  2:55 ` [PATCH 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler
  2018-05-25  5:02   ` Darrick J. Wong
@ 2018-05-26 14:07   ` kbuild test robot
  1 sibling, 0 replies; 15+ messages in thread
From: kbuild test robot @ 2018-05-26 14:07 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: kbuild-all, Toshi Kani, Mike Snitzer, dm-devel, linux-fsdevel,
	linux-kernel, linux-nvdimm, Darrick J. Wong, Ross Zwisler

[-- Attachment #1: Type: text/plain, Size: 3636 bytes --]

Hi Darrick,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.17-rc6]
[cannot apply to dm/for-next]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ross-Zwisler/Fix-DM-DAX-handling/20180526-212746
config: i386-randconfig-x014-201820 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> drivers//dax/super.c:85:5: error: redefinition of 'bdev_dax_supported'
    int bdev_dax_supported(struct block_device *bdev, int blocksize)
        ^~~~~~~~~~~~~~~~~~
   In file included from drivers//dax/super.c:23:0:
   include/linux/dax.h:82:19: note: previous definition of 'bdev_dax_supported' was here
    static inline int bdev_dax_supported(struct block_device *bdev,
                      ^~~~~~~~~~~~~~~~~~

vim +/bdev_dax_supported +85 drivers//dax/super.c

    74	
    75	/**
    76	 * bdev_dax_supported() - Check if the device supports dax for filesystem
    77	 * @bdev: block device to check
    78	 * @blocksize: The block size of the device
    79	 *
    80	 * This is a library function for filesystems to check if the block device
    81	 * can be mounted with dax option.
    82	 *
    83	 * Return: negative errno if unsupported, 0 if supported.
    84	 */
  > 85	int bdev_dax_supported(struct block_device *bdev, int blocksize)
    86	{
    87		struct dax_device *dax_dev;
    88		pgoff_t pgoff;
    89		int err, id;
    90		void *kaddr;
    91		pfn_t pfn;
    92		long len;
    93		char buf[BDEVNAME_SIZE];
    94	
    95		if (blocksize != PAGE_SIZE) {
    96			pr_debug("%s: error: unsupported blocksize for dax\n",
    97					bdevname(bdev, buf));
    98			return -EINVAL;
    99		}
   100	
   101		err = bdev_dax_pgoff(bdev, 0, PAGE_SIZE, &pgoff);
   102		if (err) {
   103			pr_debug("%s: error: unaligned partition for dax\n",
   104					bdevname(bdev, buf));
   105			return err;
   106		}
   107	
   108		dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
   109		if (!dax_dev) {
   110			pr_debug("%s: error: device does not support dax\n",
   111					bdevname(bdev, buf));
   112			return -EOPNOTSUPP;
   113		}
   114	
   115		id = dax_read_lock();
   116		len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
   117		dax_read_unlock(id);
   118	
   119		put_dax(dax_dev);
   120	
   121		if (len < 1) {
   122			pr_debug("%s: error: dax access failed (%ld)\n",
   123					bdevname(bdev, buf), len);
   124			return len < 0 ? len : -EIO;
   125		}
   126	
   127		if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
   128			/*
   129			 * An arch that has enabled the pmem api should also
   130			 * have its drivers support pfn_t_devmap()
   131			 *
   132			 * This is a developer warning and should not trigger in
   133			 * production. dax_flush() will crash since it depends
   134			 * on being able to do (page_address(pfn_to_page())).
   135			 */
   136			WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
   137		} else if (pfn_t_devmap(pfn)) {
   138			/* pass */;
   139		} else {
   140			pr_debug("%s: error: dax support not enabled\n",
   141					bdevname(bdev, buf));
   142			return -EOPNOTSUPP;
   143		}
   144	
   145		return 0;
   146	}
   147	EXPORT_SYMBOL_GPL(bdev_dax_supported);
   148	#endif
   149	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30057 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-05-26 14:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-25  2:55 [PATCH 0/7] Fix DM DAX handling Ross Zwisler
2018-05-25  2:55 ` [PATCH 1/7] fs: allow per-device dax status checking for filesystems Ross Zwisler
2018-05-25  5:02   ` Darrick J. Wong
2018-05-25 15:42     ` Ross Zwisler
2018-05-25 19:23       ` Darrick J. Wong
2018-05-26 14:07   ` kbuild test robot
2018-05-25  2:55 ` [PATCH 2/7] dax: change bdev_dax_supported() to support boolean returns Ross Zwisler
2018-05-25  5:01   ` Darrick J. Wong
2018-05-25  2:55 ` [PATCH 3/7] dm: fix test for DAX device support Ross Zwisler
2018-05-25  2:55 ` [PATCH 4/7] dm: prevent DAX mounts if not supported Ross Zwisler
2018-05-25 19:54   ` Mike Snitzer
2018-05-25 21:36     ` Ross Zwisler
2018-05-25  2:55 ` [PATCH 5/7] dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode Ross Zwisler
2018-05-25  2:55 ` [PATCH 6/7] dm-snap: remove unnecessary direct_access() stub Ross Zwisler
2018-05-25  2:55 ` [PATCH 7/7] dm-error: " Ross Zwisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).