linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations
@ 2020-05-21 19:13 ira.weiny
  2020-05-21 19:13 ` [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Changes from V3:
	Change EXT4_DAX_FL to bit24
	Cache device DAX support in the super block and use that is
		ext4_should_use_dax()

Changes from V2:
	Rework DAX exclusivity with verity and encryption based on feedback
	from Eric

Enable the same per file DAX support in ext4 as was done for xfs.  This series
builds and depends on the V11 series for xfs.[1]

This passes the same xfstests test as XFS.

The only issue is that this modifies the old mount option parsing code rather
than waiting for the new parsing code to be finalized.

This series starts with 3 fixes which include making Verity and Encrypt truly
mutually exclusive from DAX.  I think these first 3 patches should be picked up
for 5.8 regardless of what is decided regarding the mount parsing.

[1] https://lore.kernel.org/lkml/20200428002142.404144-1-ira.weiny@intel.com/

To: linux-ext4@vger.kernel.org
To: Andreas Dilger <adilger.kernel@dilger.ca>
To: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
To: Eric Biggers <ebiggers@kernel.org>

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Ira Weiny (8):
  fs/ext4: Narrow scope of DAX check in setflags
  fs/ext4: Disallow verity if inode is DAX
  fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  fs/ext4: Update ext4_should_use_dax()
  fs/ext4: Only change S_DAX on inode load
  fs/ext4: Make DAX mount option a tri-state
  fs/ext4: Introduce DAX inode flag
  Documentation/dax: Update DAX enablement for ext4

 Documentation/filesystems/dax.txt         |  6 +-
 Documentation/filesystems/ext4/verity.rst |  3 +
 fs/ext4/ext4.h                            | 23 ++++--
 fs/ext4/ialloc.c                          |  2 +-
 fs/ext4/inode.c                           | 26 +++++--
 fs/ext4/ioctl.c                           | 41 +++++++++--
 fs/ext4/super.c                           | 85 ++++++++++++++++++-----
 fs/ext4/verity.c                          |  5 +-
 include/uapi/linux/fs.h                   |  1 +
 9 files changed, 154 insertions(+), 38 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-22  7:58   ` Jan Kara
  2020-05-21 19:13 ` [PATCH V4 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

When preventing DAX and journaling on an inode.  Use the effective DAX
check rather than the mount option.

This will be required to support per inode DAX flags.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index bfc1281fc4cb..5813e5e73eab 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -393,9 +393,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
 	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
 		/*
 		 * Changes to the journaling mode can cause unsafe changes to
-		 * S_DAX if we are using the DAX mount option.
+		 * S_DAX if the inode is DAX
 		 */
-		if (test_opt(inode->i_sb, DAX)) {
+		if (IS_DAX(inode)) {
 			err = -EBUSY;
 			goto flags_out;
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
  2020-05-21 19:13 ` [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-21 19:13 ` [PATCH V4 3/8] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Verity and DAX are incompatible.  Changing the DAX mode due to a verity
flag change is wrong without a corresponding address_space_operations
update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

(Setting DAX is already disabled if Verity is set first.)

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Remove Section title 'Verity and DAX'

Changes:
	remove WARN_ON_ONCE
	Add documentation for DAX/Verity exclusivity
---
 Documentation/filesystems/ext4/verity.rst | 3 +++
 fs/ext4/verity.c                          | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/filesystems/ext4/verity.rst b/Documentation/filesystems/ext4/verity.rst
index 3e4c0ee0e068..e99ff3fd09f7 100644
--- a/Documentation/filesystems/ext4/verity.rst
+++ b/Documentation/filesystems/ext4/verity.rst
@@ -39,3 +39,6 @@ is encrypted as well as the data itself.
 
 Verity files cannot have blocks allocated past the end of the verity
 metadata.
+
+Verity and DAX are not compatible and attempts to set both of these flags
+on a file will fail.
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dc5ec724d889..f05a09fb2ae4 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
 	handle_t *handle;
 	int err;
 
+	if (IS_DAX(inode))
+		return -EINVAL;
+
 	if (ext4_verity_in_progress(inode))
 		return -EBUSY;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 3/8] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
  2020-05-21 19:13 ` [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
  2020-05-21 19:13 ` [PATCH V4 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-21 19:13 ` [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

In prep for the new tri-state mount option which then introduces
EXT4_MOUNT_DAX_NEVER.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes:
	New patch
---
 fs/ext4/ext4.h  |  4 ++--
 fs/ext4/inode.c |  2 +-
 fs/ext4/super.c | 12 ++++++------
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 91eb4381cae5..1a3daf2d18ef 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1123,9 +1123,9 @@ struct ext4_inode_info {
 #define EXT4_MOUNT_MINIX_DF		0x00080	/* Mimics the Minix statfs */
 #define EXT4_MOUNT_NOLOAD		0x00100	/* Don't use existing journal*/
 #ifdef CONFIG_FS_DAX
-#define EXT4_MOUNT_DAX			0x00200	/* Direct Access */
+#define EXT4_MOUNT_DAX_ALWAYS		0x00200	/* Direct Access */
 #else
-#define EXT4_MOUNT_DAX			0
+#define EXT4_MOUNT_DAX_ALWAYS		0
 #endif
 #define EXT4_MOUNT_DATA_FLAGS		0x00C00	/* Mode for data writes: */
 #define EXT4_MOUNT_JOURNAL_DATA		0x00400	/* Write data to journal */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2a4aae6acdcb..a10ff12194db 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,7 +4400,7 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 
 static bool ext4_should_use_dax(struct inode *inode)
 {
-	if (!test_opt(inode->i_sb, DAX))
+	if (!test_opt(inode->i_sb, DAX_ALWAYS))
 		return false;
 	if (!S_ISREG(inode->i_mode))
 		return false;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index bf5fcb477f66..7b99c44d0a91 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1775,7 +1775,7 @@ static const struct mount_opts {
 	{Opt_min_batch_time, 0, MOPT_GTE0},
 	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
 	{Opt_init_itable, 0, MOPT_GTE0},
-	{Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
+	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
 	{Opt_stripe, 0, MOPT_GTE0},
 	{Opt_resuid, 0, MOPT_GTE0},
 	{Opt_resgid, 0, MOPT_GTE0},
@@ -3982,7 +3982,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 				 "both data=journal and dioread_nolock");
 			goto failed_mount;
 		}
-		if (test_opt(sb, DAX)) {
+		if (test_opt(sb, DAX_ALWAYS)) {
 			ext4_msg(sb, KERN_ERR, "can't mount with "
 				 "both data=journal and dax");
 			goto failed_mount;
@@ -4092,7 +4092,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		goto failed_mount;
 	}
 
-	if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
+	if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
 		if (ext4_has_feature_inline_data(sb)) {
 			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
 					" that may contain inline data");
@@ -5412,7 +5412,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 			err = -EINVAL;
 			goto restore_opts;
 		}
-		if (test_opt(sb, DAX)) {
+		if (test_opt(sb, DAX_ALWAYS)) {
 			ext4_msg(sb, KERN_ERR, "can't mount with "
 				 "both data=journal and dax");
 			err = -EINVAL;
@@ -5433,10 +5433,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 		goto restore_opts;
 	}
 
-	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX) {
+	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
 		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
 			"dax flag with busy inodes while remounting");
-		sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
+		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
 	}
 
 	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax()
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (2 preceding siblings ...)
  2020-05-21 19:13 ` [PATCH V4 3/8] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-22 10:41   ` Jan Kara
  2020-05-21 19:13 ` [PATCH V4 5/8] fs/ext4: Only change S_DAX on inode load ira.weiny
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

S_DAX should only be enabled when the underlying block device supports
dax.

Cache the underlying support for DAX in the super block and modify
ext4_should_use_dax() to check for device support prior to the over
riding mount option.

While we are at it change the function to ext4_should_enable_dax() as
this better reflects the ask as well as matches xfs.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Add a sb DAX supported flag for performance

Changes from RFC
	Change function name to 'should enable'
	Clean up bool conversion
	Reorder this for better bisect-ability
---
 fs/ext4/ext4.h  |  1 +
 fs/ext4/inode.c | 15 ++++++++++-----
 fs/ext4/super.c |  5 ++++-
 3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1a3daf2d18ef..0b4db9ce7756 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1979,6 +1979,7 @@ static inline bool ext4_has_incompat_features(struct super_block *sb)
  */
 #define EXT4_FLAGS_RESIZING	0
 #define EXT4_FLAGS_SHUTDOWN	1
+#define EXT4_FLAGS_BDEV_IS_DAX	2
 
 static inline int ext4_forced_shutdown(struct ext4_sb_info *sbi)
 {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a10ff12194db..6532870f6a0b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4398,10 +4398,10 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
 }
 
-static bool ext4_should_use_dax(struct inode *inode)
+static bool ext4_should_enable_dax(struct inode *inode)
 {
-	if (!test_opt(inode->i_sb, DAX_ALWAYS))
-		return false;
+	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+
 	if (!S_ISREG(inode->i_mode))
 		return false;
 	if (ext4_should_journal_data(inode))
@@ -4412,7 +4412,12 @@ static bool ext4_should_use_dax(struct inode *inode)
 		return false;
 	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
 		return false;
-	return true;
+	if (!test_bit(EXT4_FLAGS_BDEV_IS_DAX, &sbi->s_ext4_flags))
+		return false;
+	if (test_opt(inode->i_sb, DAX_ALWAYS))
+		return true;
+
+	return false;
 }
 
 void ext4_set_inode_flags(struct inode *inode)
@@ -4430,7 +4435,7 @@ void ext4_set_inode_flags(struct inode *inode)
 		new_fl |= S_NOATIME;
 	if (flags & EXT4_DIRSYNC_FL)
 		new_fl |= S_DIRSYNC;
-	if (ext4_should_use_dax(inode))
+	if (ext4_should_enable_dax(inode))
 		new_fl |= S_DAX;
 	if (flags & EXT4_ENCRYPT_FL)
 		new_fl |= S_ENCRYPTED;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 7b99c44d0a91..f7d76dcaedfe 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4092,13 +4092,16 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		goto failed_mount;
 	}
 
+	if (bdev_dax_supported(sb->s_bdev, blocksize))
+		set_bit(EXT4_FLAGS_BDEV_IS_DAX, &sbi->s_ext4_flags);
+
 	if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
 		if (ext4_has_feature_inline_data(sb)) {
 			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
 					" that may contain inline data");
 			goto failed_mount;
 		}
-		if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
+		if (!test_bit(EXT4_FLAGS_BDEV_IS_DAX, &sbi->s_ext4_flags)) {
 			ext4_msg(sb, KERN_ERR,
 				"DAX unsupported by block device.");
 			goto failed_mount;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 5/8] fs/ext4: Only change S_DAX on inode load
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (3 preceding siblings ...)
  2020-05-21 19:13 ` [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-21 19:13 ` [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

To prevent complications with in memory inodes we only set S_DAX on
inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
change after inode eviction and reload.

Add init bool to ext4_set_inode_flags() to indicate if the inode is
being newly initialized.

Assert that S_DAX is not set on an inode which is just being loaded.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Rework based on moving the encryption patch to the end.

Changes from RFC:
	Change J_ASSERT() to WARN_ON_ONCE()
	Fix bug which would clear S_DAX incorrectly
---
 fs/ext4/ext4.h   |  2 +-
 fs/ext4/ialloc.c |  2 +-
 fs/ext4/inode.c  | 13 ++++++++++---
 fs/ext4/ioctl.c  |  3 ++-
 fs/ext4/super.c  |  4 ++--
 fs/ext4/verity.c |  2 +-
 6 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 0b4db9ce7756..f5291693ce6e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2693,7 +2693,7 @@ extern int ext4_can_truncate(struct inode *inode);
 extern int ext4_truncate(struct inode *);
 extern int ext4_break_layouts(struct inode *);
 extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length);
-extern void ext4_set_inode_flags(struct inode *);
+extern void ext4_set_inode_flags(struct inode *, bool init);
 extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 4b8c9a9bdf0c..7941c140723f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1116,7 +1116,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
 	ei->i_block_group = group;
 	ei->i_last_alloc_group = ~0;
 
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, true);
 	if (IS_DIRSYNC(inode))
 		ext4_handle_sync(handle);
 	if (insert_inode_locked(inode) < 0) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6532870f6a0b..01636cf5f322 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4420,11 +4420,13 @@ static bool ext4_should_enable_dax(struct inode *inode)
 	return false;
 }
 
-void ext4_set_inode_flags(struct inode *inode)
+void ext4_set_inode_flags(struct inode *inode, bool init)
 {
 	unsigned int flags = EXT4_I(inode)->i_flags;
 	unsigned int new_fl = 0;
 
+	WARN_ON_ONCE(IS_DAX(inode) && init);
+
 	if (flags & EXT4_SYNC_FL)
 		new_fl |= S_SYNC;
 	if (flags & EXT4_APPEND_FL)
@@ -4435,8 +4437,13 @@ void ext4_set_inode_flags(struct inode *inode)
 		new_fl |= S_NOATIME;
 	if (flags & EXT4_DIRSYNC_FL)
 		new_fl |= S_DIRSYNC;
-	if (ext4_should_enable_dax(inode))
+
+	/* Because of the way inode_set_flags() works we must preserve S_DAX
+	 * here if already set. */
+	new_fl |= (inode->i_flags & S_DAX);
+	if (init && ext4_should_enable_dax(inode))
 		new_fl |= S_DAX;
+
 	if (flags & EXT4_ENCRYPT_FL)
 		new_fl |= S_ENCRYPTED;
 	if (flags & EXT4_CASEFOLD_FL)
@@ -4650,7 +4657,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 		 * not initialized on a new filesystem. */
 	}
 	ei->i_flags = le32_to_cpu(raw_inode->i_flags);
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, true);
 	inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
 	ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
 	if (ext4_has_feature_64bit(sb))
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 5813e5e73eab..145083e8cd1e 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -381,7 +381,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
 			ext4_clear_inode_flag(inode, i);
 	}
 
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, false);
+
 	inode->i_ctime = current_time(inode);
 
 	err = ext4_mark_iloc_dirty(handle, inode, &iloc);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index f7d76dcaedfe..80eb814c47eb 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1348,7 +1348,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
 			 * S_DAX may be disabled
 			 */
-			ext4_set_inode_flags(inode);
+			ext4_set_inode_flags(inode, false);
 		}
 		return res;
 	}
@@ -1375,7 +1375,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
 		 * S_DAX may be disabled
 		 */
-		ext4_set_inode_flags(inode);
+		ext4_set_inode_flags(inode, false);
 		res = ext4_mark_inode_dirty(handle, inode);
 		if (res)
 			EXT4_ERROR_INODE(inode, "Failed to mark inode dirty");
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index f05a09fb2ae4..89a155ece323 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -244,7 +244,7 @@ static int ext4_end_enable_verity(struct file *filp, const void *desc,
 		if (err)
 			goto out_stop;
 		ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
-		ext4_set_inode_flags(inode);
+		ext4_set_inode_flags(inode, false);
 		err = ext4_mark_iloc_dirty(handle, inode, &iloc);
 	}
 out_stop:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (4 preceding siblings ...)
  2020-05-21 19:13 ` [PATCH V4 5/8] fs/ext4: Only change S_DAX on inode load ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-27  5:54   ` Xiao Yang
  2020-05-21 19:13 ` [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag ira.weiny
  2020-05-21 19:13 ` [PATCH V4 8/8] Documentation/dax: Update DAX enablement for ext4 ira.weiny
  7 siblings, 1 reply; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
operate the same which is equivalent to 'always'.  This new
functionality is limited to ext4 only.

Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.

We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.

Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
specified that option for printing.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Fix up mounting options to only show an option if specified
	Fix remount to prevent dax changes
	Isolate behavior to ext4 only

Changes from RFC:
	Combine remount check for DAX_NEVER with DAX_ALWAYS
	Update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  |  2 ++
 fs/ext4/inode.c |  2 ++
 fs/ext4/super.c | 67 +++++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f5291693ce6e..65ffb831b2b9 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1168,6 +1168,8 @@ struct ext4_inode_info {
 						      blocks */
 #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
 						      file systems */
+#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
+#define EXT4_MOUNT2_DAX_INODE		0x00000010 /* For printing options only */
 
 #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
 						specified journal checksum */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 01636cf5f322..68fac9289109 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4402,6 +4402,8 @@ static bool ext4_should_enable_dax(struct inode *inode)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 
+	if (test_opt2(inode->i_sb, DAX_NEVER))
+		return false;
 	if (!S_ISREG(inode->i_mode))
 		return false;
 	if (ext4_should_journal_data(inode))
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 80eb814c47eb..5e056aa20ce9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1512,7 +1512,8 @@ enum {
 	Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
 	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
 	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
-	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
+	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
+	Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
 	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
 	Opt_nowarn_on_error, Opt_mblk_io_submit,
 	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
@@ -1579,6 +1580,9 @@ static const match_table_t tokens = {
 	{Opt_nobarrier, "nobarrier"},
 	{Opt_i_version, "i_version"},
 	{Opt_dax, "dax"},
+	{Opt_dax_always, "dax=always"},
+	{Opt_dax_inode, "dax=inode"},
+	{Opt_dax_never, "dax=never"},
 	{Opt_stripe, "stripe=%u"},
 	{Opt_delalloc, "delalloc"},
 	{Opt_warn_on_error, "warn_on_error"},
@@ -1726,6 +1730,7 @@ static int clear_qf_name(struct super_block *sb, int qtype)
 #define MOPT_NO_EXT3	0x0200
 #define MOPT_EXT4_ONLY	(MOPT_NO_EXT2 | MOPT_NO_EXT3)
 #define MOPT_STRING	0x0400
+#define MOPT_SKIP	0x0800
 
 static const struct mount_opts {
 	int	token;
@@ -1775,7 +1780,13 @@ static const struct mount_opts {
 	{Opt_min_batch_time, 0, MOPT_GTE0},
 	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
 	{Opt_init_itable, 0, MOPT_GTE0},
-	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
+	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET | MOPT_SKIP},
+	{Opt_dax_always, EXT4_MOUNT_DAX_ALWAYS,
+		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
+	{Opt_dax_inode, EXT4_MOUNT2_DAX_INODE,
+		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
+	{Opt_dax_never, EXT4_MOUNT2_DAX_NEVER,
+		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
 	{Opt_stripe, 0, MOPT_GTE0},
 	{Opt_resuid, 0, MOPT_GTE0},
 	{Opt_resgid, 0, MOPT_GTE0},
@@ -2084,13 +2095,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
 		}
 		sbi->s_jquota_fmt = m->mount_opt;
 #endif
-	} else if (token == Opt_dax) {
+	} else if (token == Opt_dax || token == Opt_dax_always ||
+		   token == Opt_dax_inode || token == Opt_dax_never) {
 #ifdef CONFIG_FS_DAX
-		ext4_msg(sb, KERN_WARNING,
-		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
-		sbi->s_mount_opt |= m->mount_opt;
+		switch (token) {
+		case Opt_dax:
+		case Opt_dax_always:
+			ext4_msg(sb, KERN_WARNING,
+				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
+			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
+			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
+			break;
+		case Opt_dax_never:
+			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
+			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
+			break;
+		case Opt_dax_inode:
+			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
+			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
+			/* Strictly for printing options */
+			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_INODE;
+			break;
+		}
 #else
 		ext4_msg(sb, KERN_INFO, "dax option not supported");
+		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
+		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
 		return -1;
 #endif
 	} else if (token == Opt_data_err_abort) {
@@ -2254,7 +2284,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
 	for (m = ext4_mount_opts; m->token != Opt_err; m++) {
 		int want_set = m->flags & MOPT_SET;
 		if (((m->flags & (MOPT_SET|MOPT_CLEAR)) == 0) ||
-		    (m->flags & MOPT_CLEAR_ERR))
+		    (m->flags & MOPT_CLEAR_ERR) || m->flags & MOPT_SKIP)
 			continue;
 		if (!nodefs && !(m->mount_opt & (sbi->s_mount_opt ^ def_mount_opt)))
 			continue; /* skip if same as the default */
@@ -2314,6 +2344,17 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
 	if (DUMMY_ENCRYPTION_ENABLED(sbi))
 		SEQ_OPTS_PUTS("test_dummy_encryption");
 
+	if (test_opt(sb, DAX_ALWAYS)) {
+		if (IS_EXT2_SB(sb))
+			SEQ_OPTS_PUTS("dax");
+		else
+			SEQ_OPTS_PUTS("dax=always");
+	} else if (test_opt2(sb, DAX_NEVER)) {
+		SEQ_OPTS_PUTS("dax=never");
+	} else if (test_opt2(sb, DAX_INODE)) {
+		SEQ_OPTS_PUTS("dax=inode");
+	}
+
 	ext4_show_quota_options(seq, sb);
 	return 0;
 }
@@ -5436,10 +5477,16 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 		goto restore_opts;
 	}
 
-	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
+	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS ||
+	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_DAX_NEVER ||
+	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_DAX_INODE) {
 		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
-			"dax flag with busy inodes while remounting");
-		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
+			"dax mount option with busy inodes while remounting");
+		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
+		sbi->s_mount_opt |= old_opts.s_mount_opt & EXT4_MOUNT_DAX_ALWAYS;
+		sbi->s_mount_opt2 &= ~(EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
+		sbi->s_mount_opt2 |= old_opts.s_mount_opt2 &
+				     (EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
 	}
 
 	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (5 preceding siblings ...)
  2020-05-21 19:13 ` [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  2020-05-22 11:48   ` Jan Kara
  2020-05-23  5:13   ` Andreas Dilger
  2020-05-21 19:13 ` [PATCH V4 8/8] Documentation/dax: Update DAX enablement for ext4 ira.weiny
  7 siblings, 2 replies; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.

Set the flag to be user visible and changeable.  Set the flag to be
inherited.  Allow applications to change the flag at any time with the
exception of if VERITY or ENCRYPT is set.

Disallow setting VERITY or ENCRYPT if DAX is set.

Finally, on regular files, flag the inode to not be cached to facilitate
changing S_DAX on the next creation of the inode.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Move bit to bit25 per Andreas

Change from V2:
	Add in making verity and DAX exclusive.
	'Squash' in making encryption and DAX exclusive.
	Add in EXT4_INODE_DAX flag definition to be compatible with
		ext4_[set|test]_inode_flag() bit operations
	Use ext4_[set|test]_inode_flag() bit operations to be consistent
		with other code.

Change from V0:
	Add FS_DAX_FL to include/uapi/linux/fs.h
		to be consistent
	Move ext4_dax_dontcache() to ext4_ioctl_setflags()
		This ensures that it is only set when the flags are going to be
		set and not if there is an error
		Also this sets don't cache in the FS_IOC_SETFLAGS case

Change from RFC:
	use new d_mark_dontcache()
	Allow caching if ALWAYS/NEVER is set
	Rebased to latest Linus master
	Change flag to unused 0x01000000
	update ext4_should_enable_dax()
---
 fs/ext4/ext4.h          | 14 ++++++++++----
 fs/ext4/inode.c         |  2 +-
 fs/ext4/ioctl.c         | 34 +++++++++++++++++++++++++++++++++-
 fs/ext4/super.c         |  3 +++
 fs/ext4/verity.c        |  2 +-
 include/uapi/linux/fs.h |  1 +
 6 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 65ffb831b2b9..09b8906568d2 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -415,13 +415,16 @@ struct flex_groups {
 #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
 #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
 /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
+
+#define EXT4_DAX_FL			0x02000000 /* Inode is DAX */
+
 #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
 #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
 #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
-#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
+#define EXT4_FL_USER_VISIBLE		0x725BDFFF /* User visible flags */
+#define EXT4_FL_USER_MODIFIABLE		0x624BC0FF /* User modifiable flags */
 
 /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
 #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
@@ -429,14 +432,16 @@ struct flex_groups {
 					 EXT4_APPEND_FL | \
 					 EXT4_NODUMP_FL | \
 					 EXT4_NOATIME_FL | \
-					 EXT4_PROJINHERIT_FL)
+					 EXT4_PROJINHERIT_FL | \
+					 EXT4_DAX_FL)
 
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
 			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
-			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
+			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
+			   EXT4_DAX_FL)
 
 /* Flags that are appropriate for regular files (all but dir-specific ones). */
 #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
@@ -488,6 +493,7 @@ enum {
 	EXT4_INODE_VERITY	= 20,	/* Verity protected inode */
 	EXT4_INODE_EA_INODE	= 21,	/* Inode used for large EA */
 /* 22 was formerly EXT4_INODE_EOFBLOCKS */
+	EXT4_INODE_DAX		= 25,	/* Inode is DAX */
 	EXT4_INODE_INLINE_DATA	= 28,	/* Data in inode. */
 	EXT4_INODE_PROJINHERIT	= 29,	/* Create with parents projid */
 	EXT4_INODE_RESERVED	= 31,	/* reserved for ext4 lib */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 68fac9289109..778b0dbe3da6 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4419,7 +4419,7 @@ static bool ext4_should_enable_dax(struct inode *inode)
 	if (test_opt(inode->i_sb, DAX_ALWAYS))
 		return true;
 
-	return false;
+	return ext4_test_inode_flag(inode, EXT4_INODE_DAX);
 }
 
 void ext4_set_inode_flags(struct inode *inode, bool init)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 145083e8cd1e..668b8c17d6eb 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -292,6 +292,21 @@ static int ext4_ioctl_check_immutable(struct inode *inode, __u32 new_projid,
 	return 0;
 }
 
+static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
+{
+	struct ext4_inode_info *ei = EXT4_I(inode);
+
+	if (S_ISDIR(inode->i_mode))
+		return;
+
+	if (test_opt2(inode->i_sb, DAX_NEVER) ||
+	    test_opt(inode->i_sb, DAX_ALWAYS))
+		return;
+
+	if ((ei->i_flags ^ flags) & EXT4_DAX_FL)
+		d_mark_dontcache(inode);
+}
+
 static int ext4_ioctl_setflags(struct inode *inode,
 			       unsigned int flags)
 {
@@ -303,6 +318,16 @@ static int ext4_ioctl_setflags(struct inode *inode,
 	unsigned int jflag;
 	struct super_block *sb = inode->i_sb;
 
+	if (ext4_test_inode_flag(inode, EXT4_INODE_DAX)) {
+		if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY) ||
+		    ext4_test_inode_flag(inode, EXT4_INODE_ENCRYPT) ||
+		    ext4_test_inode_state(inode,
+					  EXT4_STATE_VERITY_IN_PROGRESS)) {
+			err = -EOPNOTSUPP;
+			goto flags_out;
+		}
+	}
+
 	/* Is it quota file? Do not allow user to mess with it */
 	if (ext4_is_quota_file(inode))
 		goto flags_out;
@@ -369,6 +394,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
 	if (err)
 		goto flags_err;
 
+	ext4_dax_dontcache(inode, flags);
+
 	for (i = 0, mask = 1; i < 32; i++, mask <<= 1) {
 		if (!(mask & EXT4_FL_USER_MODIFIABLE))
 			continue;
@@ -528,12 +555,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
 		xflags |= FS_XFLAG_NOATIME;
 	if (iflags & EXT4_PROJINHERIT_FL)
 		xflags |= FS_XFLAG_PROJINHERIT;
+	if (iflags & EXT4_DAX_FL)
+		xflags |= FS_XFLAG_DAX;
 	return xflags;
 }
 
 #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
 				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
-				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
+				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
+				  FS_XFLAG_DAX)
 
 /* Transfer xflags flags to internal */
 static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
@@ -552,6 +582,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
 		iflags |= EXT4_NOATIME_FL;
 	if (xflags & FS_XFLAG_PROJINHERIT)
 		iflags |= EXT4_PROJINHERIT_FL;
+	if (xflags & FS_XFLAG_DAX)
+		iflags |= EXT4_DAX_FL;
 
 	return iflags;
 }
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5e056aa20ce9..3658e3016999 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1323,6 +1323,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
 		return -EINVAL;
 
+	if (ext4_test_inode_flag(inode, EXT4_INODE_DAX))
+		return -EOPNOTSUPP;
+
 	res = ext4_convert_inline_data(inode);
 	if (res)
 		return res;
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index 89a155ece323..4fecb3e4e338 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,7 +113,7 @@ static int ext4_begin_enable_verity(struct file *filp)
 	handle_t *handle;
 	int err;
 
-	if (IS_DAX(inode))
+	if (IS_DAX(inode) || ext4_test_inode_flag(inode, EXT4_INODE_DAX))
 		return -EINVAL;
 
 	if (ext4_verity_in_progress(inode))
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 379a612f8f1d..f44eb0a04afd 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -262,6 +262,7 @@ struct fsxattr {
 #define FS_EA_INODE_FL			0x00200000 /* Inode used for large EA */
 #define FS_EOFBLOCKS_FL			0x00400000 /* Reserved for ext4 */
 #define FS_NOCOW_FL			0x00800000 /* Do not cow file */
+#define FS_DAX_FL			0x02000000 /* Inode is DAX */
 #define FS_INLINE_DATA_FL		0x10000000 /* Reserved for ext4 */
 #define FS_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define FS_CASEFOLD_FL			0x40000000 /* Folder is case insensitive */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V4 8/8] Documentation/dax: Update DAX enablement for ext4
  2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (6 preceding siblings ...)
  2020-05-21 19:13 ` [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag ira.weiny
@ 2020-05-21 19:13 ` ira.weiny
  7 siblings, 0 replies; 20+ messages in thread
From: ira.weiny @ 2020-05-21 19:13 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara, Eric Biggers
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Update the document to reflect ext4 and xfs now behave the same.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Update with ext2 text...
---
 Documentation/filesystems/dax.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/dax.txt b/Documentation/filesystems/dax.txt
index 735fb4b54117..265c4f808dbf 100644
--- a/Documentation/filesystems/dax.txt
+++ b/Documentation/filesystems/dax.txt
@@ -25,7 +25,7 @@ size when creating the filesystem.
 Currently 3 filesystems support DAX: ext2, ext4 and xfs.  Enabling DAX on them
 is different.
 
-Enabling DAX on ext4 and ext2
+Enabling DAX on ext2
 -----------------------------
 
 When mounting the filesystem, use the "-o dax" option on the command line or
@@ -33,8 +33,8 @@ add 'dax' to the options in /etc/fstab.  This works to enable DAX on all files
 within the filesystem.  It is equivalent to the '-o dax=always' behavior below.
 
 
-Enabling DAX on xfs
--------------------
+Enabling DAX on xfs and ext4
+----------------------------
 
 Summary
 -------
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags
  2020-05-21 19:13 ` [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
@ 2020-05-22  7:58   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2020-05-22  7:58 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On Thu 21-05-20 12:13:06, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> When preventing DAX and journaling on an inode.  Use the effective DAX
> check rather than the mount option.
> 
> This will be required to support per inode DAX flags.
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
...
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index bfc1281fc4cb..5813e5e73eab 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -393,9 +393,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
>  	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
>  		/*
>  		 * Changes to the journaling mode can cause unsafe changes to
> -		 * S_DAX if we are using the DAX mount option.
> +		 * S_DAX if the inode is DAX
>  		 */
> -		if (test_opt(inode->i_sb, DAX)) {
> +		if (IS_DAX(inode)) {
>  			err = -EBUSY;
>  			goto flags_out;
>  		}

Now one problem popped up in my mind: We should also make EXT4_JOURNAL_DATA_FL
and EXT4_DAX_FL exclusive so that both cannot be turned on at the same
time. And IS_DAX() check isn't enough for that. Sorry for not noticing
earlier... it's like peeling an onion...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax()
  2020-05-21 19:13 ` [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
@ 2020-05-22 10:41   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2020-05-22 10:41 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On Thu 21-05-20 12:13:09, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> S_DAX should only be enabled when the underlying block device supports
> dax.
> 
> Cache the underlying support for DAX in the super block and modify
> ext4_should_use_dax() to check for device support prior to the over
> riding mount option.
> 
> While we are at it change the function to ext4_should_enable_dax() as
> this better reflects the ask as well as matches xfs.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

The patch looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza

> 
> ---
> Changes from V3:
> 	Add a sb DAX supported flag for performance
> 
> Changes from RFC
> 	Change function name to 'should enable'
> 	Clean up bool conversion
> 	Reorder this for better bisect-ability
> ---
>  fs/ext4/ext4.h  |  1 +
>  fs/ext4/inode.c | 15 ++++++++++-----
>  fs/ext4/super.c |  5 ++++-
>  3 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1a3daf2d18ef..0b4db9ce7756 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1979,6 +1979,7 @@ static inline bool ext4_has_incompat_features(struct super_block *sb)
>   */
>  #define EXT4_FLAGS_RESIZING	0
>  #define EXT4_FLAGS_SHUTDOWN	1
> +#define EXT4_FLAGS_BDEV_IS_DAX	2
>  
>  static inline int ext4_forced_shutdown(struct ext4_sb_info *sbi)
>  {
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index a10ff12194db..6532870f6a0b 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4398,10 +4398,10 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
>  		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
>  }
>  
> -static bool ext4_should_use_dax(struct inode *inode)
> +static bool ext4_should_enable_dax(struct inode *inode)
>  {
> -	if (!test_opt(inode->i_sb, DAX_ALWAYS))
> -		return false;
> +	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +
>  	if (!S_ISREG(inode->i_mode))
>  		return false;
>  	if (ext4_should_journal_data(inode))
> @@ -4412,7 +4412,12 @@ static bool ext4_should_use_dax(struct inode *inode)
>  		return false;
>  	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
>  		return false;
> -	return true;
> +	if (!test_bit(EXT4_FLAGS_BDEV_IS_DAX, &sbi->s_ext4_flags))
> +		return false;
> +	if (test_opt(inode->i_sb, DAX_ALWAYS))
> +		return true;
> +
> +	return false;
>  }
>  
>  void ext4_set_inode_flags(struct inode *inode)
> @@ -4430,7 +4435,7 @@ void ext4_set_inode_flags(struct inode *inode)
>  		new_fl |= S_NOATIME;
>  	if (flags & EXT4_DIRSYNC_FL)
>  		new_fl |= S_DIRSYNC;
> -	if (ext4_should_use_dax(inode))
> +	if (ext4_should_enable_dax(inode))
>  		new_fl |= S_DAX;
>  	if (flags & EXT4_ENCRYPT_FL)
>  		new_fl |= S_ENCRYPTED;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 7b99c44d0a91..f7d76dcaedfe 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4092,13 +4092,16 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  		goto failed_mount;
>  	}
>  
> +	if (bdev_dax_supported(sb->s_bdev, blocksize))
> +		set_bit(EXT4_FLAGS_BDEV_IS_DAX, &sbi->s_ext4_flags);
> +
>  	if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
>  		if (ext4_has_feature_inline_data(sb)) {
>  			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
>  					" that may contain inline data");
>  			goto failed_mount;
>  		}
> -		if (!bdev_dax_supported(sb->s_bdev, blocksize)) {
> +		if (!test_bit(EXT4_FLAGS_BDEV_IS_DAX, &sbi->s_ext4_flags)) {
>  			ext4_msg(sb, KERN_ERR,
>  				"DAX unsupported by block device.");
>  			goto failed_mount;
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag
  2020-05-21 19:13 ` [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag ira.weiny
@ 2020-05-22 11:48   ` Jan Kara
  2020-05-25  4:39     ` Ira Weiny
  2020-05-23  5:13   ` Andreas Dilger
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Kara @ 2020-05-22 11:48 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On Thu 21-05-20 12:13:12, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> 
> Set the flag to be user visible and changeable.  Set the flag to be
> inherited.  Allow applications to change the flag at any time with the
> exception of if VERITY or ENCRYPT is set.
> 
> Disallow setting VERITY or ENCRYPT if DAX is set.
> 
> Finally, on regular files, flag the inode to not be cached to facilitate
> changing S_DAX on the next creation of the inode.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

...

> @@ -303,6 +318,16 @@ static int ext4_ioctl_setflags(struct inode *inode,
>  	unsigned int jflag;
>  	struct super_block *sb = inode->i_sb;
>  
> +	if (ext4_test_inode_flag(inode, EXT4_INODE_DAX)) {
> +		if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY) ||
> +		    ext4_test_inode_flag(inode, EXT4_INODE_ENCRYPT) ||
> +		    ext4_test_inode_state(inode,
> +					  EXT4_STATE_VERITY_IN_PROGRESS)) {
> +			err = -EOPNOTSUPP;
> +			goto flags_out;
> +		}
> +	}

The way this check is implemented wouldn't IMO do what we need... It
doesn't check the flags that are being set but just the current inode
state. I think it should rather be:

	if ((flags ^ oldflags) & EXT4_INODE_DAX_FL) {
		...
	}

And perhaps move this to a place in ext4_ioctl_setflags() where we check
other similar conflicts.

And then we should check conflicts with the journal flag as well, as I
mentioned in reply to the first patch. There it is more complicated by the
fact that we should disallow setting of both EXT4_INODE_DAX_FL and
EXT4_JOURNAL_DATA_FL at the same time so the checks will be somewhat more
complicated.

								Honza

> +
>  	/* Is it quota file? Do not allow user to mess with it */
>  	if (ext4_is_quota_file(inode))
>  		goto flags_out;
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag
  2020-05-21 19:13 ` [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag ira.weiny
  2020-05-22 11:48   ` Jan Kara
@ 2020-05-23  5:13   ` Andreas Dilger
  1 sibling, 0 replies; 20+ messages in thread
From: Andreas Dilger @ 2020-05-23  5:13 UTC (permalink / raw)
  To: ira.weiny
  Cc: Ext4 Developers List, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 8878 bytes --]

On May 21, 2020, at 1:13 PM, ira.weiny@intel.com wrote:
> 
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> 
> Set the flag to be user visible and changeable.  Set the flag to be
> inherited.  Allow applications to change the flag at any time with the
> exception of if VERITY or ENCRYPT is set.
> 
> Disallow setting VERITY or ENCRYPT if DAX is set.
> 
> Finally, on regular files, flag the inode to not be cached to facilitate
> changing S_DAX on the next creation of the inode.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> 
> ---
> Changes from V3:
> 	Move bit to bit25 per Andreas
> 
> Change from V2:
> 	Add in making verity and DAX exclusive.
> 	'Squash' in making encryption and DAX exclusive.
> 	Add in EXT4_INODE_DAX flag definition to be compatible with
> 		ext4_[set|test]_inode_flag() bit operations
> 	Use ext4_[set|test]_inode_flag() bit operations to be consistent
> 		with other code.
> 
> Change from V0:
> 	Add FS_DAX_FL to include/uapi/linux/fs.h
> 		to be consistent
> 	Move ext4_dax_dontcache() to ext4_ioctl_setflags()
> 		This ensures that it is only set when the flags are going to be
> 		set and not if there is an error
> 		Also this sets don't cache in the FS_IOC_SETFLAGS case
> 
> Change from RFC:
> 	use new d_mark_dontcache()
> 	Allow caching if ALWAYS/NEVER is set
> 	Rebased to latest Linus master
> 	Change flag to unused 0x01000000
> 	update ext4_should_enable_dax()
> ---
> fs/ext4/ext4.h          | 14 ++++++++++----
> fs/ext4/inode.c         |  2 +-
> fs/ext4/ioctl.c         | 34 +++++++++++++++++++++++++++++++++-
> fs/ext4/super.c         |  3 +++
> fs/ext4/verity.c        |  2 +-
> include/uapi/linux/fs.h |  1 +
> 6 files changed, 49 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 65ffb831b2b9..09b8906568d2 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -415,13 +415,16 @@ struct flex_groups {
> #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
> +
> +#define EXT4_DAX_FL			0x02000000 /* Inode is DAX */
> +
> #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
> #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
> #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
> 
> -#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
> -#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
> +#define EXT4_FL_USER_VISIBLE		0x725BDFFF /* User visible flags */
> +#define EXT4_FL_USER_MODIFIABLE		0x624BC0FF /* User modifiable flags */
> 
> /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
> #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
> @@ -429,14 +432,16 @@ struct flex_groups {
> 					 EXT4_APPEND_FL | \
> 					 EXT4_NODUMP_FL | \
> 					 EXT4_NOATIME_FL | \
> -					 EXT4_PROJINHERIT_FL)
> +					 EXT4_PROJINHERIT_FL | \
> +					 EXT4_DAX_FL)
> 
> /* Flags that should be inherited by new inodes from their parent. */
> #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
> 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
> 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
> 			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
> -			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
> +			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
> +			   EXT4_DAX_FL)
> 
> /* Flags that are appropriate for regular files (all but dir-specific ones). */
> #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
> @@ -488,6 +493,7 @@ enum {
> 	EXT4_INODE_VERITY	= 20,	/* Verity protected inode */
> 	EXT4_INODE_EA_INODE	= 21,	/* Inode used for large EA */
> /* 22 was formerly EXT4_INODE_EOFBLOCKS */
> +	EXT4_INODE_DAX		= 25,	/* Inode is DAX */
> 	EXT4_INODE_INLINE_DATA	= 28,	/* Data in inode. */
> 	EXT4_INODE_PROJINHERIT	= 29,	/* Create with parents projid */
> 	EXT4_INODE_RESERVED	= 31,	/* reserved for ext4 lib */
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 68fac9289109..778b0dbe3da6 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4419,7 +4419,7 @@ static bool ext4_should_enable_dax(struct inode *inode)
> 	if (test_opt(inode->i_sb, DAX_ALWAYS))
> 		return true;
> 
> -	return false;
> +	return ext4_test_inode_flag(inode, EXT4_INODE_DAX);
> }
> 
> void ext4_set_inode_flags(struct inode *inode, bool init)
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index 145083e8cd1e..668b8c17d6eb 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -292,6 +292,21 @@ static int ext4_ioctl_check_immutable(struct inode *inode, __u32 new_projid,
> 	return 0;
> }
> 
> +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> +{
> +	struct ext4_inode_info *ei = EXT4_I(inode);
> +
> +	if (S_ISDIR(inode->i_mode))
> +		return;
> +
> +	if (test_opt2(inode->i_sb, DAX_NEVER) ||
> +	    test_opt(inode->i_sb, DAX_ALWAYS))
> +		return;
> +
> +	if ((ei->i_flags ^ flags) & EXT4_DAX_FL)
> +		d_mark_dontcache(inode);
> +}
> +
> static int ext4_ioctl_setflags(struct inode *inode,
> 			       unsigned int flags)
> {
> @@ -303,6 +318,16 @@ static int ext4_ioctl_setflags(struct inode *inode,
> 	unsigned int jflag;
> 	struct super_block *sb = inode->i_sb;
> 
> +	if (ext4_test_inode_flag(inode, EXT4_INODE_DAX)) {
> +		if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY) ||
> +		    ext4_test_inode_flag(inode, EXT4_INODE_ENCRYPT) ||
> +		    ext4_test_inode_state(inode,
> +					  EXT4_STATE_VERITY_IN_PROGRESS)) {
> +			err = -EOPNOTSUPP;
> +			goto flags_out;
> +		}
> +	}
> +
> 	/* Is it quota file? Do not allow user to mess with it */
> 	if (ext4_is_quota_file(inode))
> 		goto flags_out;
> @@ -369,6 +394,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
> 	if (err)
> 		goto flags_err;
> 
> +	ext4_dax_dontcache(inode, flags);
> +
> 	for (i = 0, mask = 1; i < 32; i++, mask <<= 1) {
> 		if (!(mask & EXT4_FL_USER_MODIFIABLE))
> 			continue;
> @@ -528,12 +555,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
> 		xflags |= FS_XFLAG_NOATIME;
> 	if (iflags & EXT4_PROJINHERIT_FL)
> 		xflags |= FS_XFLAG_PROJINHERIT;
> +	if (iflags & EXT4_DAX_FL)
> +		xflags |= FS_XFLAG_DAX;
> 	return xflags;
> }
> 
> #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
> 				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
> -				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
> +				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
> +				  FS_XFLAG_DAX)
> 
> /* Transfer xflags flags to internal */
> static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> @@ -552,6 +582,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> 		iflags |= EXT4_NOATIME_FL;
> 	if (xflags & FS_XFLAG_PROJINHERIT)
> 		iflags |= EXT4_PROJINHERIT_FL;
> +	if (xflags & FS_XFLAG_DAX)
> +		iflags |= EXT4_DAX_FL;
> 
> 	return iflags;
> }
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 5e056aa20ce9..3658e3016999 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1323,6 +1323,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> 	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> 		return -EINVAL;
> 
> +	if (ext4_test_inode_flag(inode, EXT4_INODE_DAX))
> +		return -EOPNOTSUPP;
> +
> 	res = ext4_convert_inline_data(inode);
> 	if (res)
> 		return res;
> diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> index 89a155ece323..4fecb3e4e338 100644
> --- a/fs/ext4/verity.c
> +++ b/fs/ext4/verity.c
> @@ -113,7 +113,7 @@ static int ext4_begin_enable_verity(struct file *filp)
> 	handle_t *handle;
> 	int err;
> 
> -	if (IS_DAX(inode))
> +	if (IS_DAX(inode) || ext4_test_inode_flag(inode, EXT4_INODE_DAX))
> 		return -EINVAL;
> 
> 	if (ext4_verity_in_progress(inode))
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 379a612f8f1d..f44eb0a04afd 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -262,6 +262,7 @@ struct fsxattr {
> #define FS_EA_INODE_FL			0x00200000 /* Inode used for large EA */
> #define FS_EOFBLOCKS_FL			0x00400000 /* Reserved for ext4 */
> #define FS_NOCOW_FL			0x00800000 /* Do not cow file */
> +#define FS_DAX_FL			0x02000000 /* Inode is DAX */
> #define FS_INLINE_DATA_FL		0x10000000 /* Reserved for ext4 */
> #define FS_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> #define FS_CASEFOLD_FL			0x40000000 /* Folder is case insensitive */
> --
> 2.25.1
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag
  2020-05-22 11:48   ` Jan Kara
@ 2020-05-25  4:39     ` Ira Weiny
  2020-05-25  7:28       ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Ira Weiny @ 2020-05-25  4:39 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Eric Biggers,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Fri, May 22, 2020 at 01:48:48PM +0200, Jan Kara wrote:
> On Thu 21-05-20 12:13:12, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > 
> > Set the flag to be user visible and changeable.  Set the flag to be
> > inherited.  Allow applications to change the flag at any time with the
> > exception of if VERITY or ENCRYPT is set.
> > 
> > Disallow setting VERITY or ENCRYPT if DAX is set.
> > 
> > Finally, on regular files, flag the inode to not be cached to facilitate
> > changing S_DAX on the next creation of the inode.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ...
> 
> > @@ -303,6 +318,16 @@ static int ext4_ioctl_setflags(struct inode *inode,
> >  	unsigned int jflag;
> >  	struct super_block *sb = inode->i_sb;
> >  
> > +	if (ext4_test_inode_flag(inode, EXT4_INODE_DAX)) {
> > +		if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY) ||
> > +		    ext4_test_inode_flag(inode, EXT4_INODE_ENCRYPT) ||
> > +		    ext4_test_inode_state(inode,
> > +					  EXT4_STATE_VERITY_IN_PROGRESS)) {
> > +			err = -EOPNOTSUPP;
> > +			goto flags_out;
> > +		}
> > +	}
> 
> The way this check is implemented wouldn't IMO do what we need... It
> doesn't check the flags that are being set but just the current inode
> state. I think it should rather be:

Sorry, I got confused by the flags when I wrote this.

> 
> 	if ((flags ^ oldflags) & EXT4_INODE_DAX_FL) {
> 		...
> 	}
> 
> And perhaps move this to a place in ext4_ioctl_setflags() where we check
> other similar conflicts.

Sure.  It seems like a ext4_setflags_prepare() helper would be in order.  I'll
see what I can do.

> 
> And then we should check conflicts with the journal flag as well, as I
> mentioned in reply to the first patch. There it is more complicated by the
> fact that we should disallow setting of both EXT4_INODE_DAX_FL and
> EXT4_JOURNAL_DATA_FL at the same time so the checks will be somewhat more
> complicated.

I'm confused by jflag.  Why is EXT4_JOURNAL_DATA_FL stored in jflag?

Ira

> 
> 								Honza
> 
> > +
> >  	/* Is it quota file? Do not allow user to mess with it */
> >  	if (ext4_is_quota_file(inode))
> >  		goto flags_out;
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag
  2020-05-25  4:39     ` Ira Weiny
@ 2020-05-25  7:28       ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2020-05-25  7:28 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Jan Kara, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On Sun 24-05-20 21:39:10, Ira Weiny wrote:
> On Fri, May 22, 2020 at 01:48:48PM +0200, Jan Kara wrote:
> > And then we should check conflicts with the journal flag as well, as I
> > mentioned in reply to the first patch. There it is more complicated by the
> > fact that we should disallow setting of both EXT4_INODE_DAX_FL and
> > EXT4_JOURNAL_DATA_FL at the same time so the checks will be somewhat more
> > complicated.
> 
> I'm confused by jflag.  Why is EXT4_JOURNAL_DATA_FL stored in jflag?

It isn't just EXT4_JOURNAL_DATA_FL. It is:

	jflag = flags & EXT4_JOURNAL_DATA_FL;

so it is EXT4_JOURNAL_DATA_FL if it should be set by the current ioctl and 0
otherwise. But I agree that since we mostly do

	(jflag ^ oldflags) & EXT4_JOURNAL_DATA_FL

jflags is mostly useless as we could do just

	(flags ^ oldflags) & EXT4_JOURNAL_DATA_FL

I guess it's mostly a relict from the past...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state
  2020-05-21 19:13 ` [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
@ 2020-05-27  5:54   ` Xiao Yang
  2020-05-27 23:50     ` Ira Weiny
  0 siblings, 1 reply; 20+ messages in thread
From: Xiao Yang @ 2020-05-27  5:54 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On 2020/5/22 3:13, ira.weiny@intel.com wrote:
> From: Ira Weiny<ira.weiny@intel.com>
> 
> We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
> operate the same which is equivalent to 'always'.  This new
> functionality is limited to ext4 only.
> 
> Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
> it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.
> 
> We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
> 
> Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
> specified that option for printing.
Hi Ira,

I have two questions when reviewing this patch:
1) After doing mount with the same dax=inode option, ext4/xfs shows
differnt output(i.e. xfs doesn't print 'dax=inode'):
---------------------------------------------------
# mount -o dax=inode /dev/pmem0 /mnt/xfstests/test/
# mount | grep pmem0
/dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode)

# mount -odax=inode /dev/pmem1 /mnt/xfstests/scratch/
# mount | grep pmem1
/dev/pmem1 on /mnt/xfstests/scratch type xfs
(rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
----------------------------------------------------
Is this expected output? why don't unify the output?

2) Do mount without dax and mount with -odax=inode have the same behavior?
---------------------------------------------------
# mount /dev/pmem0 /mnt/xfstests/test/
# mount | grep pmem0
/dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel)
# umount /mnt/xfstests/test
# mount -odax=inode /dev/pmem0 /mnt/xfstests/test/
# mount | grep pmem0
/dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode
---------------------------------------------------

BTW: I focus on the support of per-file/directory DAX operations recently.

> 
> Reviewed-by: Jan Kara<jack@suse.cz>
> Signed-off-by: Ira Weiny<ira.weiny@intel.com>
> 
> ---
> Changes from V1:
> 	Fix up mounting options to only show an option if specified
> 	Fix remount to prevent dax changes
> 	Isolate behavior to ext4 only
> 
> Changes from RFC:
> 	Combine remount check for DAX_NEVER with DAX_ALWAYS
> 	Update ext4_should_enable_dax()
> ---
>   fs/ext4/ext4.h  |  2 ++
>   fs/ext4/inode.c |  2 ++
>   fs/ext4/super.c | 67 +++++++++++++++++++++++++++++++++++++++++--------
>   3 files changed, 61 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index f5291693ce6e..65ffb831b2b9 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1168,6 +1168,8 @@ struct ext4_inode_info {
>   						      blocks */
>   #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
>   						      file systems */
> +#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
> +#define EXT4_MOUNT2_DAX_INODE		0x00000010 /* For printing options only */
> 
>   #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
>   						specified journal checksum */
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 01636cf5f322..68fac9289109 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4402,6 +4402,8 @@ static bool ext4_should_enable_dax(struct inode *inode)
>   {
>   	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> 
> +	if (test_opt2(inode->i_sb, DAX_NEVER))
> +		return false;
>   	if (!S_ISREG(inode->i_mode))
>   		return false;
>   	if (ext4_should_journal_data(inode))
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 80eb814c47eb..5e056aa20ce9 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1512,7 +1512,8 @@ enum {
>   	Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
>   	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
>   	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> -	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
> +	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
> +	Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
>   	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
>   	Opt_nowarn_on_error, Opt_mblk_io_submit,
>   	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
> @@ -1579,6 +1580,9 @@ static const match_table_t tokens = {
>   	{Opt_nobarrier, "nobarrier"},
>   	{Opt_i_version, "i_version"},
>   	{Opt_dax, "dax"},
> +	{Opt_dax_always, "dax=always"},
> +	{Opt_dax_inode, "dax=inode"},
> +	{Opt_dax_never, "dax=never"},
>   	{Opt_stripe, "stripe=%u"},
>   	{Opt_delalloc, "delalloc"},
>   	{Opt_warn_on_error, "warn_on_error"},
> @@ -1726,6 +1730,7 @@ static int clear_qf_name(struct super_block *sb, int qtype)
>   #define MOPT_NO_EXT3	0x0200
>   #define MOPT_EXT4_ONLY	(MOPT_NO_EXT2 | MOPT_NO_EXT3)
>   #define MOPT_STRING	0x0400
> +#define MOPT_SKIP	0x0800
> 
>   static const struct mount_opts {
>   	int	token;
> @@ -1775,7 +1780,13 @@ static const struct mount_opts {
>   	{Opt_min_batch_time, 0, MOPT_GTE0},
>   	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
>   	{Opt_init_itable, 0, MOPT_GTE0},
> -	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
> +	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET | MOPT_SKIP},
> +	{Opt_dax_always, EXT4_MOUNT_DAX_ALWAYS,
> +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
> +	{Opt_dax_inode, EXT4_MOUNT2_DAX_INODE,
> +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
> +	{Opt_dax_never, EXT4_MOUNT2_DAX_NEVER,
> +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
>   	{Opt_stripe, 0, MOPT_GTE0},
>   	{Opt_resuid, 0, MOPT_GTE0},
>   	{Opt_resgid, 0, MOPT_GTE0},
> @@ -2084,13 +2095,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
>   		}
>   		sbi->s_jquota_fmt = m->mount_opt;
>   #endif
> -	} else if (token == Opt_dax) {
> +	} else if (token == Opt_dax || token == Opt_dax_always ||
> +		   token == Opt_dax_inode || token == Opt_dax_never) {
>   #ifdef CONFIG_FS_DAX
> -		ext4_msg(sb, KERN_WARNING,
> -		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> -		sbi->s_mount_opt |= m->mount_opt;
> +		switch (token) {
> +		case Opt_dax:
> +		case Opt_dax_always:
> +			ext4_msg(sb, KERN_WARNING,
> +				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> +			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
> +			sbi->s_mount_opt2&= ~EXT4_MOUNT2_DAX_NEVER;
> +			break;
> +		case Opt_dax_never:
> +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> +			sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> +			break;
> +		case Opt_dax_inode:
> +			sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> +			sbi->s_mount_opt2&= ~EXT4_MOUNT2_DAX_NEVER;
> +			/* Strictly for printing options */
> +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_INODE;
> +			break;
> +		}
>   #else
>   		ext4_msg(sb, KERN_INFO, "dax option not supported");
> +		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> +		sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
>   		return -1;
>   #endif

For s_mount_opt/s_mount_opt2, could we make the code more readable by
using set_opt()/set_opt2()/clear_opt()/clear_opt2() macros?

Thanks,
Xiao Yang
>   	} else if (token == Opt_data_err_abort) {
> @@ -2254,7 +2284,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
>   	for (m = ext4_mount_opts; m->token != Opt_err; m++) {
>   		int want_set = m->flags&  MOPT_SET;
>   		if (((m->flags&  (MOPT_SET|MOPT_CLEAR)) == 0) ||
> -		    (m->flags&  MOPT_CLEAR_ERR))
> +		    (m->flags&  MOPT_CLEAR_ERR) || m->flags&  MOPT_SKIP)
>   			continue;
>   		if (!nodefs&&  !(m->mount_opt&  (sbi->s_mount_opt ^ def_mount_opt)))
>   			continue; /* skip if same as the default */
> @@ -2314,6 +2344,17 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
>   	if (DUMMY_ENCRYPTION_ENABLED(sbi))
>   		SEQ_OPTS_PUTS("test_dummy_encryption");
> 
> +	if (test_opt(sb, DAX_ALWAYS)) {
> +		if (IS_EXT2_SB(sb))
> +			SEQ_OPTS_PUTS("dax");
> +		else
> +			SEQ_OPTS_PUTS("dax=always");
> +	} else if (test_opt2(sb, DAX_NEVER)) {
> +		SEQ_OPTS_PUTS("dax=never");
> +	} else if (test_opt2(sb, DAX_INODE)) {
> +		SEQ_OPTS_PUTS("dax=inode");
> +	}
> +
>   	ext4_show_quota_options(seq, sb);
>   	return 0;
>   }
> @@ -5436,10 +5477,16 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>   		goto restore_opts;
>   	}
> 
> -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt)&  EXT4_MOUNT_DAX_ALWAYS) {
> +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt)&  EXT4_MOUNT_DAX_ALWAYS ||
> +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2)&  EXT4_MOUNT2_DAX_NEVER ||
> +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2)&  EXT4_MOUNT2_DAX_INODE) {
>   		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> -			"dax flag with busy inodes while remounting");
> -		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
> +			"dax mount option with busy inodes while remounting");
> +		sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> +		sbi->s_mount_opt |= old_opts.s_mount_opt&  EXT4_MOUNT_DAX_ALWAYS;
> +		sbi->s_mount_opt2&= ~(EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
> +		sbi->s_mount_opt2 |= old_opts.s_mount_opt2&
> +				     (EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
>   	}
> 
>   	if (sbi->s_mount_flags&  EXT4_MF_FS_ABORTED)




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state
  2020-05-27  5:54   ` Xiao Yang
@ 2020-05-27 23:50     ` Ira Weiny
  2020-05-28  8:56       ` Xiao Yang
  0 siblings, 1 reply; 20+ messages in thread
From: Ira Weiny @ 2020-05-27 23:50 UTC (permalink / raw)
  To: Xiao Yang
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On Wed, May 27, 2020 at 01:54:54PM +0800, Xiao Yang wrote:
> On 2020/5/22 3:13, ira.weiny@intel.com wrote:
> > From: Ira Weiny<ira.weiny@intel.com>
> > 
> > We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
> > operate the same which is equivalent to 'always'.  This new
> > functionality is limited to ext4 only.
> > 
> > Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
> > it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.
> > 
> > We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
> > 
> > Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
> > specified that option for printing.
> Hi Ira,
> 
> I have two questions when reviewing this patch:
> 1) After doing mount with the same dax=inode option, ext4/xfs shows
> differnt output(i.e. xfs doesn't print 'dax=inode'):
> ---------------------------------------------------
> # mount -o dax=inode /dev/pmem0 /mnt/xfstests/test/
> # mount | grep pmem0
> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode)
> 
> # mount -odax=inode /dev/pmem1 /mnt/xfstests/scratch/
> # mount | grep pmem1
> /dev/pmem1 on /mnt/xfstests/scratch type xfs
> (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
> ----------------------------------------------------
> Is this expected output? why don't unify the output?

Correct. dax=inode is the default.  xfs treats that default the same whether
you specify it on the command line or not.

For ext4 Jan specifically asked that if the user specified dax=inode on the
command line that it be printed on the mount options.  If you don't specify
anything then dax=inode is in effect but ext4 will not print anything.

I had the behavior the same as XFS originally but Jan wanted it this way.  The
XFS behavior is IMO better and is what the new mount infrastructure gives by
default.

> 
> 2) Do mount without dax and mount with -odax=inode have the same behavior?

Yes.

> ---------------------------------------------------
> # mount /dev/pmem0 /mnt/xfstests/test/
> # mount | grep pmem0
> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel)
> # umount /mnt/xfstests/test
> # mount -odax=inode /dev/pmem0 /mnt/xfstests/test/
> # mount | grep pmem0
> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode
> ---------------------------------------------------

These are the same behavior.  But because you specified the option it was
echoed in the second output.

> 
> BTW: I focus on the support of per-file/directory DAX operations recently.
> 
> > 
> > Reviewed-by: Jan Kara<jack@suse.cz>
> > Signed-off-by: Ira Weiny<ira.weiny@intel.com>
> > 
> > ---
> > Changes from V1:
> > 	Fix up mounting options to only show an option if specified
> > 	Fix remount to prevent dax changes
> > 	Isolate behavior to ext4 only
> > 
> > Changes from RFC:
> > 	Combine remount check for DAX_NEVER with DAX_ALWAYS
> > 	Update ext4_should_enable_dax()
> > ---
> >   fs/ext4/ext4.h  |  2 ++
> >   fs/ext4/inode.c |  2 ++
> >   fs/ext4/super.c | 67 +++++++++++++++++++++++++++++++++++++++++--------
> >   3 files changed, 61 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index f5291693ce6e..65ffb831b2b9 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -1168,6 +1168,8 @@ struct ext4_inode_info {
> >   						      blocks */
> >   #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
> >   						      file systems */
> > +#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
> > +#define EXT4_MOUNT2_DAX_INODE		0x00000010 /* For printing options only */
> > 
> >   #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
> >   						specified journal checksum */
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 01636cf5f322..68fac9289109 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4402,6 +4402,8 @@ static bool ext4_should_enable_dax(struct inode *inode)
> >   {
> >   	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> > 
> > +	if (test_opt2(inode->i_sb, DAX_NEVER))
> > +		return false;
> >   	if (!S_ISREG(inode->i_mode))
> >   		return false;
> >   	if (ext4_should_journal_data(inode))
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 80eb814c47eb..5e056aa20ce9 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1512,7 +1512,8 @@ enum {
> >   	Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
> >   	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
> >   	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> > -	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
> > +	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
> > +	Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
> >   	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
> >   	Opt_nowarn_on_error, Opt_mblk_io_submit,
> >   	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
> > @@ -1579,6 +1580,9 @@ static const match_table_t tokens = {
> >   	{Opt_nobarrier, "nobarrier"},
> >   	{Opt_i_version, "i_version"},
> >   	{Opt_dax, "dax"},
> > +	{Opt_dax_always, "dax=always"},
> > +	{Opt_dax_inode, "dax=inode"},
> > +	{Opt_dax_never, "dax=never"},
> >   	{Opt_stripe, "stripe=%u"},
> >   	{Opt_delalloc, "delalloc"},
> >   	{Opt_warn_on_error, "warn_on_error"},
> > @@ -1726,6 +1730,7 @@ static int clear_qf_name(struct super_block *sb, int qtype)
> >   #define MOPT_NO_EXT3	0x0200
> >   #define MOPT_EXT4_ONLY	(MOPT_NO_EXT2 | MOPT_NO_EXT3)
> >   #define MOPT_STRING	0x0400
> > +#define MOPT_SKIP	0x0800
> > 
> >   static const struct mount_opts {
> >   	int	token;
> > @@ -1775,7 +1780,13 @@ static const struct mount_opts {
> >   	{Opt_min_batch_time, 0, MOPT_GTE0},
> >   	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
> >   	{Opt_init_itable, 0, MOPT_GTE0},
> > -	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
> > +	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET | MOPT_SKIP},
> > +	{Opt_dax_always, EXT4_MOUNT_DAX_ALWAYS,
> > +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
> > +	{Opt_dax_inode, EXT4_MOUNT2_DAX_INODE,
> > +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
> > +	{Opt_dax_never, EXT4_MOUNT2_DAX_NEVER,
> > +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
> >   	{Opt_stripe, 0, MOPT_GTE0},
> >   	{Opt_resuid, 0, MOPT_GTE0},
> >   	{Opt_resgid, 0, MOPT_GTE0},
> > @@ -2084,13 +2095,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
> >   		}
> >   		sbi->s_jquota_fmt = m->mount_opt;
> >   #endif
> > -	} else if (token == Opt_dax) {
> > +	} else if (token == Opt_dax || token == Opt_dax_always ||
> > +		   token == Opt_dax_inode || token == Opt_dax_never) {
> >   #ifdef CONFIG_FS_DAX
> > -		ext4_msg(sb, KERN_WARNING,
> > -		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> > -		sbi->s_mount_opt |= m->mount_opt;
> > +		switch (token) {
> > +		case Opt_dax:
> > +		case Opt_dax_always:
> > +			ext4_msg(sb, KERN_WARNING,
> > +				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> > +			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
> > +			sbi->s_mount_opt2&= ~EXT4_MOUNT2_DAX_NEVER;
> > +			break;
> > +		case Opt_dax_never:
> > +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> > +			sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> > +			break;
> > +		case Opt_dax_inode:
> > +			sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> > +			sbi->s_mount_opt2&= ~EXT4_MOUNT2_DAX_NEVER;
> > +			/* Strictly for printing options */
> > +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_INODE;
> > +			break;
> > +		}
> >   #else
> >   		ext4_msg(sb, KERN_INFO, "dax option not supported");
> > +		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> > +		sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> >   		return -1;
> >   #endif
> 
> For s_mount_opt/s_mount_opt2, could we make the code more readable by
> using set_opt()/set_opt2()/clear_opt()/clear_opt2() macros?

I could but it seems like there is a mixture of styles in that function/file...

For me it is really a toss up which is more readable.  I'm inclined to keep it
since it's been reviewed multiple times.  I was about to send V5 but I'll hold
off a bit.  If you feel strongly about it I can change it.

Thanks for the review!
Ira

> 
> Thanks,
> Xiao Yang
> >   	} else if (token == Opt_data_err_abort) {
> > @@ -2254,7 +2284,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
> >   	for (m = ext4_mount_opts; m->token != Opt_err; m++) {
> >   		int want_set = m->flags&  MOPT_SET;
> >   		if (((m->flags&  (MOPT_SET|MOPT_CLEAR)) == 0) ||
> > -		    (m->flags&  MOPT_CLEAR_ERR))
> > +		    (m->flags&  MOPT_CLEAR_ERR) || m->flags&  MOPT_SKIP)
> >   			continue;
> >   		if (!nodefs&&  !(m->mount_opt&  (sbi->s_mount_opt ^ def_mount_opt)))
> >   			continue; /* skip if same as the default */
> > @@ -2314,6 +2344,17 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
> >   	if (DUMMY_ENCRYPTION_ENABLED(sbi))
> >   		SEQ_OPTS_PUTS("test_dummy_encryption");
> > 
> > +	if (test_opt(sb, DAX_ALWAYS)) {
> > +		if (IS_EXT2_SB(sb))
> > +			SEQ_OPTS_PUTS("dax");
> > +		else
> > +			SEQ_OPTS_PUTS("dax=always");
> > +	} else if (test_opt2(sb, DAX_NEVER)) {
> > +		SEQ_OPTS_PUTS("dax=never");
> > +	} else if (test_opt2(sb, DAX_INODE)) {
> > +		SEQ_OPTS_PUTS("dax=inode");
> > +	}
> > +
> >   	ext4_show_quota_options(seq, sb);
> >   	return 0;
> >   }
> > @@ -5436,10 +5477,16 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
> >   		goto restore_opts;
> >   	}
> > 
> > -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt)&  EXT4_MOUNT_DAX_ALWAYS) {
> > +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt)&  EXT4_MOUNT_DAX_ALWAYS ||
> > +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2)&  EXT4_MOUNT2_DAX_NEVER ||
> > +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2)&  EXT4_MOUNT2_DAX_INODE) {
> >   		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> > -			"dax flag with busy inodes while remounting");
> > -		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
> > +			"dax mount option with busy inodes while remounting");
> > +		sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
> > +		sbi->s_mount_opt |= old_opts.s_mount_opt&  EXT4_MOUNT_DAX_ALWAYS;
> > +		sbi->s_mount_opt2&= ~(EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
> > +		sbi->s_mount_opt2 |= old_opts.s_mount_opt2&
> > +				     (EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
> >   	}
> > 
> >   	if (sbi->s_mount_flags&  EXT4_MF_FS_ABORTED)
> 
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state
  2020-05-27 23:50     ` Ira Weiny
@ 2020-05-28  8:56       ` Xiao Yang
  2020-05-28  9:41         ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Xiao Yang @ 2020-05-28  8:56 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On 2020/5/28 7:50, Ira Weiny wrote:
> On Wed, May 27, 2020 at 01:54:54PM +0800, Xiao Yang wrote:
>> On 2020/5/22 3:13, ira.weiny@intel.com wrote:
>>> From: Ira Weiny<ira.weiny@intel.com>
>>>
>>> We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
>>> operate the same which is equivalent to 'always'.  This new
>>> functionality is limited to ext4 only.
>>>
>>> Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
>>> it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.
>>>
>>> We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
>>>
>>> Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
>>> specified that option for printing.
>> Hi Ira,
>>
>> I have two questions when reviewing this patch:
>> 1) After doing mount with the same dax=inode option, ext4/xfs shows
>> differnt output(i.e. xfs doesn't print 'dax=inode'):
>> ---------------------------------------------------
>> # mount -o dax=inode /dev/pmem0 /mnt/xfstests/test/
>> # mount | grep pmem0
>> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode)
>>
>> # mount -odax=inode /dev/pmem1 /mnt/xfstests/scratch/
>> # mount | grep pmem1
>> /dev/pmem1 on /mnt/xfstests/scratch type xfs
>> (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
>> ----------------------------------------------------
>> Is this expected output? why don't unify the output?
>
> Correct. dax=inode is the default.  xfs treats that default the same whether
> you specify it on the command line or not.
>
> For ext4 Jan specifically asked that if the user specified dax=inode on the
> command line that it be printed on the mount options.  If you don't specify
> anything then dax=inode is in effect but ext4 will not print anything.
>
> I had the behavior the same as XFS originally but Jan wanted it this way.  The
> XFS behavior is IMO better and is what the new mount infrastructure gives by
> default.
Hi Ira,

Could we unify the output?  It is strange for me to use differnt output 
on ext4 and xfs.

>
>>
>> 2) Do mount without dax and mount with -odax=inode have the same behavior?
>
> Yes.
>
>> ---------------------------------------------------
>> # mount /dev/pmem0 /mnt/xfstests/test/
>> # mount | grep pmem0
>> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel)
>> # umount /mnt/xfstests/test
>> # mount -odax=inode /dev/pmem0 /mnt/xfstests/test/
>> # mount | grep pmem0
>> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode
>> ---------------------------------------------------
>
> These are the same behavior.  But because you specified the option it was
> echoed in the second output.

Got it, thanks for your explanation.

>
>>
>> BTW: I focus on the support of per-file/directory DAX operations recently.
>>
>>>
>>> Reviewed-by: Jan Kara<jack@suse.cz>
>>> Signed-off-by: Ira Weiny<ira.weiny@intel.com>
>>>
>>> ---
>>> Changes from V1:
>>> 	Fix up mounting options to only show an option if specified
>>> 	Fix remount to prevent dax changes
>>> 	Isolate behavior to ext4 only
>>>
>>> Changes from RFC:
>>> 	Combine remount check for DAX_NEVER with DAX_ALWAYS
>>> 	Update ext4_should_enable_dax()
>>> ---
>>>    fs/ext4/ext4.h  |  2 ++
>>>    fs/ext4/inode.c |  2 ++
>>>    fs/ext4/super.c | 67 +++++++++++++++++++++++++++++++++++++++++--------
>>>    3 files changed, 61 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>>> index f5291693ce6e..65ffb831b2b9 100644
>>> --- a/fs/ext4/ext4.h
>>> +++ b/fs/ext4/ext4.h
>>> @@ -1168,6 +1168,8 @@ struct ext4_inode_info {
>>>    						      blocks */
>>>    #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
>>>    						      file systems */
>>> +#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
>>> +#define EXT4_MOUNT2_DAX_INODE		0x00000010 /* For printing options only */
>>>
>>>    #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
>>>    						specified journal checksum */
>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>> index 01636cf5f322..68fac9289109 100644
>>> --- a/fs/ext4/inode.c
>>> +++ b/fs/ext4/inode.c
>>> @@ -4402,6 +4402,8 @@ static bool ext4_should_enable_dax(struct inode *inode)
>>>    {
>>>    	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>>>
>>> +	if (test_opt2(inode->i_sb, DAX_NEVER))
>>> +		return false;
>>>    	if (!S_ISREG(inode->i_mode))
>>>    		return false;
>>>    	if (ext4_should_journal_data(inode))
>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>>> index 80eb814c47eb..5e056aa20ce9 100644
>>> --- a/fs/ext4/super.c
>>> +++ b/fs/ext4/super.c
>>> @@ -1512,7 +1512,8 @@ enum {
>>>    	Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
>>>    	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
>>>    	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
>>> -	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
>>> +	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
>>> +	Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
>>>    	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
>>>    	Opt_nowarn_on_error, Opt_mblk_io_submit,
>>>    	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
>>> @@ -1579,6 +1580,9 @@ static const match_table_t tokens = {
>>>    	{Opt_nobarrier, "nobarrier"},
>>>    	{Opt_i_version, "i_version"},
>>>    	{Opt_dax, "dax"},
>>> +	{Opt_dax_always, "dax=always"},
>>> +	{Opt_dax_inode, "dax=inode"},
>>> +	{Opt_dax_never, "dax=never"},
>>>    	{Opt_stripe, "stripe=%u"},
>>>    	{Opt_delalloc, "delalloc"},
>>>    	{Opt_warn_on_error, "warn_on_error"},
>>> @@ -1726,6 +1730,7 @@ static int clear_qf_name(struct super_block *sb, int qtype)
>>>    #define MOPT_NO_EXT3	0x0200
>>>    #define MOPT_EXT4_ONLY	(MOPT_NO_EXT2 | MOPT_NO_EXT3)
>>>    #define MOPT_STRING	0x0400
>>> +#define MOPT_SKIP	0x0800
>>>
>>>    static const struct mount_opts {
>>>    	int	token;
>>> @@ -1775,7 +1780,13 @@ static const struct mount_opts {
>>>    	{Opt_min_batch_time, 0, MOPT_GTE0},
>>>    	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
>>>    	{Opt_init_itable, 0, MOPT_GTE0},
>>> -	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
>>> +	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET | MOPT_SKIP},
>>> +	{Opt_dax_always, EXT4_MOUNT_DAX_ALWAYS,
>>> +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
>>> +	{Opt_dax_inode, EXT4_MOUNT2_DAX_INODE,
>>> +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
>>> +	{Opt_dax_never, EXT4_MOUNT2_DAX_NEVER,
>>> +		MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
>>>    	{Opt_stripe, 0, MOPT_GTE0},
>>>    	{Opt_resuid, 0, MOPT_GTE0},
>>>    	{Opt_resgid, 0, MOPT_GTE0},
>>> @@ -2084,13 +2095,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
>>>    		}
>>>    		sbi->s_jquota_fmt = m->mount_opt;
>>>    #endif
>>> -	} else if (token == Opt_dax) {
>>> +	} else if (token == Opt_dax || token == Opt_dax_always ||
>>> +		   token == Opt_dax_inode || token == Opt_dax_never) {
>>>    #ifdef CONFIG_FS_DAX
>>> -		ext4_msg(sb, KERN_WARNING,
>>> -		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
>>> -		sbi->s_mount_opt |= m->mount_opt;
>>> +		switch (token) {
>>> +		case Opt_dax:
>>> +		case Opt_dax_always:
>>> +			ext4_msg(sb, KERN_WARNING,
>>> +				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
>>> +			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
>>> +			sbi->s_mount_opt2&= ~EXT4_MOUNT2_DAX_NEVER;
>>> +			break;
>>> +		case Opt_dax_never:
>>> +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
>>> +			sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
>>> +			break;
>>> +		case Opt_dax_inode:
>>> +			sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
>>> +			sbi->s_mount_opt2&= ~EXT4_MOUNT2_DAX_NEVER;
>>> +			/* Strictly for printing options */
>>> +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_INODE;
>>> +			break;
>>> +		}
>>>    #else
>>>    		ext4_msg(sb, KERN_INFO, "dax option not supported");
>>> +		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
>>> +		sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
>>>    		return -1;
>>>    #endif
>>
>> For s_mount_opt/s_mount_opt2, could we make the code more readable by
>> using set_opt()/set_opt2()/clear_opt()/clear_opt2() macros?
>
> I could but it seems like there is a mixture of styles in that function/file...
>
> For me it is really a toss up which is more readable.  I'm inclined to keep it
> since it's been reviewed multiple times.  I was about to send V5 but I'll hold
> off a bit.  If you feel strongly about it I can change it.

It is fine for me to keep it.
I think we can send a separate patch to use unified style in future.

Best Regards,
Xiao Yang
>
> Thanks for the review!
> Ira
>
>>
>> Thanks,
>> Xiao Yang
>>>    	} else if (token == Opt_data_err_abort) {
>>> @@ -2254,7 +2284,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
>>>    	for (m = ext4_mount_opts; m->token != Opt_err; m++) {
>>>    		int want_set = m->flags&   MOPT_SET;
>>>    		if (((m->flags&   (MOPT_SET|MOPT_CLEAR)) == 0) ||
>>> -		    (m->flags&   MOPT_CLEAR_ERR))
>>> +		    (m->flags&   MOPT_CLEAR_ERR) || m->flags&   MOPT_SKIP)
>>>    			continue;
>>>    		if (!nodefs&&   !(m->mount_opt&   (sbi->s_mount_opt ^ def_mount_opt)))
>>>    			continue; /* skip if same as the default */
>>> @@ -2314,6 +2344,17 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
>>>    	if (DUMMY_ENCRYPTION_ENABLED(sbi))
>>>    		SEQ_OPTS_PUTS("test_dummy_encryption");
>>>
>>> +	if (test_opt(sb, DAX_ALWAYS)) {
>>> +		if (IS_EXT2_SB(sb))
>>> +			SEQ_OPTS_PUTS("dax");
>>> +		else
>>> +			SEQ_OPTS_PUTS("dax=always");
>>> +	} else if (test_opt2(sb, DAX_NEVER)) {
>>> +		SEQ_OPTS_PUTS("dax=never");
>>> +	} else if (test_opt2(sb, DAX_INODE)) {
>>> +		SEQ_OPTS_PUTS("dax=inode");
>>> +	}
>>> +
>>>    	ext4_show_quota_options(seq, sb);
>>>    	return 0;
>>>    }
>>> @@ -5436,10 +5477,16 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>>>    		goto restore_opts;
>>>    	}
>>>
>>> -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt)&   EXT4_MOUNT_DAX_ALWAYS) {
>>> +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt)&   EXT4_MOUNT_DAX_ALWAYS ||
>>> +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2)&   EXT4_MOUNT2_DAX_NEVER ||
>>> +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2)&   EXT4_MOUNT2_DAX_INODE) {
>>>    		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
>>> -			"dax flag with busy inodes while remounting");
>>> -		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
>>> +			"dax mount option with busy inodes while remounting");
>>> +		sbi->s_mount_opt&= ~EXT4_MOUNT_DAX_ALWAYS;
>>> +		sbi->s_mount_opt |= old_opts.s_mount_opt&   EXT4_MOUNT_DAX_ALWAYS;
>>> +		sbi->s_mount_opt2&= ~(EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
>>> +		sbi->s_mount_opt2 |= old_opts.s_mount_opt2&
>>> +				     (EXT4_MOUNT2_DAX_NEVER | EXT4_MOUNT2_DAX_INODE);
>>>    	}
>>>
>>>    	if (sbi->s_mount_flags&   EXT4_MF_FS_ABORTED)
>>
>>
>>
>
>
> .
>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state
  2020-05-28  8:56       ` Xiao Yang
@ 2020-05-28  9:41         ` Jan Kara
  2020-06-02  1:16           ` Xiao Yang
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2020-05-28  9:41 UTC (permalink / raw)
  To: Xiao Yang
  Cc: Ira Weiny, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Jan Kara, Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On Thu 28-05-20 16:56:51, Xiao Yang wrote:
> On 2020/5/28 7:50, Ira Weiny wrote:
> > On Wed, May 27, 2020 at 01:54:54PM +0800, Xiao Yang wrote:
> > > On 2020/5/22 3:13, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny<ira.weiny@intel.com>
> > > > 
> > > > We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
> > > > operate the same which is equivalent to 'always'.  This new
> > > > functionality is limited to ext4 only.
> > > > 
> > > > Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
> > > > it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.
> > > > 
> > > > We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
> > > > 
> > > > Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
> > > > specified that option for printing.
> > > Hi Ira,
> > > 
> > > I have two questions when reviewing this patch:
> > > 1) After doing mount with the same dax=inode option, ext4/xfs shows
> > > differnt output(i.e. xfs doesn't print 'dax=inode'):
> > > ---------------------------------------------------
> > > # mount -o dax=inode /dev/pmem0 /mnt/xfstests/test/
> > > # mount | grep pmem0
> > > /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode)
> > > 
> > > # mount -odax=inode /dev/pmem1 /mnt/xfstests/scratch/
> > > # mount | grep pmem1
> > > /dev/pmem1 on /mnt/xfstests/scratch type xfs
> > > (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
> > > ----------------------------------------------------
> > > Is this expected output? why don't unify the output?
> > 
> > Correct. dax=inode is the default.  xfs treats that default the same whether
> > you specify it on the command line or not.
> > 
> > For ext4 Jan specifically asked that if the user specified dax=inode on the
> > command line that it be printed on the mount options.  If you don't specify
> > anything then dax=inode is in effect but ext4 will not print anything.
> > 
> > I had the behavior the same as XFS originally but Jan wanted it this way.  The
> > XFS behavior is IMO better and is what the new mount infrastructure gives by
> > default.
> 
> Could we unify the output?  It is strange for me to use differnt output on
> ext4 and xfs.

If we'd unify the output with XFS, it would be inconsistent with all the
other ext4 mount options. So I disagree with that. I agree it is not ideal
to have different behavior between xfs and ext4 but such is the historical
behavior. If we want to change that, we need to change the handling for all
the ext4 mount options. I'm open for that discussion but it is a problem
unrelated to this patch set.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state
  2020-05-28  9:41         ` Jan Kara
@ 2020-06-02  1:16           ` Xiao Yang
  0 siblings, 0 replies; 20+ messages in thread
From: Xiao Yang @ 2020-06-02  1:16 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ira Weiny, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Eric Biggers, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

On 2020/5/28 17:41, Jan Kara wrote:
> On Thu 28-05-20 16:56:51, Xiao Yang wrote:
>> On 2020/5/28 7:50, Ira Weiny wrote:
>>> On Wed, May 27, 2020 at 01:54:54PM +0800, Xiao Yang wrote:
>>>> On 2020/5/22 3:13, ira.weiny@intel.com wrote:
>>>>> From: Ira Weiny<ira.weiny@intel.com>
>>>>>
>>>>> We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
>>>>> operate the same which is equivalent to 'always'.  This new
>>>>> functionality is limited to ext4 only.
>>>>>
>>>>> Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
>>>>> it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.
>>>>>
>>>>> We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
>>>>>
>>>>> Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
>>>>> specified that option for printing.
>>>> Hi Ira,
>>>>
>>>> I have two questions when reviewing this patch:
>>>> 1) After doing mount with the same dax=inode option, ext4/xfs shows
>>>> differnt output(i.e. xfs doesn't print 'dax=inode'):
>>>> ---------------------------------------------------
>>>> # mount -o dax=inode /dev/pmem0 /mnt/xfstests/test/
>>>> # mount | grep pmem0
>>>> /dev/pmem0 on /mnt/xfstests/test type ext4 (rw,relatime,seclabel,dax=inode)
>>>>
>>>> # mount -odax=inode /dev/pmem1 /mnt/xfstests/scratch/
>>>> # mount | grep pmem1
>>>> /dev/pmem1 on /mnt/xfstests/scratch type xfs
>>>> (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
>>>> ----------------------------------------------------
>>>> Is this expected output? why don't unify the output?
>>>
>>> Correct. dax=inode is the default.  xfs treats that default the same whether
>>> you specify it on the command line or not.
>>>
>>> For ext4 Jan specifically asked that if the user specified dax=inode on the
>>> command line that it be printed on the mount options.  If you don't specify
>>> anything then dax=inode is in effect but ext4 will not print anything.
>>>
>>> I had the behavior the same as XFS originally but Jan wanted it this way.  The
>>> XFS behavior is IMO better and is what the new mount infrastructure gives by
>>> default.
>>
>> Could we unify the output?  It is strange for me to use differnt output on
>> ext4 and xfs.
>
> If we'd unify the output with XFS, it would be inconsistent with all the
> other ext4 mount options. So I disagree with that. I agree it is not ideal
> to have different behavior between xfs and ext4 but such is the historical
> behavior. If we want to change that, we need to change the handling for all
> the ext4 mount options. I'm open for that discussion but it is a problem
> unrelated to this patch set.
Hi Jan,

Thanks for your quick feedback.
Of course, this doubt should not block the patch set.

Best Regards,
Xiao Yang
>
> 								Honza




^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-06-02  1:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21 19:13 [PATCH V4 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
2020-05-21 19:13 ` [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
2020-05-22  7:58   ` Jan Kara
2020-05-21 19:13 ` [PATCH V4 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
2020-05-21 19:13 ` [PATCH V4 3/8] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
2020-05-21 19:13 ` [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
2020-05-22 10:41   ` Jan Kara
2020-05-21 19:13 ` [PATCH V4 5/8] fs/ext4: Only change S_DAX on inode load ira.weiny
2020-05-21 19:13 ` [PATCH V4 6/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
2020-05-27  5:54   ` Xiao Yang
2020-05-27 23:50     ` Ira Weiny
2020-05-28  8:56       ` Xiao Yang
2020-05-28  9:41         ` Jan Kara
2020-06-02  1:16           ` Xiao Yang
2020-05-21 19:13 ` [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag ira.weiny
2020-05-22 11:48   ` Jan Kara
2020-05-25  4:39     ` Ira Weiny
2020-05-25  7:28       ` Jan Kara
2020-05-23  5:13   ` Andreas Dilger
2020-05-21 19:13 ` [PATCH V4 8/8] Documentation/dax: Update DAX enablement for ext4 ira.weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).