Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations
@ 2020-05-13  5:43 ira.weiny
  2020-05-13  5:43 ` [PATCH 1/9] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
                   ` (8 more replies)
  0 siblings, 9 replies; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, linux-xfs, linux-fsdevel, Al Viro, Jeff Moyer,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Enable the same per file DAX support in ext4 as was done for xfs.  This series
builds and depends on the V11 series for xfs.[1]

This passes the same xfstests test as XFS.

The only issue is that this modifies the old mount option parsing code rather
than waiting for the new parsing code to be finalized.

This series starts with 3 fixes which include making Verity and Encrypt truly
mutually exclusive from DAX.  I think these first 3 patches should be picked up
for 5.8 regardless of what is decided regarding the mount parsing.

[1] https://lore.kernel.org/lkml/20200428002142.404144-1-ira.weiny@intel.com/

To: linux-kernel@vger.kernel.org
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Cc: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org

Ira Weiny (9):
  fs/ext4: Narrow scope of DAX check in setflags
  fs/ext4: Disallow verity if inode is DAX
  fs/ext4: Disallow encryption if inode is DAX
  fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  fs/ext4: Update ext4_should_use_dax()
  fs/ext4: Only change S_DAX on inode load
  fs/ext4: Make DAX mount option a tri-state
  fs/ext4: Introduce DAX inode flag
  Documentation/dax: Update DAX enablement for ext4

 Documentation/filesystems/dax.txt         |  6 +-
 Documentation/filesystems/ext4/verity.rst |  7 +++
 Documentation/filesystems/fscrypt.rst     |  4 +-
 fs/ext4/ext4.h                            | 20 ++++---
 fs/ext4/ialloc.c                          |  2 +-
 fs/ext4/inode.c                           | 27 +++++++--
 fs/ext4/ioctl.c                           | 32 +++++++++--
 fs/ext4/super.c                           | 67 +++++++++++++++--------
 fs/ext4/verity.c                          |  5 +-
 9 files changed, 125 insertions(+), 45 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/9] fs/ext4: Narrow scope of DAX check in setflags
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-13  5:43 ` [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX ira.weiny
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

When preventing DAX and journaling on an inode.  Use the effective DAX
check rather than the mount option.

This will be required to support per inode DAX flags.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index bfc1281fc4cb..5813e5e73eab 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -393,9 +393,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
 	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
 		/*
 		 * Changes to the journaling mode can cause unsafe changes to
-		 * S_DAX if we are using the DAX mount option.
+		 * S_DAX if the inode is DAX
 		 */
-		if (test_opt(inode->i_sb, DAX)) {
+		if (IS_DAX(inode)) {
 			err = -EBUSY;
 			goto flags_out;
 		}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
  2020-05-13  5:43 ` [PATCH 1/9] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-16  1:49   ` Eric Biggers
  2020-05-13  5:43 ` [PATCH 3/9] fs/ext4: Disallow encryption " ira.weiny
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Verity and DAX are incompatible.  Changing the DAX mode due to a verity
flag change is wrong without a corresponding address_space_operations
update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

(Setting DAX is already disabled if Verity is set first.)

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes:
	remove WARN_ON_ONCE
	Add documentation for DAX/Verity exclusivity
---
 Documentation/filesystems/ext4/verity.rst | 7 +++++++
 fs/ext4/verity.c                          | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/Documentation/filesystems/ext4/verity.rst b/Documentation/filesystems/ext4/verity.rst
index 3e4c0ee0e068..51ab1aa17e59 100644
--- a/Documentation/filesystems/ext4/verity.rst
+++ b/Documentation/filesystems/ext4/verity.rst
@@ -39,3 +39,10 @@ is encrypted as well as the data itself.
 
 Verity files cannot have blocks allocated past the end of the verity
 metadata.
+
+Verity and DAX
+--------------
+
+Verity and DAX are not compatible and attempts to set both of these flags on a
+file will fail.
+
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dc5ec724d889..f05a09fb2ae4 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
 	handle_t *handle;
 	int err;
 
+	if (IS_DAX(inode))
+		return -EINVAL;
+
 	if (ext4_verity_in_progress(inode))
 		return -EBUSY;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
  2020-05-13  5:43 ` [PATCH 1/9] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
  2020-05-13  5:43 ` [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-16  2:02   ` Eric Biggers
  2020-05-13  5:43 ` [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Encryption and DAX are incompatible.  Changing the DAX mode due to a
change in Encryption mode is wrong without a corresponding
address_space_operations update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

Furthermore, clarify the documentation of the exclusivity and how that
will work.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes:
	remove WARN_ON_ONCE
	Add documentation to the encrypt doc WRT DAX
---
 Documentation/filesystems/fscrypt.rst |  4 +++-
 fs/ext4/super.c                       | 10 +---------
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index aa072112cfff..1475b8d52fef 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1038,7 +1038,9 @@ astute users may notice some differences in behavior:
 - The ext4 filesystem does not support data journaling with encrypted
   regular files.  It will fall back to ordered data mode instead.
 
-- DAX (Direct Access) is not supported on encrypted files.
+- DAX (Direct Access) is not supported on encrypted files.  Attempts to enable
+  DAX on an encrypted file will fail.  Mount options will _not_ enable DAX on
+  encrypted files.
 
 - The st_size of an encrypted symlink will not necessarily give the
   length of the symlink target as required by POSIX.  It will actually
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index bf5fcb477f66..9873ab27e3fa 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1320,7 +1320,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 	if (inode->i_ino == EXT4_ROOT_INO)
 		return -EPERM;
 
-	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
+	if (IS_DAX(inode))
 		return -EINVAL;
 
 	res = ext4_convert_inline_data(inode);
@@ -1344,10 +1344,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
 			ext4_clear_inode_state(inode,
 					EXT4_STATE_MAY_INLINE_DATA);
-			/*
-			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
-			 * S_DAX may be disabled
-			 */
 			ext4_set_inode_flags(inode);
 		}
 		return res;
@@ -1371,10 +1367,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 				    ctx, len, 0);
 	if (!res) {
 		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
-		/*
-		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
-		 * S_DAX may be disabled
-		 */
 		ext4_set_inode_flags(inode);
 		res = ext4_mark_inode_dirty(handle, inode);
 		if (res)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (2 preceding siblings ...)
  2020-05-13  5:43 ` [PATCH 3/9] fs/ext4: Disallow encryption " ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-13 11:25   ` Jan Kara
  2020-05-13  5:43 ` [PATCH 5/9] fs/ext4: Update ext4_should_use_dax() ira.weiny
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

In prep for the new tri-state mount option which then introduces
EXT4_MOUNT_DAX_NEVER.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes:
	New patch
---
 fs/ext4/ext4.h  |  4 ++--
 fs/ext4/inode.c |  2 +-
 fs/ext4/super.c | 12 ++++++------
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 91eb4381cae5..1a3daf2d18ef 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1123,9 +1123,9 @@ struct ext4_inode_info {
 #define EXT4_MOUNT_MINIX_DF		0x00080	/* Mimics the Minix statfs */
 #define EXT4_MOUNT_NOLOAD		0x00100	/* Don't use existing journal*/
 #ifdef CONFIG_FS_DAX
-#define EXT4_MOUNT_DAX			0x00200	/* Direct Access */
+#define EXT4_MOUNT_DAX_ALWAYS		0x00200	/* Direct Access */
 #else
-#define EXT4_MOUNT_DAX			0
+#define EXT4_MOUNT_DAX_ALWAYS		0
 #endif
 #define EXT4_MOUNT_DATA_FLAGS		0x00C00	/* Mode for data writes: */
 #define EXT4_MOUNT_JOURNAL_DATA		0x00400	/* Write data to journal */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2a4aae6acdcb..a10ff12194db 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,7 +4400,7 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 
 static bool ext4_should_use_dax(struct inode *inode)
 {
-	if (!test_opt(inode->i_sb, DAX))
+	if (!test_opt(inode->i_sb, DAX_ALWAYS))
 		return false;
 	if (!S_ISREG(inode->i_mode))
 		return false;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9873ab27e3fa..d0434b513919 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1767,7 +1767,7 @@ static const struct mount_opts {
 	{Opt_min_batch_time, 0, MOPT_GTE0},
 	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
 	{Opt_init_itable, 0, MOPT_GTE0},
-	{Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
+	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
 	{Opt_stripe, 0, MOPT_GTE0},
 	{Opt_resuid, 0, MOPT_GTE0},
 	{Opt_resgid, 0, MOPT_GTE0},
@@ -3974,7 +3974,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 				 "both data=journal and dioread_nolock");
 			goto failed_mount;
 		}
-		if (test_opt(sb, DAX)) {
+		if (test_opt(sb, DAX_ALWAYS)) {
 			ext4_msg(sb, KERN_ERR, "can't mount with "
 				 "both data=journal and dax");
 			goto failed_mount;
@@ -4084,7 +4084,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		goto failed_mount;
 	}
 
-	if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
+	if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
 		if (ext4_has_feature_inline_data(sb)) {
 			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
 					" that may contain inline data");
@@ -5404,7 +5404,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 			err = -EINVAL;
 			goto restore_opts;
 		}
-		if (test_opt(sb, DAX)) {
+		if (test_opt(sb, DAX_ALWAYS)) {
 			ext4_msg(sb, KERN_ERR, "can't mount with "
 				 "both data=journal and dax");
 			err = -EINVAL;
@@ -5425,10 +5425,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 		goto restore_opts;
 	}
 
-	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX) {
+	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
 		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
 			"dax flag with busy inodes while remounting");
-		sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
+		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
 	}
 
 	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 5/9] fs/ext4: Update ext4_should_use_dax()
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (3 preceding siblings ...)
  2020-05-13  5:43 ` [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-13 11:30   ` Jan Kara
  2020-05-13  5:43 ` [PATCH 6/9] fs/ext4: Only change S_DAX on inode load ira.weiny
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

S_DAX should only be enabled when the underlying block device supports
dax.

Change ext4_should_use_dax() to check for device support prior to the
over riding mount option.

While we are at it change the function to ext4_should_enable_dax() as
this better reflects the ask as well as matches xfs.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC
	Change function name to 'should enable'
	Clean up bool conversion
	Reorder this for better bisect-ability
---
 fs/ext4/inode.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a10ff12194db..d3a4c2ed7a1c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4398,10 +4398,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
 }
 
-static bool ext4_should_use_dax(struct inode *inode)
+static bool ext4_should_enable_dax(struct inode *inode)
 {
-	if (!test_opt(inode->i_sb, DAX_ALWAYS))
-		return false;
 	if (!S_ISREG(inode->i_mode))
 		return false;
 	if (ext4_should_journal_data(inode))
@@ -4412,7 +4410,13 @@ static bool ext4_should_use_dax(struct inode *inode)
 		return false;
 	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
 		return false;
-	return true;
+	if (!bdev_dax_supported(inode->i_sb->s_bdev,
+				inode->i_sb->s_blocksize))
+		return false;
+	if (test_opt(inode->i_sb, DAX_ALWAYS))
+		return true;
+
+	return false;
 }
 
 void ext4_set_inode_flags(struct inode *inode)
@@ -4430,7 +4434,7 @@ void ext4_set_inode_flags(struct inode *inode)
 		new_fl |= S_NOATIME;
 	if (flags & EXT4_DIRSYNC_FL)
 		new_fl |= S_DIRSYNC;
-	if (ext4_should_use_dax(inode))
+	if (ext4_should_enable_dax(inode))
 		new_fl |= S_DAX;
 	if (flags & EXT4_ENCRYPT_FL)
 		new_fl |= S_ENCRYPTED;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 6/9] fs/ext4: Only change S_DAX on inode load
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (4 preceding siblings ...)
  2020-05-13  5:43 ` [PATCH 5/9] fs/ext4: Update ext4_should_use_dax() ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-13 11:33   ` Jan Kara
  2020-05-13  5:43 ` [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state ira.weiny
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

To prevent complications with in memory inodes we only set S_DAX on
inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
change after inode eviction and reload.

Add init bool to ext4_set_inode_flags() to indicate if the inode is
being newly initialized.

Assert that S_DAX is not set on an inode which is just being loaded.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Change J_ASSERT() to WARN_ON_ONCE()
	Fix bug which would clear S_DAX incorrectly
---
 fs/ext4/ext4.h   |  2 +-
 fs/ext4/ialloc.c |  2 +-
 fs/ext4/inode.c  | 13 ++++++++++---
 fs/ext4/ioctl.c  |  3 ++-
 fs/ext4/super.c  |  4 ++--
 fs/ext4/verity.c |  2 +-
 6 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1a3daf2d18ef..86a0994332ce 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2692,7 +2692,7 @@ extern int ext4_can_truncate(struct inode *inode);
 extern int ext4_truncate(struct inode *);
 extern int ext4_break_layouts(struct inode *);
 extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length);
-extern void ext4_set_inode_flags(struct inode *);
+extern void ext4_set_inode_flags(struct inode *, bool init);
 extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 4b8c9a9bdf0c..7941c140723f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1116,7 +1116,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
 	ei->i_block_group = group;
 	ei->i_last_alloc_group = ~0;
 
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, true);
 	if (IS_DIRSYNC(inode))
 		ext4_handle_sync(handle);
 	if (insert_inode_locked(inode) < 0) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d3a4c2ed7a1c..23e42a223235 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4419,11 +4419,13 @@ static bool ext4_should_enable_dax(struct inode *inode)
 	return false;
 }
 
-void ext4_set_inode_flags(struct inode *inode)
+void ext4_set_inode_flags(struct inode *inode, bool init)
 {
 	unsigned int flags = EXT4_I(inode)->i_flags;
 	unsigned int new_fl = 0;
 
+	WARN_ON_ONCE(IS_DAX(inode) && init);
+
 	if (flags & EXT4_SYNC_FL)
 		new_fl |= S_SYNC;
 	if (flags & EXT4_APPEND_FL)
@@ -4434,8 +4436,13 @@ void ext4_set_inode_flags(struct inode *inode)
 		new_fl |= S_NOATIME;
 	if (flags & EXT4_DIRSYNC_FL)
 		new_fl |= S_DIRSYNC;
-	if (ext4_should_enable_dax(inode))
+
+	/* Because of the way inode_set_flags() works we must preserve S_DAX
+	 * here if already set. */
+	new_fl |= (inode->i_flags & S_DAX);
+	if (init && ext4_should_enable_dax(inode))
 		new_fl |= S_DAX;
+
 	if (flags & EXT4_ENCRYPT_FL)
 		new_fl |= S_ENCRYPTED;
 	if (flags & EXT4_CASEFOLD_FL)
@@ -4649,7 +4656,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 		 * not initialized on a new filesystem. */
 	}
 	ei->i_flags = le32_to_cpu(raw_inode->i_flags);
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, true);
 	inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
 	ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
 	if (ext4_has_feature_64bit(sb))
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 5813e5e73eab..145083e8cd1e 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -381,7 +381,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
 			ext4_clear_inode_flag(inode, i);
 	}
 
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, false);
+
 	inode->i_ctime = current_time(inode);
 
 	err = ext4_mark_iloc_dirty(handle, inode, &iloc);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d0434b513919..5ec900fdf73c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1344,7 +1344,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
 			ext4_clear_inode_state(inode,
 					EXT4_STATE_MAY_INLINE_DATA);
-			ext4_set_inode_flags(inode);
+			ext4_set_inode_flags(inode, false);
 		}
 		return res;
 	}
@@ -1367,7 +1367,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 				    ctx, len, 0);
 	if (!res) {
 		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
-		ext4_set_inode_flags(inode);
+		ext4_set_inode_flags(inode, false);
 		res = ext4_mark_inode_dirty(handle, inode);
 		if (res)
 			EXT4_ERROR_INODE(inode, "Failed to mark inode dirty");
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index f05a09fb2ae4..89a155ece323 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -244,7 +244,7 @@ static int ext4_end_enable_verity(struct file *filp, const void *desc,
 		if (err)
 			goto out_stop;
 		ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
-		ext4_set_inode_flags(inode);
+		ext4_set_inode_flags(inode, false);
 		err = ext4_mark_iloc_dirty(handle, inode, &iloc);
 	}
 out_stop:
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (5 preceding siblings ...)
  2020-05-13  5:43 ` [PATCH 6/9] fs/ext4: Only change S_DAX on inode load ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-13 14:35   ` Jan Kara
  2020-05-13  5:43 ` [PATCH 8/9] fs/ext4: Introduce DAX inode flag ira.weiny
  2020-05-13  5:43 ` [PATCH 9/9] Documentation/dax: Update DAX enablement for ext4 ira.weiny
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

We add 'always', 'never', and 'inode' (default).  '-o dax' continue to
operate the same.

Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
it and EXT4_MOUNT_DAX_ALWAYS appropriately.

We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.

https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Combine remount check for DAX_NEVER with DAX_ALWAYS
	Update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  |  1 +
 fs/ext4/inode.c |  2 ++
 fs/ext4/super.c | 43 +++++++++++++++++++++++++++++++++++++------
 3 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 86a0994332ce..01d1de838896 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1168,6 +1168,7 @@ struct ext4_inode_info {
 						      blocks */
 #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
 						      file systems */
+#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
 
 #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
 						specified journal checksum */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 23e42a223235..140b1930e2f4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 
 static bool ext4_should_enable_dax(struct inode *inode)
 {
+	if (test_opt2(inode->i_sb, DAX_NEVER))
+		return false;
 	if (!S_ISREG(inode->i_mode))
 		return false;
 	if (ext4_should_journal_data(inode))
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5ec900fdf73c..e01a040a58a9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1505,6 +1505,7 @@ enum {
 	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
 	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
 	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
+	Opt_dax_str,
 	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
 	Opt_nowarn_on_error, Opt_mblk_io_submit,
 	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
@@ -1570,6 +1571,7 @@ static const match_table_t tokens = {
 	{Opt_barrier, "barrier"},
 	{Opt_nobarrier, "nobarrier"},
 	{Opt_i_version, "i_version"},
+	{Opt_dax_str, "dax=%s"},
 	{Opt_dax, "dax"},
 	{Opt_stripe, "stripe=%u"},
 	{Opt_delalloc, "delalloc"},
@@ -1767,6 +1769,7 @@ static const struct mount_opts {
 	{Opt_min_batch_time, 0, MOPT_GTE0},
 	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
 	{Opt_init_itable, 0, MOPT_GTE0},
+	{Opt_dax_str, 0, MOPT_STRING},
 	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
 	{Opt_stripe, 0, MOPT_GTE0},
 	{Opt_resuid, 0, MOPT_GTE0},
@@ -2076,13 +2079,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
 		}
 		sbi->s_jquota_fmt = m->mount_opt;
 #endif
-	} else if (token == Opt_dax) {
+	} else if (token == Opt_dax || token == Opt_dax_str) {
 #ifdef CONFIG_FS_DAX
-		ext4_msg(sb, KERN_WARNING,
-		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
-		sbi->s_mount_opt |= m->mount_opt;
+		char *tmp = match_strdup(&args[0]);
+
+		if (!tmp || !strcmp(tmp, "always")) {
+			ext4_msg(sb, KERN_WARNING,
+				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
+			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
+			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
+		} else if (!strcmp(tmp, "never")) {
+			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
+			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
+		} else if (!strcmp(tmp, "inode")) {
+			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
+			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
+		} else {
+			ext4_msg(sb, KERN_WARNING, "DAX invalid option.");
+			kfree(tmp);
+			return -1;
+		}
+
+		kfree(tmp);
 #else
 		ext4_msg(sb, KERN_INFO, "dax option not supported");
+		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
+		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
 		return -1;
 #endif
 	} else if (token == Opt_data_err_abort) {
@@ -2306,6 +2328,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
 	if (DUMMY_ENCRYPTION_ENABLED(sbi))
 		SEQ_OPTS_PUTS("test_dummy_encryption");
 
+	if (test_opt2(sb, DAX_NEVER))
+		SEQ_OPTS_PUTS("dax=never");
+	else if (test_opt(sb, DAX_ALWAYS))
+		SEQ_OPTS_PUTS("dax=always");
+	else
+		SEQ_OPTS_PUTS("dax=inode");
+
 	ext4_show_quota_options(seq, sb);
 	return 0;
 }
@@ -5425,10 +5454,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 		goto restore_opts;
 	}
 
-	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
+	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS ||
+	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_DAX_NEVER) {
 		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
-			"dax flag with busy inodes while remounting");
+			"dax mount option with busy inodes while remounting");
 		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
+		sbi->s_mount_opt2 ^= EXT4_MOUNT2_DAX_NEVER;
 	}
 
 	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] fs/ext4: Introduce DAX inode flag
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (6 preceding siblings ...)
  2020-05-13  5:43 ` [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  2020-05-13 14:47   ` Jan Kara
  2020-05-13  5:43 ` [PATCH 9/9] Documentation/dax: Update DAX enablement for ext4 ira.weiny
  8 siblings, 1 reply; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.

Set the flag to be user visible and changeable.  Set the flag to be
inherited.  Allow applications to change the flag at any time.

Finally, on regular files, flag the inode to not be cached to facilitate
changing S_DAX on the next creation of the inode.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Change from RFC:
	use new d_mark_dontcache()
	Allow caching if ALWAYS/NEVER is set
	Rebased to latest Linus master
	Change flag to unused 0x01000000
	update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  | 13 +++++++++----
 fs/ext4/inode.c |  4 +++-
 fs/ext4/ioctl.c | 25 ++++++++++++++++++++++++-
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 01d1de838896..715f8f2029b2 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -415,13 +415,16 @@ struct flex_groups {
 #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
 #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
 /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
+
+#define EXT4_DAX_FL			0x01000000 /* Inode is DAX */
+
 #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
 #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
 #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
-#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
+#define EXT4_FL_USER_VISIBLE		0x715BDFFF /* User visible flags */
+#define EXT4_FL_USER_MODIFIABLE		0x614BC0FF /* User modifiable flags */
 
 /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
 #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
@@ -429,14 +432,16 @@ struct flex_groups {
 					 EXT4_APPEND_FL | \
 					 EXT4_NODUMP_FL | \
 					 EXT4_NOATIME_FL | \
-					 EXT4_PROJINHERIT_FL)
+					 EXT4_PROJINHERIT_FL | \
+					 EXT4_DAX_FL)
 
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
 			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
-			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
+			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
+			   EXT4_DAX_FL)
 
 /* Flags that are appropriate for regular files (all but dir-specific ones). */
 #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 140b1930e2f4..105cf04f7940 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 
 static bool ext4_should_enable_dax(struct inode *inode)
 {
+	unsigned int flags = EXT4_I(inode)->i_flags;
+
 	if (test_opt2(inode->i_sb, DAX_NEVER))
 		return false;
 	if (!S_ISREG(inode->i_mode))
@@ -4418,7 +4420,7 @@ static bool ext4_should_enable_dax(struct inode *inode)
 	if (test_opt(inode->i_sb, DAX_ALWAYS))
 		return true;
 
-	return false;
+	return flags & EXT4_DAX_FL;
 }
 
 void ext4_set_inode_flags(struct inode *inode, bool init)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 145083e8cd1e..6996a5c3e101 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -528,12 +528,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
 		xflags |= FS_XFLAG_NOATIME;
 	if (iflags & EXT4_PROJINHERIT_FL)
 		xflags |= FS_XFLAG_PROJINHERIT;
+	if (iflags & EXT4_DAX_FL)
+		xflags |= FS_XFLAG_DAX;
 	return xflags;
 }
 
 #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
 				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
-				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
+				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
+				  FS_XFLAG_DAX)
 
 /* Transfer xflags flags to internal */
 static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
@@ -552,6 +555,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
 		iflags |= EXT4_NOATIME_FL;
 	if (xflags & FS_XFLAG_PROJINHERIT)
 		iflags |= EXT4_PROJINHERIT_FL;
+	if (xflags & FS_XFLAG_DAX)
+		iflags |= EXT4_DAX_FL;
 
 	return iflags;
 }
@@ -802,6 +807,21 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
 	return error;
 }
 
+static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
+{
+	struct ext4_inode_info *ei = EXT4_I(inode);
+
+	if (S_ISDIR(inode->i_mode))
+		return;
+
+	if (test_opt2(inode->i_sb, DAX_NEVER) ||
+	    test_opt(inode->i_sb, DAX_ALWAYS))
+		return;
+
+	if (((ei->i_flags ^ flags) & EXT4_DAX_FL) == EXT4_DAX_FL)
+		d_mark_dontcache(inode);
+}
+
 long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 {
 	struct inode *inode = file_inode(filp);
@@ -1267,6 +1287,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 			return err;
 
 		inode_lock(inode);
+
+		ext4_dax_dontcache(inode, flags);
+
 		ext4_fill_fsxattr(inode, &old_fa);
 		err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa);
 		if (err)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 9/9] Documentation/dax: Update DAX enablement for ext4
  2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (7 preceding siblings ...)
  2020-05-13  5:43 ` [PATCH 8/9] fs/ext4: Introduce DAX inode flag ira.weiny
@ 2020-05-13  5:43 ` ira.weiny
  8 siblings, 0 replies; 29+ messages in thread
From: ira.weiny @ 2020-05-13  5:43 UTC (permalink / raw)
  To: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara
  Cc: Ira Weiny, Al Viro, Dan Williams, Dave Chinner,
	Christoph Hellwig, Jeff Moyer, Darrick J. Wong, linux-fsdevel,
	linux-kernel

From: Ira Weiny <ira.weiny@intel.com>

Update the document to reflect ext4 and xfs now behave the same.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Update with ext2 text...
---
 Documentation/filesystems/dax.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/dax.txt b/Documentation/filesystems/dax.txt
index 735fb4b54117..265c4f808dbf 100644
--- a/Documentation/filesystems/dax.txt
+++ b/Documentation/filesystems/dax.txt
@@ -25,7 +25,7 @@ size when creating the filesystem.
 Currently 3 filesystems support DAX: ext2, ext4 and xfs.  Enabling DAX on them
 is different.
 
-Enabling DAX on ext4 and ext2
+Enabling DAX on ext2
 -----------------------------
 
 When mounting the filesystem, use the "-o dax" option on the command line or
@@ -33,8 +33,8 @@ add 'dax' to the options in /etc/fstab.  This works to enable DAX on all files
 within the filesystem.  It is equivalent to the '-o dax=always' behavior below.
 
 
-Enabling DAX on xfs
--------------------
+Enabling DAX on xfs and ext4
+----------------------------
 
 Summary
 -------
-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  2020-05-13  5:43 ` [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
@ 2020-05-13 11:25   ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2020-05-13 11:25 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue 12-05-20 22:43:19, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> In prep for the new tri-state mount option which then introduces
> EXT4_MOUNT_DAX_NEVER.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> 
> ---
> Changes:
> 	New patch
> ---
>  fs/ext4/ext4.h  |  4 ++--
>  fs/ext4/inode.c |  2 +-
>  fs/ext4/super.c | 12 ++++++------
>  3 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 91eb4381cae5..1a3daf2d18ef 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1123,9 +1123,9 @@ struct ext4_inode_info {
>  #define EXT4_MOUNT_MINIX_DF		0x00080	/* Mimics the Minix statfs */
>  #define EXT4_MOUNT_NOLOAD		0x00100	/* Don't use existing journal*/
>  #ifdef CONFIG_FS_DAX
> -#define EXT4_MOUNT_DAX			0x00200	/* Direct Access */
> +#define EXT4_MOUNT_DAX_ALWAYS		0x00200	/* Direct Access */
>  #else
> -#define EXT4_MOUNT_DAX			0
> +#define EXT4_MOUNT_DAX_ALWAYS		0
>  #endif
>  #define EXT4_MOUNT_DATA_FLAGS		0x00C00	/* Mode for data writes: */
>  #define EXT4_MOUNT_JOURNAL_DATA		0x00400	/* Write data to journal */
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 2a4aae6acdcb..a10ff12194db 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4400,7 +4400,7 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
>  
>  static bool ext4_should_use_dax(struct inode *inode)
>  {
> -	if (!test_opt(inode->i_sb, DAX))
> +	if (!test_opt(inode->i_sb, DAX_ALWAYS))
>  		return false;
>  	if (!S_ISREG(inode->i_mode))
>  		return false;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9873ab27e3fa..d0434b513919 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1767,7 +1767,7 @@ static const struct mount_opts {
>  	{Opt_min_batch_time, 0, MOPT_GTE0},
>  	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
>  	{Opt_init_itable, 0, MOPT_GTE0},
> -	{Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
> +	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
>  	{Opt_stripe, 0, MOPT_GTE0},
>  	{Opt_resuid, 0, MOPT_GTE0},
>  	{Opt_resgid, 0, MOPT_GTE0},
> @@ -3974,7 +3974,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  				 "both data=journal and dioread_nolock");
>  			goto failed_mount;
>  		}
> -		if (test_opt(sb, DAX)) {
> +		if (test_opt(sb, DAX_ALWAYS)) {
>  			ext4_msg(sb, KERN_ERR, "can't mount with "
>  				 "both data=journal and dax");
>  			goto failed_mount;
> @@ -4084,7 +4084,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  		goto failed_mount;
>  	}
>  
> -	if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
> +	if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
>  		if (ext4_has_feature_inline_data(sb)) {
>  			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
>  					" that may contain inline data");
> @@ -5404,7 +5404,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>  			err = -EINVAL;
>  			goto restore_opts;
>  		}
> -		if (test_opt(sb, DAX)) {
> +		if (test_opt(sb, DAX_ALWAYS)) {
>  			ext4_msg(sb, KERN_ERR, "can't mount with "
>  				 "both data=journal and dax");
>  			err = -EINVAL;
> @@ -5425,10 +5425,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>  		goto restore_opts;
>  	}
>  
> -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX) {
> +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
>  		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
>  			"dax flag with busy inodes while remounting");
> -		sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
> +		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
>  	}
>  
>  	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/9] fs/ext4: Update ext4_should_use_dax()
  2020-05-13  5:43 ` [PATCH 5/9] fs/ext4: Update ext4_should_use_dax() ira.weiny
@ 2020-05-13 11:30   ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2020-05-13 11:30 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue 12-05-20 22:43:20, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> S_DAX should only be enabled when the underlying block device supports
> dax.
> 
> Change ext4_should_use_dax() to check for device support prior to the
> over riding mount option.
> 
> While we are at it change the function to ext4_should_enable_dax() as
> this better reflects the ask as well as matches xfs.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

The patch looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza


> 
> ---
> Changes from RFC
> 	Change function name to 'should enable'
> 	Clean up bool conversion
> 	Reorder this for better bisect-ability
> ---
>  fs/ext4/inode.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index a10ff12194db..d3a4c2ed7a1c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4398,10 +4398,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
>  		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
>  }
>  
> -static bool ext4_should_use_dax(struct inode *inode)
> +static bool ext4_should_enable_dax(struct inode *inode)
>  {
> -	if (!test_opt(inode->i_sb, DAX_ALWAYS))
> -		return false;
>  	if (!S_ISREG(inode->i_mode))
>  		return false;
>  	if (ext4_should_journal_data(inode))
> @@ -4412,7 +4410,13 @@ static bool ext4_should_use_dax(struct inode *inode)
>  		return false;
>  	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
>  		return false;
> -	return true;
> +	if (!bdev_dax_supported(inode->i_sb->s_bdev,
> +				inode->i_sb->s_blocksize))
> +		return false;
> +	if (test_opt(inode->i_sb, DAX_ALWAYS))
> +		return true;
> +
> +	return false;
>  }
>  
>  void ext4_set_inode_flags(struct inode *inode)
> @@ -4430,7 +4434,7 @@ void ext4_set_inode_flags(struct inode *inode)
>  		new_fl |= S_NOATIME;
>  	if (flags & EXT4_DIRSYNC_FL)
>  		new_fl |= S_DIRSYNC;
> -	if (ext4_should_use_dax(inode))
> +	if (ext4_should_enable_dax(inode))
>  		new_fl |= S_DAX;
>  	if (flags & EXT4_ENCRYPT_FL)
>  		new_fl |= S_ENCRYPTED;
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/9] fs/ext4: Only change S_DAX on inode load
  2020-05-13  5:43 ` [PATCH 6/9] fs/ext4: Only change S_DAX on inode load ira.weiny
@ 2020-05-13 11:33   ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2020-05-13 11:33 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue 12-05-20 22:43:21, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> To prevent complications with in memory inodes we only set S_DAX on
> inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
> change after inode eviction and reload.
> 
> Add init bool to ext4_set_inode_flags() to indicate if the inode is
> being newly initialized.
> 
> Assert that S_DAX is not set on an inode which is just being loaded.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

The patch looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza


> 
> ---
> Changes from RFC:
> 	Change J_ASSERT() to WARN_ON_ONCE()
> 	Fix bug which would clear S_DAX incorrectly
> ---
>  fs/ext4/ext4.h   |  2 +-
>  fs/ext4/ialloc.c |  2 +-
>  fs/ext4/inode.c  | 13 ++++++++++---
>  fs/ext4/ioctl.c  |  3 ++-
>  fs/ext4/super.c  |  4 ++--
>  fs/ext4/verity.c |  2 +-
>  6 files changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1a3daf2d18ef..86a0994332ce 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2692,7 +2692,7 @@ extern int ext4_can_truncate(struct inode *inode);
>  extern int ext4_truncate(struct inode *);
>  extern int ext4_break_layouts(struct inode *);
>  extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length);
> -extern void ext4_set_inode_flags(struct inode *);
> +extern void ext4_set_inode_flags(struct inode *, bool init);
>  extern int ext4_alloc_da_blocks(struct inode *inode);
>  extern void ext4_set_aops(struct inode *inode);
>  extern int ext4_writepage_trans_blocks(struct inode *);
> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
> index 4b8c9a9bdf0c..7941c140723f 100644
> --- a/fs/ext4/ialloc.c
> +++ b/fs/ext4/ialloc.c
> @@ -1116,7 +1116,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
>  	ei->i_block_group = group;
>  	ei->i_last_alloc_group = ~0;
>  
> -	ext4_set_inode_flags(inode);
> +	ext4_set_inode_flags(inode, true);
>  	if (IS_DIRSYNC(inode))
>  		ext4_handle_sync(handle);
>  	if (insert_inode_locked(inode) < 0) {
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d3a4c2ed7a1c..23e42a223235 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4419,11 +4419,13 @@ static bool ext4_should_enable_dax(struct inode *inode)
>  	return false;
>  }
>  
> -void ext4_set_inode_flags(struct inode *inode)
> +void ext4_set_inode_flags(struct inode *inode, bool init)
>  {
>  	unsigned int flags = EXT4_I(inode)->i_flags;
>  	unsigned int new_fl = 0;
>  
> +	WARN_ON_ONCE(IS_DAX(inode) && init);
> +
>  	if (flags & EXT4_SYNC_FL)
>  		new_fl |= S_SYNC;
>  	if (flags & EXT4_APPEND_FL)
> @@ -4434,8 +4436,13 @@ void ext4_set_inode_flags(struct inode *inode)
>  		new_fl |= S_NOATIME;
>  	if (flags & EXT4_DIRSYNC_FL)
>  		new_fl |= S_DIRSYNC;
> -	if (ext4_should_enable_dax(inode))
> +
> +	/* Because of the way inode_set_flags() works we must preserve S_DAX
> +	 * here if already set. */
> +	new_fl |= (inode->i_flags & S_DAX);
> +	if (init && ext4_should_enable_dax(inode))
>  		new_fl |= S_DAX;
> +
>  	if (flags & EXT4_ENCRYPT_FL)
>  		new_fl |= S_ENCRYPTED;
>  	if (flags & EXT4_CASEFOLD_FL)
> @@ -4649,7 +4656,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
>  		 * not initialized on a new filesystem. */
>  	}
>  	ei->i_flags = le32_to_cpu(raw_inode->i_flags);
> -	ext4_set_inode_flags(inode);
> +	ext4_set_inode_flags(inode, true);
>  	inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
>  	ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
>  	if (ext4_has_feature_64bit(sb))
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index 5813e5e73eab..145083e8cd1e 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -381,7 +381,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
>  			ext4_clear_inode_flag(inode, i);
>  	}
>  
> -	ext4_set_inode_flags(inode);
> +	ext4_set_inode_flags(inode, false);
> +
>  	inode->i_ctime = current_time(inode);
>  
>  	err = ext4_mark_iloc_dirty(handle, inode, &iloc);
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index d0434b513919..5ec900fdf73c 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1344,7 +1344,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
>  			ext4_clear_inode_state(inode,
>  					EXT4_STATE_MAY_INLINE_DATA);
> -			ext4_set_inode_flags(inode);
> +			ext4_set_inode_flags(inode, false);
>  		}
>  		return res;
>  	}
> @@ -1367,7 +1367,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  				    ctx, len, 0);
>  	if (!res) {
>  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> -		ext4_set_inode_flags(inode);
> +		ext4_set_inode_flags(inode, false);
>  		res = ext4_mark_inode_dirty(handle, inode);
>  		if (res)
>  			EXT4_ERROR_INODE(inode, "Failed to mark inode dirty");
> diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> index f05a09fb2ae4..89a155ece323 100644
> --- a/fs/ext4/verity.c
> +++ b/fs/ext4/verity.c
> @@ -244,7 +244,7 @@ static int ext4_end_enable_verity(struct file *filp, const void *desc,
>  		if (err)
>  			goto out_stop;
>  		ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
> -		ext4_set_inode_flags(inode);
> +		ext4_set_inode_flags(inode, false);
>  		err = ext4_mark_iloc_dirty(handle, inode, &iloc);
>  	}
>  out_stop:
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state
  2020-05-13  5:43 ` [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state ira.weiny
@ 2020-05-13 14:35   ` Jan Kara
  2020-05-13 18:17     ` Darrick J. Wong
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2020-05-13 14:35 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue 12-05-20 22:43:22, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> We add 'always', 'never', and 'inode' (default).  '-o dax' continue to
> operate the same.
> 
> Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
> it and EXT4_MOUNT_DAX_ALWAYS appropriately.
> 
> We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
> 
> https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from RFC:
> 	Combine remount check for DAX_NEVER with DAX_ALWAYS
> 	Update ext4_should_enable_dax()
> ---
>  fs/ext4/ext4.h  |  1 +
>  fs/ext4/inode.c |  2 ++
>  fs/ext4/super.c | 43 +++++++++++++++++++++++++++++++++++++------
>  3 files changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 86a0994332ce..01d1de838896 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1168,6 +1168,7 @@ struct ext4_inode_info {
>  						      blocks */
>  #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
>  						      file systems */
> +#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
>  
>  #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
>  						specified journal checksum */
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 23e42a223235..140b1930e2f4 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
>  
>  static bool ext4_should_enable_dax(struct inode *inode)
>  {
> +	if (test_opt2(inode->i_sb, DAX_NEVER))
> +		return false;
>  	if (!S_ISREG(inode->i_mode))
>  		return false;
>  	if (ext4_should_journal_data(inode))
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 5ec900fdf73c..e01a040a58a9 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1505,6 +1505,7 @@ enum {
>  	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
>  	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
>  	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
> +	Opt_dax_str,
>  	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
>  	Opt_nowarn_on_error, Opt_mblk_io_submit,
>  	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
> @@ -1570,6 +1571,7 @@ static const match_table_t tokens = {
>  	{Opt_barrier, "barrier"},
>  	{Opt_nobarrier, "nobarrier"},
>  	{Opt_i_version, "i_version"},
> +	{Opt_dax_str, "dax=%s"},

Hum, maybe it would be easier to handle this like we do with e.g. 'data='
mount option? I.e. like:

	{Opt_dax_always, "dax=always"},
	{Opt_dax_never, "dax=never"},
	{Opt_dax_inode, "dax=inode"),

and then handle these three tokens... Not that it would be a big difference
but that's why we usually handle mount options with small "enums" in ext4.

								Honza

>  	{Opt_dax, "dax"},
>  	{Opt_stripe, "stripe=%u"},
>  	{Opt_delalloc, "delalloc"},
> @@ -1767,6 +1769,7 @@ static const struct mount_opts {
>  	{Opt_min_batch_time, 0, MOPT_GTE0},
>  	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
>  	{Opt_init_itable, 0, MOPT_GTE0},
> +	{Opt_dax_str, 0, MOPT_STRING},
>  	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
>  	{Opt_stripe, 0, MOPT_GTE0},
>  	{Opt_resuid, 0, MOPT_GTE0},
> @@ -2076,13 +2079,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
>  		}
>  		sbi->s_jquota_fmt = m->mount_opt;
>  #endif
> -	} else if (token == Opt_dax) {
> +	} else if (token == Opt_dax || token == Opt_dax_str) {
>  #ifdef CONFIG_FS_DAX
> -		ext4_msg(sb, KERN_WARNING,
> -		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> -		sbi->s_mount_opt |= m->mount_opt;
> +		char *tmp = match_strdup(&args[0]);
> +
> +		if (!tmp || !strcmp(tmp, "always")) {
> +			ext4_msg(sb, KERN_WARNING,
> +				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> +			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
> +			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
> +		} else if (!strcmp(tmp, "never")) {
> +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> +			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> +		} else if (!strcmp(tmp, "inode")) {
> +			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> +			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
> +		} else {
> +			ext4_msg(sb, KERN_WARNING, "DAX invalid option.");
> +			kfree(tmp);
> +			return -1;
> +		}
> +
> +		kfree(tmp);
>  #else
>  		ext4_msg(sb, KERN_INFO, "dax option not supported");
> +		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> +		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
>  		return -1;
>  #endif
>  	} else if (token == Opt_data_err_abort) {
> @@ -2306,6 +2328,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
>  	if (DUMMY_ENCRYPTION_ENABLED(sbi))
>  		SEQ_OPTS_PUTS("test_dummy_encryption");
>  
> +	if (test_opt2(sb, DAX_NEVER))
> +		SEQ_OPTS_PUTS("dax=never");
> +	else if (test_opt(sb, DAX_ALWAYS))
> +		SEQ_OPTS_PUTS("dax=always");
> +	else
> +		SEQ_OPTS_PUTS("dax=inode");
> +
>  	ext4_show_quota_options(seq, sb);
>  	return 0;
>  }
> @@ -5425,10 +5454,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>  		goto restore_opts;
>  	}
>  
> -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
> +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS ||
> +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_DAX_NEVER) {
>  		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> -			"dax flag with busy inodes while remounting");
> +			"dax mount option with busy inodes while remounting");
>  		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
> +		sbi->s_mount_opt2 ^= EXT4_MOUNT2_DAX_NEVER;
>  	}
>  
>  	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 8/9] fs/ext4: Introduce DAX inode flag
  2020-05-13  5:43 ` [PATCH 8/9] fs/ext4: Introduce DAX inode flag ira.weiny
@ 2020-05-13 14:47   ` Jan Kara
  2020-05-13 21:41     ` Ira Weiny
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2020-05-13 14:47 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue 12-05-20 22:43:23, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> 
> Set the flag to be user visible and changeable.  Set the flag to be
> inherited.  Allow applications to change the flag at any time.
> 
> Finally, on regular files, flag the inode to not be cached to facilitate
> changing S_DAX on the next creation of the inode.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Change from RFC:
> 	use new d_mark_dontcache()
> 	Allow caching if ALWAYS/NEVER is set
> 	Rebased to latest Linus master
> 	Change flag to unused 0x01000000
> 	update ext4_should_enable_dax()
> ---
>  fs/ext4/ext4.h  | 13 +++++++++----
>  fs/ext4/inode.c |  4 +++-
>  fs/ext4/ioctl.c | 25 ++++++++++++++++++++++++-
>  3 files changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 01d1de838896..715f8f2029b2 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -415,13 +415,16 @@ struct flex_groups {
>  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
>  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
>  /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
> +
> +#define EXT4_DAX_FL			0x01000000 /* Inode is DAX */
> +
>  #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
>  #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
>  #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
>  #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
>  
> -#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
> -#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
> +#define EXT4_FL_USER_VISIBLE		0x715BDFFF /* User visible flags */
> +#define EXT4_FL_USER_MODIFIABLE		0x614BC0FF /* User modifiable flags */

Hum, I think this was already mentioned but there are also definitions in
include/uapi/linux/fs.h which should be kept in sync... Also if DAX flag
gets modified through FS_IOC_SETFLAGS, we should call ext4_doncache() as
well, shouldn't we?

> @@ -802,6 +807,21 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
>  	return error;
>  }
>  
> +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> +{
> +	struct ext4_inode_info *ei = EXT4_I(inode);
> +
> +	if (S_ISDIR(inode->i_mode))
> +		return;
> +
> +	if (test_opt2(inode->i_sb, DAX_NEVER) ||
> +	    test_opt(inode->i_sb, DAX_ALWAYS))
> +		return;
> +
> +	if (((ei->i_flags ^ flags) & EXT4_DAX_FL) == EXT4_DAX_FL)
> +		d_mark_dontcache(inode);
> +}
> +
>  long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  {
>  	struct inode *inode = file_inode(filp);
> @@ -1267,6 +1287,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  			return err;
>  
>  		inode_lock(inode);
> +
> +		ext4_dax_dontcache(inode, flags);
> +

I don't think we should set dontcache flag when setting of DAX flag fails -
it could event be a security issue). So I think you'll have to check
whether DAX flag is being changed, call vfs_ioc_fssetxattr_check(), and
only if it succeeded and DAX flags was changing call ext4_dax_dontcache().

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state
  2020-05-13 14:35   ` Jan Kara
@ 2020-05-13 18:17     ` Darrick J. Wong
  2020-05-13 19:53       ` Ira Weiny
  0 siblings, 1 reply; 29+ messages in thread
From: Darrick J. Wong @ 2020-05-13 18:17 UTC (permalink / raw)
  To: Jan Kara
  Cc: ira.weiny, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, linux-fsdevel, linux-kernel

On Wed, May 13, 2020 at 04:35:26PM +0200, Jan Kara wrote:
> On Tue 12-05-20 22:43:22, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > We add 'always', 'never', and 'inode' (default).  '-o dax' continue to
> > operate the same.
> > 
> > Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
> > it and EXT4_MOUNT_DAX_ALWAYS appropriately.
> > 
> > We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
> > 
> > https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes from RFC:
> > 	Combine remount check for DAX_NEVER with DAX_ALWAYS
> > 	Update ext4_should_enable_dax()
> > ---
> >  fs/ext4/ext4.h  |  1 +
> >  fs/ext4/inode.c |  2 ++
> >  fs/ext4/super.c | 43 +++++++++++++++++++++++++++++++++++++------
> >  3 files changed, 40 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 86a0994332ce..01d1de838896 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -1168,6 +1168,7 @@ struct ext4_inode_info {
> >  						      blocks */
> >  #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
> >  						      file systems */
> > +#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
> >  
> >  #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
> >  						specified journal checksum */
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 23e42a223235..140b1930e2f4 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> >  
> >  static bool ext4_should_enable_dax(struct inode *inode)
> >  {
> > +	if (test_opt2(inode->i_sb, DAX_NEVER))
> > +		return false;
> >  	if (!S_ISREG(inode->i_mode))
> >  		return false;
> >  	if (ext4_should_journal_data(inode))
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 5ec900fdf73c..e01a040a58a9 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1505,6 +1505,7 @@ enum {
> >  	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
> >  	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> >  	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
> > +	Opt_dax_str,
> >  	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
> >  	Opt_nowarn_on_error, Opt_mblk_io_submit,
> >  	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
> > @@ -1570,6 +1571,7 @@ static const match_table_t tokens = {
> >  	{Opt_barrier, "barrier"},
> >  	{Opt_nobarrier, "nobarrier"},
> >  	{Opt_i_version, "i_version"},
> > +	{Opt_dax_str, "dax=%s"},
> 
> Hum, maybe it would be easier to handle this like we do with e.g. 'data='
> mount option? I.e. like:
> 
> 	{Opt_dax_always, "dax=always"},
> 	{Opt_dax_never, "dax=never"},
> 	{Opt_dax_inode, "dax=inode"),
> 
> and then handle these three tokens... Not that it would be a big difference
> but that's why we usually handle mount options with small "enums" in ext4.

I was hoping that we could hoist the tristate enum bits out of XFS and
simply share them across the three DAX filesystems, but I have no idea
if that will work with a filesystem that hasn't been converted to the
new mount option parsing api.  I'm betting no. :/

(FWIW see enum xfs_dax_mode and struct constant_table dax_param_enums in
fs/xfs/xfs_super.c in the for-next tree.)

Hm, otoh I don't see any recent posting of an ext4 mount parsing
conversion series, so yeah this is probably as good as can be done until
that happens.

--D

> 								Honza
> 
> >  	{Opt_dax, "dax"},
> >  	{Opt_stripe, "stripe=%u"},
> >  	{Opt_delalloc, "delalloc"},
> > @@ -1767,6 +1769,7 @@ static const struct mount_opts {
> >  	{Opt_min_batch_time, 0, MOPT_GTE0},
> >  	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
> >  	{Opt_init_itable, 0, MOPT_GTE0},
> > +	{Opt_dax_str, 0, MOPT_STRING},
> >  	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
> >  	{Opt_stripe, 0, MOPT_GTE0},
> >  	{Opt_resuid, 0, MOPT_GTE0},
> > @@ -2076,13 +2079,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
> >  		}
> >  		sbi->s_jquota_fmt = m->mount_opt;
> >  #endif
> > -	} else if (token == Opt_dax) {
> > +	} else if (token == Opt_dax || token == Opt_dax_str) {
> >  #ifdef CONFIG_FS_DAX
> > -		ext4_msg(sb, KERN_WARNING,
> > -		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> > -		sbi->s_mount_opt |= m->mount_opt;
> > +		char *tmp = match_strdup(&args[0]);
> > +
> > +		if (!tmp || !strcmp(tmp, "always")) {
> > +			ext4_msg(sb, KERN_WARNING,
> > +				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> > +			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
> > +			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
> > +		} else if (!strcmp(tmp, "never")) {
> > +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> > +			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> > +		} else if (!strcmp(tmp, "inode")) {
> > +			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> > +			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
> > +		} else {
> > +			ext4_msg(sb, KERN_WARNING, "DAX invalid option.");
> > +			kfree(tmp);
> > +			return -1;
> > +		}
> > +
> > +		kfree(tmp);
> >  #else
> >  		ext4_msg(sb, KERN_INFO, "dax option not supported");
> > +		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> > +		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> >  		return -1;
> >  #endif
> >  	} else if (token == Opt_data_err_abort) {
> > @@ -2306,6 +2328,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
> >  	if (DUMMY_ENCRYPTION_ENABLED(sbi))
> >  		SEQ_OPTS_PUTS("test_dummy_encryption");
> >  
> > +	if (test_opt2(sb, DAX_NEVER))
> > +		SEQ_OPTS_PUTS("dax=never");
> > +	else if (test_opt(sb, DAX_ALWAYS))
> > +		SEQ_OPTS_PUTS("dax=always");
> > +	else
> > +		SEQ_OPTS_PUTS("dax=inode");
> > +
> >  	ext4_show_quota_options(seq, sb);
> >  	return 0;
> >  }
> > @@ -5425,10 +5454,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
> >  		goto restore_opts;
> >  	}
> >  
> > -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
> > +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS ||
> > +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_DAX_NEVER) {
> >  		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> > -			"dax flag with busy inodes while remounting");
> > +			"dax mount option with busy inodes while remounting");
> >  		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
> > +		sbi->s_mount_opt2 ^= EXT4_MOUNT2_DAX_NEVER;
> >  	}
> >  
> >  	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
> > -- 
> > 2.25.1
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state
  2020-05-13 18:17     ` Darrick J. Wong
@ 2020-05-13 19:53       ` Ira Weiny
  0 siblings, 0 replies; 29+ messages in thread
From: Ira Weiny @ 2020-05-13 19:53 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, linux-fsdevel, linux-kernel

On Wed, May 13, 2020 at 11:17:17AM -0700, Darrick J. Wong wrote:
> On Wed, May 13, 2020 at 04:35:26PM +0200, Jan Kara wrote:
> > On Tue 12-05-20 22:43:22, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > We add 'always', 'never', and 'inode' (default).  '-o dax' continue to
> > > operate the same.
> > > 
> > > Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
> > > it and EXT4_MOUNT_DAX_ALWAYS appropriately.
> > > 
> > > We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
> > > 
> > > https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > ---
> > > Changes from RFC:
> > > 	Combine remount check for DAX_NEVER with DAX_ALWAYS
> > > 	Update ext4_should_enable_dax()
> > > ---
> > >  fs/ext4/ext4.h  |  1 +
> > >  fs/ext4/inode.c |  2 ++
> > >  fs/ext4/super.c | 43 +++++++++++++++++++++++++++++++++++++------
> > >  3 files changed, 40 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > > index 86a0994332ce..01d1de838896 100644
> > > --- a/fs/ext4/ext4.h
> > > +++ b/fs/ext4/ext4.h
> > > @@ -1168,6 +1168,7 @@ struct ext4_inode_info {
> > >  						      blocks */
> > >  #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
> > >  						      file systems */
> > > +#define EXT4_MOUNT2_DAX_NEVER		0x00000008 /* Do not allow Direct Access */
> > >  
> > >  #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
> > >  						specified journal checksum */
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 23e42a223235..140b1930e2f4 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> > >  
> > >  static bool ext4_should_enable_dax(struct inode *inode)
> > >  {
> > > +	if (test_opt2(inode->i_sb, DAX_NEVER))
> > > +		return false;
> > >  	if (!S_ISREG(inode->i_mode))
> > >  		return false;
> > >  	if (ext4_should_journal_data(inode))
> > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > > index 5ec900fdf73c..e01a040a58a9 100644
> > > --- a/fs/ext4/super.c
> > > +++ b/fs/ext4/super.c
> > > @@ -1505,6 +1505,7 @@ enum {
> > >  	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
> > >  	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> > >  	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
> > > +	Opt_dax_str,
> > >  	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
> > >  	Opt_nowarn_on_error, Opt_mblk_io_submit,
> > >  	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
> > > @@ -1570,6 +1571,7 @@ static const match_table_t tokens = {
> > >  	{Opt_barrier, "barrier"},
> > >  	{Opt_nobarrier, "nobarrier"},
> > >  	{Opt_i_version, "i_version"},
> > > +	{Opt_dax_str, "dax=%s"},
> > 
> > Hum, maybe it would be easier to handle this like we do with e.g. 'data='
> > mount option? I.e. like:
> > 
> > 	{Opt_dax_always, "dax=always"},
> > 	{Opt_dax_never, "dax=never"},
> > 	{Opt_dax_inode, "dax=inode"),
> > 
> > and then handle these three tokens... Not that it would be a big difference
> > but that's why we usually handle mount options with small "enums" in ext4.

We could, but at this point it would need to be reworked for the new option
parsing code anyway...

I've kind of been waiting to see if another round of those patches were
submitted but looks like they are taking more work.

> 
> I was hoping that we could hoist the tristate enum bits out of XFS and
> simply share them across the three DAX filesystems, but I have no idea
> if that will work with a filesystem that hasn't been converted to the
> new mount option parsing api.  I'm betting no. :/
> 
> (FWIW see enum xfs_dax_mode and struct constant_table dax_param_enums in
> fs/xfs/xfs_super.c in the for-next tree.)
> 
> Hm, otoh I don't see any recent posting of an ext4 mount parsing
> conversion series, so yeah this is probably as good as can be done until
> that happens.
>

That is my thinking.

I wanted to get this series out because as a feature it would be nice if this
went in together with XFS for 5.8.  But I understand if we want to wait.

Ira

> 
> --D
> 
> > 								Honza
> > 
> > >  	{Opt_dax, "dax"},
> > >  	{Opt_stripe, "stripe=%u"},
> > >  	{Opt_delalloc, "delalloc"},
> > > @@ -1767,6 +1769,7 @@ static const struct mount_opts {
> > >  	{Opt_min_batch_time, 0, MOPT_GTE0},
> > >  	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
> > >  	{Opt_init_itable, 0, MOPT_GTE0},
> > > +	{Opt_dax_str, 0, MOPT_STRING},
> > >  	{Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
> > >  	{Opt_stripe, 0, MOPT_GTE0},
> > >  	{Opt_resuid, 0, MOPT_GTE0},
> > > @@ -2076,13 +2079,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
> > >  		}
> > >  		sbi->s_jquota_fmt = m->mount_opt;
> > >  #endif
> > > -	} else if (token == Opt_dax) {
> > > +	} else if (token == Opt_dax || token == Opt_dax_str) {
> > >  #ifdef CONFIG_FS_DAX
> > > -		ext4_msg(sb, KERN_WARNING,
> > > -		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> > > -		sbi->s_mount_opt |= m->mount_opt;
> > > +		char *tmp = match_strdup(&args[0]);
> > > +
> > > +		if (!tmp || !strcmp(tmp, "always")) {
> > > +			ext4_msg(sb, KERN_WARNING,
> > > +				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
> > > +			sbi->s_mount_opt |= EXT4_MOUNT_DAX_ALWAYS;
> > > +			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
> > > +		} else if (!strcmp(tmp, "never")) {
> > > +			sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> > > +			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> > > +		} else if (!strcmp(tmp, "inode")) {
> > > +			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> > > +			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_DAX_NEVER;
> > > +		} else {
> > > +			ext4_msg(sb, KERN_WARNING, "DAX invalid option.");
> > > +			kfree(tmp);
> > > +			return -1;
> > > +		}
> > > +
> > > +		kfree(tmp);
> > >  #else
> > >  		ext4_msg(sb, KERN_INFO, "dax option not supported");
> > > +		sbi->s_mount_opt2 |= EXT4_MOUNT2_DAX_NEVER;
> > > +		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX_ALWAYS;
> > >  		return -1;
> > >  #endif
> > >  	} else if (token == Opt_data_err_abort) {
> > > @@ -2306,6 +2328,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
> > >  	if (DUMMY_ENCRYPTION_ENABLED(sbi))
> > >  		SEQ_OPTS_PUTS("test_dummy_encryption");
> > >  
> > > +	if (test_opt2(sb, DAX_NEVER))
> > > +		SEQ_OPTS_PUTS("dax=never");
> > > +	else if (test_opt(sb, DAX_ALWAYS))
> > > +		SEQ_OPTS_PUTS("dax=always");
> > > +	else
> > > +		SEQ_OPTS_PUTS("dax=inode");
> > > +
> > >  	ext4_show_quota_options(seq, sb);
> > >  	return 0;
> > >  }
> > > @@ -5425,10 +5454,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
> > >  		goto restore_opts;
> > >  	}
> > >  
> > > -	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
> > > +	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS ||
> > > +	    (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_DAX_NEVER) {
> > >  		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> > > -			"dax flag with busy inodes while remounting");
> > > +			"dax mount option with busy inodes while remounting");
> > >  		sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
> > > +		sbi->s_mount_opt2 ^= EXT4_MOUNT2_DAX_NEVER;
> > >  	}
> > >  
> > >  	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
> > > -- 
> > > 2.25.1
> > > 
> > -- 
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 8/9] fs/ext4: Introduce DAX inode flag
  2020-05-13 14:47   ` Jan Kara
@ 2020-05-13 21:41     ` Ira Weiny
  2020-05-14  6:43       ` Jan Kara
  0 siblings, 1 reply; 29+ messages in thread
From: Ira Weiny @ 2020-05-13 21:41 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Al Viro,
	Dan Williams, Dave Chinner, Christoph Hellwig, Jeff Moyer,
	Darrick J. Wong, linux-fsdevel, linux-kernel

On Wed, May 13, 2020 at 04:47:06PM +0200, Jan Kara wrote:
> On Tue 12-05-20 22:43:23, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > 
> > Set the flag to be user visible and changeable.  Set the flag to be
> > inherited.  Allow applications to change the flag at any time.
> > 
> > Finally, on regular files, flag the inode to not be cached to facilitate
> > changing S_DAX on the next creation of the inode.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Change from RFC:
> > 	use new d_mark_dontcache()
> > 	Allow caching if ALWAYS/NEVER is set
> > 	Rebased to latest Linus master
> > 	Change flag to unused 0x01000000
> > 	update ext4_should_enable_dax()
> > ---
> >  fs/ext4/ext4.h  | 13 +++++++++----
> >  fs/ext4/inode.c |  4 +++-
> >  fs/ext4/ioctl.c | 25 ++++++++++++++++++++++++-
> >  3 files changed, 36 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 01d1de838896..715f8f2029b2 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -415,13 +415,16 @@ struct flex_groups {
> >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> >  /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
> > +
> > +#define EXT4_DAX_FL			0x01000000 /* Inode is DAX */
> > +
> >  #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
> >  #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> >  #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
> >  #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
> >  
> > -#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
> > -#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
> > +#define EXT4_FL_USER_VISIBLE		0x715BDFFF /* User visible flags */
> > +#define EXT4_FL_USER_MODIFIABLE		0x614BC0FF /* User modifiable flags */
> 
> Hum, I think this was already mentioned but there are also definitions in
> include/uapi/linux/fs.h which should be kept in sync... Also if DAX flag
> gets modified through FS_IOC_SETFLAGS, we should call ext4_doncache() as
> well, shouldn't we?

Ah yea it was mentioned.  Sorry.

> 
> > @@ -802,6 +807,21 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
> >  	return error;
> >  }
> >  
> > +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> > +{
> > +	struct ext4_inode_info *ei = EXT4_I(inode);
> > +
> > +	if (S_ISDIR(inode->i_mode))
> > +		return;
> > +
> > +	if (test_opt2(inode->i_sb, DAX_NEVER) ||
> > +	    test_opt(inode->i_sb, DAX_ALWAYS))
> > +		return;
> > +
> > +	if (((ei->i_flags ^ flags) & EXT4_DAX_FL) == EXT4_DAX_FL)
> > +		d_mark_dontcache(inode);
> > +}
> > +
> >  long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> >  {
> >  	struct inode *inode = file_inode(filp);
> > @@ -1267,6 +1287,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> >  			return err;
> >  
> >  		inode_lock(inode);
> > +
> > +		ext4_dax_dontcache(inode, flags);
> > +
> 
> I don't think we should set dontcache flag when setting of DAX flag fails -
> it could event be a security issue).

good point.

>
> So I think you'll have to check
> whether DAX flag is being changed,

ext4_dax_dontcache() does check if the flag is being changed.

> call vfs_ioc_fssetxattr_check(), and
> only if it succeeded and DAX flags was changing call ext4_dax_dontcache().

Yes I think it would be better to ensure all of the ioctl succeeds prior to
setting the don't cache.  The logic is easier to follow.

Ira

> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 8/9] fs/ext4: Introduce DAX inode flag
  2020-05-13 21:41     ` Ira Weiny
@ 2020-05-14  6:43       ` Jan Kara
  2020-05-14  6:55         ` Ira Weiny
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2020-05-14  6:43 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Jan Kara, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Wed 13-05-20 14:41:55, Ira Weiny wrote:
> On Wed, May 13, 2020 at 04:47:06PM +0200, Jan Kara wrote:
> >
> > So I think you'll have to check
> > whether DAX flag is being changed,
> 
> ext4_dax_dontcache() does check if the flag is being changed.

Yes, but if you call it after inode flags change, you cannot determine that
just from flags and EXT4_I(inode)->i_flags. So that logic needs to change.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 8/9] fs/ext4: Introduce DAX inode flag
  2020-05-14  6:43       ` Jan Kara
@ 2020-05-14  6:55         ` Ira Weiny
  0 siblings, 0 replies; 29+ messages in thread
From: Ira Weiny @ 2020-05-14  6:55 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Al Viro,
	Dan Williams, Dave Chinner, Christoph Hellwig, Jeff Moyer,
	Darrick J. Wong, linux-fsdevel, linux-kernel

On Thu, May 14, 2020 at 08:43:35AM +0200, Jan Kara wrote:
> On Wed 13-05-20 14:41:55, Ira Weiny wrote:
> > On Wed, May 13, 2020 at 04:47:06PM +0200, Jan Kara wrote:
> > >
> > > So I think you'll have to check
> > > whether DAX flag is being changed,
> > 
> > ext4_dax_dontcache() does check if the flag is being changed.
> 
> Yes, but if you call it after inode flags change, you cannot determine that
> just from flags and EXT4_I(inode)->i_flags. So that logic needs to change.

I just caught this email... just after sending V1.

I've moved where ext4_dax_dontcache() is called.  I think it is ok now with the
current check.

LMK if I've messed it up...  :-/

Ira

> 
> 								Honza
> 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX
  2020-05-13  5:43 ` [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX ira.weiny
@ 2020-05-16  1:49   ` Eric Biggers
  2020-05-18  5:32     ` Ira Weiny
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Biggers @ 2020-05-16  1:49 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue, May 12, 2020 at 10:43:17PM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> flag change is wrong without a corresponding address_space_operations
> update.
> 
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
> 
> (Setting DAX is already disabled if Verity is set first.)
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes:
> 	remove WARN_ON_ONCE
> 	Add documentation for DAX/Verity exclusivity
> ---
>  Documentation/filesystems/ext4/verity.rst | 7 +++++++
>  fs/ext4/verity.c                          | 3 +++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/Documentation/filesystems/ext4/verity.rst b/Documentation/filesystems/ext4/verity.rst
> index 3e4c0ee0e068..51ab1aa17e59 100644
> --- a/Documentation/filesystems/ext4/verity.rst
> +++ b/Documentation/filesystems/ext4/verity.rst
> @@ -39,3 +39,10 @@ is encrypted as well as the data itself.
>  
>  Verity files cannot have blocks allocated past the end of the verity
>  metadata.
> +
> +Verity and DAX
> +--------------
> +
> +Verity and DAX are not compatible and attempts to set both of these flags on a
> +file will fail.
> +

If you build the documentation, this shows up as its own subsection
"2.13. Verity and DAX" alongside "2.12. Verity files", which looks odd.
I think you should delete this new subsection header so that this paragraph goes
in the existing "Verity files" subsection.

Also, Documentation/filesystems/fsverity.rst already mentions DAX (similar to
fscrypt.rst).  Is it intentional that you added this to the ext4-specific
documentation instead?

- Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-13  5:43 ` [PATCH 3/9] fs/ext4: Disallow encryption " ira.weiny
@ 2020-05-16  2:02   ` Eric Biggers
  2020-05-18  5:03     ` Ira Weiny
  0 siblings, 1 reply; 29+ messages in thread
From: Eric Biggers @ 2020-05-16  2:02 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue, May 12, 2020 at 10:43:18PM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Encryption and DAX are incompatible.  Changing the DAX mode due to a
> change in Encryption mode is wrong without a corresponding
> address_space_operations update.
> 
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
> 
> Furthermore, clarify the documentation of the exclusivity and how that
> will work.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes:
> 	remove WARN_ON_ONCE
> 	Add documentation to the encrypt doc WRT DAX
> ---
>  Documentation/filesystems/fscrypt.rst |  4 +++-
>  fs/ext4/super.c                       | 10 +---------
>  2 files changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
> index aa072112cfff..1475b8d52fef 100644
> --- a/Documentation/filesystems/fscrypt.rst
> +++ b/Documentation/filesystems/fscrypt.rst
> @@ -1038,7 +1038,9 @@ astute users may notice some differences in behavior:
>  - The ext4 filesystem does not support data journaling with encrypted
>    regular files.  It will fall back to ordered data mode instead.
>  
> -- DAX (Direct Access) is not supported on encrypted files.
> +- DAX (Direct Access) is not supported on encrypted files.  Attempts to enable
> +  DAX on an encrypted file will fail.  Mount options will _not_ enable DAX on
> +  encrypted files.
>  
>  - The st_size of an encrypted symlink will not necessarily give the
>    length of the symlink target as required by POSIX.  It will actually
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index bf5fcb477f66..9873ab27e3fa 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1320,7 +1320,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  	if (inode->i_ino == EXT4_ROOT_INO)
>  		return -EPERM;
>  
> -	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> +	if (IS_DAX(inode))
>  		return -EINVAL;
>  
>  	res = ext4_convert_inline_data(inode);
> @@ -1344,10 +1344,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
>  			ext4_clear_inode_state(inode,
>  					EXT4_STATE_MAY_INLINE_DATA);
> -			/*
> -			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> -			 * S_DAX may be disabled
> -			 */
>  			ext4_set_inode_flags(inode);
>  		}
>  		return res;
> @@ -1371,10 +1367,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  				    ctx, len, 0);
>  	if (!res) {
>  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> -		/*
> -		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> -		 * S_DAX may be disabled
> -		 */
>  		ext4_set_inode_flags(inode);
>  		res = ext4_mark_inode_dirty(handle, inode);
>  		if (res)

I'm confused by the ext4_set_context() change.

ext4_set_context() is only called when FS_IOC_SET_ENCRYPTION_POLICY sets an
encryption policy on an empty directory, *or* when a new inode (regular, dir, or
symlink) is created in an encrypted directory (thus inheriting encryption from
its parent).

So when is it reachable when IS_DAX()?  Is the issue that the DAX flag can now
be set on directories?  The commit message doesn't seem to be talking about
directories.  Is the behavior we want is that on an (empty) directory with the
DAX flag set, FS_IOC_SET_ENCRYPTION_POLICY should fail with EINVAL?

I don't see why the i_size_read(inode) check is there though, so I think you're
at least right to remove that.

- Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-16  2:02   ` Eric Biggers
@ 2020-05-18  5:03     ` Ira Weiny
  2020-05-18 16:24       ` Eric Biggers
  0 siblings, 1 reply; 29+ messages in thread
From: Ira Weiny @ 2020-05-18  5:03 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Fri, May 15, 2020 at 07:02:53PM -0700, Eric Biggers wrote:
> On Tue, May 12, 2020 at 10:43:18PM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Encryption and DAX are incompatible.  Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > Furthermore, clarify the documentation of the exclusivity and how that
> > will work.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes:
> > 	remove WARN_ON_ONCE
> > 	Add documentation to the encrypt doc WRT DAX
> > ---
> >  Documentation/filesystems/fscrypt.rst |  4 +++-
> >  fs/ext4/super.c                       | 10 +---------
> >  2 files changed, 4 insertions(+), 10 deletions(-)
> > 
> > diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
> > index aa072112cfff..1475b8d52fef 100644
> > --- a/Documentation/filesystems/fscrypt.rst
> > +++ b/Documentation/filesystems/fscrypt.rst
> > @@ -1038,7 +1038,9 @@ astute users may notice some differences in behavior:
> >  - The ext4 filesystem does not support data journaling with encrypted
> >    regular files.  It will fall back to ordered data mode instead.
> >  
> > -- DAX (Direct Access) is not supported on encrypted files.
> > +- DAX (Direct Access) is not supported on encrypted files.  Attempts to enable
> > +  DAX on an encrypted file will fail.  Mount options will _not_ enable DAX on
> > +  encrypted files.
> >  
> >  - The st_size of an encrypted symlink will not necessarily give the
> >    length of the symlink target as required by POSIX.  It will actually
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index bf5fcb477f66..9873ab27e3fa 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1320,7 +1320,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  	if (inode->i_ino == EXT4_ROOT_INO)
> >  		return -EPERM;
> >  
> > -	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> > +	if (IS_DAX(inode))
> >  		return -EINVAL;
> >  
> >  	res = ext4_convert_inline_data(inode);
> > @@ -1344,10 +1344,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> >  			ext4_clear_inode_state(inode,
> >  					EXT4_STATE_MAY_INLINE_DATA);
> > -			/*
> > -			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > -			 * S_DAX may be disabled
> > -			 */
> >  			ext4_set_inode_flags(inode);
> >  		}
> >  		return res;
> > @@ -1371,10 +1367,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  				    ctx, len, 0);
> >  	if (!res) {
> >  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > -		/*
> > -		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > -		 * S_DAX may be disabled
> > -		 */
> >  		ext4_set_inode_flags(inode);
> >  		res = ext4_mark_inode_dirty(handle, inode);
> >  		if (res)
> 
> I'm confused by the ext4_set_context() change.
> 
> ext4_set_context() is only called when FS_IOC_SET_ENCRYPTION_POLICY sets an
> encryption policy on an empty directory, *or* when a new inode (regular, dir, or
> symlink) is created in an encrypted directory (thus inheriting encryption from
> its parent).

I don't see the check which prevents FS_IOC_SET_ENCRYPTION_POLICY on a file?

On inode creation, encryption will always usurp S_DAX...

> 
> So when is it reachable when IS_DAX()?  Is the issue that the DAX flag can now
> be set on directories?  The commit message doesn't seem to be talking about
> directories.  Is the behavior we want is that on an (empty) directory with the
> DAX flag set, FS_IOC_SET_ENCRYPTION_POLICY should fail with EINVAL?

We would want that but AFIAK S_DAX is never set on directories.  Perhaps this
is another place where S_DAX needs to be changed to the new inode flag?
However, this would not be appropriate at this point in the series.  At this
point in the series S_DAX is still set based on the mount option and I'm 99%
sure that only happens on regular files, not directories.  So I'm confused now.

This is, AFAICS, not going to affect correctness.  It will only be confusing
because the user will be able to set both DAX and encryption on the directory
but files there will only see encryption being used...  :-(

Assuming you are correct about this call path only being valid on directories.
It seems this IS_DAX() needs to be changed to check for EXT4_DAX_FL in
"fs/ext4: Introduce DAX inode flag"?  Then at that point we can prevent DAX and
encryption on a directory.  ...  and at this point IS_DAX() could be removed at
this point in the series???

> 
> I don't see why the i_size_read(inode) check is there though, so I think you're
> at least right to remove that.

Agreed.
Ira

> 
> - Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX
  2020-05-16  1:49   ` Eric Biggers
@ 2020-05-18  5:32     ` Ira Weiny
  0 siblings, 0 replies; 29+ messages in thread
From: Ira Weiny @ 2020-05-18  5:32 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Fri, May 15, 2020 at 06:49:16PM -0700, Eric Biggers wrote:
> On Tue, May 12, 2020 at 10:43:17PM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > flag change is wrong without a corresponding address_space_operations
> > update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > (Setting DAX is already disabled if Verity is set first.)
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes:
> > 	remove WARN_ON_ONCE
> > 	Add documentation for DAX/Verity exclusivity
> > ---
> >  Documentation/filesystems/ext4/verity.rst | 7 +++++++
> >  fs/ext4/verity.c                          | 3 +++
> >  2 files changed, 10 insertions(+)
> > 
> > diff --git a/Documentation/filesystems/ext4/verity.rst b/Documentation/filesystems/ext4/verity.rst
> > index 3e4c0ee0e068..51ab1aa17e59 100644
> > --- a/Documentation/filesystems/ext4/verity.rst
> > +++ b/Documentation/filesystems/ext4/verity.rst
> > @@ -39,3 +39,10 @@ is encrypted as well as the data itself.
> >  
> >  Verity files cannot have blocks allocated past the end of the verity
> >  metadata.
> > +
> > +Verity and DAX
> > +--------------
> > +
> > +Verity and DAX are not compatible and attempts to set both of these flags on a
> > +file will fail.
> > +
> 
> If you build the documentation, this shows up as its own subsection
> "2.13. Verity and DAX" alongside "2.12. Verity files", which looks odd.
> I think you should delete this new subsection header so that this paragraph goes
> in the existing "Verity files" subsection.

Ok...  I'll fix it up...

> 
> Also, Documentation/filesystems/fsverity.rst already mentions DAX (similar to
> fscrypt.rst).  Is it intentional that you added this to the ext4-specific
> documentation instead?

I proposed this text[1] and there were no objections...  I was looking at ext4
because only ext4 supports verity and DAX.  I think having this in both the
ext4 docs and the verity docs helps.

Ira

[1] https://lore.kernel.org/lkml/20200415191451.GA2305801@iweiny-DESK2.sc.intel.com/

> 
> - Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-18  5:03     ` Ira Weiny
@ 2020-05-18 16:24       ` Eric Biggers
  2020-05-18 19:23         ` Ira Weiny
  2020-05-20  2:02         ` Ira Weiny
  0 siblings, 2 replies; 29+ messages in thread
From: Eric Biggers @ 2020-05-18 16:24 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Sun, May 17, 2020 at 10:03:15PM -0700, Ira Weiny wrote:
> On Fri, May 15, 2020 at 07:02:53PM -0700, Eric Biggers wrote:
> > On Tue, May 12, 2020 at 10:43:18PM -0700, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > Encryption and DAX are incompatible.  Changing the DAX mode due to a
> > > change in Encryption mode is wrong without a corresponding
> > > address_space_operations update.
> > > 
> > > Make the 2 options mutually exclusive by returning an error if DAX was
> > > set first.
> > > 
> > > Furthermore, clarify the documentation of the exclusivity and how that
> > > will work.
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > ---
> > > Changes:
> > > 	remove WARN_ON_ONCE
> > > 	Add documentation to the encrypt doc WRT DAX
> > > ---
> > >  Documentation/filesystems/fscrypt.rst |  4 +++-
> > >  fs/ext4/super.c                       | 10 +---------
> > >  2 files changed, 4 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
> > > index aa072112cfff..1475b8d52fef 100644
> > > --- a/Documentation/filesystems/fscrypt.rst
> > > +++ b/Documentation/filesystems/fscrypt.rst
> > > @@ -1038,7 +1038,9 @@ astute users may notice some differences in behavior:
> > >  - The ext4 filesystem does not support data journaling with encrypted
> > >    regular files.  It will fall back to ordered data mode instead.
> > >  
> > > -- DAX (Direct Access) is not supported on encrypted files.
> > > +- DAX (Direct Access) is not supported on encrypted files.  Attempts to enable
> > > +  DAX on an encrypted file will fail.  Mount options will _not_ enable DAX on
> > > +  encrypted files.
> > >  
> > >  - The st_size of an encrypted symlink will not necessarily give the
> > >    length of the symlink target as required by POSIX.  It will actually
> > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > > index bf5fcb477f66..9873ab27e3fa 100644
> > > --- a/fs/ext4/super.c
> > > +++ b/fs/ext4/super.c
> > > @@ -1320,7 +1320,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > >  	if (inode->i_ino == EXT4_ROOT_INO)
> > >  		return -EPERM;
> > >  
> > > -	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> > > +	if (IS_DAX(inode))
> > >  		return -EINVAL;
> > >  
> > >  	res = ext4_convert_inline_data(inode);
> > > @@ -1344,10 +1344,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > >  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > >  			ext4_clear_inode_state(inode,
> > >  					EXT4_STATE_MAY_INLINE_DATA);
> > > -			/*
> > > -			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > > -			 * S_DAX may be disabled
> > > -			 */
> > >  			ext4_set_inode_flags(inode);
> > >  		}
> > >  		return res;
> > > @@ -1371,10 +1367,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > >  				    ctx, len, 0);
> > >  	if (!res) {
> > >  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > > -		/*
> > > -		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > > -		 * S_DAX may be disabled
> > > -		 */
> > >  		ext4_set_inode_flags(inode);
> > >  		res = ext4_mark_inode_dirty(handle, inode);
> > >  		if (res)
> > 
> > I'm confused by the ext4_set_context() change.
> > 
> > ext4_set_context() is only called when FS_IOC_SET_ENCRYPTION_POLICY sets an
> > encryption policy on an empty directory, *or* when a new inode (regular, dir, or
> > symlink) is created in an encrypted directory (thus inheriting encryption from
> > its parent).
> 
> I don't see the check which prevents FS_IOC_SET_ENCRYPTION_POLICY on a file?

It's in fscrypt_ioctl_set_policy().

> 
> On inode creation, encryption will always usurp S_DAX...
> 
> > 
> > So when is it reachable when IS_DAX()?  Is the issue that the DAX flag can now
> > be set on directories?  The commit message doesn't seem to be talking about
> > directories.  Is the behavior we want is that on an (empty) directory with the
> > DAX flag set, FS_IOC_SET_ENCRYPTION_POLICY should fail with EINVAL?
> 
> We would want that but AFIAK S_DAX is never set on directories.  Perhaps this
> is another place where S_DAX needs to be changed to the new inode flag?
> However, this would not be appropriate at this point in the series.  At this
> point in the series S_DAX is still set based on the mount option and I'm 99%
> sure that only happens on regular files, not directories.  So I'm confused now.

S_DAX is only set by ext4_set_inode_flags() which only sets it on regular files.

> 
> This is, AFAICS, not going to affect correctness.  It will only be confusing
> because the user will be able to set both DAX and encryption on the directory
> but files there will only see encryption being used...  :-(
> 
> Assuming you are correct about this call path only being valid on directories.
> It seems this IS_DAX() needs to be changed to check for EXT4_DAX_FL in
> "fs/ext4: Introduce DAX inode flag"?  Then at that point we can prevent DAX and
> encryption on a directory.  ...  and at this point IS_DAX() could be removed at
> this point in the series???

I haven't read the whole series, but if you are indeed trying to prevent a
directory with EXT4_DAX_FL from being encrypted, then it does look like you'd
need to check EXT4_DAX_FL, not S_DAX.

The other question is what should happen when a file is created in an encrypted
directory when the filesystem is mounted with -o dax.  Actually, I think I
missed something there.  Currently (based on reading the code) the DAX flag will
get set first, and then ext4_set_context() will see IS_DAX() && i_size == 0 and
clear the DAX flag when setting the encrypt flag.  So, the i_size == 0 check is
actually needed.  Your patch (AFAICS) just makes creating an encrypted file fail
when '-o dax'.  Is that intended?  If not, maybe you should change it to check
S_NEW instead of i_size == 0 to make it clearer?

- Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-18 16:24       ` Eric Biggers
@ 2020-05-18 19:23         ` Ira Weiny
  2020-05-18 19:44           ` Eric Biggers
  2020-05-20  2:02         ` Ira Weiny
  1 sibling, 1 reply; 29+ messages in thread
From: Ira Weiny @ 2020-05-18 19:23 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Mon, May 18, 2020 at 09:24:47AM -0700, Eric Biggers wrote:
> On Sun, May 17, 2020 at 10:03:15PM -0700, Ira Weiny wrote:
> > On Fri, May 15, 2020 at 07:02:53PM -0700, Eric Biggers wrote:
> > > On Tue, May 12, 2020 at 10:43:18PM -0700, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > Encryption and DAX are incompatible.  Changing the DAX mode due to a
> > > > change in Encryption mode is wrong without a corresponding
> > > > address_space_operations update.
> > > > 
> > > > Make the 2 options mutually exclusive by returning an error if DAX was
> > > > set first.
> > > > 
> > > > Furthermore, clarify the documentation of the exclusivity and how that
> > > > will work.
> > > > 
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > ---
> > > > Changes:
> > > > 	remove WARN_ON_ONCE
> > > > 	Add documentation to the encrypt doc WRT DAX
> > > > ---
> > > >  Documentation/filesystems/fscrypt.rst |  4 +++-
> > > >  fs/ext4/super.c                       | 10 +---------
> > > >  2 files changed, 4 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
> > > > index aa072112cfff..1475b8d52fef 100644
> > > > --- a/Documentation/filesystems/fscrypt.rst
> > > > +++ b/Documentation/filesystems/fscrypt.rst
> > > > @@ -1038,7 +1038,9 @@ astute users may notice some differences in behavior:
> > > >  - The ext4 filesystem does not support data journaling with encrypted
> > > >    regular files.  It will fall back to ordered data mode instead.
> > > >  
> > > > -- DAX (Direct Access) is not supported on encrypted files.
> > > > +- DAX (Direct Access) is not supported on encrypted files.  Attempts to enable
> > > > +  DAX on an encrypted file will fail.  Mount options will _not_ enable DAX on
> > > > +  encrypted files.
> > > >  
> > > >  - The st_size of an encrypted symlink will not necessarily give the
> > > >    length of the symlink target as required by POSIX.  It will actually
> > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > > > index bf5fcb477f66..9873ab27e3fa 100644
> > > > --- a/fs/ext4/super.c
> > > > +++ b/fs/ext4/super.c
> > > > @@ -1320,7 +1320,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > > >  	if (inode->i_ino == EXT4_ROOT_INO)
> > > >  		return -EPERM;
> > > >  
> > > > -	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> > > > +	if (IS_DAX(inode))
> > > >  		return -EINVAL;
> > > >  
> > > >  	res = ext4_convert_inline_data(inode);
> > > > @@ -1344,10 +1344,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > > >  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > > >  			ext4_clear_inode_state(inode,
> > > >  					EXT4_STATE_MAY_INLINE_DATA);
> > > > -			/*
> > > > -			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > > > -			 * S_DAX may be disabled
> > > > -			 */
> > > >  			ext4_set_inode_flags(inode);
> > > >  		}
> > > >  		return res;
> > > > @@ -1371,10 +1367,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> > > >  				    ctx, len, 0);
> > > >  	if (!res) {
> > > >  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > > > -		/*
> > > > -		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > > > -		 * S_DAX may be disabled
> > > > -		 */
> > > >  		ext4_set_inode_flags(inode);
> > > >  		res = ext4_mark_inode_dirty(handle, inode);
> > > >  		if (res)
> > > 
> > > I'm confused by the ext4_set_context() change.
> > > 
> > > ext4_set_context() is only called when FS_IOC_SET_ENCRYPTION_POLICY sets an
> > > encryption policy on an empty directory, *or* when a new inode (regular, dir, or
> > > symlink) is created in an encrypted directory (thus inheriting encryption from
> > > its parent).
> > 
> > I don't see the check which prevents FS_IOC_SET_ENCRYPTION_POLICY on a file?
> 
> It's in fscrypt_ioctl_set_policy().

I see...

> 
> > 
> > On inode creation, encryption will always usurp S_DAX...
> > 
> > > 
> > > So when is it reachable when IS_DAX()?  Is the issue that the DAX flag can now
> > > be set on directories?  The commit message doesn't seem to be talking about
> > > directories.  Is the behavior we want is that on an (empty) directory with the
> > > DAX flag set, FS_IOC_SET_ENCRYPTION_POLICY should fail with EINVAL?
> > 
> > We would want that but AFIAK S_DAX is never set on directories.  Perhaps this
> > is another place where S_DAX needs to be changed to the new inode flag?
> > However, this would not be appropriate at this point in the series.  At this
> > point in the series S_DAX is still set based on the mount option and I'm 99%
> > sure that only happens on regular files, not directories.  So I'm confused now.
> 
> S_DAX is only set by ext4_set_inode_flags() which only sets it on regular files.

Exactly...

> 
> > 
> > This is, AFAICS, not going to affect correctness.  It will only be confusing
> > because the user will be able to set both DAX and encryption on the directory
> > but files there will only see encryption being used...  :-(
> > 
> > Assuming you are correct about this call path only being valid on directories.
> > It seems this IS_DAX() needs to be changed to check for EXT4_DAX_FL in
> > "fs/ext4: Introduce DAX inode flag"?  Then at that point we can prevent DAX and
> > encryption on a directory.  ...  and at this point IS_DAX() could be removed at
> > this point in the series???
> 
> I haven't read the whole series, but if you are indeed trying to prevent a
> directory with EXT4_DAX_FL from being encrypted, then it does look like you'd
> need to check EXT4_DAX_FL, not S_DAX.

Yep.

> 
> The other question is what should happen when a file is created in an encrypted
> directory when the filesystem is mounted with -o dax.  Actually, I think I
> missed something there.  Currently (based on reading the code) the DAX flag will
> get set first, and then ext4_set_context()

See this is where I am confused.  Above you said that ext4_set_context() is only
called on a directory.  And I agree with you now having seen the check in
fscrypt_ioctl_set_policy().  So what is the call path you are speaking of here?

> will see IS_DAX() && i_size == 0 and
> clear the DAX flag when setting the encrypt flag.  So, the i_size == 0 check is
> actually needed.  Your patch (AFAICS) just makes creating an encrypted file fail
> when '-o dax'.  Is that intended?

Yes that is what I intended for this patch.  At this point in the series the
file system is either all DAX (-o dax) or not.  I did not comprehend the
directory vs regular file complexity with fscrypt.

It seems this patch should be removing the IS_DAX() check completely but I'm
still not sure if a regular file inode could be passed to ext4_set_context()
and I think we need to protect if it has IS_DAX() set if it does...

An alternate solution would be to drop this patch entirely and change the code
later in the series once EXT4_DAX_FL is defined...

But I'm not even clear where EXT4_ENCRYPT_FL is set...

Ira

> If not, maybe you should change it to check
> S_NEW instead of i_size == 0 to make it clearer?
> 
> - Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-18 19:23         ` Ira Weiny
@ 2020-05-18 19:44           ` Eric Biggers
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Biggers @ 2020-05-18 19:44 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Mon, May 18, 2020 at 12:23:57PM -0700, Ira Weiny wrote:
> > 
> > The other question is what should happen when a file is created in an encrypted
> > directory when the filesystem is mounted with -o dax.  Actually, I think I
> > missed something there.  Currently (based on reading the code) the DAX flag will
> > get set first, and then ext4_set_context()
> 
> See this is where I am confused.  Above you said that ext4_set_context() is only
> called on a directory.  And I agree with you now having seen the check in
> fscrypt_ioctl_set_policy().  So what is the call path you are speaking of here?

Here's what I actually said:

	ext4_set_context() is only called when FS_IOC_SET_ENCRYPTION_POLICY sets
	an encryption policy on an empty directory, *or* when a new inode
	(regular, dir, or symlink) is created in an encrypted directory (thus
	inheriting encryption from its parent).

Just find the places where ->set_context() is called and follow them backwards.

- Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-18 16:24       ` Eric Biggers
  2020-05-18 19:23         ` Ira Weiny
@ 2020-05-20  2:02         ` Ira Weiny
  2020-05-20 13:11           ` Jan Kara
  1 sibling, 1 reply; 29+ messages in thread
From: Ira Weiny @ 2020-05-20  2:02 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, Andreas Dilger, Theodore Y. Ts'o, Jan Kara,
	Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Mon, May 18, 2020 at 09:24:47AM -0700, Eric Biggers wrote:
> On Sun, May 17, 2020 at 10:03:15PM -0700, Ira Weiny wrote:

First off...  OMG...

I'm seeing some possible user pitfalls which are complicating things IMO.  It
probably does not matter because most users don't care and have either enabled
DAX on _every_ mount or _not_ enabled DAX on _every_ mount.  And have _not_
used verity nor encryption while using DAX.

Verity is a bit easier because verity is not inherited and we only need to
protect against setting it if DAX is on.

However, it can be weird for the user thusly:

1) mount _without_ DAX
2) enable verity on individual inodes
3) unmount/mount _with_ DAX

Now the verity files are not enabled for DAX without any indication...  <sigh>
This is still true with my patch.  But at least it closes the hole of trying to
change the DAX flag after the fact (because verity was set).

Also both this check and the verity need to be maintained to keep the mount
option working as it was before...

For encryption it is more complicated because encryption can be set on
directories and inherited so the IS_DAX() check does nothing while '-o dax' is
used.  Therefore users can:

1) mount _with_ DAX
2) enable encryption on a directory
3) files created in that directory will not have DAX set

And I now understand why the WARN_ON() was there...  To tell users about this
craziness.

...

> > This is, AFAICS, not going to affect correctness.  It will only be confusing
> > because the user will be able to set both DAX and encryption on the directory
> > but files there will only see encryption being used...  :-(
> > 
> > Assuming you are correct about this call path only being valid on directories.
> > It seems this IS_DAX() needs to be changed to check for EXT4_DAX_FL in
> > "fs/ext4: Introduce DAX inode flag"?  Then at that point we can prevent DAX and
> > encryption on a directory.  ...  and at this point IS_DAX() could be removed at
> > this point in the series???
> 
> I haven't read the whole series, but if you are indeed trying to prevent a
> directory with EXT4_DAX_FL from being encrypted, then it does look like you'd
> need to check EXT4_DAX_FL, not S_DAX.
> 
> The other question is what should happen when a file is created in an encrypted
> directory when the filesystem is mounted with -o dax.  Actually, I think I
> missed something there.  Currently (based on reading the code) the DAX flag will
> get set first, and then ext4_set_context() will see IS_DAX() && i_size == 0 and
> clear the DAX flag when setting the encrypt flag.

I think you are correct.

>
> So, the i_size == 0 check is actually needed.
> Your patch (AFAICS) just makes creating an encrypted file fail
> when '-o dax'.  Is that intended?

Yes that is what I intended but it is more complicated I see now.

The intent is that IS_DAX() should _never_ be true on an encrypted or verity
file...  even if -o dax is specified.  Because IS_DAX() should be a result of
the inode flags being checked.  The order of the setting of those flags is a
bit odd for the encrypted case.  I don't really like that DAX is set then
un-set.  It is convoluted but I'm not clear right now how to fix it.

> If not, maybe you should change it to check
> S_NEW instead of i_size == 0 to make it clearer?

The patch is completely unnecessary.

It is much easier to make (EXT4_ENCRYPT_FL | EXT4_VERITY_FL) incompatible with
EXT4_DAX_FL when it is introduced later in the series.  Furthermore this mutual
exclusion can be done on directories in the encrypt case.  Which I think will
be nicer for the user if they get an error when trying to set one when the other
is set.

Ira


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX
  2020-05-20  2:02         ` Ira Weiny
@ 2020-05-20 13:11           ` Jan Kara
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Kara @ 2020-05-20 13:11 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Eric Biggers, linux-ext4, Andreas Dilger, Theodore Y. Ts'o,
	Jan Kara, Al Viro, Dan Williams, Dave Chinner, Christoph Hellwig,
	Jeff Moyer, Darrick J. Wong, linux-fsdevel, linux-kernel

On Tue 19-05-20 19:02:33, Ira Weiny wrote:
> On Mon, May 18, 2020 at 09:24:47AM -0700, Eric Biggers wrote:
> > On Sun, May 17, 2020 at 10:03:15PM -0700, Ira Weiny wrote:
> 
> First off...  OMG...
> 
> I'm seeing some possible user pitfalls which are complicating things IMO.  It
> probably does not matter because most users don't care and have either enabled
> DAX on _every_ mount or _not_ enabled DAX on _every_ mount.  And have _not_
> used verity nor encryption while using DAX.
> 
> Verity is a bit easier because verity is not inherited and we only need to
> protect against setting it if DAX is on.
> 
> However, it can be weird for the user thusly:
> 
> 1) mount _without_ DAX
> 2) enable verity on individual inodes
> 3) unmount/mount _with_ DAX
> 
> Now the verity files are not enabled for DAX without any indication...
> <sigh> This is still true with my patch.  But at least it closes the hole
> of trying to change the DAX flag after the fact (because verity was set).
> 
> Also both this check and the verity need to be maintained to keep the mount
> option working as it was before...
> 
> For encryption it is more complicated because encryption can be set on
> directories and inherited so the IS_DAX() check does nothing while '-o
> dax' is used.  Therefore users can:
> 
> 1) mount _with_ DAX
> 2) enable encryption on a directory
> 3) files created in that directory will not have DAX set
> 
> And I now understand why the WARN_ON() was there...  To tell users about this
> craziness.

Thanks for digging into this! I agree that just not setting S_DAX where
other inode features disallow that is probably the best.

> > > This is, AFAICS, not going to affect correctness.  It will only be confusing
> > > because the user will be able to set both DAX and encryption on the directory
> > > but files there will only see encryption being used...  :-(
> > > 
> > > Assuming you are correct about this call path only being valid on directories.
> > > It seems this IS_DAX() needs to be changed to check for EXT4_DAX_FL in
> > > "fs/ext4: Introduce DAX inode flag"?  Then at that point we can prevent DAX and
> > > encryption on a directory.  ...  and at this point IS_DAX() could be removed at
> > > this point in the series???
> > 
> > I haven't read the whole series, but if you are indeed trying to prevent a
> > directory with EXT4_DAX_FL from being encrypted, then it does look like you'd
> > need to check EXT4_DAX_FL, not S_DAX.
> > 
> > The other question is what should happen when a file is created in an encrypted
> > directory when the filesystem is mounted with -o dax.  Actually, I think I
> > missed something there.  Currently (based on reading the code) the DAX flag will
> > get set first, and then ext4_set_context() will see IS_DAX() && i_size == 0 and
> > clear the DAX flag when setting the encrypt flag.
> 
> I think you are correct.
> 
> >
> > So, the i_size == 0 check is actually needed.
> > Your patch (AFAICS) just makes creating an encrypted file fail
> > when '-o dax'.  Is that intended?
> 
> Yes that is what I intended but it is more complicated I see now.
> 
> The intent is that IS_DAX() should _never_ be true on an encrypted or verity
> file...  even if -o dax is specified.  Because IS_DAX() should be a result of
> the inode flags being checked.  The order of the setting of those flags is a
> bit odd for the encrypted case.  I don't really like that DAX is set then
> un-set.  It is convoluted but I'm not clear right now how to fix it.
> 
> > If not, maybe you should change it to check
> > S_NEW instead of i_size == 0 to make it clearer?
> 
> The patch is completely unnecessary.
> 
> It is much easier to make (EXT4_ENCRYPT_FL | EXT4_VERITY_FL) incompatible
> with EXT4_DAX_FL when it is introduced later in the series.  Furthermore
> this mutual exclusion can be done on directories in the encrypt case.
> Which I think will be nicer for the user if they get an error when trying
> to set one when the other is set.

Agreed.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, back to index

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13  5:43 [PATCH 0/9] Enable ext4 support for per-file/directory DAX operations ira.weiny
2020-05-13  5:43 ` [PATCH 1/9] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
2020-05-13  5:43 ` [PATCH 2/9] fs/ext4: Disallow verity if inode is DAX ira.weiny
2020-05-16  1:49   ` Eric Biggers
2020-05-18  5:32     ` Ira Weiny
2020-05-13  5:43 ` [PATCH 3/9] fs/ext4: Disallow encryption " ira.weiny
2020-05-16  2:02   ` Eric Biggers
2020-05-18  5:03     ` Ira Weiny
2020-05-18 16:24       ` Eric Biggers
2020-05-18 19:23         ` Ira Weiny
2020-05-18 19:44           ` Eric Biggers
2020-05-20  2:02         ` Ira Weiny
2020-05-20 13:11           ` Jan Kara
2020-05-13  5:43 ` [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS ira.weiny
2020-05-13 11:25   ` Jan Kara
2020-05-13  5:43 ` [PATCH 5/9] fs/ext4: Update ext4_should_use_dax() ira.weiny
2020-05-13 11:30   ` Jan Kara
2020-05-13  5:43 ` [PATCH 6/9] fs/ext4: Only change S_DAX on inode load ira.weiny
2020-05-13 11:33   ` Jan Kara
2020-05-13  5:43 ` [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state ira.weiny
2020-05-13 14:35   ` Jan Kara
2020-05-13 18:17     ` Darrick J. Wong
2020-05-13 19:53       ` Ira Weiny
2020-05-13  5:43 ` [PATCH 8/9] fs/ext4: Introduce DAX inode flag ira.weiny
2020-05-13 14:47   ` Jan Kara
2020-05-13 21:41     ` Ira Weiny
2020-05-14  6:43       ` Jan Kara
2020-05-14  6:55         ` Ira Weiny
2020-05-13  5:43 ` [PATCH 9/9] Documentation/dax: Update DAX enablement for ext4 ira.weiny

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git