linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations
@ 2020-04-14  4:00 ira.weiny
  2020-04-14  4:00 ` [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
                   ` (7 more replies)
  0 siblings, 8 replies; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, linux-ext4, linux-xfs,
	linux-fsdevel, Jeff Moyer

From: Ira Weiny <ira.weiny@intel.com>

Enable the same per file DAX support to ext4 as was done for xfs.  This series
builds and depends on the V7 series for xfs.[1]

To summarize:

 1. There exists an in-kernel access mode flag S_DAX that is set when
    file accesses go directly to persistent memory, bypassing the page
    cache.  Applications must call statx to discover the current S_DAX
    state (STATX_ATTR_DAX).

 2. There exists an advisory file inode flag FS_XFLAG_DAX that is
    inherited from the parent directory FS_XFLAG_DAX inode flag at file
    creation time.  This advisory flag can be set or cleared at any
    time, but doing so does not immediately affect the S_DAX state.

    Unless overridden by mount options (see (3)), if FS_XFLAG_DAX is set
    and the fs is on pmem then it will enable S_DAX at inode load time;
    if FS_XFLAG_DAX is not set, it will not enable S_DAX.

 3. There exists a dax= mount option.

    "-o dax=never"  means "never set S_DAX, ignore FS_XFLAG_DAX."

    "-o dax=always" means "always set S_DAX (at least on pmem),
                    and ignore FS_XFLAG_DAX."

    "-o dax"        is an alias for "dax=always".

    "-o dax=inode"  means "follow FS_XFLAG_DAX" and is the default.

 4. There exists an advisory directory inode flag FS_XFLAG_DAX that can
    be set or cleared at any time.  The flag state is inherited by any files or
    subdirectories when they are created within that directory.

 5. Programs that require a specific file access mode (DAX or not DAX)
    can do one of the following:

    (a) Create files in directories that the FS_XFLAG_DAX flag set as
        needed; or

    (b) Have the administrator set an override via mount option; or

    (c) Set or clear the file's FS_XFLAG_DAX flag as needed.  Programs
        must then cause the kernel to evict the inode from memory.  This
        can be done by:

        i>  Closing the file and re-opening the file and using statx to
            see if the fs has changed the S_DAX flag; and

        ii> If the file still does not have the desired S_DAX access
            mode, either unmount and remount the filesystem, or close
            the file and use drop_caches.

 6. It is expected that users who want to squeeze every last bit of performance
    out of the particular rough and tumble bits of their storage will also be
    exposed to the difficulties of what happens when the operating system can't
    totally virtualize those hardware capabilities.  DAX is such a feature.


[1] https://lore.kernel.org/lkml/20200407182958.568475-1-ira.weiny@intel.com/

To: linux-kernel@vger.kernel.org
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Cc: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org


Ira Weiny (8):
  fs/ext4: Narrow scope of DAX check in setflags
  fs/ext4: Disallow verity if inode is DAX
  fs/ext4: Disallow encryption if inode is DAX
  fs/ext4: Introduce DAX inode flag
  fs/ext4: Make DAX mount option a tri-state
  fs/ext4: Update ext4_should_use_dax()
  fs/ext4: Only change S_DAX on inode load
  Documentation/dax: Update DAX enablement for ext4

 Documentation/filesystems/dax.txt | 13 +------
 fs/ext4/ext4.h                    | 16 ++++++---
 fs/ext4/ialloc.c                  |  2 +-
 fs/ext4/inode.c                   | 22 ++++++++----
 fs/ext4/ioctl.c                   | 28 ++++++++++++---
 fs/ext4/super.c                   | 57 +++++++++++++++++++++++--------
 fs/ext4/verity.c                  |  5 ++-
 7 files changed, 99 insertions(+), 44 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 11:45   ` Jan Kara
  2020-04-14  4:00 ` [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

When preventing DAX and journaling on an inode.  Use the effective DAX
check rather than the mount option.

This will be required to support per inode DAX flags.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index a0ec750018dd..ee3401a32e79 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -405,9 +405,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
 	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
 		/*
 		 * Changes to the journaling mode can cause unsafe changes to
-		 * S_DAX if we are using the DAX mount option.
+		 * S_DAX if the inode is DAX
 		 */
-		if (test_opt(inode->i_sb, DAX)) {
+		if (IS_DAX(inode)) {
 			err = -EBUSY;
 			goto flags_out;
 		}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
  2020-04-14  4:00 ` [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 11:58   ` Jan Kara
  2020-04-15 12:00   ` Jan Kara
  2020-04-14  4:00 ` [PATCH RFC 3/8] fs/ext4: Disallow encryption " ira.weiny
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

Verity and DAX are incompatible.  Changing the DAX mode due to a verity
flag change is wrong without a corresponding address_space_operations
update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

(Setting DAX is already disabled if Verity is set first.)

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/verity.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dc5ec724d889..ce3f9a198d3b 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
 	handle_t *handle;
 	int err;
 
+	if (WARN_ON_ONCE(IS_DAX(inode)))
+		return -EINVAL;
+
 	if (ext4_verity_in_progress(inode))
 		return -EBUSY;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
  2020-04-14  4:00 ` [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
  2020-04-14  4:00 ` [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 12:02   ` Jan Kara
  2020-04-15 16:03   ` Theodore Y. Ts'o
  2020-04-14  4:00 ` [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag ira.weiny
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

Encryption and DAX are incompatible.  Changing the DAX mode due to a
change in Encryption mode is wrong without a corresponding
address_space_operations update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/super.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0c7c4adb664e..b14863058115 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1325,7 +1325,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 	if (inode->i_ino == EXT4_ROOT_INO)
 		return -EPERM;
 
-	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
+	if (WARN_ON_ONCE(IS_DAX(inode)))
 		return -EINVAL;
 
 	res = ext4_convert_inline_data(inode);
@@ -1349,10 +1349,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
 			ext4_clear_inode_state(inode,
 					EXT4_STATE_MAY_INLINE_DATA);
-			/*
-			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
-			 * S_DAX may be disabled
-			 */
 			ext4_set_inode_flags(inode);
 		}
 		return res;
@@ -1376,10 +1372,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 				    ctx, len, 0);
 	if (!res) {
 		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
-		/*
-		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
-		 * S_DAX may be disabled
-		 */
 		ext4_set_inode_flags(inode);
 		res = ext4_mark_inode_dirty(handle, inode);
 		if (res)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (2 preceding siblings ...)
  2020-04-14  4:00 ` [PATCH RFC 3/8] fs/ext4: Disallow encryption " ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 12:08   ` Jan Kara
  2020-04-16 16:25   ` Darrick J. Wong
  2020-04-14  4:00 ` [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.

Set the flag to be user visible and changeable.  Set the flag to be
inherited.  Allow applications to change the flag at any time.

Finally, on regular files, flag the inode to not be cached to facilitate
changing S_DAX on the next creation of the inode.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/ext4.h  | 13 +++++++++----
 fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 61b37a052052..434021fcec88 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -415,13 +415,16 @@ struct flex_groups {
 #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
 #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
 #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
+
+#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
+
 #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
 #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
 #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
-#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
+#define EXT4_FL_USER_VISIBLE		0x70DBDFFF /* User visible flags */
+#define EXT4_FL_USER_MODIFIABLE		0x60CBC0FF /* User modifiable flags */
 
 /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
 #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
@@ -429,14 +432,16 @@ struct flex_groups {
 					 EXT4_APPEND_FL | \
 					 EXT4_NODUMP_FL | \
 					 EXT4_NOATIME_FL | \
-					 EXT4_PROJINHERIT_FL)
+					 EXT4_PROJINHERIT_FL | \
+					 EXT4_DAX_FL)
 
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
 			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
-			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
+			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
+			   EXT4_DAX_FL)
 
 /* Flags that are appropriate for regular files (all but dir-specific ones). */
 #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index ee3401a32e79..ca07d5086f03 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -539,12 +539,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
 		xflags |= FS_XFLAG_NOATIME;
 	if (iflags & EXT4_PROJINHERIT_FL)
 		xflags |= FS_XFLAG_PROJINHERIT;
+	if (iflags & EXT4_DAX_FL)
+		xflags |= FS_XFLAG_DAX;
 	return xflags;
 }
 
 #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
 				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
-				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
+				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
+				  FS_XFLAG_DAX)
 
 /* Transfer xflags flags to internal */
 static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
@@ -563,6 +566,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
 		iflags |= EXT4_NOATIME_FL;
 	if (xflags & FS_XFLAG_PROJINHERIT)
 		iflags |= EXT4_PROJINHERIT_FL;
+	if (xflags & FS_XFLAG_DAX)
+		iflags |= EXT4_DAX_FL;
 
 	return iflags;
 }
@@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
 	return error;
 }
 
+static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
+{
+	struct ext4_inode_info *ei = EXT4_I(inode);
+
+	if (S_ISDIR(inode->i_mode))
+		return;
+
+	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
+		inode->i_state |= I_DONTCACHE;
+}
+
 long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 {
 	struct inode *inode = file_inode(filp);
@@ -1273,6 +1289,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 			return err;
 
 		inode_lock(inode);
+
+		ext4_dax_dontcache(inode, flags);
+
 		ext4_fill_fsxattr(inode, &old_fa);
 		err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa);
 		if (err)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (3 preceding siblings ...)
  2020-04-14  4:00 ` [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 12:51   ` Jan Kara
  2020-04-14  4:00 ` [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

We add 'always', 'never', and 'inode' (default).  '-o dax' continue to
operate the same.

Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT_NODAX and set
it and EXT4_MOUNT_DAX appropriately.

We also force EXT4_MOUNT_NODAX if !CONFIG_FS_DAX.

https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/ext4.h  |  1 +
 fs/ext4/super.c | 43 +++++++++++++++++++++++++++++++++++++++----
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 434021fcec88..aadf33d5b37d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1174,6 +1174,7 @@ struct ext4_inode_info {
 						      blocks */
 #define EXT4_MOUNT2_HURD_COMPAT		0x00000004 /* Support HURD-castrated
 						      file systems */
+#define EXT4_MOUNT2_NODAX		0x00000008 /* Do not allow Direct Access */
 
 #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM	0x00000008 /* User explicitly
 						specified journal checksum */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b14863058115..0175fbfeb2ea 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1510,6 +1510,7 @@ enum {
 	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
 	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
 	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
+	Opt_dax_str,
 	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
 	Opt_nowarn_on_error, Opt_mblk_io_submit,
 	Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
@@ -1575,6 +1576,7 @@ static const match_table_t tokens = {
 	{Opt_barrier, "barrier"},
 	{Opt_nobarrier, "nobarrier"},
 	{Opt_i_version, "i_version"},
+	{Opt_dax_str, "dax=%s"},
 	{Opt_dax, "dax"},
 	{Opt_stripe, "stripe=%u"},
 	{Opt_delalloc, "delalloc"},
@@ -1772,6 +1774,7 @@ static const struct mount_opts {
 	{Opt_min_batch_time, 0, MOPT_GTE0},
 	{Opt_inode_readahead_blks, 0, MOPT_GTE0},
 	{Opt_init_itable, 0, MOPT_GTE0},
+	{Opt_dax_str, 0, MOPT_STRING},
 	{Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
 	{Opt_stripe, 0, MOPT_GTE0},
 	{Opt_resuid, 0, MOPT_GTE0},
@@ -2081,13 +2084,32 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
 		}
 		sbi->s_jquota_fmt = m->mount_opt;
 #endif
-	} else if (token == Opt_dax) {
+	} else if (token == Opt_dax || token == Opt_dax_str) {
 #ifdef CONFIG_FS_DAX
-		ext4_msg(sb, KERN_WARNING,
-		"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
-		sbi->s_mount_opt |= m->mount_opt;
+		char *tmp = match_strdup(&args[0]);
+
+		if (!tmp || !strcmp(tmp, "always")) {
+			ext4_msg(sb, KERN_WARNING,
+				"DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
+			sbi->s_mount_opt |= EXT4_MOUNT_DAX;
+			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_NODAX;
+		} else if (!strcmp(tmp, "never")) {
+			sbi->s_mount_opt2 |= EXT4_MOUNT2_NODAX;
+			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
+		} else if (!strcmp(tmp, "inode")) {
+			sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
+			sbi->s_mount_opt2 &= ~EXT4_MOUNT2_NODAX;
+		} else {
+			ext4_msg(sb, KERN_WARNING, "DAX invalid option.");
+			kfree(tmp);
+			return -1;
+		}
+
+		kfree(tmp);
 #else
 		ext4_msg(sb, KERN_INFO, "dax option not supported");
+		sbi->s_mount_opt2 |= EXT4_MOUNT2_NODAX;
+		sbi->s_mount_opt &= ~EXT4_MOUNT_DAX;
 		return -1;
 #endif
 	} else if (token == Opt_data_err_abort) {
@@ -2303,6 +2325,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
 	if (DUMMY_ENCRYPTION_ENABLED(sbi))
 		SEQ_OPTS_PUTS("test_dummy_encryption");
 
+	if (test_opt2(sb, NODAX))
+		SEQ_OPTS_PUTS("dax=never");
+	else if (test_opt(sb, DAX))
+		SEQ_OPTS_PUTS("dax=always");
+	else
+		SEQ_OPTS_PUTS("dax=inode");
+
 	ext4_show_quota_options(seq, sb);
 	return 0;
 }
@@ -5424,6 +5453,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 		sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
 	}
 
+	if ((sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_NODAX) {
+		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
+			"non-dax flag with busy inodes while remounting");
+		sbi->s_mount_opt2 ^= EXT4_MOUNT2_NODAX;
+	}
+
 	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
 		ext4_abort(sb, "Abort forced by user");
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax()
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (4 preceding siblings ...)
  2020-04-14  4:00 ` [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 13:58   ` Jan Kara
  2020-04-14  4:00 ` [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load ira.weiny
  2020-04-14  4:00 ` [PATCH RFC 8/8] Documentation/dax: Update DAX enablement for ext4 ira.weiny
  7 siblings, 1 reply; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

Change the logic of ext4_should_use_dax() to support using the inode dax
flag OR the overriding tri-state mount option.

While we are at it change the function to ext4_enable_dax() as this
better reflects the ask.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/inode.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fa0ff78dc033..e9d582e516bc 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4383,9 +4383,11 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
 }
 
-static bool ext4_should_use_dax(struct inode *inode)
+static bool ext4_enable_dax(struct inode *inode)
 {
-	if (!test_opt(inode->i_sb, DAX))
+	unsigned int flags = EXT4_I(inode)->i_flags;
+
+	if (test_opt2(inode->i_sb, NODAX))
 		return false;
 	if (!S_ISREG(inode->i_mode))
 		return false;
@@ -4397,7 +4399,13 @@ static bool ext4_should_use_dax(struct inode *inode)
 		return false;
 	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
 		return false;
-	return true;
+	if (!bdev_dax_supported(inode->i_sb->s_bdev,
+				inode->i_sb->s_blocksize))
+		return false;
+	if (test_opt(inode->i_sb, DAX))
+		return true;
+
+	return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
 }
 
 void ext4_set_inode_flags(struct inode *inode)
@@ -4415,7 +4423,7 @@ void ext4_set_inode_flags(struct inode *inode)
 		new_fl |= S_NOATIME;
 	if (flags & EXT4_DIRSYNC_FL)
 		new_fl |= S_DIRSYNC;
-	if (ext4_should_use_dax(inode))
+	if (ext4_enable_dax(inode))
 		new_fl |= S_DAX;
 	if (flags & EXT4_ENCRYPT_FL)
 		new_fl |= S_ENCRYPTED;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (5 preceding siblings ...)
  2020-04-14  4:00 ` [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  2020-04-15 14:03   ` Jan Kara
  2020-04-14  4:00 ` [PATCH RFC 8/8] Documentation/dax: Update DAX enablement for ext4 ira.weiny
  7 siblings, 1 reply; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

To prevent complications with in memory inodes we only set S_DAX on
inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
change after inode eviction and reload.

Add init bool to ext4_set_inode_flags() to indicate if the inode is
being newly initialized.

Assert that S_DAX is not set on an inode which is just being loaded.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/ext4/ext4.h   | 2 +-
 fs/ext4/ialloc.c | 2 +-
 fs/ext4/inode.c  | 8 +++++---
 fs/ext4/ioctl.c  | 3 ++-
 fs/ext4/super.c  | 4 ++--
 fs/ext4/verity.c | 2 +-
 6 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index aadf33d5b37d..7f9234cce5b8 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2699,7 +2699,7 @@ extern int ext4_can_truncate(struct inode *inode);
 extern int ext4_truncate(struct inode *);
 extern int ext4_break_layouts(struct inode *);
 extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length);
-extern void ext4_set_inode_flags(struct inode *);
+extern void ext4_set_inode_flags(struct inode *, bool init);
 extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index f95ee99091e4..1519a4bf42fa 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1104,7 +1104,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
 	ei->i_block_group = group;
 	ei->i_last_alloc_group = ~0;
 
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, true);
 	if (IS_DIRSYNC(inode))
 		ext4_handle_sync(handle);
 	if (insert_inode_locked(inode) < 0) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e9d582e516bc..e9dfdfd6c698 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4408,11 +4408,13 @@ static bool ext4_enable_dax(struct inode *inode)
 	return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
 }
 
-void ext4_set_inode_flags(struct inode *inode)
+void ext4_set_inode_flags(struct inode *inode, bool init)
 {
 	unsigned int flags = EXT4_I(inode)->i_flags;
 	unsigned int new_fl = 0;
 
+	J_ASSERT(!(IS_DAX(inode) && init));
+
 	if (flags & EXT4_SYNC_FL)
 		new_fl |= S_SYNC;
 	if (flags & EXT4_APPEND_FL)
@@ -4423,7 +4425,7 @@ void ext4_set_inode_flags(struct inode *inode)
 		new_fl |= S_NOATIME;
 	if (flags & EXT4_DIRSYNC_FL)
 		new_fl |= S_DIRSYNC;
-	if (ext4_enable_dax(inode))
+	if (init && ext4_enable_dax(inode))
 		new_fl |= S_DAX;
 	if (flags & EXT4_ENCRYPT_FL)
 		new_fl |= S_ENCRYPTED;
@@ -4639,7 +4641,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 		 * not initialized on a new filesystem. */
 	}
 	ei->i_flags = le32_to_cpu(raw_inode->i_flags);
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, true);
 	inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
 	ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
 	if (ext4_has_feature_64bit(sb))
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index ca07d5086f03..efe41dcc8807 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -393,7 +393,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
 			ext4_clear_inode_flag(inode, i);
 	}
 
-	ext4_set_inode_flags(inode);
+	ext4_set_inode_flags(inode, false);
+
 	inode->i_ctime = current_time(inode);
 
 	err = ext4_mark_iloc_dirty(handle, inode, &iloc);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0175fbfeb2ea..43cc52ea7918 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1349,7 +1349,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
 			ext4_clear_inode_state(inode,
 					EXT4_STATE_MAY_INLINE_DATA);
-			ext4_set_inode_flags(inode);
+			ext4_set_inode_flags(inode, false);
 		}
 		return res;
 	}
@@ -1372,7 +1372,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 				    ctx, len, 0);
 	if (!res) {
 		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
-		ext4_set_inode_flags(inode);
+		ext4_set_inode_flags(inode, false);
 		res = ext4_mark_inode_dirty(handle, inode);
 		if (res)
 			EXT4_ERROR_INODE(inode, "Failed to mark inode dirty");
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index ce3f9a198d3b..4f7d780ff6da 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -244,7 +244,7 @@ static int ext4_end_enable_verity(struct file *filp, const void *desc,
 		if (err)
 			goto out_stop;
 		ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
-		ext4_set_inode_flags(inode);
+		ext4_set_inode_flags(inode, false);
 		err = ext4_mark_iloc_dirty(handle, inode, &iloc);
 	}
 out_stop:
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC 8/8] Documentation/dax: Update DAX enablement for ext4
  2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
                   ` (6 preceding siblings ...)
  2020-04-14  4:00 ` [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load ira.weiny
@ 2020-04-14  4:00 ` ira.weiny
  7 siblings, 0 replies; 42+ messages in thread
From: ira.weiny @ 2020-04-14  4:00 UTC (permalink / raw)
  To: linux-kernel, Jan Kara
  Cc: Ira Weiny, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

Update the document to reflect ext4 and xfs now behave the same.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 Documentation/filesystems/dax.txt | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/Documentation/filesystems/dax.txt b/Documentation/filesystems/dax.txt
index af14c1b330a9..5f28b22d2815 100644
--- a/Documentation/filesystems/dax.txt
+++ b/Documentation/filesystems/dax.txt
@@ -22,18 +22,7 @@ on it as usual.  The DAX code currently only supports files with a block
 size equal to your kernel's PAGE_SIZE, so you may need to specify a block
 size when creating the file system.
 
-Currently 2 filesystems support DAX, ext4 and xfs.  Enabling DAX on them is
-different at this time.
-
-Enabling DAX on ext4
---------------------
-
-When mounting the filesystem, use the "-o dax" option on the command line or
-add 'dax' to the options in /etc/fstab.
-
-
-Enabling DAX on xfs
--------------------
+Currently 2 filesystems support DAX, ext4 and xfs.
 
 Summary
 -------
-- 
2.25.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags
  2020-04-14  4:00 ` [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
@ 2020-04-15 11:45   ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2020-04-15 11:45 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:23, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> When preventing DAX and journaling on an inode.  Use the effective DAX
> check rather than the mount option.
> 
> This will be required to support per inode DAX flags.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/ioctl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index a0ec750018dd..ee3401a32e79 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -405,9 +405,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
>  	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
>  		/*
>  		 * Changes to the journaling mode can cause unsafe changes to
> -		 * S_DAX if we are using the DAX mount option.
> +		 * S_DAX if the inode is DAX
>  		 */
> -		if (test_opt(inode->i_sb, DAX)) {
> +		if (IS_DAX(inode)) {
>  			err = -EBUSY;
>  			goto flags_out;
>  		}
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-14  4:00 ` [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
@ 2020-04-15 11:58   ` Jan Kara
  2020-04-15 12:00   ` Jan Kara
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Kara @ 2020-04-15 11:58 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> flag change is wrong without a corresponding address_space_operations
> update.
> 
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
> 
> (Setting DAX is already disabled if Verity is set first.)
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/verity.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> index dc5ec724d889..ce3f9a198d3b 100644
> --- a/fs/ext4/verity.c
> +++ b/fs/ext4/verity.c
> @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
>  	handle_t *handle;
>  	int err;
>  
> +	if (WARN_ON_ONCE(IS_DAX(inode)))
> +		return -EINVAL;
> +
>  	if (ext4_verity_in_progress(inode))
>  		return -EBUSY;
>  
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-14  4:00 ` [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
  2020-04-15 11:58   ` Jan Kara
@ 2020-04-15 12:00   ` Jan Kara
  2020-04-15 15:55     ` Theodore Y. Ts'o
                       ` (2 more replies)
  1 sibling, 3 replies; 42+ messages in thread
From: Jan Kara @ 2020-04-15 12:00 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> flag change is wrong without a corresponding address_space_operations
> update.
> 
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
> 
> (Setting DAX is already disabled if Verity is set first.)
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/ext4/verity.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> index dc5ec724d889..ce3f9a198d3b 100644
> --- a/fs/ext4/verity.c
> +++ b/fs/ext4/verity.c
> @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
>  	handle_t *handle;
>  	int err;
>  
> +	if (WARN_ON_ONCE(IS_DAX(inode)))
> +		return -EINVAL;
> +

Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
correctly, user could normally trigger this, couldn't he?

								Honza

>  	if (ext4_verity_in_progress(inode))
>  		return -EBUSY;
>  
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-14  4:00 ` [PATCH RFC 3/8] fs/ext4: Disallow encryption " ira.weiny
@ 2020-04-15 12:02   ` Jan Kara
  2020-04-15 20:35     ` Ira Weiny
  2020-04-15 16:03   ` Theodore Y. Ts'o
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Kara @ 2020-04-15 12:02 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:25, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Encryption and DAX are incompatible.  Changing the DAX mode due to a
> change in Encryption mode is wrong without a corresponding
> address_space_operations update.
> 
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/ext4/super.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 0c7c4adb664e..b14863058115 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1325,7 +1325,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  	if (inode->i_ino == EXT4_ROOT_INO)
>  		return -EPERM;
>  
> -	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> +	if (WARN_ON_ONCE(IS_DAX(inode)))

Also here I don't think WARN_ON_ONCE() is warranted once we allow per-inode
setting of DAX. It will then become a regular error condition...

								Honza

>  		return -EINVAL;
>  
>  	res = ext4_convert_inline_data(inode);
> @@ -1349,10 +1349,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
>  			ext4_clear_inode_state(inode,
>  					EXT4_STATE_MAY_INLINE_DATA);
> -			/*
> -			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> -			 * S_DAX may be disabled
> -			 */
>  			ext4_set_inode_flags(inode);
>  		}
>  		return res;
> @@ -1376,10 +1372,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  				    ctx, len, 0);
>  	if (!res) {
>  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> -		/*
> -		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> -		 * S_DAX may be disabled
> -		 */
>  		ext4_set_inode_flags(inode);
>  		res = ext4_mark_inode_dirty(handle, inode);
>  		if (res)
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-14  4:00 ` [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag ira.weiny
@ 2020-04-15 12:08   ` Jan Kara
  2020-04-15 20:39     ` Ira Weiny
  2020-04-16 16:25   ` Darrick J. Wong
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Kara @ 2020-04-15 12:08 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:26, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> 
> Set the flag to be user visible and changeable.  Set the flag to be
> inherited.  Allow applications to change the flag at any time.
> 
> Finally, on regular files, flag the inode to not be cached to facilitate
> changing S_DAX on the next creation of the inode.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/ext4/ext4.h  | 13 +++++++++----
>  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 61b37a052052..434021fcec88 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -415,13 +415,16 @@ struct flex_groups {
>  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
>  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
>  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> +
> +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> +

You seem to be using somewhat older kernel... EXT4_EOFBLOCKS_FL doesn't
exist anymore (but still it's good to leave it reserved for some time so
the value you've chosen is OK).

> @@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
>  	return error;
>  }
>  
> +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> +{
> +	struct ext4_inode_info *ei = EXT4_I(inode);
> +
> +	if (S_ISDIR(inode->i_mode))
> +		return;
> +
> +	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
> +		inode->i_state |= I_DONTCACHE;
> +}
> +

You probably want to use the function you've introduced in the XFS series
here...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state
  2020-04-14  4:00 ` [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
@ 2020-04-15 12:51   ` Jan Kara
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kara @ 2020-04-15 12:51 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:27, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> We add 'always', 'never', and 'inode' (default).  '-o dax' continue to
> operate the same.
> 
> Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT_NODAX and set
> it and EXT4_MOUNT_DAX appropriately.
> 
> We also force EXT4_MOUNT_NODAX if !CONFIG_FS_DAX.
> 
> https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

...

> @@ -2303,6 +2325,13 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
>  	if (DUMMY_ENCRYPTION_ENABLED(sbi))
>  		SEQ_OPTS_PUTS("test_dummy_encryption");
>  
> +	if (test_opt2(sb, NODAX))
> +		SEQ_OPTS_PUTS("dax=never");
> +	else if (test_opt(sb, DAX))
> +		SEQ_OPTS_PUTS("dax=always");
> +	else
> +		SEQ_OPTS_PUTS("dax=inode");
> +

We try to show only mount options that were explicitely set by the user, or
that are different from defaults - e.g., see how 'data=' mount option
printing is handled.

>  	ext4_show_quota_options(seq, sb);
>  	return 0;
>  }
> @@ -5424,6 +5453,12 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>  		sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
>  	}
>  
> +	if ((sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & EXT4_MOUNT2_NODAX) {
> +		ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> +			"non-dax flag with busy inodes while remounting");
> +		sbi->s_mount_opt2 ^= EXT4_MOUNT2_NODAX;
> +	}
> +
>  	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
>  		ext4_abort(sb, "Abort forced by user");

I'd just merge this with the check whether EXT4_MOUNT_DAX changed.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax()
  2020-04-14  4:00 ` [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
@ 2020-04-15 13:58   ` Jan Kara
  2020-04-17 17:16     ` Ira Weiny
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2020-04-15 13:58 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:28, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Change the logic of ext4_should_use_dax() to support using the inode dax
> flag OR the overriding tri-state mount option.
> 
> While we are at it change the function to ext4_enable_dax() as this
> better reflects the ask.

I disagree with the renaming. ext4_enable_dax() suggests it enables
something. It does not. I'd either leave ext4_should_use_dax() or maybe
change it to ext4_should_enable_dax() if you really like the "enable" word
:).

> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/ext4/inode.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index fa0ff78dc033..e9d582e516bc 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4383,9 +4383,11 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
>  		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
>  }
>  
> -static bool ext4_should_use_dax(struct inode *inode)
> +static bool ext4_enable_dax(struct inode *inode)
>  {
> -	if (!test_opt(inode->i_sb, DAX))
> +	unsigned int flags = EXT4_I(inode)->i_flags;
> +
> +	if (test_opt2(inode->i_sb, NODAX))
>  		return false;
>  	if (!S_ISREG(inode->i_mode))
>  		return false;
> @@ -4397,7 +4399,13 @@ static bool ext4_should_use_dax(struct inode *inode)
>  		return false;
>  	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
>  		return false;
> -	return true;
> +	if (!bdev_dax_supported(inode->i_sb->s_bdev,
> +				inode->i_sb->s_blocksize))
> +		return false;
> +	if (test_opt(inode->i_sb, DAX))
> +		return true;
> +
> +	return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;

flags & EXT4_DAX_FL is enough here, isn't it?

								Honza

>  }
>  
>  void ext4_set_inode_flags(struct inode *inode)
> @@ -4415,7 +4423,7 @@ void ext4_set_inode_flags(struct inode *inode)
>  		new_fl |= S_NOATIME;
>  	if (flags & EXT4_DIRSYNC_FL)
>  		new_fl |= S_DIRSYNC;
> -	if (ext4_should_use_dax(inode))
> +	if (ext4_enable_dax(inode))
>  		new_fl |= S_DAX;
>  	if (flags & EXT4_ENCRYPT_FL)
>  		new_fl |= S_ENCRYPTED;
> -- 
> 2.25.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load
  2020-04-14  4:00 ` [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load ira.weiny
@ 2020-04-15 14:03   ` Jan Kara
  2020-04-17 17:18     ` Ira Weiny
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kara @ 2020-04-15 14:03 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Mon 13-04-20 21:00:29, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> To prevent complications with in memory inodes we only set S_DAX on
> inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
> change after inode eviction and reload.
> 
> Add init bool to ext4_set_inode_flags() to indicate if the inode is
> being newly initialized.
> 
> Assert that S_DAX is not set on an inode which is just being loaded.

> @@ -4408,11 +4408,13 @@ static bool ext4_enable_dax(struct inode *inode)
>  	return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
>  }
>  
> -void ext4_set_inode_flags(struct inode *inode)
> +void ext4_set_inode_flags(struct inode *inode, bool init)
>  {
>  	unsigned int flags = EXT4_I(inode)->i_flags;
>  	unsigned int new_fl = 0;
>  
> +	J_ASSERT(!(IS_DAX(inode) && init));
> +

WARN_ON or BUG_ON here? J_ASSERT is for journalling assertions...

Otherwise the patch looks good.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-15 12:00   ` Jan Kara
@ 2020-04-15 15:55     ` Theodore Y. Ts'o
  2020-04-15 19:21       ` Ira Weiny
  2020-04-15 19:14     ` Ira Weiny
  2020-04-15 20:34     ` Ira Weiny
  2 siblings, 1 reply; 42+ messages in thread
From: Theodore Y. Ts'o @ 2020-04-15 15:55 UTC (permalink / raw)
  To: Jan Kara
  Cc: ira.weiny, linux-kernel, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 02:00:02PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > flag change is wrong without a corresponding address_space_operations
> > update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > (Setting DAX is already disabled if Verity is set first.)
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/verity.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> > index dc5ec724d889..ce3f9a198d3b 100644
> > --- a/fs/ext4/verity.c
> > +++ b/fs/ext4/verity.c
> > @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
> >  	handle_t *handle;
> >  	int err;
> >  
> > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> > +		return -EINVAL;
> > +
> 
> Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
> correctly, user could normally trigger this, couldn't he?

Tes, the WARN_ON_ONCE isn't appropriate here.  We should also disallow
setting the DAX flag if the inode has the verity flag set already.

And if we need to decide what to if the file system is mounted with
"-o dax=always" and the verity file system feature is enabled.  We
could either (a) reject the mount with if the mount option is given
and the file system can have verity files, or (b) make "-o dax=always"
mean "-o dax=mostly_always" and treat verity files as not using dax
even when dax=always is selected.

Also, in theory, we *could* support dax and verity files, but
verifying the crypto checksums of all of the pages when the file is
first accessed, and then marking that in a flag in the in-inode flag.
Or we could have a per-page flag stored somewhere that indicates that
the page has been verified, so that we can on-demand verify the
integrity of the page.  Given that verity files are read-only, the
main reason why someone might want to use dax && verity would be to
reduce page cache overhead of system files; if executing out of dax
pages doesn't have significant performance impacts, this might be
something which might be a nice-to-have.  I don't think we need to
worry about this for now; if there are use cases where mobile devices
want to use dax && verity, we can let them figure out how to make it
work.  I'm just pointing out that it's not really a completely insane
combination.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-14  4:00 ` [PATCH RFC 3/8] fs/ext4: Disallow encryption " ira.weiny
  2020-04-15 12:02   ` Jan Kara
@ 2020-04-15 16:03   ` Theodore Y. Ts'o
  2020-04-15 17:27     ` Dan Williams
  2020-04-15 19:54     ` Ira Weiny
  1 sibling, 2 replies; 42+ messages in thread
From: Theodore Y. Ts'o @ 2020-04-15 16:03 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Mon, Apr 13, 2020 at 09:00:25PM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Encryption and DAX are incompatible.  Changing the DAX mode due to a
> change in Encryption mode is wrong without a corresponding
> address_space_operations update.
> 
> Make the 2 options mutually exclusive by returning an error if DAX was
> set first.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

The encryption flag is inherited from the containing directory, and
directories can't have the DAX flag set, so anything we do in
ext4_set_context() will be safety belt / sanity checking in nature.

But we *do* need to figure out what we do with mount -o dax=always
when the file system might have encrypted files.  My previous comments
about the verity flag and dax flag applies here.

Also note that encrypted files are read/write so we must never allow
the combination of ENCRPYT_FL and DAX_FL.  So that may be something
where we should teach __ext4_iget() to check for this, and declare the
file system as corrupted if it sees this combination.  (For VERITY_FL
&& DAX_FL that is a combo that we might want to support in the future,
so that's probably a case where arguably, we should just ignore the
DAX_FL for now.)

					- Ted

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-15 16:03   ` Theodore Y. Ts'o
@ 2020-04-15 17:27     ` Dan Williams
  2020-04-15 19:54     ` Ira Weiny
  1 sibling, 0 replies; 42+ messages in thread
From: Dan Williams @ 2020-04-15 17:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Weiny, Ira, Linux Kernel Mailing List, Jan Kara, Darrick J. Wong,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 9:03 AM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> On Mon, Apr 13, 2020 at 09:00:25PM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> >
> > Encryption and DAX are incompatible.  Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> >
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> >
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>
> The encryption flag is inherited from the containing directory, and
> directories can't have the DAX flag set, so anything we do in
> ext4_set_context() will be safety belt / sanity checking in nature.
>
> But we *do* need to figure out what we do with mount -o dax=always
> when the file system might have encrypted files.  My previous comments
> about the verity flag and dax flag applies here.
>
> Also note that encrypted files are read/write so we must never allow
> the combination of ENCRPYT_FL and DAX_FL.  So that may be something
> where we should teach __ext4_iget() to check for this, and declare the
> file system as corrupted if it sees this combination.  (For VERITY_FL
> && DAX_FL that is a combo that we might want to support in the future,
> so that's probably a case where arguably, we should just ignore the
> DAX_FL for now.)

We also have a pending consideration for what MKTME (inline memory
encryption with programmable hardware keys) means for file-encryption
+ dax. Certainly kernel based software encryption is incompatible with
dax, but one of the hallway track discussions I wanted to have at LSF
is whether fscrypt is the right interface for managing inline memory
encryption. For now, disallowing ENCRPYT_FL + DAX_FL seems ok if
ENCRPYT_FL always means software encryption, but this is something to
circle back to once we get MKTME implemented for volume encryption and
start to look at finer grained (per-directory key) encryption.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-15 12:00   ` Jan Kara
  2020-04-15 15:55     ` Theodore Y. Ts'o
@ 2020-04-15 19:14     ` Ira Weiny
  2020-04-16  1:29       ` Eric Biggers
  2020-04-15 20:34     ` Ira Weiny
  2 siblings, 1 reply; 42+ messages in thread
From: Ira Weiny @ 2020-04-15 19:14 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 02:00:02PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > flag change is wrong without a corresponding address_space_operations
> > update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > (Setting DAX is already disabled if Verity is set first.)
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/verity.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> > index dc5ec724d889..ce3f9a198d3b 100644
> > --- a/fs/ext4/verity.c
> > +++ b/fs/ext4/verity.c
> > @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
> >  	handle_t *handle;
> >  	int err;
> >  
> > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> > +		return -EINVAL;
> > +
> 
> Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
> correctly, user could normally trigger this, couldn't he?

Ok.  I did not think this through but I did think about this.  I was following
the code from the encryption side which issues a warning and was thinking that
would be a good way to alert the user they are doing something wrong...

I think you are right about both of them but we also need to put something in
the verity, dax, and ...  (I can't find a file in Documentation which talks
about encryption right off) documentation files....  For verity something like.

<quote>
Verity and DAX
--------------

Verity and DAX are not compatible and attempts to set both of these flags on a
file will fail.
</quote>

And the same thing in the DAX doc?

And where would be appropriate for the encrypt doc?

Ira

> 
> 								Honza
> 
> >  	if (ext4_verity_in_progress(inode))
> >  		return -EBUSY;
> >  
> > -- 
> > 2.25.1
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-15 15:55     ` Theodore Y. Ts'o
@ 2020-04-15 19:21       ` Ira Weiny
  0 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-15 19:21 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Jan Kara, linux-kernel, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 11:55:25AM -0400, Theodore Y. Ts'o wrote:
> On Wed, Apr 15, 2020 at 02:00:02PM +0200, Jan Kara wrote:
> > On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > > flag change is wrong without a corresponding address_space_operations
> > > update.
> > > 
> > > Make the 2 options mutually exclusive by returning an error if DAX was
> > > set first.
> > > 
> > > (Setting DAX is already disabled if Verity is set first.)
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  fs/ext4/verity.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> > > index dc5ec724d889..ce3f9a198d3b 100644
> > > --- a/fs/ext4/verity.c
> > > +++ b/fs/ext4/verity.c
> > > @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
> > >  	handle_t *handle;
> > >  	int err;
> > >  
> > > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> > > +		return -EINVAL;
> > > +
> > 
> > Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
> > correctly, user could normally trigger this, couldn't he?
> 
> Tes, the WARN_ON_ONCE isn't appropriate here.  We should also disallow
> setting the DAX flag if the inode has the verity flag set already.

This is taken care of and is part of ext4_enable_dax() after this series.

> 
> And if we need to decide what to if the file system is mounted with
> "-o dax=always" and the verity file system feature is enabled.  We
> could either (a) reject the mount with if the mount option is given
> and the file system can have verity files, or (b) make "-o dax=always"
> mean "-o dax=mostly_always" and treat verity files as not using dax
> even when dax=always is selected.

The later is implemented in this series...  Not the most explicit thing.  :-(

> 
> Also, in theory, we *could* support dax and verity files, but
> verifying the crypto checksums of all of the pages when the file is
> first accessed, and then marking that in a flag in the in-inode flag.
> Or we could have a per-page flag stored somewhere that indicates that
> the page has been verified, so that we can on-demand verify the
> integrity of the page.  Given that verity files are read-only, the
> main reason why someone might want to use dax && verity would be to
> reduce page cache overhead of system files; if executing out of dax
> pages doesn't have significant performance impacts, this might be
> something which might be a nice-to-have.  I don't think we need to
> worry about this for now; if there are use cases where mobile devices
> want to use dax && verity, we can let them figure out how to make it
> work.  I'm just pointing out that it's not really a completely insane
> combination.

Fair enough.  The main issue I need to correct here is to keep the 2 mutually
exclusive.  Which AFAICT is not true today.  This makes it so even without the
per-file enablement.

Ira

> 
> Cheers,
> 
> 						- Ted

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-15 16:03   ` Theodore Y. Ts'o
  2020-04-15 17:27     ` Dan Williams
@ 2020-04-15 19:54     ` Ira Weiny
  2020-04-21 18:41       ` Ira Weiny
  1 sibling, 1 reply; 42+ messages in thread
From: Ira Weiny @ 2020-04-15 19:54 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 12:03:07PM -0400, Theodore Y. Ts'o wrote:
> On Mon, Apr 13, 2020 at 09:00:25PM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Encryption and DAX are incompatible.  Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> The encryption flag is inherited from the containing directory, and
> directories can't have the DAX flag set,

But they can have FS_XFLAG_DAX set.

> so anything we do in
> ext4_set_context() will be safety belt / sanity checking in nature.
> 
> But we *do* need to figure out what we do with mount -o dax=always
> when the file system might have encrypted files.  My previous comments
> about the verity flag and dax flag applies here.

:-( agreed.

FWIW without these patches an inode which has encrypt or verity set is already
turning off DAX...  So we already have a '-o dax' flag which is not "always".

:-(

Unfortunately the 'always' designation kind of breaks semantically but it is
equal to the current mount option.

> 
> Also note that encrypted files are read/write so we must never allow
> the combination of ENCRPYT_FL and DAX_FL.  So that may be something
> where we should teach __ext4_iget() to check for this, and declare the
> file system as corrupted if it sees this combination.

ok...

> (For VERITY_FL
> && DAX_FL that is a combo that we might want to support in the future,
> so that's probably a case where arguably, we should just ignore the
> DAX_FL for now.)

ok...

Ira


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-15 12:00   ` Jan Kara
  2020-04-15 15:55     ` Theodore Y. Ts'o
  2020-04-15 19:14     ` Ira Weiny
@ 2020-04-15 20:34     ` Ira Weiny
  2 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-15 20:34 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 02:00:02PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > flag change is wrong without a corresponding address_space_operations
> > update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > (Setting DAX is already disabled if Verity is set first.)
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/verity.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> > index dc5ec724d889..ce3f9a198d3b 100644
> > --- a/fs/ext4/verity.c
> > +++ b/fs/ext4/verity.c
> > @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
> >  	handle_t *handle;
> >  	int err;
> >  
> > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> > +		return -EINVAL;
> > +
> 
> Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
> correctly, user could normally trigger this, couldn't he?

Removed and added to the verity doc.
Ira

> 
> 								Honza
> 
> >  	if (ext4_verity_in_progress(inode))
> >  		return -EBUSY;
> >  
> > -- 
> > 2.25.1
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-15 12:02   ` Jan Kara
@ 2020-04-15 20:35     ` Ira Weiny
  0 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-15 20:35 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 02:02:41PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:25, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Encryption and DAX are incompatible.  Changing the DAX mode due to a
> > change in Encryption mode is wrong without a corresponding
> > address_space_operations update.
> > 
> > Make the 2 options mutually exclusive by returning an error if DAX was
> > set first.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/super.c | 10 +---------
> >  1 file changed, 1 insertion(+), 9 deletions(-)
> > 
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 0c7c4adb664e..b14863058115 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1325,7 +1325,7 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  	if (inode->i_ino == EXT4_ROOT_INO)
> >  		return -EPERM;
> >  
> > -	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> 
> Also here I don't think WARN_ON_ONCE() is warranted once we allow per-inode
> setting of DAX. It will then become a regular error condition...

Removed.
Ira

> 
> 								Honza
> 
> >  		return -EINVAL;
> >  
> >  	res = ext4_convert_inline_data(inode);
> > @@ -1349,10 +1349,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  			ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> >  			ext4_clear_inode_state(inode,
> >  					EXT4_STATE_MAY_INLINE_DATA);
> > -			/*
> > -			 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > -			 * S_DAX may be disabled
> > -			 */
> >  			ext4_set_inode_flags(inode);
> >  		}
> >  		return res;
> > @@ -1376,10 +1372,6 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  				    ctx, len, 0);
> >  	if (!res) {
> >  		ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
> > -		/*
> > -		 * Update inode->i_flags - S_ENCRYPTED will be enabled,
> > -		 * S_DAX may be disabled
> > -		 */
> >  		ext4_set_inode_flags(inode);
> >  		res = ext4_mark_inode_dirty(handle, inode);
> >  		if (res)
> > -- 
> > 2.25.1
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-15 12:08   ` Jan Kara
@ 2020-04-15 20:39     ` Ira Weiny
  2020-04-16 10:32       ` Jan Kara
  2020-04-16 18:01       ` Theodore Y. Ts'o
  0 siblings, 2 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-15 20:39 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 02:08:46PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:26, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > 
> > Set the flag to be user visible and changeable.  Set the flag to be
> > inherited.  Allow applications to change the flag at any time.
> > 
> > Finally, on regular files, flag the inode to not be cached to facilitate
> > changing S_DAX on the next creation of the inode.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/ext4.h  | 13 +++++++++----
> >  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> >  2 files changed, 29 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 61b37a052052..434021fcec88 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -415,13 +415,16 @@ struct flex_groups {
> >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> >  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> > +
> > +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> > +
> 
> You seem to be using somewhat older kernel... EXT4_EOFBLOCKS_FL doesn't
> exist anymore (but still it's good to leave it reserved for some time so
> the value you've chosen is OK).

I'm on top of 5.6 released.  Did this get removed for 5.7?  I've heard there are
some boot issues with 5.7-rc1 so I'm holding out for rc2.

> 
> > @@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
> >  	return error;
> >  }
> >  
> > +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> > +{
> > +	struct ext4_inode_info *ei = EXT4_I(inode);
> > +
> > +	if (S_ISDIR(inode->i_mode))
> > +		return;
> > +
> > +	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
> > +		inode->i_state |= I_DONTCACHE;
> > +}
> > +
> 
> You probably want to use the function you've introduced in the XFS series
> here...

you mean:

flag_inode_dontcache()
???

Yes that is done.  I sent this prior to v8 (where that was added) of the other
series...

Ira

> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-15 19:14     ` Ira Weiny
@ 2020-04-16  1:29       ` Eric Biggers
  2020-04-16  3:48         ` Ira Weiny
  0 siblings, 1 reply; 42+ messages in thread
From: Eric Biggers @ 2020-04-16  1:29 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Jan Kara, linux-kernel, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 12:14:52PM -0700, Ira Weiny wrote:
> On Wed, Apr 15, 2020 at 02:00:02PM +0200, Jan Kara wrote:
> > On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > > flag change is wrong without a corresponding address_space_operations
> > > update.
> > > 
> > > Make the 2 options mutually exclusive by returning an error if DAX was
> > > set first.
> > > 
> > > (Setting DAX is already disabled if Verity is set first.)
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  fs/ext4/verity.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> > > index dc5ec724d889..ce3f9a198d3b 100644
> > > --- a/fs/ext4/verity.c
> > > +++ b/fs/ext4/verity.c
> > > @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
> > >  	handle_t *handle;
> > >  	int err;
> > >  
> > > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> > > +		return -EINVAL;
> > > +
> > 
> > Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
> > correctly, user could normally trigger this, couldn't he?
> 
> Ok.  I did not think this through but I did think about this.  I was following
> the code from the encryption side which issues a warning and was thinking that
> would be a good way to alert the user they are doing something wrong...
> 
> I think you are right about both of them but we also need to put something in
> the verity, dax, and ...  (I can't find a file in Documentation which talks
> about encryption right off) documentation files....  For verity something like.
> 
> <quote>
> Verity and DAX
> --------------
> 
> Verity and DAX are not compatible and attempts to set both of these flags on a
> file will fail.
> </quote>
> 
> And the same thing in the DAX doc?
> 
> And where would be appropriate for the encrypt doc?
> 

Documentation/filesystems/fscrypt.rst mentions that DAX isn't supported on
encrypted files, but it doesn't say what happens if someone tries to do it
anyway.  Feel free to improve the documentation.

- Eric

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX
  2020-04-16  1:29       ` Eric Biggers
@ 2020-04-16  3:48         ` Ira Weiny
  0 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-16  3:48 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jan Kara, linux-kernel, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 06:29:01PM -0700, Eric Biggers wrote:
> On Wed, Apr 15, 2020 at 12:14:52PM -0700, Ira Weiny wrote:
> > On Wed, Apr 15, 2020 at 02:00:02PM +0200, Jan Kara wrote:
> > > On Mon 13-04-20 21:00:24, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > Verity and DAX are incompatible.  Changing the DAX mode due to a verity
> > > > flag change is wrong without a corresponding address_space_operations
> > > > update.
> > > > 
> > > > Make the 2 options mutually exclusive by returning an error if DAX was
> > > > set first.
> > > > 
> > > > (Setting DAX is already disabled if Verity is set first.)
> > > > 
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > ---
> > > >  fs/ext4/verity.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> > > > index dc5ec724d889..ce3f9a198d3b 100644
> > > > --- a/fs/ext4/verity.c
> > > > +++ b/fs/ext4/verity.c
> > > > @@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
> > > >  	handle_t *handle;
> > > >  	int err;
> > > >  
> > > > +	if (WARN_ON_ONCE(IS_DAX(inode)))
> > > > +		return -EINVAL;
> > > > +
> > > 
> > > Hum, one question, is there a reason for WARN_ON_ONCE()? If I understand
> > > correctly, user could normally trigger this, couldn't he?
> > 
> > Ok.  I did not think this through but I did think about this.  I was following
> > the code from the encryption side which issues a warning and was thinking that
> > would be a good way to alert the user they are doing something wrong...
> > 
> > I think you are right about both of them but we also need to put something in
> > the verity, dax, and ...  (I can't find a file in Documentation which talks
> > about encryption right off) documentation files....  For verity something like.
> > 
> > <quote>
> > Verity and DAX
> > --------------
> > 
> > Verity and DAX are not compatible and attempts to set both of these flags on a
> > file will fail.
> > </quote>
> > 
> > And the same thing in the DAX doc?
> > 
> > And where would be appropriate for the encrypt doc?
> > 
> 
> Documentation/filesystems/fscrypt.rst mentions that DAX isn't supported on
> encrypted files, but it doesn't say what happens if someone tries to do it
> anyway.  Feel free to improve the documentation.

Thanks for pointing this out!

Ira


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-15 20:39     ` Ira Weiny
@ 2020-04-16 10:32       ` Jan Kara
  2020-04-16 18:01       ` Theodore Y. Ts'o
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Kara @ 2020-04-16 10:32 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Jan Kara, linux-kernel, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Wed 15-04-20 13:39:25, Ira Weiny wrote:
> > > @@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
> > >  	return error;
> > >  }
> > >  
> > > +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> > > +{
> > > +	struct ext4_inode_info *ei = EXT4_I(inode);
> > > +
> > > +	if (S_ISDIR(inode->i_mode))
> > > +		return;
> > > +
> > > +	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
> > > +		inode->i_state |= I_DONTCACHE;
> > > +}
> > > +
> > 
> > You probably want to use the function you've introduced in the XFS series
> > here...
> 
> you mean:
> 
> flag_inode_dontcache()
> ???

Yeah, that's what I meant.

> Yes that is done.  I sent this prior to v8 (where that was added) of the other
> series...

Yep, I thought that was the case but I wanted to mention it as a reminder.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-14  4:00 ` [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag ira.weiny
  2020-04-15 12:08   ` Jan Kara
@ 2020-04-16 16:25   ` Darrick J. Wong
  2020-04-16 22:33     ` Ira Weiny
  1 sibling, 1 reply; 42+ messages in thread
From: Darrick J. Wong @ 2020-04-16 16:25 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-kernel, Jan Kara, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> 
> Set the flag to be user visible and changeable.  Set the flag to be
> inherited.  Allow applications to change the flag at any time.
> 
> Finally, on regular files, flag the inode to not be cached to facilitate
> changing S_DAX on the next creation of the inode.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/ext4/ext4.h  | 13 +++++++++----
>  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 61b37a052052..434021fcec88 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -415,13 +415,16 @@ struct flex_groups {
>  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
>  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
>  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> +
> +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */

Sooo, fun fact about ext4 vs. the world--

The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
flag values as the ondisk inode flags in ext*.  Therefore, each of these
EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
equivalent in include/uapi/linux/fs.h.

(Note that the "[whatever]" is a straight translation since the same
uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
those.)

Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
updated to note that the value was taken.  I think Ted might be inclined
to reserve the ondisk inode bit just in case ext4 ever does support copy
on write, though that's his call. :)

Long story short - can you use 0x1000000 for this instead, and add the
corresponding value to the uapi fs.h?  I guess that also means that we
can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
that.

--D

> +
>  #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
>  #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
>  #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
>  #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
>  
> -#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
> -#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
> +#define EXT4_FL_USER_VISIBLE		0x70DBDFFF /* User visible flags */
> +#define EXT4_FL_USER_MODIFIABLE		0x60CBC0FF /* User modifiable flags */
>  
>  /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
>  #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
> @@ -429,14 +432,16 @@ struct flex_groups {
>  					 EXT4_APPEND_FL | \
>  					 EXT4_NODUMP_FL | \
>  					 EXT4_NOATIME_FL | \
> -					 EXT4_PROJINHERIT_FL)
> +					 EXT4_PROJINHERIT_FL | \
> +					 EXT4_DAX_FL)
>  
>  /* Flags that should be inherited by new inodes from their parent. */
>  #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
>  			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
>  			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
>  			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
> -			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
> +			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
> +			   EXT4_DAX_FL)
>  
>  /* Flags that are appropriate for regular files (all but dir-specific ones). */
>  #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index ee3401a32e79..ca07d5086f03 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -539,12 +539,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
>  		xflags |= FS_XFLAG_NOATIME;
>  	if (iflags & EXT4_PROJINHERIT_FL)
>  		xflags |= FS_XFLAG_PROJINHERIT;
> +	if (iflags & EXT4_DAX_FL)
> +		xflags |= FS_XFLAG_DAX;
>  	return xflags;
>  }
>  
>  #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
>  				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
> -				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
> +				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
> +				  FS_XFLAG_DAX)
>  
>  /* Transfer xflags flags to internal */
>  static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> @@ -563,6 +566,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
>  		iflags |= EXT4_NOATIME_FL;
>  	if (xflags & FS_XFLAG_PROJINHERIT)
>  		iflags |= EXT4_PROJINHERIT_FL;
> +	if (xflags & FS_XFLAG_DAX)
> +		iflags |= EXT4_DAX_FL;
>  
>  	return iflags;
>  }
> @@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
>  	return error;
>  }
>  
> +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> +{
> +	struct ext4_inode_info *ei = EXT4_I(inode);
> +
> +	if (S_ISDIR(inode->i_mode))
> +		return;
> +
> +	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
> +		inode->i_state |= I_DONTCACHE;
> +}
> +
>  long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  {
>  	struct inode *inode = file_inode(filp);
> @@ -1273,6 +1289,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  			return err;
>  
>  		inode_lock(inode);
> +
> +		ext4_dax_dontcache(inode, flags);
> +
>  		ext4_fill_fsxattr(inode, &old_fa);
>  		err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa);
>  		if (err)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-15 20:39     ` Ira Weiny
  2020-04-16 10:32       ` Jan Kara
@ 2020-04-16 18:01       ` Theodore Y. Ts'o
  1 sibling, 0 replies; 42+ messages in thread
From: Theodore Y. Ts'o @ 2020-04-16 18:01 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Jan Kara, linux-kernel, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 01:39:25PM -0700, Ira Weiny wrote:
> 
> I'm on top of 5.6 released.  Did this get removed for 5.7?  I've heard there are
> some boot issues with 5.7-rc1 so I'm holding out for rc2.

Yes, it got removed in 5.7-rc1 in commit 4337ecd1fe99.

The boot issues with 5.7-rc1 is why ext4.git tree is now based off of
v5.7-rc1-35-g00086336a8d9: Merge tag 'efi-urgent-2020-04-15'....

You might want to see if 00086336a8d9 works for you (and if not, let
the x86 and/or efi folks know).

					- Ted

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-16 16:25   ` Darrick J. Wong
@ 2020-04-16 22:33     ` Ira Weiny
  2020-04-16 22:49       ` Darrick J. Wong
  0 siblings, 1 reply; 42+ messages in thread
From: Ira Weiny @ 2020-04-16 22:33 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-kernel, Jan Kara, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
> On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > 
> > Set the flag to be user visible and changeable.  Set the flag to be
> > inherited.  Allow applications to change the flag at any time.
> > 
> > Finally, on regular files, flag the inode to not be cached to facilitate
> > changing S_DAX on the next creation of the inode.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/ext4.h  | 13 +++++++++----
> >  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> >  2 files changed, 29 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 61b37a052052..434021fcec88 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -415,13 +415,16 @@ struct flex_groups {
> >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> >  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> > +
> > +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> 
> Sooo, fun fact about ext4 vs. the world--
> 
> The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
> flag values as the ondisk inode flags in ext*.  Therefore, each of these
> EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
> equivalent in include/uapi/linux/fs.h.

Interesting...

> 
> (Note that the "[whatever]" is a straight translation since the same
> uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
> those.)
> 
> Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
> updated to note that the value was taken.  I think Ted might be inclined
> to reserve the ondisk inode bit just in case ext4 ever does support copy
> on write, though that's his call. :)

Seems like I should change this...  And I did not realize I was inherently
changing a bit definition which was exposed to other FS's...

> 
> Long story short - can you use 0x1000000 for this instead, and add the
> corresponding value to the uapi fs.h?  I guess that also means that we
> can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
> that.

:-/

Are there any potential users of FS_XFLAG_DAX now?

From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
options?  Or is it depending on the VFS layer for some of them?

Ira

> 
> --D
> 
> > +
> >  #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
> >  #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> >  #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
> >  #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
> >  
> > -#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
> > -#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
> > +#define EXT4_FL_USER_VISIBLE		0x70DBDFFF /* User visible flags */
> > +#define EXT4_FL_USER_MODIFIABLE		0x60CBC0FF /* User modifiable flags */
> >  
> >  /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
> >  #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
> > @@ -429,14 +432,16 @@ struct flex_groups {
> >  					 EXT4_APPEND_FL | \
> >  					 EXT4_NODUMP_FL | \
> >  					 EXT4_NOATIME_FL | \
> > -					 EXT4_PROJINHERIT_FL)
> > +					 EXT4_PROJINHERIT_FL | \
> > +					 EXT4_DAX_FL)
> >  
> >  /* Flags that should be inherited by new inodes from their parent. */
> >  #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
> >  			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
> >  			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
> >  			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
> > -			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
> > +			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
> > +			   EXT4_DAX_FL)
> >  
> >  /* Flags that are appropriate for regular files (all but dir-specific ones). */
> >  #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
> > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> > index ee3401a32e79..ca07d5086f03 100644
> > --- a/fs/ext4/ioctl.c
> > +++ b/fs/ext4/ioctl.c
> > @@ -539,12 +539,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
> >  		xflags |= FS_XFLAG_NOATIME;
> >  	if (iflags & EXT4_PROJINHERIT_FL)
> >  		xflags |= FS_XFLAG_PROJINHERIT;
> > +	if (iflags & EXT4_DAX_FL)
> > +		xflags |= FS_XFLAG_DAX;
> >  	return xflags;
> >  }
> >  
> >  #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
> >  				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
> > -				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
> > +				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
> > +				  FS_XFLAG_DAX)
> >  
> >  /* Transfer xflags flags to internal */
> >  static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> > @@ -563,6 +566,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> >  		iflags |= EXT4_NOATIME_FL;
> >  	if (xflags & FS_XFLAG_PROJINHERIT)
> >  		iflags |= EXT4_PROJINHERIT_FL;
> > +	if (xflags & FS_XFLAG_DAX)
> > +		iflags |= EXT4_DAX_FL;
> >  
> >  	return iflags;
> >  }
> > @@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
> >  	return error;
> >  }
> >  
> > +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> > +{
> > +	struct ext4_inode_info *ei = EXT4_I(inode);
> > +
> > +	if (S_ISDIR(inode->i_mode))
> > +		return;
> > +
> > +	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
> > +		inode->i_state |= I_DONTCACHE;
> > +}
> > +
> >  long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> >  {
> >  	struct inode *inode = file_inode(filp);
> > @@ -1273,6 +1289,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> >  			return err;
> >  
> >  		inode_lock(inode);
> > +
> > +		ext4_dax_dontcache(inode, flags);
> > +
> >  		ext4_fill_fsxattr(inode, &old_fa);
> >  		err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa);
> >  		if (err)
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-16 22:33     ` Ira Weiny
@ 2020-04-16 22:49       ` Darrick J. Wong
  2020-04-17  0:37         ` Ira Weiny
  0 siblings, 1 reply; 42+ messages in thread
From: Darrick J. Wong @ 2020-04-16 22:49 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-kernel, Jan Kara, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Thu, Apr 16, 2020 at 03:33:27PM -0700, Ira Weiny wrote:
> On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
> > On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > > 
> > > Set the flag to be user visible and changeable.  Set the flag to be
> > > inherited.  Allow applications to change the flag at any time.
> > > 
> > > Finally, on regular files, flag the inode to not be cached to facilitate
> > > changing S_DAX on the next creation of the inode.
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  fs/ext4/ext4.h  | 13 +++++++++----
> > >  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> > >  2 files changed, 29 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > > index 61b37a052052..434021fcec88 100644
> > > --- a/fs/ext4/ext4.h
> > > +++ b/fs/ext4/ext4.h
> > > @@ -415,13 +415,16 @@ struct flex_groups {
> > >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> > >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> > >  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> > > +
> > > +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> > 
> > Sooo, fun fact about ext4 vs. the world--
> > 
> > The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
> > flag values as the ondisk inode flags in ext*.  Therefore, each of these
> > EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
> > equivalent in include/uapi/linux/fs.h.
> 
> Interesting...
> 
> > 
> > (Note that the "[whatever]" is a straight translation since the same
> > uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
> > those.)
> > 
> > Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
> > updated to note that the value was taken.  I think Ted might be inclined
> > to reserve the ondisk inode bit just in case ext4 ever does support copy
> > on write, though that's his call. :)
> 
> Seems like I should change this...  And I did not realize I was inherently
> changing a bit definition which was exposed to other FS's...

<nod> This whole thing is a mess, particularly now that we have two vfs
ioctls to set per-fs inode attributes, both of which were inherited from
other filesystems... :(

> > 
> > Long story short - can you use 0x1000000 for this instead, and add the
> > corresponding value to the uapi fs.h?  I guess that also means that we
> > can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
> > that.
> 
> :-/
> 
> Are there any potential users of FS_XFLAG_DAX now?

Yes, it's in the userspace ABI so we can't get rid of it.

(FWIW there are several flags that exist in both FS_XFLAG_* and FS_*_FL
form.)

> From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
> straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
> FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
> FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
> options?  Or is it depending on the VFS layer for some of them?

XFS doesn't support most of the FS_*_FL flags.

--D

> Ira
> 
> > 
> > --D
> > 
> > > +
> > >  #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
> > >  #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> > >  #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded file */
> > >  #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
> > >  
> > > -#define EXT4_FL_USER_VISIBLE		0x705BDFFF /* User visible flags */
> > > -#define EXT4_FL_USER_MODIFIABLE		0x604BC0FF /* User modifiable flags */
> > > +#define EXT4_FL_USER_VISIBLE		0x70DBDFFF /* User visible flags */
> > > +#define EXT4_FL_USER_MODIFIABLE		0x60CBC0FF /* User modifiable flags */
> > >  
> > >  /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
> > >  #define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
> > > @@ -429,14 +432,16 @@ struct flex_groups {
> > >  					 EXT4_APPEND_FL | \
> > >  					 EXT4_NODUMP_FL | \
> > >  					 EXT4_NOATIME_FL | \
> > > -					 EXT4_PROJINHERIT_FL)
> > > +					 EXT4_PROJINHERIT_FL | \
> > > +					 EXT4_DAX_FL)
> > >  
> > >  /* Flags that should be inherited by new inodes from their parent. */
> > >  #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
> > >  			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
> > >  			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
> > >  			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
> > > -			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
> > > +			   EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
> > > +			   EXT4_DAX_FL)
> > >  
> > >  /* Flags that are appropriate for regular files (all but dir-specific ones). */
> > >  #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL |\
> > > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> > > index ee3401a32e79..ca07d5086f03 100644
> > > --- a/fs/ext4/ioctl.c
> > > +++ b/fs/ext4/ioctl.c
> > > @@ -539,12 +539,15 @@ static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
> > >  		xflags |= FS_XFLAG_NOATIME;
> > >  	if (iflags & EXT4_PROJINHERIT_FL)
> > >  		xflags |= FS_XFLAG_PROJINHERIT;
> > > +	if (iflags & EXT4_DAX_FL)
> > > +		xflags |= FS_XFLAG_DAX;
> > >  	return xflags;
> > >  }
> > >  
> > >  #define EXT4_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \
> > >  				  FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \
> > > -				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT)
> > > +				  FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT | \
> > > +				  FS_XFLAG_DAX)
> > >  
> > >  /* Transfer xflags flags to internal */
> > >  static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> > > @@ -563,6 +566,8 @@ static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
> > >  		iflags |= EXT4_NOATIME_FL;
> > >  	if (xflags & FS_XFLAG_PROJINHERIT)
> > >  		iflags |= EXT4_PROJINHERIT_FL;
> > > +	if (xflags & FS_XFLAG_DAX)
> > > +		iflags |= EXT4_DAX_FL;
> > >  
> > >  	return iflags;
> > >  }
> > > @@ -813,6 +818,17 @@ static int ext4_ioctl_get_es_cache(struct file *filp, unsigned long arg)
> > >  	return error;
> > >  }
> > >  
> > > +static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
> > > +{
> > > +	struct ext4_inode_info *ei = EXT4_I(inode);
> > > +
> > > +	if (S_ISDIR(inode->i_mode))
> > > +		return;
> > > +
> > > +	if ((ei->i_flags ^ flags) == EXT4_DAX_FL)
> > > +		inode->i_state |= I_DONTCACHE;
> > > +}
> > > +
> > >  long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> > >  {
> > >  	struct inode *inode = file_inode(filp);
> > > @@ -1273,6 +1289,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> > >  			return err;
> > >  
> > >  		inode_lock(inode);
> > > +
> > > +		ext4_dax_dontcache(inode, flags);
> > > +
> > >  		ext4_fill_fsxattr(inode, &old_fa);
> > >  		err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa);
> > >  		if (err)
> > > -- 
> > > 2.25.1
> > > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-16 22:49       ` Darrick J. Wong
@ 2020-04-17  0:37         ` Ira Weiny
  2020-04-17  1:57           ` Darrick J. Wong
  0 siblings, 1 reply; 42+ messages in thread
From: Ira Weiny @ 2020-04-17  0:37 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-kernel, Jan Kara, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Thu, Apr 16, 2020 at 03:49:37PM -0700, Darrick J. Wong wrote:
> On Thu, Apr 16, 2020 at 03:33:27PM -0700, Ira Weiny wrote:
> > On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
> > > On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > > > 
> > > > Set the flag to be user visible and changeable.  Set the flag to be
> > > > inherited.  Allow applications to change the flag at any time.
> > > > 
> > > > Finally, on regular files, flag the inode to not be cached to facilitate
> > > > changing S_DAX on the next creation of the inode.
> > > > 
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > ---
> > > >  fs/ext4/ext4.h  | 13 +++++++++----
> > > >  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> > > >  2 files changed, 29 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > > > index 61b37a052052..434021fcec88 100644
> > > > --- a/fs/ext4/ext4.h
> > > > +++ b/fs/ext4/ext4.h
> > > > @@ -415,13 +415,16 @@ struct flex_groups {
> > > >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> > > >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> > > >  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> > > > +
> > > > +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> > > 
> > > Sooo, fun fact about ext4 vs. the world--
> > > 
> > > The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
> > > flag values as the ondisk inode flags in ext*.  Therefore, each of these
> > > EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
> > > equivalent in include/uapi/linux/fs.h.
> > 
> > Interesting...
> > 
> > > 
> > > (Note that the "[whatever]" is a straight translation since the same
> > > uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
> > > those.)
> > > 
> > > Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
> > > updated to note that the value was taken.  I think Ted might be inclined
> > > to reserve the ondisk inode bit just in case ext4 ever does support copy
> > > on write, though that's his call. :)
> > 
> > Seems like I should change this...  And I did not realize I was inherently
> > changing a bit definition which was exposed to other FS's...
> 
> <nod> This whole thing is a mess, particularly now that we have two vfs
> ioctls to set per-fs inode attributes, both of which were inherited from
> other filesystems... :(
>

Ok I've changed it.

> 
> > > 
> > > Long story short - can you use 0x1000000 for this instead, and add the
> > > corresponding value to the uapi fs.h?  I guess that also means that we
> > > can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
> > > that.
> > 
> > :-/
> > 
> > Are there any potential users of FS_XFLAG_DAX now?
> 
> Yes, it's in the userspace ABI so we can't get rid of it.
> 
> (FWIW there are several flags that exist in both FS_XFLAG_* and FS_*_FL
> form.)
> 
> > From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
> > straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
> > FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
> > FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
> > options?  Or is it depending on the VFS layer for some of them?
> 
> XFS doesn't support most of the FS_*_FL flags.

If FS_XFLAG_DAX needs to continue to be user visible I think we need to keep
that flag and we should not expose the EXT4_DAX_FL flag...

I think that works for XFS.

But for ext4 it looks like EXT4_FL_XFLAG_VISIBLE was intended to be used for
[GET|SET]XATTR where EXT4_FL_USER_VISIBLE was intended to for [GET|SET]FLAGS...
But if I don't add EXT4_DAX_FL in EXT4_FL_XFLAG_VISIBLE my test fails.

I've been playing with the flags and looking at the code and I _thought_ the
following patch would ensure that FS_XFLAG_DAX is the only one visible but for
some reason FS_XFLAG_DAX can't be set with this patch.  I still need the
EXT4_FL_USER_VISIBLE mask altered...  Which I believe would expose EXT4_DAX_FL
directly as well.

Jan, Ted?  Any ideas?  Or should we expose EXT4_DAX_FL and FS_XFLAG_DAX in
ext4?

Ira

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index fb7e66089a74..c3823f057755 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -423,7 +423,7 @@ struct flex_groups {
 #define EXT4_CASEFOLD_FL               0x40000000 /* Casefolded file */
 #define EXT4_RESERVED_FL               0x80000000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE           0x715BDFFF /* User visible flags */
+#define EXT4_FL_USER_VISIBLE           0x705BDFFF /* User visible flags */
 #define EXT4_FL_USER_MODIFIABLE                0x614BC0FF /* User modifiable flags */
 
 /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index b3c6e891185e..8bd0d3f9ca0b 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -744,8 +744,8 @@ static void ext4_fill_fsxattr(struct inode *inode, struct fsxattr *fa)
 {
        struct ext4_inode_info *ei = EXT4_I(inode);
 
-       simple_fill_fsxattr(fa, ext4_iflags_to_xflags(ei->i_flags &
-                                                     EXT4_FL_USER_VISIBLE));
+       simple_fill_fsxattr(fa, (ext4_iflags_to_xflags(ei->i_flags) &
+                                                     EXT4_FL_XFLAG_VISIBLE));
 
        if (ext4_has_feature_project(inode->i_sb))
                fa->fsx_projid = from_kprojid(&init_user_ns, ei->i_projid);

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-17  0:37         ` Ira Weiny
@ 2020-04-17  1:57           ` Darrick J. Wong
  2020-04-17  2:20             ` Ira Weiny
  0 siblings, 1 reply; 42+ messages in thread
From: Darrick J. Wong @ 2020-04-17  1:57 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-kernel, Jan Kara, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Thu, Apr 16, 2020 at 05:37:19PM -0700, Ira Weiny wrote:
> On Thu, Apr 16, 2020 at 03:49:37PM -0700, Darrick J. Wong wrote:
> > On Thu, Apr 16, 2020 at 03:33:27PM -0700, Ira Weiny wrote:
> > > On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
> > > > On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > 
> > > > > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > > > > 
> > > > > Set the flag to be user visible and changeable.  Set the flag to be
> > > > > inherited.  Allow applications to change the flag at any time.
> > > > > 
> > > > > Finally, on regular files, flag the inode to not be cached to facilitate
> > > > > changing S_DAX on the next creation of the inode.
> > > > > 
> > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > > ---
> > > > >  fs/ext4/ext4.h  | 13 +++++++++----
> > > > >  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> > > > >  2 files changed, 29 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > > > > index 61b37a052052..434021fcec88 100644
> > > > > --- a/fs/ext4/ext4.h
> > > > > +++ b/fs/ext4/ext4.h
> > > > > @@ -415,13 +415,16 @@ struct flex_groups {
> > > > >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> > > > >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> > > > >  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> > > > > +
> > > > > +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> > > > 
> > > > Sooo, fun fact about ext4 vs. the world--
> > > > 
> > > > The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
> > > > flag values as the ondisk inode flags in ext*.  Therefore, each of these
> > > > EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
> > > > equivalent in include/uapi/linux/fs.h.
> > > 
> > > Interesting...
> > > 
> > > > 
> > > > (Note that the "[whatever]" is a straight translation since the same
> > > > uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
> > > > those.)
> > > > 
> > > > Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
> > > > updated to note that the value was taken.  I think Ted might be inclined
> > > > to reserve the ondisk inode bit just in case ext4 ever does support copy
> > > > on write, though that's his call. :)
> > > 
> > > Seems like I should change this...  And I did not realize I was inherently
> > > changing a bit definition which was exposed to other FS's...
> > 
> > <nod> This whole thing is a mess, particularly now that we have two vfs
> > ioctls to set per-fs inode attributes, both of which were inherited from
> > other filesystems... :(
> >
> 
> Ok I've changed it.
> 
> > 
> > > > 
> > > > Long story short - can you use 0x1000000 for this instead, and add the
> > > > corresponding value to the uapi fs.h?  I guess that also means that we
> > > > can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
> > > > that.
> > > 
> > > :-/
> > > 
> > > Are there any potential users of FS_XFLAG_DAX now?
> > 
> > Yes, it's in the userspace ABI so we can't get rid of it.
> > 
> > (FWIW there are several flags that exist in both FS_XFLAG_* and FS_*_FL
> > form.)
> > 
> > > From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
> > > straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
> > > FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
> > > FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
> > > options?  Or is it depending on the VFS layer for some of them?
> > 
> > XFS doesn't support most of the FS_*_FL flags.
> 
> If FS_XFLAG_DAX needs to continue to be user visible I think we need to keep
> that flag and we should not expose the EXT4_DAX_FL flag...
> 
> I think that works for XFS.
> 
> But for ext4 it looks like EXT4_FL_XFLAG_VISIBLE was intended to be used for
> [GET|SET]XATTR where EXT4_FL_USER_VISIBLE was intended to for [GET|SET]FLAGS...
> But if I don't add EXT4_DAX_FL in EXT4_FL_XFLAG_VISIBLE my test fails.
> 
> I've been playing with the flags and looking at the code and I _thought_ the
> following patch would ensure that FS_XFLAG_DAX is the only one visible but for
> some reason FS_XFLAG_DAX can't be set with this patch.  I still need the
> EXT4_FL_USER_VISIBLE mask altered...  Which I believe would expose EXT4_DAX_FL
> directly as well.
> 
> Jan, Ted?  Any ideas?  Or should we expose EXT4_DAX_FL and FS_XFLAG_DAX in
> ext4?

Both flags should be exposed through their respective ioctl interfaces
in both filesystems.  That way we don't have to add even more verbiage
to the documentation to instruct userspace programmers on how to special
case ext4 and XFS for the same piece of functionality.

--D

> Ira
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index fb7e66089a74..c3823f057755 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -423,7 +423,7 @@ struct flex_groups {
>  #define EXT4_CASEFOLD_FL               0x40000000 /* Casefolded file */
>  #define EXT4_RESERVED_FL               0x80000000 /* reserved for ext4 lib */
>  
> -#define EXT4_FL_USER_VISIBLE           0x715BDFFF /* User visible flags */
> +#define EXT4_FL_USER_VISIBLE           0x705BDFFF /* User visible flags */
>  #define EXT4_FL_USER_MODIFIABLE                0x614BC0FF /* User modifiable flags */
>  
>  /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index b3c6e891185e..8bd0d3f9ca0b 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -744,8 +744,8 @@ static void ext4_fill_fsxattr(struct inode *inode, struct fsxattr *fa)
>  {
>         struct ext4_inode_info *ei = EXT4_I(inode);
>  
> -       simple_fill_fsxattr(fa, ext4_iflags_to_xflags(ei->i_flags &
> -                                                     EXT4_FL_USER_VISIBLE));
> +       simple_fill_fsxattr(fa, (ext4_iflags_to_xflags(ei->i_flags) &
> +                                                     EXT4_FL_XFLAG_VISIBLE));
>  
>         if (ext4_has_feature_project(inode->i_sb))
>                 fa->fsx_projid = from_kprojid(&init_user_ns, ei->i_projid);

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-17  1:57           ` Darrick J. Wong
@ 2020-04-17  2:20             ` Ira Weiny
  2020-04-17  6:43               ` Andreas Dilger
  0 siblings, 1 reply; 42+ messages in thread
From: Ira Weiny @ 2020-04-17  2:20 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-kernel, Jan Kara, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Thu, Apr 16, 2020 at 06:57:31PM -0700, Darrick J. Wong wrote:
> On Thu, Apr 16, 2020 at 05:37:19PM -0700, Ira Weiny wrote:
> > On Thu, Apr 16, 2020 at 03:49:37PM -0700, Darrick J. Wong wrote:
> > > On Thu, Apr 16, 2020 at 03:33:27PM -0700, Ira Weiny wrote:
> > > > On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
> > > > > On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> > > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > > 
> > > > > > Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> > > > > > 
> > > > > > Set the flag to be user visible and changeable.  Set the flag to be
> > > > > > inherited.  Allow applications to change the flag at any time.
> > > > > > 
> > > > > > Finally, on regular files, flag the inode to not be cached to facilitate
> > > > > > changing S_DAX on the next creation of the inode.
> > > > > > 
> > > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > > > ---
> > > > > >  fs/ext4/ext4.h  | 13 +++++++++----
> > > > > >  fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> > > > > >  2 files changed, 29 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > > > > > index 61b37a052052..434021fcec88 100644
> > > > > > --- a/fs/ext4/ext4.h
> > > > > > +++ b/fs/ext4/ext4.h
> > > > > > @@ -415,13 +415,16 @@ struct flex_groups {
> > > > > >  #define EXT4_VERITY_FL			0x00100000 /* Verity protected inode */
> > > > > >  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> > > > > >  #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> > > > > > +
> > > > > > +#define EXT4_DAX_FL			0x00800000 /* Inode is DAX */
> > > > > 
> > > > > Sooo, fun fact about ext4 vs. the world--
> > > > > 
> > > > > The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
> > > > > flag values as the ondisk inode flags in ext*.  Therefore, each of these
> > > > > EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
> > > > > equivalent in include/uapi/linux/fs.h.
> > > > 
> > > > Interesting...
> > > > 
> > > > > 
> > > > > (Note that the "[whatever]" is a straight translation since the same
> > > > > uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
> > > > > those.)
> > > > > 
> > > > > Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
> > > > > updated to note that the value was taken.  I think Ted might be inclined
> > > > > to reserve the ondisk inode bit just in case ext4 ever does support copy
> > > > > on write, though that's his call. :)
> > > > 
> > > > Seems like I should change this...  And I did not realize I was inherently
> > > > changing a bit definition which was exposed to other FS's...
> > > 
> > > <nod> This whole thing is a mess, particularly now that we have two vfs
> > > ioctls to set per-fs inode attributes, both of which were inherited from
> > > other filesystems... :(
> > >
> > 
> > Ok I've changed it.
> > 
> > > 
> > > > > 
> > > > > Long story short - can you use 0x1000000 for this instead, and add the
> > > > > corresponding value to the uapi fs.h?  I guess that also means that we
> > > > > can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
> > > > > that.
> > > > 
> > > > :-/
> > > > 
> > > > Are there any potential users of FS_XFLAG_DAX now?
> > > 
> > > Yes, it's in the userspace ABI so we can't get rid of it.
> > > 
> > > (FWIW there are several flags that exist in both FS_XFLAG_* and FS_*_FL
> > > form.)
> > > 
> > > > From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
> > > > straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
> > > > FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
> > > > FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
> > > > options?  Or is it depending on the VFS layer for some of them?
> > > 
> > > XFS doesn't support most of the FS_*_FL flags.
> > 
> > If FS_XFLAG_DAX needs to continue to be user visible I think we need to keep
> > that flag and we should not expose the EXT4_DAX_FL flag...
> > 
> > I think that works for XFS.
> > 
> > But for ext4 it looks like EXT4_FL_XFLAG_VISIBLE was intended to be used for
> > [GET|SET]XATTR where EXT4_FL_USER_VISIBLE was intended to for [GET|SET]FLAGS...
> > But if I don't add EXT4_DAX_FL in EXT4_FL_XFLAG_VISIBLE my test fails.
> > 
> > I've been playing with the flags and looking at the code and I _thought_ the
> > following patch would ensure that FS_XFLAG_DAX is the only one visible but for
> > some reason FS_XFLAG_DAX can't be set with this patch.  I still need the
> > EXT4_FL_USER_VISIBLE mask altered...  Which I believe would expose EXT4_DAX_FL
> > directly as well.
> > 
> > Jan, Ted?  Any ideas?  Or should we expose EXT4_DAX_FL and FS_XFLAG_DAX in
> > ext4?
> 
> Both flags should be exposed through their respective ioctl interfaces
> in both filesystems.  That way we don't have to add even more verbiage
> to the documentation to instruct userspace programmers on how to special
> case ext4 and XFS for the same piece of functionality.

Wouldn't it be more confusing for the user to have 2 different flags which do
the same thing?

I would think that using FS_XFLAG_DAX _only_ (for both ext4 and xfs) would be
easier without special cases?

Ira


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-17  2:20             ` Ira Weiny
@ 2020-04-17  6:43               ` Andreas Dilger
  2020-04-17 17:19                 ` Ira Weiny
  0 siblings, 1 reply; 42+ messages in thread
From: Andreas Dilger @ 2020-04-17  6:43 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Darrick J. Wong, linux-kernel, Jan Kara, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

We still need to store an on-disk DAX flag for Ext4, and at that point it
doesn't make sense not to expose it via the standard Ext4 chattr utility.

So having EXT4_DAX_FL (== FS_DAX_FL) is no extra effort to add.

Cheers, Andreas

> On Apr 16, 2020, at 20:20, Ira Weiny <ira.weiny@intel.com> wrote:
> 
> On Thu, Apr 16, 2020 at 06:57:31PM -0700, Darrick J. Wong wrote:
>>> On Thu, Apr 16, 2020 at 05:37:19PM -0700, Ira Weiny wrote:
>>> On Thu, Apr 16, 2020 at 03:49:37PM -0700, Darrick J. Wong wrote:
>>>> On Thu, Apr 16, 2020 at 03:33:27PM -0700, Ira Weiny wrote:
>>>>> On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
>>>>>> On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
>>>>>>> From: Ira Weiny <ira.weiny@intel.com>
>>>>>>> 
>>>>>>> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
>>>>>>> 
>>>>>>> Set the flag to be user visible and changeable.  Set the flag to be
>>>>>>> inherited.  Allow applications to change the flag at any time.
>>>>>>> 
>>>>>>> Finally, on regular files, flag the inode to not be cached to facilitate
>>>>>>> changing S_DAX on the next creation of the inode.
>>>>>>> 
>>>>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>>>>>>> ---
>>>>>>> fs/ext4/ext4.h  | 13 +++++++++----
>>>>>>> fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
>>>>>>> 2 files changed, 29 insertions(+), 5 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>>>>>>> index 61b37a052052..434021fcec88 100644
>>>>>>> --- a/fs/ext4/ext4.h
>>>>>>> +++ b/fs/ext4/ext4.h
>>>>>>> @@ -415,13 +415,16 @@ struct flex_groups {
>>>>>>> #define EXT4_VERITY_FL            0x00100000 /* Verity protected inode */
>>>>>>> #define EXT4_EA_INODE_FL            0x00200000 /* Inode used for large EA */
>>>>>>> #define EXT4_EOFBLOCKS_FL        0x00400000 /* Blocks allocated beyond EOF */
>>>>>>> +
>>>>>>> +#define EXT4_DAX_FL            0x00800000 /* Inode is DAX */
>>>>>> 
>>>>>> Sooo, fun fact about ext4 vs. the world--
>>>>>> 
>>>>>> The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
>>>>>> flag values as the ondisk inode flags in ext*.  Therefore, each of these
>>>>>> EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
>>>>>> equivalent in include/uapi/linux/fs.h.
>>>>> 
>>>>> Interesting...
>>>>> 
>>>>>> 
>>>>>> (Note that the "[whatever]" is a straight translation since the same
>>>>>> uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
>>>>>> those.)
>>>>>> 
>>>>>> Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
>>>>>> updated to note that the value was taken.  I think Ted might be inclined
>>>>>> to reserve the ondisk inode bit just in case ext4 ever does support copy
>>>>>> on write, though that's his call. :)
>>>>> 
>>>>> Seems like I should change this...  And I did not realize I was inherently
>>>>> changing a bit definition which was exposed to other FS's...
>>>> 
>>>> <nod> This whole thing is a mess, particularly now that we have two vfs
>>>> ioctls to set per-fs inode attributes, both of which were inherited from
>>>> other filesystems... :(
>>>> 
>>> 
>>> Ok I've changed it.
>>> 
>>>> 
>>>>>> 
>>>>>> Long story short - can you use 0x1000000 for this instead, and add the
>>>>>> corresponding value to the uapi fs.h?  I guess that also means that we
>>>>>> can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
>>>>>> that.
>>>>> 
>>>>> :-/
>>>>> 
>>>>> Are there any potential users of FS_XFLAG_DAX now?
>>>> 
>>>> Yes, it's in the userspace ABI so we can't get rid of it.
>>>> 
>>>> (FWIW there are several flags that exist in both FS_XFLAG_* and FS_*_FL
>>>> form.)
>>>> 
>>>>> From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
>>>>> straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
>>>>> FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
>>>>> FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
>>>>> options?  Or is it depending on the VFS layer for some of them?
>>>> 
>>>> XFS doesn't support most of the FS_*_FL flags.
>>> 
>>> If FS_XFLAG_DAX needs to continue to be user visible I think we need to keep
>>> that flag and we should not expose the EXT4_DAX_FL flag...
>>> 
>>> I think that works for XFS.
>>> 
>>> But for ext4 it looks like EXT4_FL_XFLAG_VISIBLE was intended to be used for
>>> [GET|SET]XATTR where EXT4_FL_USER_VISIBLE was intended to for [GET|SET]FLAGS...
>>> But if I don't add EXT4_DAX_FL in EXT4_FL_XFLAG_VISIBLE my test fails.
>>> 
>>> I've been playing with the flags and looking at the code and I _thought_ the
>>> following patch would ensure that FS_XFLAG_DAX is the only one visible but for
>>> some reason FS_XFLAG_DAX can't be set with this patch.  I still need the
>>> EXT4_FL_USER_VISIBLE mask altered...  Which I believe would expose EXT4_DAX_FL
>>> directly as well.
>>> 
>>> Jan, Ted?  Any ideas?  Or should we expose EXT4_DAX_FL and FS_XFLAG_DAX in
>>> ext4?
>> 
>> Both flags should be exposed through their respective ioctl interfaces
>> in both filesystems.  That way we don't have to add even more verbiage
>> to the documentation to instruct userspace programmers on how to special
>> case ext4 and XFS for the same piece of functionality.
> 
> Wouldn't it be more confusing for the user to have 2 different flags which do
> the same thing?
> 
> I would think that using FS_XFLAG_DAX _only_ (for both ext4 and xfs) would be
> easier without special cases?
> 
> Ira
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax()
  2020-04-15 13:58   ` Jan Kara
@ 2020-04-17 17:16     ` Ira Weiny
  0 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-17 17:16 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 03:58:34PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:28, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Change the logic of ext4_should_use_dax() to support using the inode dax
> > flag OR the overriding tri-state mount option.
> > 
> > While we are at it change the function to ext4_enable_dax() as this
> > better reflects the ask.
> 
> I disagree with the renaming. ext4_enable_dax() suggests it enables
> something. It does not. I'd either leave ext4_should_use_dax() or maybe
> change it to ext4_should_enable_dax() if you really like the "enable" word
> :).

Ok that does sound better.  And I've changed it in the xfs series as well.

but I kept Darrick's review on that patch...

> 
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  fs/ext4/inode.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index fa0ff78dc033..e9d582e516bc 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4383,9 +4383,11 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> >  		!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
> >  }
> >  
> > -static bool ext4_should_use_dax(struct inode *inode)
> > +static bool ext4_enable_dax(struct inode *inode)
> >  {
> > -	if (!test_opt(inode->i_sb, DAX))
> > +	unsigned int flags = EXT4_I(inode)->i_flags;
> > +
> > +	if (test_opt2(inode->i_sb, NODAX))
> >  		return false;
> >  	if (!S_ISREG(inode->i_mode))
> >  		return false;
> > @@ -4397,7 +4399,13 @@ static bool ext4_should_use_dax(struct inode *inode)
> >  		return false;
> >  	if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
> >  		return false;
> > -	return true;
> > +	if (!bdev_dax_supported(inode->i_sb->s_bdev,
> > +				inode->i_sb->s_blocksize))
> > +		return false;
> > +	if (test_opt(inode->i_sb, DAX))
> > +		return true;
> > +
> > +	return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
> 
> flags & EXT4_DAX_FL is enough here, isn't it?

Yes, changed.

Ira

> 
> 								Honza
> 
> >  }
> >  
> >  void ext4_set_inode_flags(struct inode *inode)
> > @@ -4415,7 +4423,7 @@ void ext4_set_inode_flags(struct inode *inode)
> >  		new_fl |= S_NOATIME;
> >  	if (flags & EXT4_DIRSYNC_FL)
> >  		new_fl |= S_DIRSYNC;
> > -	if (ext4_should_use_dax(inode))
> > +	if (ext4_enable_dax(inode))
> >  		new_fl |= S_DAX;
> >  	if (flags & EXT4_ENCRYPT_FL)
> >  		new_fl |= S_ENCRYPTED;
> > -- 
> > 2.25.1
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load
  2020-04-15 14:03   ` Jan Kara
@ 2020-04-17 17:18     ` Ira Weiny
  0 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-17 17:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, Darrick J. Wong, Dan Williams, Dave Chinner,
	Christoph Hellwig, Theodore Y. Ts'o, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 04:03:08PM +0200, Jan Kara wrote:
> On Mon 13-04-20 21:00:29, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > To prevent complications with in memory inodes we only set S_DAX on
> > inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
> > change after inode eviction and reload.
> > 
> > Add init bool to ext4_set_inode_flags() to indicate if the inode is
> > being newly initialized.
> > 
> > Assert that S_DAX is not set on an inode which is just being loaded.
> 
> > @@ -4408,11 +4408,13 @@ static bool ext4_enable_dax(struct inode *inode)
> >  	return (flags & EXT4_DAX_FL) == EXT4_DAX_FL;
> >  }
> >  
> > -void ext4_set_inode_flags(struct inode *inode)
> > +void ext4_set_inode_flags(struct inode *inode, bool init)
> >  {
> >  	unsigned int flags = EXT4_I(inode)->i_flags;
> >  	unsigned int new_fl = 0;
> >  
> > +	J_ASSERT(!(IS_DAX(inode) && init));
> > +
> 
> WARN_ON or BUG_ON here? J_ASSERT is for journalling assertions...

Ah sorry, did not realize that J_ was specific.

Changed to WARN_ON_ONCE()

Ira

> 
> Otherwise the patch looks good.
> 
> 								Honza
> 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag
  2020-04-17  6:43               ` Andreas Dilger
@ 2020-04-17 17:19                 ` Ira Weiny
  0 siblings, 0 replies; 42+ messages in thread
From: Ira Weiny @ 2020-04-17 17:19 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Darrick J. Wong, linux-kernel, Jan Kara, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o,
	Jeff Moyer, linux-ext4, linux-xfs, linux-fsdevel

On Fri, Apr 17, 2020 at 12:43:39AM -0600, Andreas Dilger wrote:
> We still need to store an on-disk DAX flag for Ext4, and at that point it
> doesn't make sense not to expose it via the standard Ext4 chattr utility.
> 
> So having EXT4_DAX_FL (== FS_DAX_FL) is no extra effort to add.

I'll leave it exposed then.

Thanks,
Ira

> 
> Cheers, Andreas
> 
> > On Apr 16, 2020, at 20:20, Ira Weiny <ira.weiny@intel.com> wrote:
> > 
> > On Thu, Apr 16, 2020 at 06:57:31PM -0700, Darrick J. Wong wrote:
> >>> On Thu, Apr 16, 2020 at 05:37:19PM -0700, Ira Weiny wrote:
> >>> On Thu, Apr 16, 2020 at 03:49:37PM -0700, Darrick J. Wong wrote:
> >>>> On Thu, Apr 16, 2020 at 03:33:27PM -0700, Ira Weiny wrote:
> >>>>> On Thu, Apr 16, 2020 at 09:25:04AM -0700, Darrick J. Wong wrote:
> >>>>>> On Mon, Apr 13, 2020 at 09:00:26PM -0700, ira.weiny@intel.com wrote:
> >>>>>>> From: Ira Weiny <ira.weiny@intel.com>
> >>>>>>> 
> >>>>>>> Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.
> >>>>>>> 
> >>>>>>> Set the flag to be user visible and changeable.  Set the flag to be
> >>>>>>> inherited.  Allow applications to change the flag at any time.
> >>>>>>> 
> >>>>>>> Finally, on regular files, flag the inode to not be cached to facilitate
> >>>>>>> changing S_DAX on the next creation of the inode.
> >>>>>>> 
> >>>>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >>>>>>> ---
> >>>>>>> fs/ext4/ext4.h  | 13 +++++++++----
> >>>>>>> fs/ext4/ioctl.c | 21 ++++++++++++++++++++-
> >>>>>>> 2 files changed, 29 insertions(+), 5 deletions(-)
> >>>>>>> 
> >>>>>>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> >>>>>>> index 61b37a052052..434021fcec88 100644
> >>>>>>> --- a/fs/ext4/ext4.h
> >>>>>>> +++ b/fs/ext4/ext4.h
> >>>>>>> @@ -415,13 +415,16 @@ struct flex_groups {
> >>>>>>> #define EXT4_VERITY_FL            0x00100000 /* Verity protected inode */
> >>>>>>> #define EXT4_EA_INODE_FL            0x00200000 /* Inode used for large EA */
> >>>>>>> #define EXT4_EOFBLOCKS_FL        0x00400000 /* Blocks allocated beyond EOF */
> >>>>>>> +
> >>>>>>> +#define EXT4_DAX_FL            0x00800000 /* Inode is DAX */
> >>>>>> 
> >>>>>> Sooo, fun fact about ext4 vs. the world--
> >>>>>> 
> >>>>>> The GETFLAGS/SETFLAGS ioctl, since it came from ext2, shares the same
> >>>>>> flag values as the ondisk inode flags in ext*.  Therefore, each of these
> >>>>>> EXT4_[whatever]_FL values are supposed to have a FS_[whatever]_FL
> >>>>>> equivalent in include/uapi/linux/fs.h.
> >>>>> 
> >>>>> Interesting...
> >>>>> 
> >>>>>> 
> >>>>>> (Note that the "[whatever]" is a straight translation since the same
> >>>>>> uapi header also defines the FS_XFLAG_[xfswhatever] flag values; ignore
> >>>>>> those.)
> >>>>>> 
> >>>>>> Evidently, FS_NOCOW_FL already took 0x800000, but ext4.h was never
> >>>>>> updated to note that the value was taken.  I think Ted might be inclined
> >>>>>> to reserve the ondisk inode bit just in case ext4 ever does support copy
> >>>>>> on write, though that's his call. :)
> >>>>> 
> >>>>> Seems like I should change this...  And I did not realize I was inherently
> >>>>> changing a bit definition which was exposed to other FS's...
> >>>> 
> >>>> <nod> This whole thing is a mess, particularly now that we have two vfs
> >>>> ioctls to set per-fs inode attributes, both of which were inherited from
> >>>> other filesystems... :(
> >>>> 
> >>> 
> >>> Ok I've changed it.
> >>> 
> >>>> 
> >>>>>> 
> >>>>>> Long story short - can you use 0x1000000 for this instead, and add the
> >>>>>> corresponding value to the uapi fs.h?  I guess that also means that we
> >>>>>> can change FS_XFLAG_DAX (in the form of FS_DAX_FL in FSSETFLAGS) after
> >>>>>> that.
> >>>>> 
> >>>>> :-/
> >>>>> 
> >>>>> Are there any potential users of FS_XFLAG_DAX now?
> >>>> 
> >>>> Yes, it's in the userspace ABI so we can't get rid of it.
> >>>> 
> >>>> (FWIW there are several flags that exist in both FS_XFLAG_* and FS_*_FL
> >>>> form.)
> >>>> 
> >>>>> From what it looks like, changing FS_XFLAG_DAX to FS_DAX_FL would be pretty
> >>>>> straight forward.  Just to be sure, looks like XFS converts the FS_[xxx]_FL to
> >>>>> FS_XFLAGS_[xxx] in xfs_merge_ioc_xflags()?  But it does not look like all the
> >>>>> FS_[xxx]_FL flags are converted.  Is is that XFS does not support those
> >>>>> options?  Or is it depending on the VFS layer for some of them?
> >>>> 
> >>>> XFS doesn't support most of the FS_*_FL flags.
> >>> 
> >>> If FS_XFLAG_DAX needs to continue to be user visible I think we need to keep
> >>> that flag and we should not expose the EXT4_DAX_FL flag...
> >>> 
> >>> I think that works for XFS.
> >>> 
> >>> But for ext4 it looks like EXT4_FL_XFLAG_VISIBLE was intended to be used for
> >>> [GET|SET]XATTR where EXT4_FL_USER_VISIBLE was intended to for [GET|SET]FLAGS...
> >>> But if I don't add EXT4_DAX_FL in EXT4_FL_XFLAG_VISIBLE my test fails.
> >>> 
> >>> I've been playing with the flags and looking at the code and I _thought_ the
> >>> following patch would ensure that FS_XFLAG_DAX is the only one visible but for
> >>> some reason FS_XFLAG_DAX can't be set with this patch.  I still need the
> >>> EXT4_FL_USER_VISIBLE mask altered...  Which I believe would expose EXT4_DAX_FL
> >>> directly as well.
> >>> 
> >>> Jan, Ted?  Any ideas?  Or should we expose EXT4_DAX_FL and FS_XFLAG_DAX in
> >>> ext4?
> >> 
> >> Both flags should be exposed through their respective ioctl interfaces
> >> in both filesystems.  That way we don't have to add even more verbiage
> >> to the documentation to instruct userspace programmers on how to special
> >> case ext4 and XFS for the same piece of functionality.
> > 
> > Wouldn't it be more confusing for the user to have 2 different flags which do
> > the same thing?
> > 
> > I would think that using FS_XFLAG_DAX _only_ (for both ext4 and xfs) would be
> > easier without special cases?
> > 
> > Ira
> > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-15 19:54     ` Ira Weiny
@ 2020-04-21 18:41       ` Ira Weiny
  2020-04-21 18:56         ` Darrick J. Wong
  0 siblings, 1 reply; 42+ messages in thread
From: Ira Weiny @ 2020-04-21 18:41 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: linux-kernel, Jan Kara, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Wed, Apr 15, 2020 at 12:54:34PM -0700, 'Ira Weiny' wrote:
> On Wed, Apr 15, 2020 at 12:03:07PM -0400, Theodore Y. Ts'o wrote:
> > On Mon, Apr 13, 2020 at 09:00:25PM -0700, ira.weiny@intel.com wrote:
 
[snip]

> > 
> > Also note that encrypted files are read/write so we must never allow
> > the combination of ENCRPYT_FL and DAX_FL.  So that may be something
> > where we should teach __ext4_iget() to check for this, and declare the
> > file system as corrupted if it sees this combination.
> 
> ok...

After thinking about this...

Do we really want to declare the FS corrupted?

If so, I think we need to return errors when such a configuration is attempted.
If in the future we have an encrypted mode which can co-exist with DAX (such as
Dan mentioned) we can change this.

FWIW I think we should return errors when such a configuration is attempted but
_not_ declare the FS corrupted.  That allows users to enable this configuration
later if we can figure out how to support it.

> 
> > (For VERITY_FL
> > && DAX_FL that is a combo that we might want to support in the future,
> > so that's probably a case where arguably, we should just ignore the
> > DAX_FL for now.)
> 
> ok...

I think this should work the same.

It looks like VERITY_FL and ENCRYPT_FL are _not_ user modifiable?  Is that
correct?

You said that ENCRPYT_FL is set from the parent directory?  But I'm not seeing
where that occurs?

Similarly I don't see where VERITY_FL is being set either?  :-/

I think to make this work correctly we should restrict setting those flags if
DAX_FL is set and vice versa.  But I'm not finding where to do that.  :-/

Ira

> 
> Ira
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC 3/8] fs/ext4: Disallow encryption if inode is DAX
  2020-04-21 18:41       ` Ira Weiny
@ 2020-04-21 18:56         ` Darrick J. Wong
  0 siblings, 0 replies; 42+ messages in thread
From: Darrick J. Wong @ 2020-04-21 18:56 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Theodore Y. Ts'o, linux-kernel, Jan Kara, Dan Williams,
	Dave Chinner, Christoph Hellwig, Jeff Moyer, linux-ext4,
	linux-xfs, linux-fsdevel

On Tue, Apr 21, 2020 at 11:41:43AM -0700, Ira Weiny wrote:
> On Wed, Apr 15, 2020 at 12:54:34PM -0700, 'Ira Weiny' wrote:
> > On Wed, Apr 15, 2020 at 12:03:07PM -0400, Theodore Y. Ts'o wrote:
> > > On Mon, Apr 13, 2020 at 09:00:25PM -0700, ira.weiny@intel.com wrote:
>  
> [snip]
> 
> > > 
> > > Also note that encrypted files are read/write so we must never allow
> > > the combination of ENCRPYT_FL and DAX_FL.  So that may be something
> > > where we should teach __ext4_iget() to check for this, and declare the
> > > file system as corrupted if it sees this combination.
> > 
> > ok...
> 
> After thinking about this...
> 
> Do we really want to declare the FS corrupted?

Seeing as we're defining the dax inode flag to be advisory (since its
value is ignored if the fs isn't on pmem, or the administrator overrode
with dax=never mount option), I don't see why that's filesystem
corruption.

I can see a case for returning errors if you're trying to change ENCRYPT
or VERITY on a file that's has S_DAX set.  We can't encrypt or set
verity data for a file that could be changed behind our backs, so the
kernel cannot satisfy /that/ request.

As for changing FS_DAX_FL on an encrypted/verity'd file, the API says
that it might not have an immedate effect on S_DAX and that programs
have to check S_DAX after changing FS_DAX_FL.  It'll never result in
S_DAX being set, but the current spec never guarantees that. ;)

(If FS_DAX_FL were *mandatory* then yes that would be corruption.)

--D

> If so, I think we need to return errors when such a configuration is attempted.
> If in the future we have an encrypted mode which can co-exist with DAX (such as
> Dan mentioned) we can change this.
> 
> FWIW I think we should return errors when such a configuration is attempted but
> _not_ declare the FS corrupted.  That allows users to enable this configuration
> later if we can figure out how to support it.
> 
> > 
> > > (For VERITY_FL
> > > && DAX_FL that is a combo that we might want to support in the future,
> > > so that's probably a case where arguably, we should just ignore the
> > > DAX_FL for now.)
> > 
> > ok...
> 
> I think this should work the same.
> 
> It looks like VERITY_FL and ENCRYPT_FL are _not_ user modifiable?  Is that
> correct?
> 
> You said that ENCRPYT_FL is set from the parent directory?  But I'm not seeing
> where that occurs?
> 
> Similarly I don't see where VERITY_FL is being set either?  :-/
> 
> I think to make this work correctly we should restrict setting those flags if
> DAX_FL is set and vice versa.  But I'm not finding where to do that.  :-/
> 
> Ira
> 
> > 
> > Ira
> > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2020-04-21 18:57 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-14  4:00 [PATCH RFC 0/8] Enable ext4 support for per-file/directory DAX operations ira.weiny
2020-04-14  4:00 ` [PATCH RFC 1/8] fs/ext4: Narrow scope of DAX check in setflags ira.weiny
2020-04-15 11:45   ` Jan Kara
2020-04-14  4:00 ` [PATCH RFC 2/8] fs/ext4: Disallow verity if inode is DAX ira.weiny
2020-04-15 11:58   ` Jan Kara
2020-04-15 12:00   ` Jan Kara
2020-04-15 15:55     ` Theodore Y. Ts'o
2020-04-15 19:21       ` Ira Weiny
2020-04-15 19:14     ` Ira Weiny
2020-04-16  1:29       ` Eric Biggers
2020-04-16  3:48         ` Ira Weiny
2020-04-15 20:34     ` Ira Weiny
2020-04-14  4:00 ` [PATCH RFC 3/8] fs/ext4: Disallow encryption " ira.weiny
2020-04-15 12:02   ` Jan Kara
2020-04-15 20:35     ` Ira Weiny
2020-04-15 16:03   ` Theodore Y. Ts'o
2020-04-15 17:27     ` Dan Williams
2020-04-15 19:54     ` Ira Weiny
2020-04-21 18:41       ` Ira Weiny
2020-04-21 18:56         ` Darrick J. Wong
2020-04-14  4:00 ` [PATCH RFC 4/8] fs/ext4: Introduce DAX inode flag ira.weiny
2020-04-15 12:08   ` Jan Kara
2020-04-15 20:39     ` Ira Weiny
2020-04-16 10:32       ` Jan Kara
2020-04-16 18:01       ` Theodore Y. Ts'o
2020-04-16 16:25   ` Darrick J. Wong
2020-04-16 22:33     ` Ira Weiny
2020-04-16 22:49       ` Darrick J. Wong
2020-04-17  0:37         ` Ira Weiny
2020-04-17  1:57           ` Darrick J. Wong
2020-04-17  2:20             ` Ira Weiny
2020-04-17  6:43               ` Andreas Dilger
2020-04-17 17:19                 ` Ira Weiny
2020-04-14  4:00 ` [PATCH RFC 5/8] fs/ext4: Make DAX mount option a tri-state ira.weiny
2020-04-15 12:51   ` Jan Kara
2020-04-14  4:00 ` [PATCH RFC 6/8] fs/ext4: Update ext4_should_use_dax() ira.weiny
2020-04-15 13:58   ` Jan Kara
2020-04-17 17:16     ` Ira Weiny
2020-04-14  4:00 ` [PATCH RFC 7/8] fs/ext4: Only change S_DAX on inode load ira.weiny
2020-04-15 14:03   ` Jan Kara
2020-04-17 17:18     ` Ira Weiny
2020-04-14  4:00 ` [PATCH RFC 8/8] Documentation/dax: Update DAX enablement for ext4 ira.weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).