linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/9] make statx() return DIO alignment information
@ 2022-07-22  7:12 Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 1/9] statx: add direct I/O " Eric Biggers
                   ` (9 more replies)
  0 siblings, 10 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

This patchset makes the statx() system call return direct I/O (DIO)
alignment information.  This allows userspace to easily determine
whether a file supports DIO, and if so with what alignment restrictions.

Patch 1 adds the basic VFS support for STATX_DIOALIGN.  Patch 2 wires it
up for all block device files.  The remaining patches wire it up for
regular files on ext4, f2fs, and xfs.  Support for regular files on
other filesystems can be added later.

I've also written a man-pages patch, which I'm sending separately.

Note, f2fs has one corner case where DIO reads are allowed but not DIO
writes.  The proposed statx fields can't represent this.  My proposal
(patch 6) is to just eliminate this case, as it seems much too weird.
But I'd appreciate any feedback on that part.

This patchset applies to v5.19-rc7.

Changed v3 => v4:
   - Added xfs support.

   - Moved the helper function for block devices into block/bdev.c.
   
   - Adjusted the ext4 patch to not introduce a bug where misaligned DIO
     starts being allowed on encrypted files when it gets combined with
     the patch "iomap: add support for dma aligned direct-io" that is
     queued in the block tree for 5.20.

   - Made a simplification in fscrypt_dio_supported().

Changed v2 => v3:
   - Dropped the stx_offset_align_optimal field, since its purpose
     wasn't clearly distinguished from the existing stx_blksize.

   - Renamed STATX_IOALIGN to STATX_DIOALIGN, to reflect the new focus
     on DIO only.

   - Similarly, renamed stx_{mem,offset}_align_dio to
     stx_dio_{mem,offset}_align, to reflect the new focus on DIO only.

   - Wired up STATX_DIOALIGN on block device files.

Changed v1 => v2:
   - No changes.

Eric Biggers (9):
  statx: add direct I/O alignment information
  vfs: support STATX_DIOALIGN on block devices
  fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN
  ext4: support STATX_DIOALIGN
  f2fs: move f2fs_force_buffered_io() into file.c
  f2fs: don't allow DIO reads but not DIO writes
  f2fs: simplify f2fs_force_buffered_io()
  f2fs: support STATX_DIOALIGN
  xfs: support STATX_DIOALIGN

 block/bdev.c              | 25 ++++++++++++++++++++
 fs/crypto/inline_crypt.c  | 49 +++++++++++++++++++--------------------
 fs/ext4/ext4.h            |  1 +
 fs/ext4/file.c            | 37 ++++++++++++++++++++---------
 fs/ext4/inode.c           | 36 ++++++++++++++++++++++++++++
 fs/f2fs/f2fs.h            | 45 -----------------------------------
 fs/f2fs/file.c            | 45 ++++++++++++++++++++++++++++++++++-
 fs/stat.c                 | 14 +++++++++++
 fs/xfs/xfs_iops.c         |  9 +++++++
 include/linux/blkdev.h    |  4 ++++
 include/linux/fscrypt.h   |  7 ++----
 include/linux/stat.h      |  2 ++
 include/uapi/linux/stat.h |  4 +++-
 13 files changed, 190 insertions(+), 88 deletions(-)

base-commit: ff6992735ade75aae3e35d16b17da1008d753d28
-- 
2.37.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 1/9] statx: add direct I/O alignment information
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22 16:32   ` Darrick J. Wong
  2022-07-22 17:31   ` Martin K. Petersen
  2022-07-22  7:12 ` [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices Eric Biggers
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch,
	Christoph Hellwig

From: Eric Biggers <ebiggers@google.com>

Traditionally, the conditions for when DIO (direct I/O) is supported
were fairly simple.  For both block devices and regular files, DIO had
to be aligned to the logical block size of the block device.

However, due to filesystem features that have been added over time (e.g.
multi-device support, data journalling, inline data, encryption, verity,
compression, checkpoint disabling, log-structured mode), the conditions
for when DIO is allowed on a regular file have gotten increasingly
complex.  Whether a particular regular file supports DIO, and with what
alignment, can depend on various file attributes and filesystem mount
options, as well as which block device(s) the file's data is located on.

Moreover, the general rule of DIO needing to be aligned to the block
device's logical block size is being relaxed to allow user buffers (but
not file offsets) aligned to the DMA alignment instead
(https://lore.kernel.org/linux-block/20220610195830.3574005-1-kbusch@fb.com/T/#u).

XFS has an ioctl XFS_IOC_DIOINFO that exposes DIO alignment information.
Uplifting this to the VFS is one possibility.  However, as discussed
(https://lore.kernel.org/linux-fsdevel/20220120071215.123274-1-ebiggers@kernel.org/T/#u),
this ioctl is rarely used and not known to be used outside of
XFS-specific code.  It was also never intended to indicate when a file
doesn't support DIO at all, nor was it intended for block devices.

Therefore, let's expose this information via statx().  Add the
STATX_DIOALIGN flag and two new statx fields associated with it:

* stx_dio_mem_align: the alignment (in bytes) required for user memory
  buffers for DIO, or 0 if DIO is not supported on the file.

* stx_dio_offset_align: the alignment (in bytes) required for file
  offsets and I/O segment lengths for DIO, or 0 if DIO is not supported
  on the file.  This will only be nonzero if stx_dio_mem_align is
  nonzero, and vice versa.

Note that as with other statx() extensions, if STATX_DIOALIGN isn't set
in the returned statx struct, then these new fields won't be filled in.
This will happen if the file is neither a regular file nor a block
device, or if the file is a regular file and the filesystem doesn't
support STATX_DIOALIGN.  It might also happen if the caller didn't
include STATX_DIOALIGN in the request mask, since statx() isn't required
to return unrequested information.

This commit only adds the VFS-level plumbing for STATX_DIOALIGN.  For
regular files, individual filesystems will still need to add code to
support it.  For block devices, a separate commit will wire it up too.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/stat.c                 | 2 ++
 include/linux/stat.h      | 2 ++
 include/uapi/linux/stat.h | 4 +++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/stat.c b/fs/stat.c
index 9ced8860e0f35d..a7930d74448304 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -611,6 +611,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
 	tmp.stx_dev_major = MAJOR(stat->dev);
 	tmp.stx_dev_minor = MINOR(stat->dev);
 	tmp.stx_mnt_id = stat->mnt_id;
+	tmp.stx_dio_mem_align = stat->dio_mem_align;
+	tmp.stx_dio_offset_align = stat->dio_offset_align;
 
 	return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
 }
diff --git a/include/linux/stat.h b/include/linux/stat.h
index 7df06931f25d85..ff277ced50e9fd 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -50,6 +50,8 @@ struct kstat {
 	struct timespec64 btime;			/* File creation time */
 	u64		blocks;
 	u64		mnt_id;
+	u32		dio_mem_align;
+	u32		dio_offset_align;
 };
 
 #endif
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 1500a0f58041ae..7cab2c65d3d7fc 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -124,7 +124,8 @@ struct statx {
 	__u32	stx_dev_minor;
 	/* 0x90 */
 	__u64	stx_mnt_id;
-	__u64	__spare2;
+	__u32	stx_dio_mem_align;	/* Memory buffer alignment for direct I/O */
+	__u32	stx_dio_offset_align;	/* File offset alignment for direct I/O */
 	/* 0xa0 */
 	__u64	__spare3[12];	/* Spare space for future expansion */
 	/* 0x100 */
@@ -152,6 +153,7 @@ struct statx {
 #define STATX_BASIC_STATS	0x000007ffU	/* The stuff in the normal stat struct */
 #define STATX_BTIME		0x00000800U	/* Want/got stx_btime */
 #define STATX_MNT_ID		0x00001000U	/* Got stx_mnt_id */
+#define STATX_DIOALIGN		0x00002000U	/* Want/got direct I/O alignment info */
 
 #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 1/9] statx: add direct I/O " Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22  8:10   ` Christoph Hellwig
  2022-07-22 17:32   ` Martin K. Petersen
  2022-07-22  7:12 ` [PATCH v4 3/9] fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN Eric Biggers
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

Add support for STATX_DIOALIGN to block devices, so that direct I/O
alignment restrictions are exposed to userspace in a generic way.

Note that this breaks the tradition of stat operating only on the block
device node, not the block device itself.  However, it was felt that
doing this is preferable, in order to make the interface useful and
avoid needing separate interfaces for regular files and block devices.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 block/bdev.c           | 25 +++++++++++++++++++++++++
 fs/stat.c              | 12 ++++++++++++
 include/linux/blkdev.h |  4 ++++
 3 files changed, 41 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index 5fe06c1f2def41..cee0951e27a82a 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -26,6 +26,7 @@
 #include <linux/namei.h>
 #include <linux/part_stat.h>
 #include <linux/uaccess.h>
+#include <linux/stat.h>
 #include "../fs/internal.h"
 #include "blk.h"
 
@@ -1071,3 +1072,27 @@ void sync_bdevs(bool wait)
 	spin_unlock(&blockdev_superblock->s_inode_list_lock);
 	iput(old_inode);
 }
+
+/*
+ * Handle STATX_DIOALIGN for block devices.
+ *
+ * Note that the inode passed to this is the inode of a block device node file,
+ * not the block device's internal inode.  Therefore it is *not* valid to use
+ * I_BDEV() here; the block device has to be looked up by i_rdev instead.
+ */
+void bdev_statx_dioalign(struct inode *inode, struct kstat *stat)
+{
+	struct block_device *bdev;
+	unsigned int lbs;
+
+	bdev = blkdev_get_no_open(inode->i_rdev);
+	if (!bdev)
+		return;
+
+	lbs = bdev_logical_block_size(bdev);
+	stat->dio_mem_align = lbs;
+	stat->dio_offset_align = lbs;
+	stat->result_mask |= STATX_DIOALIGN;
+
+	blkdev_put_no_open(bdev);
+}
diff --git a/fs/stat.c b/fs/stat.c
index a7930d74448304..ef50573c72a269 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -5,6 +5,7 @@
  *  Copyright (C) 1991, 1992  Linus Torvalds
  */
 
+#include <linux/blkdev.h>
 #include <linux/export.h>
 #include <linux/mm.h>
 #include <linux/errno.h>
@@ -230,11 +231,22 @@ static int vfs_statx(int dfd, struct filename *filename, int flags,
 		goto out;
 
 	error = vfs_getattr(&path, stat, request_mask, flags);
+
 	stat->mnt_id = real_mount(path.mnt)->mnt_id;
 	stat->result_mask |= STATX_MNT_ID;
+
 	if (path.mnt->mnt_root == path.dentry)
 		stat->attributes |= STATX_ATTR_MOUNT_ROOT;
 	stat->attributes_mask |= STATX_ATTR_MOUNT_ROOT;
+
+	/* Handle STATX_DIOALIGN for block devices. */
+	if (request_mask & STATX_DIOALIGN) {
+		struct inode *inode = d_backing_inode(path.dentry);
+
+		if (S_ISBLK(inode->i_mode))
+			bdev_statx_dioalign(inode, stat);
+	}
+
 	path_put(&path);
 	if (retry_estale(error, lookup_flags)) {
 		lookup_flags |= LOOKUP_REVAL;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2f7b43444c5f8d..d75151bd43b541 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1538,6 +1538,7 @@ int sync_blockdev(struct block_device *bdev);
 int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend);
 int sync_blockdev_nowait(struct block_device *bdev);
 void sync_bdevs(bool wait);
+void bdev_statx_dioalign(struct inode *inode, struct kstat *stat);
 void printk_all_partitions(void);
 #else
 static inline void invalidate_bdev(struct block_device *bdev)
@@ -1554,6 +1555,9 @@ static inline int sync_blockdev_nowait(struct block_device *bdev)
 static inline void sync_bdevs(bool wait)
 {
 }
+static inline void bdev_statx_dioalign(struct inode *inode, struct kstat *stat)
+{
+}
 static inline void printk_all_partitions(void)
 {
 }
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 3/9] fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 1/9] statx: add direct I/O " Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22  8:10   ` Christoph Hellwig
  2022-07-22  7:12 ` [PATCH v4 4/9] ext4: support STATX_DIOALIGN Eric Biggers
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

To prepare for STATX_DIOALIGN support, make two changes to
fscrypt_dio_supported().

First, remove the filesystem-block-alignment check and make the
filesystems handle it instead.  It previously made sense to have it in
fs/crypto/; however, to support STATX_DIOALIGN the alignment restriction
would have to be returned to filesystems.  It ends up being simpler if
filesystems handle this part themselves, especially for f2fs which only
allows fs-block-aligned DIO in the first place.

Second, make fscrypt_dio_supported() work on inodes whose encryption key
hasn't been set up yet, by making it set up the key if needed.  This is
required for statx(), since statx() doesn't require a file descriptor.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/crypto/inline_crypt.c | 49 ++++++++++++++++++++--------------------
 fs/ext4/file.c           |  9 ++++++--
 fs/f2fs/f2fs.h           |  2 +-
 include/linux/fscrypt.h  |  7 ++----
 4 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/fs/crypto/inline_crypt.c b/fs/crypto/inline_crypt.c
index 90f3e68f166e39..8d4bee5bccbf42 100644
--- a/fs/crypto/inline_crypt.c
+++ b/fs/crypto/inline_crypt.c
@@ -401,46 +401,45 @@ bool fscrypt_mergeable_bio_bh(struct bio *bio,
 EXPORT_SYMBOL_GPL(fscrypt_mergeable_bio_bh);
 
 /**
- * fscrypt_dio_supported() - check whether a DIO (direct I/O) request is
- *			     supported as far as encryption is concerned
- * @iocb: the file and position the I/O is targeting
- * @iter: the I/O data segment(s)
+ * fscrypt_dio_supported() - check whether DIO (direct I/O) is supported on an
+ *			     inode, as far as encryption is concerned
+ * @inode: the inode in question
  *
  * Return: %true if there are no encryption constraints that prevent DIO from
  *	   being supported; %false if DIO is unsupported.  (Note that in the
  *	   %true case, the filesystem might have other, non-encryption-related
- *	   constraints that prevent DIO from actually being supported.)
+ *	   constraints that prevent DIO from actually being supported.  Also, on
+ *	   encrypted files the filesystem is still responsible for only allowing
+ *	   DIO when requests are filesystem-block-aligned.)
  */
-bool fscrypt_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
+bool fscrypt_dio_supported(struct inode *inode)
 {
-	const struct inode *inode = file_inode(iocb->ki_filp);
-	const unsigned int blocksize = i_blocksize(inode);
+	int err;
 
 	/* If the file is unencrypted, no veto from us. */
 	if (!fscrypt_needs_contents_encryption(inode))
 		return true;
 
-	/* We only support DIO with inline crypto, not fs-layer crypto. */
-	if (!fscrypt_inode_uses_inline_crypto(inode))
-		return false;
-
 	/*
-	 * Since the granularity of encryption is filesystem blocks, the file
-	 * position and total I/O length must be aligned to the filesystem block
-	 * size -- not just to the block device's logical block size as is
-	 * traditionally the case for DIO on many filesystems.
+	 * We only support DIO with inline crypto, not fs-layer crypto.
 	 *
-	 * We require that the user-provided memory buffers be filesystem block
-	 * aligned too.  It is simpler to have a single alignment value required
-	 * for all properties of the I/O, as is normally the case for DIO.
-	 * Also, allowing less aligned buffers would imply that data units could
-	 * cross bvecs, which would greatly complicate the I/O stack, which
-	 * assumes that bios can be split at any bvec boundary.
+	 * To determine whether the inode is using inline crypto, we have to set
+	 * up the key if it wasn't already done.  This is because in the current
+	 * design of fscrypt, the decision of whether to use inline crypto or
+	 * not isn't made until the inode's encryption key is being set up.  In
+	 * the DIO read/write case, the key will always be set up already, since
+	 * the file will be open.  But in the case of statx(), the key might not
+	 * be set up yet, as the file might not have been opened yet.
 	 */
-	if (!IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter), blocksize))
+	err = fscrypt_require_key(inode);
+	if (err) {
+		/*
+		 * Key unavailable or couldn't be set up.  This edge case isn't
+		 * worth worrying about; just report that DIO is unsupported.
+		 */
 		return false;
-
-	return true;
+	}
+	return fscrypt_inode_uses_inline_crypto(inode);
 }
 EXPORT_SYMBOL_GPL(fscrypt_dio_supported);
 
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 109d07629f81fb..26d7426208970d 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -40,8 +40,13 @@ static bool ext4_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
 
-	if (!fscrypt_dio_supported(iocb, iter))
-		return false;
+	if (IS_ENCRYPTED(inode)) {
+		if (!fscrypt_dio_supported(inode))
+			return false;
+		if (!IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter),
+				i_blocksize(inode)))
+			return false;
+	}
 	if (fsverity_active(inode))
 		return false;
 	if (ext4_should_journal_data(inode))
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index d9bbecd008d22a..7869e749700fc2 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4453,7 +4453,7 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	int rw = iov_iter_rw(iter);
 
-	if (!fscrypt_dio_supported(iocb, iter))
+	if (!fscrypt_dio_supported(inode))
 		return true;
 	if (fsverity_active(inode))
 		return true;
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index e60d57c99cb6f2..0f9f5ed5b34d35 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -763,7 +763,7 @@ bool fscrypt_mergeable_bio(struct bio *bio, const struct inode *inode,
 bool fscrypt_mergeable_bio_bh(struct bio *bio,
 			      const struct buffer_head *next_bh);
 
-bool fscrypt_dio_supported(struct kiocb *iocb, struct iov_iter *iter);
+bool fscrypt_dio_supported(struct inode *inode);
 
 u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks);
 
@@ -796,11 +796,8 @@ static inline bool fscrypt_mergeable_bio_bh(struct bio *bio,
 	return true;
 }
 
-static inline bool fscrypt_dio_supported(struct kiocb *iocb,
-					 struct iov_iter *iter)
+static inline bool fscrypt_dio_supported(struct inode *inode)
 {
-	const struct inode *inode = file_inode(iocb->ki_filp);
-
 	return !fscrypt_needs_contents_encryption(inode);
 }
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 4/9] ext4: support STATX_DIOALIGN
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (2 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 3/9] fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22 17:05   ` Theodore Ts'o
  2022-07-22  7:12 ` [PATCH v4 5/9] f2fs: move f2fs_force_buffered_io() into file.c Eric Biggers
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

Add support for STATX_DIOALIGN to ext4, so that direct I/O alignment
restrictions are exposed to userspace in a generic way.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/ext4/ext4.h  |  1 +
 fs/ext4/file.c  | 42 ++++++++++++++++++++++++++----------------
 fs/ext4/inode.c | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 75b8d81b24692c..68e964394e9173 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2968,6 +2968,7 @@ extern struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 extern int  ext4_write_inode(struct inode *, struct writeback_control *);
 extern int  ext4_setattr(struct user_namespace *, struct dentry *,
 			 struct iattr *);
+extern u32  ext4_dio_alignment(struct inode *inode);
 extern int  ext4_getattr(struct user_namespace *, const struct path *,
 			 struct kstat *, u32, unsigned int);
 extern void ext4_evict_inode(struct inode *);
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 26d7426208970d..8bb1c35fd6dd5a 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -36,24 +36,34 @@
 #include "acl.h"
 #include "truncate.h"
 
-static bool ext4_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
+/*
+ * Returns %true if the given DIO request should be attempted with DIO, or
+ * %false if it should fall back to buffered I/O.
+ *
+ * DIO isn't well specified; when it's unsupported (either due to the request
+ * being misaligned, or due to the file not supporting DIO at all), filesystems
+ * either fall back to buffered I/O or return EINVAL.  For files that don't use
+ * any special features like encryption or verity, ext4 has traditionally
+ * returned EINVAL for misaligned DIO.  iomap_dio_rw() uses this convention too.
+ * In this case, we should attempt the DIO, *not* fall back to buffered I/O.
+ *
+ * In contrast, in cases where DIO is unsupported due to ext4 features, ext4
+ * traditionally falls back to buffered I/O.
+ *
+ * This function implements the traditional ext4 behavior in all these cases.
+ */
+static bool ext4_should_use_dio(struct kiocb *iocb, struct iov_iter *iter)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
+	u32 dio_align = ext4_dio_alignment(inode);
 
-	if (IS_ENCRYPTED(inode)) {
-		if (!fscrypt_dio_supported(inode))
-			return false;
-		if (!IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter),
-				i_blocksize(inode)))
-			return false;
-	}
-	if (fsverity_active(inode))
+	if (dio_align == 0)
 		return false;
-	if (ext4_should_journal_data(inode))
-		return false;
-	if (ext4_has_inline_data(inode))
-		return false;
-	return true;
+
+	if (dio_align == 1)
+		return true;
+
+	return IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter), dio_align);
 }
 
 static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
@@ -68,7 +78,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		inode_lock_shared(inode);
 	}
 
-	if (!ext4_dio_supported(iocb, to)) {
+	if (!ext4_should_use_dio(iocb, to)) {
 		inode_unlock_shared(inode);
 		/*
 		 * Fallback to buffered I/O if the operation being performed on
@@ -516,7 +526,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	}
 
 	/* Fallback to buffered I/O if the inode does not support direct I/O. */
-	if (!ext4_dio_supported(iocb, from)) {
+	if (!ext4_should_use_dio(iocb, from)) {
 		if (ilock_shared)
 			inode_unlock_shared(inode);
 		else
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 84c0eb55071d65..75dd332e9da57b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5536,6 +5536,22 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
 	return error;
 }
 
+u32 ext4_dio_alignment(struct inode *inode)
+{
+	if (fsverity_active(inode))
+		return 0;
+	if (ext4_should_journal_data(inode))
+		return 0;
+	if (ext4_has_inline_data(inode))
+		return 0;
+	if (IS_ENCRYPTED(inode)) {
+		if (!fscrypt_dio_supported(inode))
+			return 0;
+		return i_blocksize(inode);
+	}
+	return 1; /* use the iomap defaults */
+}
+
 int ext4_getattr(struct user_namespace *mnt_userns, const struct path *path,
 		 struct kstat *stat, u32 request_mask, unsigned int query_flags)
 {
@@ -5551,6 +5567,26 @@ int ext4_getattr(struct user_namespace *mnt_userns, const struct path *path,
 		stat->btime.tv_nsec = ei->i_crtime.tv_nsec;
 	}
 
+	/*
+	 * Return the DIO alignment restrictions if requested.  We only return
+	 * this information when requested, since on encrypted files it might
+	 * take a fair bit of work to get if the file wasn't opened recently.
+	 */
+	if ((request_mask & STATX_DIOALIGN) && S_ISREG(inode->i_mode)) {
+		u32 dio_align = ext4_dio_alignment(inode);
+		unsigned int lbs = bdev_logical_block_size(inode->i_sb->s_bdev);
+
+		stat->result_mask |= STATX_DIOALIGN;
+		if (dio_align == 1) {
+			/* iomap defaults */
+			stat->dio_mem_align = lbs;
+			stat->dio_offset_align = lbs;
+		} else {
+			stat->dio_mem_align = dio_align;
+			stat->dio_offset_align = dio_align;
+		}
+	}
+
 	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
 	if (flags & EXT4_APPEND_FL)
 		stat->attributes |= STATX_ATTR_APPEND;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 5/9] f2fs: move f2fs_force_buffered_io() into file.c
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (3 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 4/9] ext4: support STATX_DIOALIGN Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes Eric Biggers
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

f2fs_force_buffered_io() is only used in file.c, so move it into there.
No behavior change.  This makes it easier to review later patches.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/f2fs.h | 45 ---------------------------------------------
 fs/f2fs/file.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 7869e749700fc2..d187b7d7ed2435 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4426,17 +4426,6 @@ static inline void f2fs_i_compr_blocks_update(struct inode *inode,
 	f2fs_mark_inode_dirty_sync(inode, true);
 }
 
-static inline int block_unaligned_IO(struct inode *inode,
-				struct kiocb *iocb, struct iov_iter *iter)
-{
-	unsigned int i_blkbits = READ_ONCE(inode->i_blkbits);
-	unsigned int blocksize_mask = (1 << i_blkbits) - 1;
-	loff_t offset = iocb->ki_pos;
-	unsigned long align = offset | iov_iter_alignment(iter);
-
-	return align & blocksize_mask;
-}
-
 static inline bool f2fs_allow_multi_device_dio(struct f2fs_sb_info *sbi,
 								int flag)
 {
@@ -4447,40 +4436,6 @@ static inline bool f2fs_allow_multi_device_dio(struct f2fs_sb_info *sbi,
 	return sbi->aligned_blksize;
 }
 
-static inline bool f2fs_force_buffered_io(struct inode *inode,
-				struct kiocb *iocb, struct iov_iter *iter)
-{
-	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-	int rw = iov_iter_rw(iter);
-
-	if (!fscrypt_dio_supported(inode))
-		return true;
-	if (fsverity_active(inode))
-		return true;
-	if (f2fs_compressed_file(inode))
-		return true;
-
-	/* disallow direct IO if any of devices has unaligned blksize */
-	if (f2fs_is_multi_device(sbi) && !sbi->aligned_blksize)
-		return true;
-	/*
-	 * for blkzoned device, fallback direct IO to buffered IO, so
-	 * all IOs can be serialized by log-structured write.
-	 */
-	if (f2fs_sb_has_blkzoned(sbi))
-		return true;
-	if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
-		if (block_unaligned_IO(inode, iocb, iter))
-			return true;
-		if (F2FS_IO_ALIGNED(sbi))
-			return true;
-	}
-	if (is_sbi_flag_set(F2FS_I_SB(inode), SBI_CP_DISABLED))
-		return true;
-
-	return false;
-}
-
 static inline bool f2fs_need_verity(const struct inode *inode, pgoff_t idx)
 {
 	return fsverity_active(inode) &&
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index bd14cef1b08fd2..5e5c97fccfb4ee 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -808,6 +808,51 @@ int f2fs_truncate(struct inode *inode)
 	return 0;
 }
 
+static int block_unaligned_IO(struct inode *inode, struct kiocb *iocb,
+			      struct iov_iter *iter)
+{
+	unsigned int i_blkbits = READ_ONCE(inode->i_blkbits);
+	unsigned int blocksize_mask = (1 << i_blkbits) - 1;
+	loff_t offset = iocb->ki_pos;
+	unsigned long align = offset | iov_iter_alignment(iter);
+
+	return align & blocksize_mask;
+}
+
+static inline bool f2fs_force_buffered_io(struct inode *inode,
+				struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+	int rw = iov_iter_rw(iter);
+
+	if (!fscrypt_dio_supported(inode))
+		return true;
+	if (fsverity_active(inode))
+		return true;
+	if (f2fs_compressed_file(inode))
+		return true;
+
+	/* disallow direct IO if any of devices has unaligned blksize */
+	if (f2fs_is_multi_device(sbi) && !sbi->aligned_blksize)
+		return true;
+	/*
+	 * for blkzoned device, fallback direct IO to buffered IO, so
+	 * all IOs can be serialized by log-structured write.
+	 */
+	if (f2fs_sb_has_blkzoned(sbi))
+		return true;
+	if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
+		if (block_unaligned_IO(inode, iocb, iter))
+			return true;
+		if (F2FS_IO_ALIGNED(sbi))
+			return true;
+	}
+	if (is_sbi_flag_set(F2FS_I_SB(inode), SBI_CP_DISABLED))
+		return true;
+
+	return false;
+}
+
 int f2fs_getattr(struct user_namespace *mnt_userns, const struct path *path,
 		 struct kstat *stat, u32 request_mask, unsigned int query_flags)
 {
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (4 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 5/9] f2fs: move f2fs_force_buffered_io() into file.c Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-24  2:01   ` Jaegeuk Kim
  2022-07-22  7:12 ` [PATCH v4 7/9] f2fs: simplify f2fs_force_buffered_io() Eric Biggers
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

Currently, if an f2fs filesystem is mounted with the mode=lfs and
io_bits mount options, DIO reads are allowed but DIO writes are not.
Allowing DIO reads but not DIO writes is an unusual restriction, which
is likely to be surprising to applications, namely any application that
both reads and writes from a file (using O_DIRECT).  This behavior is
also incompatible with the proposed STATX_DIOALIGN extension to statx.
Given this, let's drop the support for DIO reads in this configuration.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/file.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5e5c97fccfb4ee..ad0212848a1ab9 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -823,7 +823,6 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 				struct kiocb *iocb, struct iov_iter *iter)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-	int rw = iov_iter_rw(iter);
 
 	if (!fscrypt_dio_supported(inode))
 		return true;
@@ -841,7 +840,7 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 	 */
 	if (f2fs_sb_has_blkzoned(sbi))
 		return true;
-	if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
+	if (f2fs_lfs_mode(sbi)) {
 		if (block_unaligned_IO(inode, iocb, iter))
 			return true;
 		if (F2FS_IO_ALIGNED(sbi))
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 7/9] f2fs: simplify f2fs_force_buffered_io()
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (5 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 8/9] f2fs: support STATX_DIOALIGN Eric Biggers
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

f2fs only allows direct I/O that is aligned to the filesystem block
size.  Given that fact, simplify f2fs_force_buffered_io() by removing
the redundant call to block_unaligned_IO().

This makes it easier to reuse this code for STATX_DIOALIGN.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/file.c | 24 ++++--------------------
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ad0212848a1ab9..1b452bb75af29e 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -808,19 +808,7 @@ int f2fs_truncate(struct inode *inode)
 	return 0;
 }
 
-static int block_unaligned_IO(struct inode *inode, struct kiocb *iocb,
-			      struct iov_iter *iter)
-{
-	unsigned int i_blkbits = READ_ONCE(inode->i_blkbits);
-	unsigned int blocksize_mask = (1 << i_blkbits) - 1;
-	loff_t offset = iocb->ki_pos;
-	unsigned long align = offset | iov_iter_alignment(iter);
-
-	return align & blocksize_mask;
-}
-
-static inline bool f2fs_force_buffered_io(struct inode *inode,
-				struct kiocb *iocb, struct iov_iter *iter)
+static bool f2fs_force_buffered_io(struct inode *inode)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 
@@ -840,12 +828,8 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 	 */
 	if (f2fs_sb_has_blkzoned(sbi))
 		return true;
-	if (f2fs_lfs_mode(sbi)) {
-		if (block_unaligned_IO(inode, iocb, iter))
-			return true;
-		if (F2FS_IO_ALIGNED(sbi))
-			return true;
-	}
+	if (f2fs_lfs_mode(sbi) && F2FS_IO_ALIGNED(sbi))
+		return true;
 	if (is_sbi_flag_set(F2FS_I_SB(inode), SBI_CP_DISABLED))
 		return true;
 
@@ -4205,7 +4189,7 @@ static bool f2fs_should_use_dio(struct inode *inode, struct kiocb *iocb,
 	if (!(iocb->ki_flags & IOCB_DIRECT))
 		return false;
 
-	if (f2fs_force_buffered_io(inode, iocb, iter))
+	if (f2fs_force_buffered_io(inode))
 		return false;
 
 	/*
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 8/9] f2fs: support STATX_DIOALIGN
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (6 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 7/9] f2fs: simplify f2fs_force_buffered_io() Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22  7:12 ` [PATCH v4 9/9] xfs: " Eric Biggers
  2022-08-26 17:19 ` [PATCH v4 0/9] make statx() return DIO alignment information Jeff Layton
  9 siblings, 0 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

Add support for STATX_DIOALIGN to f2fs, so that direct I/O alignment
restrictions are exposed to userspace in a generic way.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/file.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 1b452bb75af29e..11d75aa3da185a 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -852,6 +852,21 @@ int f2fs_getattr(struct user_namespace *mnt_userns, const struct path *path,
 		stat->btime.tv_nsec = fi->i_crtime.tv_nsec;
 	}
 
+	/*
+	 * Return the DIO alignment restrictions if requested.  We only return
+	 * this information when requested, since on encrypted files it might
+	 * take a fair bit of work to get if the file wasn't opened recently.
+	 */
+	if ((request_mask & STATX_DIOALIGN) && S_ISREG(inode->i_mode)) {
+		unsigned int bsize = i_blocksize(inode);
+
+		stat->result_mask |= STATX_DIOALIGN;
+		if (!f2fs_force_buffered_io(inode)) {
+			stat->dio_mem_align = bsize;
+			stat->dio_offset_align = bsize;
+		}
+	}
+
 	flags = fi->i_flags;
 	if (flags & F2FS_COMPR_FL)
 		stat->attributes |= STATX_ATTR_COMPRESSED;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 9/9] xfs: support STATX_DIOALIGN
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (7 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 8/9] f2fs: support STATX_DIOALIGN Eric Biggers
@ 2022-07-22  7:12 ` Eric Biggers
  2022-07-22  8:11   ` Christoph Hellwig
  2022-07-22 16:24   ` Darrick J. Wong
  2022-08-26 17:19 ` [PATCH v4 0/9] make statx() return DIO alignment information Jeff Layton
  9 siblings, 2 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-22  7:12 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

From: Eric Biggers <ebiggers@google.com>

Add support for STATX_DIOALIGN to xfs, so that direct I/O alignment
restrictions are exposed to userspace in a generic way.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/xfs/xfs_iops.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 29f5b8b8aca69a..bac3f56141801e 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -605,6 +605,15 @@ xfs_vn_getattr(
 		stat->blksize = BLKDEV_IOSIZE;
 		stat->rdev = inode->i_rdev;
 		break;
+	case S_IFREG:
+		if (request_mask & STATX_DIOALIGN) {
+			struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
+
+			stat->result_mask |= STATX_DIOALIGN;
+			stat->dio_mem_align = target->bt_logical_sectorsize;
+			stat->dio_offset_align = target->bt_logical_sectorsize;
+		}
+		fallthrough;
 	default:
 		stat->blksize = xfs_stat_blksize(ip);
 		stat->rdev = 0;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices
  2022-07-22  7:12 ` [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices Eric Biggers
@ 2022-07-22  8:10   ` Christoph Hellwig
  2022-07-22 17:32   ` Martin K. Petersen
  1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-07-22  8:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/9] fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN
  2022-07-22  7:12 ` [PATCH v4 3/9] fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN Eric Biggers
@ 2022-07-22  8:10   ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-07-22  8:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 9/9] xfs: support STATX_DIOALIGN
  2022-07-22  7:12 ` [PATCH v4 9/9] xfs: " Eric Biggers
@ 2022-07-22  8:11   ` Christoph Hellwig
  2022-07-22 16:24   ` Darrick J. Wong
  1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-07-22  8:11 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Fri, Jul 22, 2022 at 12:12:28AM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Add support for STATX_DIOALIGN to xfs, so that direct I/O alignment
> restrictions are exposed to userspace in a generic way.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 9/9] xfs: support STATX_DIOALIGN
  2022-07-22  7:12 ` [PATCH v4 9/9] xfs: " Eric Biggers
  2022-07-22  8:11   ` Christoph Hellwig
@ 2022-07-22 16:24   ` Darrick J. Wong
  1 sibling, 0 replies; 32+ messages in thread
From: Darrick J. Wong @ 2022-07-22 16:24 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Fri, Jul 22, 2022 at 12:12:28AM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Add support for STATX_DIOALIGN to xfs, so that direct I/O alignment
> restrictions are exposed to userspace in a generic way.
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>

LGTM
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_iops.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 29f5b8b8aca69a..bac3f56141801e 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -605,6 +605,15 @@ xfs_vn_getattr(
>  		stat->blksize = BLKDEV_IOSIZE;
>  		stat->rdev = inode->i_rdev;
>  		break;
> +	case S_IFREG:
> +		if (request_mask & STATX_DIOALIGN) {
> +			struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
> +
> +			stat->result_mask |= STATX_DIOALIGN;
> +			stat->dio_mem_align = target->bt_logical_sectorsize;
> +			stat->dio_offset_align = target->bt_logical_sectorsize;
> +		}
> +		fallthrough;
>  	default:
>  		stat->blksize = xfs_stat_blksize(ip);
>  		stat->rdev = 0;
> -- 
> 2.37.0
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/9] statx: add direct I/O alignment information
  2022-07-22  7:12 ` [PATCH v4 1/9] statx: add direct I/O " Eric Biggers
@ 2022-07-22 16:32   ` Darrick J. Wong
  2022-07-22 17:31   ` Martin K. Petersen
  1 sibling, 0 replies; 32+ messages in thread
From: Darrick J. Wong @ 2022-07-22 16:32 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch,
	Christoph Hellwig

On Fri, Jul 22, 2022 at 12:12:20AM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Traditionally, the conditions for when DIO (direct I/O) is supported
> were fairly simple.  For both block devices and regular files, DIO had
> to be aligned to the logical block size of the block device.
> 
> However, due to filesystem features that have been added over time (e.g.
> multi-device support, data journalling, inline data, encryption, verity,
> compression, checkpoint disabling, log-structured mode), the conditions
> for when DIO is allowed on a regular file have gotten increasingly
> complex.  Whether a particular regular file supports DIO, and with what
> alignment, can depend on various file attributes and filesystem mount
> options, as well as which block device(s) the file's data is located on.
> 
> Moreover, the general rule of DIO needing to be aligned to the block
> device's logical block size is being relaxed to allow user buffers (but
> not file offsets) aligned to the DMA alignment instead
> (https://lore.kernel.org/linux-block/20220610195830.3574005-1-kbusch@fb.com/T/#u).
> 
> XFS has an ioctl XFS_IOC_DIOINFO that exposes DIO alignment information.
> Uplifting this to the VFS is one possibility.  However, as discussed
> (https://lore.kernel.org/linux-fsdevel/20220120071215.123274-1-ebiggers@kernel.org/T/#u),
> this ioctl is rarely used and not known to be used outside of
> XFS-specific code.  It was also never intended to indicate when a file
> doesn't support DIO at all, nor was it intended for block devices.
> 
> Therefore, let's expose this information via statx().  Add the
> STATX_DIOALIGN flag and two new statx fields associated with it:
> 
> * stx_dio_mem_align: the alignment (in bytes) required for user memory
>   buffers for DIO, or 0 if DIO is not supported on the file.
> 
> * stx_dio_offset_align: the alignment (in bytes) required for file
>   offsets and I/O segment lengths for DIO, or 0 if DIO is not supported
>   on the file.  This will only be nonzero if stx_dio_mem_align is
>   nonzero, and vice versa.
> 
> Note that as with other statx() extensions, if STATX_DIOALIGN isn't set
> in the returned statx struct, then these new fields won't be filled in.
> This will happen if the file is neither a regular file nor a block
> device, or if the file is a regular file and the filesystem doesn't
> support STATX_DIOALIGN.  It might also happen if the caller didn't
> include STATX_DIOALIGN in the request mask, since statx() isn't required
> to return unrequested information.
> 
> This commit only adds the VFS-level plumbing for STATX_DIOALIGN.  For
> regular files, individual filesystems will still need to add code to
> support it.  For block devices, a separate commit will wire it up too.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Looks good to me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/stat.c                 | 2 ++
>  include/linux/stat.h      | 2 ++
>  include/uapi/linux/stat.h | 4 +++-
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/stat.c b/fs/stat.c
> index 9ced8860e0f35d..a7930d74448304 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -611,6 +611,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
>  	tmp.stx_dev_major = MAJOR(stat->dev);
>  	tmp.stx_dev_minor = MINOR(stat->dev);
>  	tmp.stx_mnt_id = stat->mnt_id;
> +	tmp.stx_dio_mem_align = stat->dio_mem_align;
> +	tmp.stx_dio_offset_align = stat->dio_offset_align;
>  
>  	return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
>  }
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index 7df06931f25d85..ff277ced50e9fd 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -50,6 +50,8 @@ struct kstat {
>  	struct timespec64 btime;			/* File creation time */
>  	u64		blocks;
>  	u64		mnt_id;
> +	u32		dio_mem_align;
> +	u32		dio_offset_align;
>  };
>  
>  #endif
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index 1500a0f58041ae..7cab2c65d3d7fc 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -124,7 +124,8 @@ struct statx {
>  	__u32	stx_dev_minor;
>  	/* 0x90 */
>  	__u64	stx_mnt_id;
> -	__u64	__spare2;
> +	__u32	stx_dio_mem_align;	/* Memory buffer alignment for direct I/O */
> +	__u32	stx_dio_offset_align;	/* File offset alignment for direct I/O */
>  	/* 0xa0 */
>  	__u64	__spare3[12];	/* Spare space for future expansion */
>  	/* 0x100 */
> @@ -152,6 +153,7 @@ struct statx {
>  #define STATX_BASIC_STATS	0x000007ffU	/* The stuff in the normal stat struct */
>  #define STATX_BTIME		0x00000800U	/* Want/got stx_btime */
>  #define STATX_MNT_ID		0x00001000U	/* Got stx_mnt_id */
> +#define STATX_DIOALIGN		0x00002000U	/* Want/got direct I/O alignment info */
>  
>  #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
>  
> -- 
> 2.37.0
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] ext4: support STATX_DIOALIGN
  2022-07-22  7:12 ` [PATCH v4 4/9] ext4: support STATX_DIOALIGN Eric Biggers
@ 2022-07-22 17:05   ` Theodore Ts'o
  0 siblings, 0 replies; 32+ messages in thread
From: Theodore Ts'o @ 2022-07-22 17:05 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Fri, Jul 22, 2022 at 12:12:23AM -0700, Eric Biggers wrote:
> -static bool ext4_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
> +/*
> + * Returns %true if the given DIO request should be attempted with DIO, or
> + * %false if it should fall back to buffered I/O.
> + *
> + * DIO isn't well specified; when it's unsupported (either due to the request
> + * being misaligned, or due to the file not supporting DIO at all), filesystems
> + * either fall back to buffered I/O or return EINVAL.  For files that don't use
> + * any special features like encryption or verity, ext4 has traditionally
> + * returned EINVAL for misaligned DIO.  iomap_dio_rw() uses this convention too.
> + * In this case, we should attempt the DIO, *not* fall back to buffered I/O.
> + *
> + * In contrast, in cases where DIO is unsupported due to ext4 features, ext4
> + * traditionally falls back to buffered I/O.
> + *
> + * This function implements the traditional ext4 behavior in all these cases.

Heh.  I had been under the impression that misaligned I/O fell back to
buffered I/O for ext4, since that's what a lot of historical Unix
systems did.  Obviously, it's not something I've tested since "you
should never do that".

There's actually some interesting discussion about what Linux *should*
be doing in the futre in this discussion:

https://patchwork.ozlabs.org/project/linux-ext4/patch/1461472078-20104-1-git-send-email-tytso@mit.edu/

Including the following from Christoph Hellwig:

https://patchwork.ozlabs.org/project/linux-ext4/patch/1461472078-20104-1-git-send-email-tytso@mit.edu/#1335016

> I've been doing an audit of our direct I/O implementations, and most
> of them does some form of transparent fallback, including some that
> only pretend to support O_DIRECT, but do anything special for it at all,
> while at the same time we go through greast efforts to check a file
> system actualy supports direct I/O, leading to nasty no-op ->direct_IO
> implementations as we even got that abstraction wrong.
> 
> At this point I wonder if we should simply treat O_DIRECT as a hint
> and always allow it, and just let the file system optimize for it
> (skip buffering, require alignment, relaxed Posix atomicy requirements)
> if it is set.

The thread also mentioned XFS_IOC_DIOINFO and how We Really Should
have something with equivalent functionality to the VFS --- six years
ago.  :-)


Anyway, this change to ext4 looks good.

Acked-by: Theodore Ts'o <tytso@mit.edu>

							- Ted

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/9] statx: add direct I/O alignment information
  2022-07-22  7:12 ` [PATCH v4 1/9] statx: add direct I/O " Eric Biggers
  2022-07-22 16:32   ` Darrick J. Wong
@ 2022-07-22 17:31   ` Martin K. Petersen
  1 sibling, 0 replies; 32+ messages in thread
From: Martin K. Petersen @ 2022-07-22 17:31 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch,
	Christoph Hellwig


Eric,

> Therefore, let's expose this information via statx().  Add the
> STATX_DIOALIGN flag and two new statx fields associated with it:
>
> * stx_dio_mem_align: the alignment (in bytes) required for user memory
>   buffers for DIO, or 0 if DIO is not supported on the file.
>
> * stx_dio_offset_align: the alignment (in bytes) required for file
>   offsets and I/O segment lengths for DIO, or 0 if DIO is not supported
>   on the file.  This will only be nonzero if stx_dio_mem_align is
>   nonzero, and vice versa.

Nice to finally have a generic interface for this!

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices
  2022-07-22  7:12 ` [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices Eric Biggers
  2022-07-22  8:10   ` Christoph Hellwig
@ 2022-07-22 17:32   ` Martin K. Petersen
  1 sibling, 0 replies; 32+ messages in thread
From: Martin K. Petersen @ 2022-07-22 17:32 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch


Eric,

> Add support for STATX_DIOALIGN to block devices, so that direct I/O
> alignment restrictions are exposed to userspace in a generic way.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-07-22  7:12 ` [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes Eric Biggers
@ 2022-07-24  2:01   ` Jaegeuk Kim
  2022-07-25 18:12     ` Eric Biggers
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2022-07-24  2:01 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On 07/22, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Currently, if an f2fs filesystem is mounted with the mode=lfs and
> io_bits mount options, DIO reads are allowed but DIO writes are not.
> Allowing DIO reads but not DIO writes is an unusual restriction, which
> is likely to be surprising to applications, namely any application that
> both reads and writes from a file (using O_DIRECT).  This behavior is
> also incompatible with the proposed STATX_DIOALIGN extension to statx.
> Given this, let's drop the support for DIO reads in this configuration.

IIRC, we allowed DIO reads since applications complained a lower performance.
So, I'm afraid this change will make another confusion to users. Could
you please apply the new bahavior only for STATX_DIOALIGN?

> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  fs/f2fs/file.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 5e5c97fccfb4ee..ad0212848a1ab9 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -823,7 +823,6 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
>  				struct kiocb *iocb, struct iov_iter *iter)
>  {
>  	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> -	int rw = iov_iter_rw(iter);
>  
>  	if (!fscrypt_dio_supported(inode))
>  		return true;
> @@ -841,7 +840,7 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
>  	 */
>  	if (f2fs_sb_has_blkzoned(sbi))
>  		return true;
> -	if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
> +	if (f2fs_lfs_mode(sbi)) {
>  		if (block_unaligned_IO(inode, iocb, iter))
>  			return true;
>  		if (F2FS_IO_ALIGNED(sbi))
> -- 
> 2.37.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-07-24  2:01   ` Jaegeuk Kim
@ 2022-07-25 18:12     ` Eric Biggers
  2022-07-25 23:58       ` Andreas Dilger
  2022-07-31  3:08       ` Jaegeuk Kim
  0 siblings, 2 replies; 32+ messages in thread
From: Eric Biggers @ 2022-07-25 18:12 UTC (permalink / raw)
  To: Jaegeuk Kim
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> On 07/22, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > is likely to be surprising to applications, namely any application that
> > both reads and writes from a file (using O_DIRECT).  This behavior is
> > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > Given this, let's drop the support for DIO reads in this configuration.
> 
> IIRC, we allowed DIO reads since applications complained a lower performance.
> So, I'm afraid this change will make another confusion to users. Could
> you please apply the new bahavior only for STATX_DIOALIGN?
> 

Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
weird case where DIO reads are allowed but not DIO writes.  So the question is
whether this case actually matters, in which case we should make STATX_DIOALIGN
distinguish between DIO reads and DIO writes, or whether it's some odd edge case
that doesn't really matter, in which case we could just fix it or make
STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
insight here.  What sort of applications want DIO reads but not DIO writes?
Is this common at all?

- Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-07-25 18:12     ` Eric Biggers
@ 2022-07-25 23:58       ` Andreas Dilger
  2022-07-31  3:08       ` Jaegeuk Kim
  1 sibling, 0 replies; 32+ messages in thread
From: Andreas Dilger @ 2022-07-25 23:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jaegeuk Kim, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, linux-api, linux-fscrypt, linux-block, linux-kernel,
	Keith Busch

[-- Attachment #1: Type: text/plain, Size: 2299 bytes --]

On Jul 25, 2022, at 12:12 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> 
> On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
>> On 07/22, Eric Biggers wrote:
>>> From: Eric Biggers <ebiggers@google.com>
>>> 
>>> Currently, if an f2fs filesystem is mounted with the mode=lfs and
>>> io_bits mount options, DIO reads are allowed but DIO writes are not.
>>> Allowing DIO reads but not DIO writes is an unusual restriction, which
>>> is likely to be surprising to applications, namely any application that
>>> both reads and writes from a file (using O_DIRECT).  This behavior is
>>> also incompatible with the proposed STATX_DIOALIGN extension to statx.
>>> Given this, let's drop the support for DIO reads in this configuration.
>> 
>> IIRC, we allowed DIO reads since applications complained a lower performance.
>> So, I'm afraid this change will make another confusion to users. Could
>> you please apply the new bahavior only for STATX_DIOALIGN?
>> 
> 
> Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> weird case where DIO reads are allowed but not DIO writes.  So the question is
> whether this case actually matters, in which case we should make STATX_DIOALIGN
> distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> that doesn't really matter, in which case we could just fix it or make
> STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> insight here.  What sort of applications want DIO reads but not DIO writes?
> Is this common at all?

I don't think this is f2fs related, but some backup applications I'm aware
of are using DIO reads to avoid polluting the page cache when reading large
numbers of files. They don't care about DIO writes, since that is usually
slower than async writes due to the sync before returning from the syscall.

Also, IMHO it doesn't make sense to remove useful functionality because the
new STATX_DIOALIGN fields don't handle this.  At worst the application will
still get an error when trying a DIO write, but in most cases they will
not use the brand new STATX call in the first place, and if this is documented
then any application that starts to use it should be able to handle it.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-07-25 18:12     ` Eric Biggers
  2022-07-25 23:58       ` Andreas Dilger
@ 2022-07-31  3:08       ` Jaegeuk Kim
  2022-08-16  0:55         ` Eric Biggers
  1 sibling, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2022-07-31  3:08 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On 07/25, Eric Biggers wrote:
> On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > On 07/22, Eric Biggers wrote:
> > > From: Eric Biggers <ebiggers@google.com>
> > > 
> > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > is likely to be surprising to applications, namely any application that
> > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > Given this, let's drop the support for DIO reads in this configuration.
> > 
> > IIRC, we allowed DIO reads since applications complained a lower performance.
> > So, I'm afraid this change will make another confusion to users. Could
> > you please apply the new bahavior only for STATX_DIOALIGN?
> > 
> 
> Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> weird case where DIO reads are allowed but not DIO writes.  So the question is
> whether this case actually matters, in which case we should make STATX_DIOALIGN
> distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> that doesn't really matter, in which case we could just fix it or make
> STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> insight here.  What sort of applications want DIO reads but not DIO writes?
> Is this common at all?

I think there's no specific application to use the LFS mode at this
moment, but I'd like to allow DIO read for zoned device which will be
used for Android devices.

> 
> - Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-07-31  3:08       ` Jaegeuk Kim
@ 2022-08-16  0:55         ` Eric Biggers
  2022-08-16  9:03           ` Dave Chinner
                             ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Eric Biggers @ 2022-08-16  0:55 UTC (permalink / raw)
  To: Jaegeuk Kim
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> On 07/25, Eric Biggers wrote:
> > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > On 07/22, Eric Biggers wrote:
> > > > From: Eric Biggers <ebiggers@google.com>
> > > > 
> > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > is likely to be surprising to applications, namely any application that
> > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > Given this, let's drop the support for DIO reads in this configuration.
> > > 
> > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > So, I'm afraid this change will make another confusion to users. Could
> > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > 
> > 
> > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > that doesn't really matter, in which case we could just fix it or make
> > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > insight here.  What sort of applications want DIO reads but not DIO writes?
> > Is this common at all?
> 
> I think there's no specific application to use the LFS mode at this
> moment, but I'd like to allow DIO read for zoned device which will be
> used for Android devices.
> 

So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
be useless on all Android devices?  That sounds undesirable.  Are you sure that
supporting DIO reads but not DIO writes actually works?  Does it not cause
problems for existing applications?

What we need to do is make a decision about whether this means we should build
in a stx_dio_direction field (indicating no support / readonly support /
writeonly support / readwrite support) into the API from the beginning.  If we
don't do that, then I don't think we could simply add such a field later, as the
statx_dio_*_align fields will have already been assigned their meaning.  I think
we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
statx_dio_ro_*_align fields.  That seems uglier than building a directional
indicator into the API from the beginning.  On the other hand, requiring all
programs to check stx_dio_direction would add complexity to using the API.

Any thoughts on this?

- Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-16  0:55         ` Eric Biggers
@ 2022-08-16  9:03           ` Dave Chinner
  2022-08-16 16:42             ` Andreas Dilger
  2022-08-20  0:06           ` Jaegeuk Kim
  2022-08-21  8:53           ` Christoph Hellwig
  2 siblings, 1 reply; 32+ messages in thread
From: Dave Chinner @ 2022-08-16  9:03 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jaegeuk Kim, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, linux-api, linux-fscrypt, linux-block, linux-kernel,
	Keith Busch

On Mon, Aug 15, 2022 at 05:55:45PM -0700, Eric Biggers wrote:
> On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> > On 07/25, Eric Biggers wrote:
> > > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > > On 07/22, Eric Biggers wrote:
> > > > > From: Eric Biggers <ebiggers@google.com>
> > > > > 
> > > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > > is likely to be surprising to applications, namely any application that
> > > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > > Given this, let's drop the support for DIO reads in this configuration.
> > > > 
> > > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > > So, I'm afraid this change will make another confusion to users. Could
> > > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > > 
> > > 
> > > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > > that doesn't really matter, in which case we could just fix it or make
> > > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > > insight here.  What sort of applications want DIO reads but not DIO writes?
> > > Is this common at all?
> > 
> > I think there's no specific application to use the LFS mode at this
> > moment, but I'd like to allow DIO read for zoned device which will be
> > used for Android devices.
> > 
> 
> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> be useless on all Android devices?  That sounds undesirable.  Are you sure that
> supporting DIO reads but not DIO writes actually works?  Does it not cause
> problems for existing applications?

What purpose does DIO in only one direction actually serve? All it
means is that we're forcibly mixing buffered and direct IO to the
same file and that simply never ends well from a data coherency POV.

Hence I'd suggest that mixing DIO reads and buffered writes like
this ends up exposing uses to the worst of both worlds - all of the
problems with none of the benefits...

> What we need to do is make a decision about whether this means we should build
> in a stx_dio_direction field (indicating no support / readonly support /
> writeonly support / readwrite support) into the API from the beginning.  If we
> don't do that, then I don't think we could simply add such a field later, as the
> statx_dio_*_align fields will have already been assigned their meaning.  I think
> we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
> statx_dio_ro_*_align fields.  That seems uglier than building a directional
> indicator into the API from the beginning.  On the other hand, requiring all
> programs to check stx_dio_direction would add complexity to using the API.
> 
> Any thoughts on this?

Decide whether partial, single direction DIO serves a useful purpose
before trying to work out what is needed in the API to indicate that
this sort of crazy will be supported....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-16  9:03           ` Dave Chinner
@ 2022-08-16 16:42             ` Andreas Dilger
  2022-08-19 23:09               ` Eric Biggers
  0 siblings, 1 reply; 32+ messages in thread
From: Andreas Dilger @ 2022-08-16 16:42 UTC (permalink / raw)
  To: Dave Chinner, Eric Biggers
  Cc: Jaegeuk Kim, linux-fsdevel, Ext4 Developers List,
	linux-f2fs-devel, xfs, linux-api, linux-fscrypt, linux-block,
	Linux Kernel Mailing List, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 5260 bytes --]

On Aug 16, 2022, at 3:03 AM, Dave Chinner <david@fromorbit.com> wrote:
> 
> On Mon, Aug 15, 2022 at 05:55:45PM -0700, Eric Biggers wrote:
>> On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
>>> On 07/25, Eric Biggers wrote:
>>>> On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
>>>>> On 07/22, Eric Biggers wrote:
>>>>>> From: Eric Biggers <ebiggers@google.com>
>>>>>> 
>>>>>> Currently, if an f2fs filesystem is mounted with the mode=lfs and
>>>>>> io_bits mount options, DIO reads are allowed but DIO writes are not.
>>>>>> Allowing DIO reads but not DIO writes is an unusual restriction, which
>>>>>> is likely to be surprising to applications, namely any application that
>>>>>> both reads and writes from a file (using O_DIRECT).  This behavior is
>>>>>> also incompatible with the proposed STATX_DIOALIGN extension to statx.
>>>>>> Given this, let's drop the support for DIO reads in this configuration.
>>>>> 
>>>>> IIRC, we allowed DIO reads since applications complained a lower performance.
>>>>> So, I'm afraid this change will make another confusion to users. Could
>>>>> you please apply the new bahavior only for STATX_DIOALIGN?
>>>>> 
>>>> 
>>>> Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
>>>> weird case where DIO reads are allowed but not DIO writes.  So the question is
>>>> whether this case actually matters, in which case we should make STATX_DIOALIGN
>>>> distinguish between DIO reads and DIO writes, or whether it's some odd edge case
>>>> that doesn't really matter, in which case we could just fix it or make
>>>> STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
>>>> insight here.  What sort of applications want DIO reads but not DIO writes?
>>>> Is this common at all?
>>> 
>>> I think there's no specific application to use the LFS mode at this
>>> moment, but I'd like to allow DIO read for zoned device which will be
>>> used for Android devices.
>>> 
>> 
>> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
>> be useless on all Android devices?  That sounds undesirable.  Are you sure that
>> supporting DIO reads but not DIO writes actually works?  Does it not cause
>> problems for existing applications?
> 
> What purpose does DIO in only one direction actually serve? All it
> means is that we're forcibly mixing buffered and direct IO to the
> same file and that simply never ends well from a data coherency POV.
> 
> Hence I'd suggest that mixing DIO reads and buffered writes like
> this ends up exposing uses to the worst of both worlds - all of the
> problems with none of the benefits...
> 
>> What we need to do is make a decision about whether this means we should
>> build in a stx_dio_direction field (indicating no support / readonly
>> support / writeonly support / readwrite support) into the API from the
>> beginning.  If we don't do that, then I don't think we could simply add
>> such a field later, as the statx_dio_*_align fields will have already
>> been assigned their meaning.  I think we'd instead have to "duplicate"
>> the API, with STATX_DIOROALIGN and statx_dio_ro_*_align fields.  That
>> seems uglier than building a directional indicator into the API from the
>> beginning.  On the other hand, requiring all programs to check
>> stx_dio_direction would add complexity to using the API.
>> 
>> Any thoughts on this?
> 
> Decide whether partial, single direction DIO serves a useful purpose
> before trying to work out what is needed in the API to indicate that
> this sort of crazy will be supported....

Using read-only O_DIRECT makes sense for backup and other filesystem
scanning tools that don't want to pollute the page cache of a system
(which may be in use by other programs) while reading many files once.

Using interfaces like posix_fadvise(FADV_DONTNEED) to drop file cache
afterward is both a hassle and problematic when reading very large files
that would push out more important pages from cache before the large
file's pages can be dropped.


IMHO, this whole discussion is putting the cart before the horse.
Changing existing (and useful) IO behavior to accommodate an API that
nobody has ever used, and is unlikely to even be widely used, doesn't
make sense to me.  Most applications won't check or care about the new
DIO size fields, since they've lived this long without statx() returning
this info, and will just pick a "large enough" size (4KB, 1MB, whatever)
that gives them the performance they need.  They *WILL* care if the app
is suddenly unable to read data from a file in ways that have worked for
a long time.

Even if apps are modified to check these new DIO size fields, and then
try to DIO write to a file in f2fs that doesn't allow it, then f2fs will
return an error, which is what it would have done without the statx()
changes, so no harm done AFAICS.

Even with a more-complex DIO status return that handles a "direction"
field (which IMHO is needlessly complex), there is always the potential
for a TOCTOU race where a file changes between checking and access, so
the userspace code would need to handle this.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-16 16:42             ` Andreas Dilger
@ 2022-08-19 23:09               ` Eric Biggers
  2022-08-23  3:22                 ` Andreas Dilger
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Biggers @ 2022-08-19 23:09 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Dave Chinner, Jaegeuk Kim, linux-fsdevel, Ext4 Developers List,
	linux-f2fs-devel, xfs, linux-api, linux-fscrypt, linux-block,
	Linux Kernel Mailing List, Keith Busch

On Tue, Aug 16, 2022 at 10:42:29AM -0600, Andreas Dilger wrote:
> 
> IMHO, this whole discussion is putting the cart before the horse.
> Changing existing (and useful) IO behavior to accommodate an API that
> nobody has ever used, and is unlikely to even be widely used, doesn't
> make sense to me.  Most applications won't check or care about the new
> DIO size fields, since they've lived this long without statx() returning
> this info, and will just pick a "large enough" size (4KB, 1MB, whatever)
> that gives them the performance they need.  They *WILL* care if the app
> is suddenly unable to read data from a file in ways that have worked for
> a long time.
> 
> Even if apps are modified to check these new DIO size fields, and then
> try to DIO write to a file in f2fs that doesn't allow it, then f2fs will
> return an error, which is what it would have done without the statx()
> changes, so no harm done AFAICS.
> 
> Even with a more-complex DIO status return that handles a "direction"
> field (which IMHO is needlessly complex), there is always the potential
> for a TOCTOU race where a file changes between checking and access, so
> the userspace code would need to handle this.
> 

I'm having trouble making sense of your argument here; you seem to be saying
that STATX_DIOALIGN isn't useful, so it doesn't matter if we design it
correctly?  That line of reasoning is concerning, as it's certainly intended to
be useful, and if it's not useful there's no point in adding it.

Are there any specific concerns that you have, besides TOCTOU races and the lack
of support for read-only DIO?

I don't think that TOCTOU races are a real concern here.  Generally DIO
constraints would only change if the application doing DIO intentionally does
something to the file, or if there are changes that involve the filesystem being
taken offline, e.g. the filesystem being mounted with significantly different
options or being moved to a different block device.  And, well, everything else
in stat()/statx() is subject to TOCTOU as well, but is still used...

- Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-16  0:55         ` Eric Biggers
  2022-08-16  9:03           ` Dave Chinner
@ 2022-08-20  0:06           ` Jaegeuk Kim
  2022-08-20  0:33             ` Eric Biggers
  2022-08-21  8:53           ` Christoph Hellwig
  2 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2022-08-20  0:06 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On 08/15, Eric Biggers wrote:
> On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> > On 07/25, Eric Biggers wrote:
> > > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > > On 07/22, Eric Biggers wrote:
> > > > > From: Eric Biggers <ebiggers@google.com>
> > > > > 
> > > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > > is likely to be surprising to applications, namely any application that
> > > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > > Given this, let's drop the support for DIO reads in this configuration.
> > > > 
> > > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > > So, I'm afraid this change will make another confusion to users. Could
> > > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > > 
> > > 
> > > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > > that doesn't really matter, in which case we could just fix it or make
> > > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > > insight here.  What sort of applications want DIO reads but not DIO writes?
> > > Is this common at all?
> > 
> > I think there's no specific application to use the LFS mode at this
> > moment, but I'd like to allow DIO read for zoned device which will be
> > used for Android devices.
> > 
> 
> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> be useless on all Android devices?  That sounds undesirable. 

Do you have a plan to adopt STATX_DIOALIGN in android?

> Are you sure that
> supporting DIO reads but not DIO writes actually works?  Does it not cause
> problems for existing applications?

I haven't heard any issue so far.

> 
> What we need to do is make a decision about whether this means we should build
> in a stx_dio_direction field (indicating no support / readonly support /
> writeonly support / readwrite support) into the API from the beginning.  If we
> don't do that, then I don't think we could simply add such a field later, as the
> statx_dio_*_align fields will have already been assigned their meaning.  I think
> we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
> statx_dio_ro_*_align fields.  That seems uglier than building a directional
> indicator into the API from the beginning.  On the other hand, requiring all
> programs to check stx_dio_direction would add complexity to using the API.
> 
> Any thoughts on this?

I haven't seen the details of the implementation tho, why not supporting it
only if filesystem has the same DIO RW policy?

> 
> - Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-20  0:06           ` Jaegeuk Kim
@ 2022-08-20  0:33             ` Eric Biggers
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Biggers @ 2022-08-20  0:33 UTC (permalink / raw)
  To: Jaegeuk Kim
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Fri, Aug 19, 2022 at 05:06:06PM -0700, Jaegeuk Kim wrote:
> On 08/15, Eric Biggers wrote:
> > On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> > > On 07/25, Eric Biggers wrote:
> > > > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > > > On 07/22, Eric Biggers wrote:
> > > > > > From: Eric Biggers <ebiggers@google.com>
> > > > > > 
> > > > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > > > is likely to be surprising to applications, namely any application that
> > > > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > > > Given this, let's drop the support for DIO reads in this configuration.
> > > > > 
> > > > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > > > So, I'm afraid this change will make another confusion to users. Could
> > > > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > > > 
> > > > 
> > > > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > > > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > > > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > > > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > > > that doesn't really matter, in which case we could just fix it or make
> > > > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > > > insight here.  What sort of applications want DIO reads but not DIO writes?
> > > > Is this common at all?
> > > 
> > > I think there's no specific application to use the LFS mode at this
> > > moment, but I'd like to allow DIO read for zoned device which will be
> > > used for Android devices.
> > > 
> > 
> > So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> > be useless on all Android devices?  That sounds undesirable. 
> 
> Do you have a plan to adopt STATX_DIOALIGN in android?

Nothing specific, but statx() is among the system calls that are supported by
Android's libc and that apps are allowed to use.  So STATX_DIOALIGN would become
available as well.  I'd prefer if it actually worked properly if apps, or
Android system components, do actually try to use it (or need to use it)...

> > What we need to do is make a decision about whether this means we should build
> > in a stx_dio_direction field (indicating no support / readonly support /
> > writeonly support / readwrite support) into the API from the beginning.  If we
> > don't do that, then I don't think we could simply add such a field later, as the
> > statx_dio_*_align fields will have already been assigned their meaning.  I think
> > we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
> > statx_dio_ro_*_align fields.  That seems uglier than building a directional
> > indicator into the API from the beginning.  On the other hand, requiring all
> > programs to check stx_dio_direction would add complexity to using the API.
> > 
> > Any thoughts on this?
> 
> I haven't seen the details of the implementation tho, why not supporting it
> only if filesystem has the same DIO RW policy?

As I've mentioned, we could of course make STATX_DIOALIGN report that DIO is
unsupported when the DIO support is read-only.

The thing that confuses me based on the responses so far is that there seem to
be two camps of people: (1) people who really want STATX_DIOALIGN, and who don't
think that read-only DIO support should exist so they don't want STATX_DIOALIGN
to support it; and (2) people who feel that read-only DIO support is perfectly
reasonable and useful, and who don't care whether STATX_DIOALIGN supports it
because they don't care about STATX_DIOALIGN in the first place.

While both camps seem to agree that STATX_DIOALIGN shouldn't support read-only
DIO, it is for totally contradictory reasons, so it's not very convincing.  We
should ensure that we have rock-solid reasoning before committing to a new UAPI
that will have to be permanently supported...

- Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-16  0:55         ` Eric Biggers
  2022-08-16  9:03           ` Dave Chinner
  2022-08-20  0:06           ` Jaegeuk Kim
@ 2022-08-21  8:53           ` Christoph Hellwig
  2 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-21  8:53 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jaegeuk Kim, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, linux-api, linux-fscrypt, linux-block, linux-kernel,
	Keith Busch

On Mon, Aug 15, 2022 at 05:55:45PM -0700, Eric Biggers wrote:
> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> be useless on all Android devices?  That sounds undesirable.  Are you sure that

We just need to fix f2fs to support direct I/O on zone devices.  There
is not good reason not to support it, in fact the way how zoned devices
requires appends with the Zone Append semantics makes direct I/O way
safer than how f2fs does direct I/O currently on non-zoned devices.

Until then just supporting direct I/O reads on zoned devices for f2fs
seems like a really bad choice given that it will lead to nasty cache
incoherency.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes
  2022-08-19 23:09               ` Eric Biggers
@ 2022-08-23  3:22                 ` Andreas Dilger
  0 siblings, 0 replies; 32+ messages in thread
From: Andreas Dilger @ 2022-08-23  3:22 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Dave Chinner, Jaegeuk Kim, linux-fsdevel, Ext4 Developers List,
	linux-f2fs-devel, xfs, linux-api, linux-fscrypt, linux-block,
	Linux Kernel Mailing List, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 3055 bytes --]

On Aug 19, 2022, at 5:09 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> 
> On Tue, Aug 16, 2022 at 10:42:29AM -0600, Andreas Dilger wrote:
>> 
>> IMHO, this whole discussion is putting the cart before the horse.
>> Changing existing (and useful) IO behavior to accommodate an API that
>> nobody has ever used, and is unlikely to even be widely used, doesn't
>> make sense to me.  Most applications won't check or care about the new
>> DIO size fields, since they've lived this long without statx() returning
>> this info, and will just pick a "large enough" size (4KB, 1MB, whatever)
>> that gives them the performance they need.  They *WILL* care if the app
>> is suddenly unable to read data from a file in ways that have worked for
>> a long time.
>> 
>> Even if apps are modified to check these new DIO size fields, and then
>> try to DIO write to a file in f2fs that doesn't allow it, then f2fs will
>> return an error, which is what it would have done without the statx()
>> changes, so no harm done AFAICS.
>> 
>> Even with a more-complex DIO status return that handles a "direction"
>> field (which IMHO is needlessly complex), there is always the potential
>> for a TOCTOU race where a file changes between checking and access, so
>> the userspace code would need to handle this.
> 
> I'm having trouble making sense of your argument here; you seem to be saying
> that STATX_DIOALIGN isn't useful, so it doesn't matter if we design it
> correctly?  That line of reasoning is concerning, as it's certainly intended
> to be useful, and if it's not useful there's no point in adding it.
> 
> Are there any specific concerns that you have, besides TOCTOU races and the
> lack of support for read-only DIO?

My main concern is disabling useful functionality that exists today to appease
the new DIO size API.  Whether STATX_DIOALIGN will become widely used by
applications or not is hard to say at this point.

If there were separate STATX_DIOREAD and STATX_DIOWRITE flags in the returned
data, and the alignment is provided as it is today, that would be enough IMHO
to address the original use case without significant complexity.

> I don't think that TOCTOU races are a real concern here.  Generally DIO
> constraints would only change if the application doing DIO intentionally does
> something to the file, or if there are changes that involve the filesystem
> being taken offline, e.g. the filesystem being mounted with significantly
> different options or being moved to a different block device.  And, well,
> everything else in stat()/statx() is subject to TOCTOU as well, but is still
> used...

I was thinking of background filesystem operations like compression, LVM
migration to new storage with a different sector size, etc. that may change
the DIO characteristics of the file even while it is open.  Not that I think
this will happen frequently, but it is possible, and applications shouldn't
explode if the DIO parameters change and they get an error.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 0/9] make statx() return DIO alignment information
  2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
                   ` (8 preceding siblings ...)
  2022-07-22  7:12 ` [PATCH v4 9/9] xfs: " Eric Biggers
@ 2022-08-26 17:19 ` Jeff Layton
  2022-08-27  7:07   ` Eric Biggers
  9 siblings, 1 reply; 32+ messages in thread
From: Jeff Layton @ 2022-08-26 17:19 UTC (permalink / raw)
  To: Eric Biggers, linux-fsdevel
  Cc: linux-ext4, linux-f2fs-devel, linux-xfs, linux-api,
	linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Fri, 2022-07-22 at 00:12 -0700, Eric Biggers wrote:
> This patchset makes the statx() system call return direct I/O (DIO)
> alignment information.  This allows userspace to easily determine
> whether a file supports DIO, and if so with what alignment restrictions.
> 
> Patch 1 adds the basic VFS support for STATX_DIOALIGN.  Patch 2 wires it
> up for all block device files.  The remaining patches wire it up for
> regular files on ext4, f2fs, and xfs.  Support for regular files on
> other filesystems can be added later.
> 
> I've also written a man-pages patch, which I'm sending separately.
> 
> Note, f2fs has one corner case where DIO reads are allowed but not DIO
> writes.  The proposed statx fields can't represent this.  My proposal
> (patch 6) is to just eliminate this case, as it seems much too weird.
> But I'd appreciate any feedback on that part.
> 
> This patchset applies to v5.19-rc7.
> 
> Changed v3 => v4:
>    - Added xfs support.
> 
>    - Moved the helper function for block devices into block/bdev.c.
>    
>    - Adjusted the ext4 patch to not introduce a bug where misaligned DIO
>      starts being allowed on encrypted files when it gets combined with
>      the patch "iomap: add support for dma aligned direct-io" that is
>      queued in the block tree for 5.20.
> 
>    - Made a simplification in fscrypt_dio_supported().
> 
> Changed v2 => v3:
>    - Dropped the stx_offset_align_optimal field, since its purpose
>      wasn't clearly distinguished from the existing stx_blksize.
> 
>    - Renamed STATX_IOALIGN to STATX_DIOALIGN, to reflect the new focus
>      on DIO only.
> 
>    - Similarly, renamed stx_{mem,offset}_align_dio to
>      stx_dio_{mem,offset}_align, to reflect the new focus on DIO only.
> 
>    - Wired up STATX_DIOALIGN on block device files.
> 
> Changed v1 => v2:
>    - No changes.
> 
> Eric Biggers (9):
>   statx: add direct I/O alignment information
>   vfs: support STATX_DIOALIGN on block devices
>   fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN
>   ext4: support STATX_DIOALIGN
>   f2fs: move f2fs_force_buffered_io() into file.c
>   f2fs: don't allow DIO reads but not DIO writes
>   f2fs: simplify f2fs_force_buffered_io()
>   f2fs: support STATX_DIOALIGN
>   xfs: support STATX_DIOALIGN
> 
>  block/bdev.c              | 25 ++++++++++++++++++++
>  fs/crypto/inline_crypt.c  | 49 +++++++++++++++++++--------------------
>  fs/ext4/ext4.h            |  1 +
>  fs/ext4/file.c            | 37 ++++++++++++++++++++---------
>  fs/ext4/inode.c           | 36 ++++++++++++++++++++++++++++
>  fs/f2fs/f2fs.h            | 45 -----------------------------------
>  fs/f2fs/file.c            | 45 ++++++++++++++++++++++++++++++++++-
>  fs/stat.c                 | 14 +++++++++++
>  fs/xfs/xfs_iops.c         |  9 +++++++
>  include/linux/blkdev.h    |  4 ++++
>  include/linux/fscrypt.h   |  7 ++----
>  include/linux/stat.h      |  2 ++
>  include/uapi/linux/stat.h |  4 +++-
>  13 files changed, 190 insertions(+), 88 deletions(-)
> 
> base-commit: ff6992735ade75aae3e35d16b17da1008d753d28

Hi Eric,

Can I ask what your plans are with this set? I didn't see it in
linux-next yet, so I wasn't sure when you were looking to get it merged.
I'm working on patches to add a new statx field for the i_version
counter as well and I want to make sure that our work doesn't collide.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 0/9] make statx() return DIO alignment information
  2022-08-26 17:19 ` [PATCH v4 0/9] make statx() return DIO alignment information Jeff Layton
@ 2022-08-27  7:07   ` Eric Biggers
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Biggers @ 2022-08-27  7:07 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	linux-api, linux-fscrypt, linux-block, linux-kernel, Keith Busch

On Fri, Aug 26, 2022 at 01:19:37PM -0400, Jeff Layton wrote:
> On Fri, 2022-07-22 at 00:12 -0700, Eric Biggers wrote:
> > This patchset makes the statx() system call return direct I/O (DIO)
> > alignment information.  This allows userspace to easily determine
> > whether a file supports DIO, and if so with what alignment restrictions.
> > 
> > Patch 1 adds the basic VFS support for STATX_DIOALIGN.  Patch 2 wires it
> > up for all block device files.  The remaining patches wire it up for
> > regular files on ext4, f2fs, and xfs.  Support for regular files on
> > other filesystems can be added later.
> > 
> > I've also written a man-pages patch, which I'm sending separately.
> > 
> > Note, f2fs has one corner case where DIO reads are allowed but not DIO
> > writes.  The proposed statx fields can't represent this.  My proposal
> > (patch 6) is to just eliminate this case, as it seems much too weird.
> > But I'd appreciate any feedback on that part.
> > 
> > This patchset applies to v5.19-rc7.
> > 
> > Changed v3 => v4:
> >    - Added xfs support.
> > 
> >    - Moved the helper function for block devices into block/bdev.c.
> >    
> >    - Adjusted the ext4 patch to not introduce a bug where misaligned DIO
> >      starts being allowed on encrypted files when it gets combined with
> >      the patch "iomap: add support for dma aligned direct-io" that is
> >      queued in the block tree for 5.20.
> > 
> >    - Made a simplification in fscrypt_dio_supported().
> > 
> > Changed v2 => v3:
> >    - Dropped the stx_offset_align_optimal field, since its purpose
> >      wasn't clearly distinguished from the existing stx_blksize.
> > 
> >    - Renamed STATX_IOALIGN to STATX_DIOALIGN, to reflect the new focus
> >      on DIO only.
> > 
> >    - Similarly, renamed stx_{mem,offset}_align_dio to
> >      stx_dio_{mem,offset}_align, to reflect the new focus on DIO only.
> > 
> >    - Wired up STATX_DIOALIGN on block device files.
> > 
> > Changed v1 => v2:
> >    - No changes.
> > 
> > Eric Biggers (9):
> >   statx: add direct I/O alignment information
> >   vfs: support STATX_DIOALIGN on block devices
> >   fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN
> >   ext4: support STATX_DIOALIGN
> >   f2fs: move f2fs_force_buffered_io() into file.c
> >   f2fs: don't allow DIO reads but not DIO writes
> >   f2fs: simplify f2fs_force_buffered_io()
> >   f2fs: support STATX_DIOALIGN
> >   xfs: support STATX_DIOALIGN
> > 
> >  block/bdev.c              | 25 ++++++++++++++++++++
> >  fs/crypto/inline_crypt.c  | 49 +++++++++++++++++++--------------------
> >  fs/ext4/ext4.h            |  1 +
> >  fs/ext4/file.c            | 37 ++++++++++++++++++++---------
> >  fs/ext4/inode.c           | 36 ++++++++++++++++++++++++++++
> >  fs/f2fs/f2fs.h            | 45 -----------------------------------
> >  fs/f2fs/file.c            | 45 ++++++++++++++++++++++++++++++++++-
> >  fs/stat.c                 | 14 +++++++++++
> >  fs/xfs/xfs_iops.c         |  9 +++++++
> >  include/linux/blkdev.h    |  4 ++++
> >  include/linux/fscrypt.h   |  7 ++----
> >  include/linux/stat.h      |  2 ++
> >  include/uapi/linux/stat.h |  4 +++-
> >  13 files changed, 190 insertions(+), 88 deletions(-)
> > 
> > base-commit: ff6992735ade75aae3e35d16b17da1008d753d28
> 
> Hi Eric,
> 
> Can I ask what your plans are with this set? I didn't see it in
> linux-next yet, so I wasn't sure when you were looking to get it merged.
> I'm working on patches to add a new statx field for the i_version
> counter as well and I want to make sure that our work doesn't collide.
> 

I've just sent v5.  I guess I'll try to get it merged for 6.1.  We were a bit
stuck on the read-only DIO issue.  All things considered though, including that
Christoph thinks it's possible for f2fs to support DIO writes on zoned block
devices, I'm willing to bet that read-only DIO doesn't really matter enough for
it to be worth it to add a direction field to STATX_DIOALIGN (which would make
it harder to use STATX_DIOALIGN, as the field would always have to be checked).

- Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-08-27  7:08 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-22  7:12 [PATCH v4 0/9] make statx() return DIO alignment information Eric Biggers
2022-07-22  7:12 ` [PATCH v4 1/9] statx: add direct I/O " Eric Biggers
2022-07-22 16:32   ` Darrick J. Wong
2022-07-22 17:31   ` Martin K. Petersen
2022-07-22  7:12 ` [PATCH v4 2/9] vfs: support STATX_DIOALIGN on block devices Eric Biggers
2022-07-22  8:10   ` Christoph Hellwig
2022-07-22 17:32   ` Martin K. Petersen
2022-07-22  7:12 ` [PATCH v4 3/9] fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN Eric Biggers
2022-07-22  8:10   ` Christoph Hellwig
2022-07-22  7:12 ` [PATCH v4 4/9] ext4: support STATX_DIOALIGN Eric Biggers
2022-07-22 17:05   ` Theodore Ts'o
2022-07-22  7:12 ` [PATCH v4 5/9] f2fs: move f2fs_force_buffered_io() into file.c Eric Biggers
2022-07-22  7:12 ` [PATCH v4 6/9] f2fs: don't allow DIO reads but not DIO writes Eric Biggers
2022-07-24  2:01   ` Jaegeuk Kim
2022-07-25 18:12     ` Eric Biggers
2022-07-25 23:58       ` Andreas Dilger
2022-07-31  3:08       ` Jaegeuk Kim
2022-08-16  0:55         ` Eric Biggers
2022-08-16  9:03           ` Dave Chinner
2022-08-16 16:42             ` Andreas Dilger
2022-08-19 23:09               ` Eric Biggers
2022-08-23  3:22                 ` Andreas Dilger
2022-08-20  0:06           ` Jaegeuk Kim
2022-08-20  0:33             ` Eric Biggers
2022-08-21  8:53           ` Christoph Hellwig
2022-07-22  7:12 ` [PATCH v4 7/9] f2fs: simplify f2fs_force_buffered_io() Eric Biggers
2022-07-22  7:12 ` [PATCH v4 8/9] f2fs: support STATX_DIOALIGN Eric Biggers
2022-07-22  7:12 ` [PATCH v4 9/9] xfs: " Eric Biggers
2022-07-22  8:11   ` Christoph Hellwig
2022-07-22 16:24   ` Darrick J. Wong
2022-08-26 17:19 ` [PATCH v4 0/9] make statx() return DIO alignment information Jeff Layton
2022-08-27  7:07   ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).