All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20  7:12 ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

This series adds support for this.  There are multiple use cases for DIO
on encrypted files, but avoiding double caching on loopback devices
located in an encrypted directory is the main one currently.

Previous versions of this series were sent out by Satya Tangirala.
I've cleaned up a few things since Satya's last version, v9
(https://lore.kernel.org/all/20210604210908.2105870-1-satyat@google.com/T/#u).
But more notably, I've made a couple simplifications.

First, since f2fs has now been converted to use iomap for DIO, I've
dropped the patch which added fscrypt support to fs/direct-io.c.

Second, I've returned to the original design where DIO requests must be
fully aligned to the FS block size in terms of file position, length,
and memory buffers.  Satya previously was pursuing a slightly different
design, where the memory buffers (but not the file position and length)
were allowed to be aligned to just the block device logical block size.
This was at the request of Dave Chinner on v4 and v6 of the patchset
(https://lore.kernel.org/linux-fscrypt/20200720233739.824943-1-satyat@google.com/T/#u
and
https://lore.kernel.org/linux-fscrypt/20200724184501.1651378-1-satyat@google.com/T/#u).

I believe that approach is a dead end, for two reasons.  First, it
necessarily causes it to be possible that crypto data units span bvecs.
Splits cannot occur at such locations; however the block layer currently
assumes that bios can be split at any bvec boundary.  Changing that is
quite difficult, as Satya's v9 patchset demonstrated.  This is not an
issue if we require FS block aligned buffers instead.  Second, it
doesn't change the fact that FS block alignment is still required for
the file position and I/O length; this is unavoidable due to the
granularity of encryption being the FS block size.  So, it seems that
relaxing the memory buffer alignment requirement wouldn't make things
meaningfully easier for applications, which raises the question of why
we would bother with it in the first place.

Christoph Hellwig also said that he much prefers that fscrypt DIO be
supported without sector-only alignment to start:
https://lore.kernel.org/r/YPu+88KReGlt94o3@infradead.org

Given the above, as far as I know the only remaining objection to this
patchset would be that DIO constraints aren't sufficiently discoverable
by userspace.  Now, to put this in context, this is a longstanding issue
with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
not specific to this feature, and it doesn't actually seem to be too
important in practice; many other filesystem features place constraints
on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
(And for better or worse, many systems using fscrypt already have
out-of-tree patches that enable DIO support, and people don't seem to
have trouble with the FS block size alignment requirement.)

I plan to propose a new generic ioctl to address the issue of DIO
constraints being insufficiently discoverable.  But until then, I'm
wondering if people are willing to consider this patchset again, or
whether it is considered blocked by this issue alone.  (And if this
patchset is still unacceptable, would it be acceptable with f2fs support
only, given that f2fs *already* only allows FS block size aligned DIO?)

Eric Biggers (5):
  fscrypt: add functions for direct I/O support
  iomap: support direct I/O with fscrypt using blk-crypto
  ext4: support direct I/O with fscrypt using blk-crypto
  f2fs: support direct I/O with fscrypt using blk-crypto
  fscrypt: update documentation for direct I/O support

 Documentation/filesystems/fscrypt.rst | 25 +++++++-
 fs/crypto/crypto.c                    |  8 +++
 fs/crypto/inline_crypt.c              | 90 +++++++++++++++++++++++++++
 fs/ext4/file.c                        | 10 +--
 fs/ext4/inode.c                       |  7 +++
 fs/f2fs/data.c                        |  7 +++
 fs/f2fs/f2fs.h                        |  6 +-
 fs/iomap/direct-io.c                  |  6 ++
 include/linux/fscrypt.h               | 18 ++++++
 9 files changed, 170 insertions(+), 7 deletions(-)


base-commit: 1d1df41c5a33359a00e919d54eaebfb789711fdc
-- 
2.34.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20  7:12 ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fsdevel,
	Jaegeuk Kim, linux-ext4

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

This series adds support for this.  There are multiple use cases for DIO
on encrypted files, but avoiding double caching on loopback devices
located in an encrypted directory is the main one currently.

Previous versions of this series were sent out by Satya Tangirala.
I've cleaned up a few things since Satya's last version, v9
(https://lore.kernel.org/all/20210604210908.2105870-1-satyat@google.com/T/#u).
But more notably, I've made a couple simplifications.

First, since f2fs has now been converted to use iomap for DIO, I've
dropped the patch which added fscrypt support to fs/direct-io.c.

Second, I've returned to the original design where DIO requests must be
fully aligned to the FS block size in terms of file position, length,
and memory buffers.  Satya previously was pursuing a slightly different
design, where the memory buffers (but not the file position and length)
were allowed to be aligned to just the block device logical block size.
This was at the request of Dave Chinner on v4 and v6 of the patchset
(https://lore.kernel.org/linux-fscrypt/20200720233739.824943-1-satyat@google.com/T/#u
and
https://lore.kernel.org/linux-fscrypt/20200724184501.1651378-1-satyat@google.com/T/#u).

I believe that approach is a dead end, for two reasons.  First, it
necessarily causes it to be possible that crypto data units span bvecs.
Splits cannot occur at such locations; however the block layer currently
assumes that bios can be split at any bvec boundary.  Changing that is
quite difficult, as Satya's v9 patchset demonstrated.  This is not an
issue if we require FS block aligned buffers instead.  Second, it
doesn't change the fact that FS block alignment is still required for
the file position and I/O length; this is unavoidable due to the
granularity of encryption being the FS block size.  So, it seems that
relaxing the memory buffer alignment requirement wouldn't make things
meaningfully easier for applications, which raises the question of why
we would bother with it in the first place.

Christoph Hellwig also said that he much prefers that fscrypt DIO be
supported without sector-only alignment to start:
https://lore.kernel.org/r/YPu+88KReGlt94o3@infradead.org

Given the above, as far as I know the only remaining objection to this
patchset would be that DIO constraints aren't sufficiently discoverable
by userspace.  Now, to put this in context, this is a longstanding issue
with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
not specific to this feature, and it doesn't actually seem to be too
important in practice; many other filesystem features place constraints
on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
(And for better or worse, many systems using fscrypt already have
out-of-tree patches that enable DIO support, and people don't seem to
have trouble with the FS block size alignment requirement.)

I plan to propose a new generic ioctl to address the issue of DIO
constraints being insufficiently discoverable.  But until then, I'm
wondering if people are willing to consider this patchset again, or
whether it is considered blocked by this issue alone.  (And if this
patchset is still unacceptable, would it be acceptable with f2fs support
only, given that f2fs *already* only allows FS block size aligned DIO?)

Eric Biggers (5):
  fscrypt: add functions for direct I/O support
  iomap: support direct I/O with fscrypt using blk-crypto
  ext4: support direct I/O with fscrypt using blk-crypto
  f2fs: support direct I/O with fscrypt using blk-crypto
  fscrypt: update documentation for direct I/O support

 Documentation/filesystems/fscrypt.rst | 25 +++++++-
 fs/crypto/crypto.c                    |  8 +++
 fs/crypto/inline_crypt.c              | 90 +++++++++++++++++++++++++++
 fs/ext4/file.c                        | 10 +--
 fs/ext4/inode.c                       |  7 +++
 fs/f2fs/data.c                        |  7 +++
 fs/f2fs/f2fs.h                        |  6 +-
 fs/iomap/direct-io.c                  |  6 ++
 include/linux/fscrypt.h               | 18 ++++++
 9 files changed, 170 insertions(+), 7 deletions(-)


base-commit: 1d1df41c5a33359a00e919d54eaebfb789711fdc
-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v10 1/5] fscrypt: add functions for direct I/O support
  2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  7:12   ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

In preparation for supporting this, add the following functions:

- fscrypt_dio_unsupported() checks whether a DIO request is unsupported
  due to encryption constraints.  Encrypted files will only support DIO
  when inline encryption is used and the I/O request is properly
  aligned; this function checks these preconditions.

- fscrypt_limit_io_blocks() limits the length of a bio to avoid crossing
  a place in the file that a bio with an encryption context cannot
  cross due to a DUN discontiguity.  This function is needed by
  filesystems that use the iomap DIO implementation (which operates
  directly on logical ranges, so it won't use fscrypt_mergeable_bio())
  and that support FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/crypto/crypto.c       |  8 ++++
 fs/crypto/inline_crypt.c | 90 ++++++++++++++++++++++++++++++++++++++++
 include/linux/fscrypt.h  | 18 ++++++++
 3 files changed, 116 insertions(+)

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 4ef3f714046aa..4fcca79f39aeb 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -69,6 +69,14 @@ void fscrypt_free_bounce_page(struct page *bounce_page)
 }
 EXPORT_SYMBOL(fscrypt_free_bounce_page);
 
+/*
+ * Generate the IV for the given logical block number within the given file.
+ * For filenames encryption, lblk_num == 0.
+ *
+ * Keep this in sync with fscrypt_limit_io_blocks().  fscrypt_limit_io_blocks()
+ * needs to know about any IV generation methods where the low bits of IV don't
+ * simply contain the lblk_num (e.g., IV_INO_LBLK_32).
+ */
 void fscrypt_generate_iv(union fscrypt_iv *iv, u64 lblk_num,
 			 const struct fscrypt_info *ci)
 {
diff --git a/fs/crypto/inline_crypt.c b/fs/crypto/inline_crypt.c
index c57bebfa48fea..304ae414cbbf2 100644
--- a/fs/crypto/inline_crypt.c
+++ b/fs/crypto/inline_crypt.c
@@ -17,6 +17,7 @@
 #include <linux/buffer_head.h>
 #include <linux/sched/mm.h>
 #include <linux/slab.h>
+#include <linux/uio.h>
 
 #include "fscrypt_private.h"
 
@@ -315,6 +316,10 @@ EXPORT_SYMBOL_GPL(fscrypt_set_bio_crypt_ctx_bh);
  *
  * fscrypt_set_bio_crypt_ctx() must have already been called on the bio.
  *
+ * This function isn't required in cases where crypto-mergeability is ensured in
+ * another way, such as I/O targeting only a single file (and thus a single key)
+ * combined with fscrypt_limit_io_blocks() to ensure DUN contiguity.
+ *
  * Return: true iff the I/O is mergeable
  */
 bool fscrypt_mergeable_bio(struct bio *bio, const struct inode *inode,
@@ -363,3 +368,88 @@ bool fscrypt_mergeable_bio_bh(struct bio *bio,
 	return fscrypt_mergeable_bio(bio, inode, next_lblk);
 }
 EXPORT_SYMBOL_GPL(fscrypt_mergeable_bio_bh);
+
+/**
+ * fscrypt_dio_unsupported() - check whether a DIO (direct I/O) request is
+ *			       unsupported due to encryption constraints
+ * @iocb: the file and position the I/O is targeting
+ * @iter: the I/O data segment(s)
+ *
+ * Return: true if DIO is unsupported
+ */
+bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter)
+{
+	const struct inode *inode = file_inode(iocb->ki_filp);
+	const unsigned int blocksize = i_blocksize(inode);
+
+	/* If the file is unencrypted, no veto from us. */
+	if (!fscrypt_needs_contents_encryption(inode))
+		return false;
+
+	/* We only support DIO with inline crypto, not fs-layer crypto. */
+	if (!fscrypt_inode_uses_inline_crypto(inode))
+		return true;
+
+	/*
+	 * Since the granularity of encryption is filesystem blocks, the file
+	 * position and total I/O length must be aligned to the filesystem block
+	 * size -- not just to the block device's logical block size as is
+	 * traditionally the case for DIO on many filesystems (not including
+	 * f2fs, which only allows filesystem block aligned DIO anyway).
+	 *
+	 * We also require that the user-provided memory buffers be block
+	 * aligned too.  It is simpler to have a single alignment value required
+	 * for all properties of the I/O, as is normally the case for DIO.
+	 * Also, allowing less aligned buffers would also imply that a data unit
+	 * could cross bvecs, which would greatly complicate the I/O stack,
+	 * which assumes that bios can be split at any bvec boundary.
+	 */
+	if (!IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter), blocksize))
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(fscrypt_dio_unsupported);
+
+/**
+ * fscrypt_limit_io_blocks() - limit I/O blocks to avoid discontiguous DUNs
+ * @inode: the file on which I/O is being done
+ * @lblk: the block at which the I/O is being started from
+ * @nr_blocks: the number of blocks we want to submit starting at @lblk
+ *
+ * Determine the limit to the number of blocks that can be submitted in a bio
+ * targeting @lblk without causing a data unit number (DUN) discontiguity.
+ *
+ * This is normally just @nr_blocks, as normally the DUNs just increment along
+ * with the logical blocks.  (Or the file is not encrypted.)
+ *
+ * In rare cases, fscrypt can be using an IV generation method that allows the
+ * DUN to wrap around within logically contiguous blocks, and that wraparound
+ * will occur.  If this happens, a value less than @nr_blocks will be returned
+ * so that the wraparound doesn't occur in the middle of a bio, which would
+ * cause encryption/decryption to produce the wrong results.
+ *
+ * Return: the actual number of blocks that can be submitted
+ */
+u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks)
+{
+	const struct fscrypt_info *ci = inode->i_crypt_info;
+	u32 dun;
+
+	if (!fscrypt_inode_uses_inline_crypto(inode))
+		return nr_blocks;
+
+	if (nr_blocks <= 1)
+		return nr_blocks;
+
+	if (!(fscrypt_policy_flags(&ci->ci_policy) &
+	      FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32))
+		return nr_blocks;
+
+	/* With IV_INO_LBLK_32, the DUN can wrap around from U32_MAX to 0. */
+
+	dun = ci->ci_hashed_ino + lblk;
+
+	return min_t(u64, nr_blocks, (u64)U32_MAX + 1 - dun);
+}
+EXPORT_SYMBOL_GPL(fscrypt_limit_io_blocks);
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 91ea9477e9bd2..87ec5f63b0a82 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -714,6 +714,10 @@ bool fscrypt_mergeable_bio(struct bio *bio, const struct inode *inode,
 bool fscrypt_mergeable_bio_bh(struct bio *bio,
 			      const struct buffer_head *next_bh);
 
+bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter);
+
+u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks);
+
 #else /* CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
 
 static inline bool __fscrypt_inode_uses_inline_crypto(const struct inode *inode)
@@ -742,6 +746,20 @@ static inline bool fscrypt_mergeable_bio_bh(struct bio *bio,
 {
 	return true;
 }
+
+static inline bool fscrypt_dio_unsupported(struct kiocb *iocb,
+					   struct iov_iter *iter)
+{
+	const struct inode *inode = file_inode(iocb->ki_filp);
+
+	return fscrypt_needs_contents_encryption(inode);
+}
+
+static inline u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk,
+					  u64 nr_blocks)
+{
+	return nr_blocks;
+}
 #endif /* !CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [f2fs-dev] [PATCH v10 1/5] fscrypt: add functions for direct I/O support
@ 2022-01-20  7:12   ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fsdevel,
	Jaegeuk Kim, Satya Tangirala, linux-ext4

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

In preparation for supporting this, add the following functions:

- fscrypt_dio_unsupported() checks whether a DIO request is unsupported
  due to encryption constraints.  Encrypted files will only support DIO
  when inline encryption is used and the I/O request is properly
  aligned; this function checks these preconditions.

- fscrypt_limit_io_blocks() limits the length of a bio to avoid crossing
  a place in the file that a bio with an encryption context cannot
  cross due to a DUN discontiguity.  This function is needed by
  filesystems that use the iomap DIO implementation (which operates
  directly on logical ranges, so it won't use fscrypt_mergeable_bio())
  and that support FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/crypto/crypto.c       |  8 ++++
 fs/crypto/inline_crypt.c | 90 ++++++++++++++++++++++++++++++++++++++++
 include/linux/fscrypt.h  | 18 ++++++++
 3 files changed, 116 insertions(+)

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 4ef3f714046aa..4fcca79f39aeb 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -69,6 +69,14 @@ void fscrypt_free_bounce_page(struct page *bounce_page)
 }
 EXPORT_SYMBOL(fscrypt_free_bounce_page);
 
+/*
+ * Generate the IV for the given logical block number within the given file.
+ * For filenames encryption, lblk_num == 0.
+ *
+ * Keep this in sync with fscrypt_limit_io_blocks().  fscrypt_limit_io_blocks()
+ * needs to know about any IV generation methods where the low bits of IV don't
+ * simply contain the lblk_num (e.g., IV_INO_LBLK_32).
+ */
 void fscrypt_generate_iv(union fscrypt_iv *iv, u64 lblk_num,
 			 const struct fscrypt_info *ci)
 {
diff --git a/fs/crypto/inline_crypt.c b/fs/crypto/inline_crypt.c
index c57bebfa48fea..304ae414cbbf2 100644
--- a/fs/crypto/inline_crypt.c
+++ b/fs/crypto/inline_crypt.c
@@ -17,6 +17,7 @@
 #include <linux/buffer_head.h>
 #include <linux/sched/mm.h>
 #include <linux/slab.h>
+#include <linux/uio.h>
 
 #include "fscrypt_private.h"
 
@@ -315,6 +316,10 @@ EXPORT_SYMBOL_GPL(fscrypt_set_bio_crypt_ctx_bh);
  *
  * fscrypt_set_bio_crypt_ctx() must have already been called on the bio.
  *
+ * This function isn't required in cases where crypto-mergeability is ensured in
+ * another way, such as I/O targeting only a single file (and thus a single key)
+ * combined with fscrypt_limit_io_blocks() to ensure DUN contiguity.
+ *
  * Return: true iff the I/O is mergeable
  */
 bool fscrypt_mergeable_bio(struct bio *bio, const struct inode *inode,
@@ -363,3 +368,88 @@ bool fscrypt_mergeable_bio_bh(struct bio *bio,
 	return fscrypt_mergeable_bio(bio, inode, next_lblk);
 }
 EXPORT_SYMBOL_GPL(fscrypt_mergeable_bio_bh);
+
+/**
+ * fscrypt_dio_unsupported() - check whether a DIO (direct I/O) request is
+ *			       unsupported due to encryption constraints
+ * @iocb: the file and position the I/O is targeting
+ * @iter: the I/O data segment(s)
+ *
+ * Return: true if DIO is unsupported
+ */
+bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter)
+{
+	const struct inode *inode = file_inode(iocb->ki_filp);
+	const unsigned int blocksize = i_blocksize(inode);
+
+	/* If the file is unencrypted, no veto from us. */
+	if (!fscrypt_needs_contents_encryption(inode))
+		return false;
+
+	/* We only support DIO with inline crypto, not fs-layer crypto. */
+	if (!fscrypt_inode_uses_inline_crypto(inode))
+		return true;
+
+	/*
+	 * Since the granularity of encryption is filesystem blocks, the file
+	 * position and total I/O length must be aligned to the filesystem block
+	 * size -- not just to the block device's logical block size as is
+	 * traditionally the case for DIO on many filesystems (not including
+	 * f2fs, which only allows filesystem block aligned DIO anyway).
+	 *
+	 * We also require that the user-provided memory buffers be block
+	 * aligned too.  It is simpler to have a single alignment value required
+	 * for all properties of the I/O, as is normally the case for DIO.
+	 * Also, allowing less aligned buffers would also imply that a data unit
+	 * could cross bvecs, which would greatly complicate the I/O stack,
+	 * which assumes that bios can be split at any bvec boundary.
+	 */
+	if (!IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter), blocksize))
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(fscrypt_dio_unsupported);
+
+/**
+ * fscrypt_limit_io_blocks() - limit I/O blocks to avoid discontiguous DUNs
+ * @inode: the file on which I/O is being done
+ * @lblk: the block at which the I/O is being started from
+ * @nr_blocks: the number of blocks we want to submit starting at @lblk
+ *
+ * Determine the limit to the number of blocks that can be submitted in a bio
+ * targeting @lblk without causing a data unit number (DUN) discontiguity.
+ *
+ * This is normally just @nr_blocks, as normally the DUNs just increment along
+ * with the logical blocks.  (Or the file is not encrypted.)
+ *
+ * In rare cases, fscrypt can be using an IV generation method that allows the
+ * DUN to wrap around within logically contiguous blocks, and that wraparound
+ * will occur.  If this happens, a value less than @nr_blocks will be returned
+ * so that the wraparound doesn't occur in the middle of a bio, which would
+ * cause encryption/decryption to produce the wrong results.
+ *
+ * Return: the actual number of blocks that can be submitted
+ */
+u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks)
+{
+	const struct fscrypt_info *ci = inode->i_crypt_info;
+	u32 dun;
+
+	if (!fscrypt_inode_uses_inline_crypto(inode))
+		return nr_blocks;
+
+	if (nr_blocks <= 1)
+		return nr_blocks;
+
+	if (!(fscrypt_policy_flags(&ci->ci_policy) &
+	      FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32))
+		return nr_blocks;
+
+	/* With IV_INO_LBLK_32, the DUN can wrap around from U32_MAX to 0. */
+
+	dun = ci->ci_hashed_ino + lblk;
+
+	return min_t(u64, nr_blocks, (u64)U32_MAX + 1 - dun);
+}
+EXPORT_SYMBOL_GPL(fscrypt_limit_io_blocks);
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 91ea9477e9bd2..87ec5f63b0a82 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -714,6 +714,10 @@ bool fscrypt_mergeable_bio(struct bio *bio, const struct inode *inode,
 bool fscrypt_mergeable_bio_bh(struct bio *bio,
 			      const struct buffer_head *next_bh);
 
+bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter);
+
+u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks);
+
 #else /* CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
 
 static inline bool __fscrypt_inode_uses_inline_crypto(const struct inode *inode)
@@ -742,6 +746,20 @@ static inline bool fscrypt_mergeable_bio_bh(struct bio *bio,
 {
 	return true;
 }
+
+static inline bool fscrypt_dio_unsupported(struct kiocb *iocb,
+					   struct iov_iter *iter)
+{
+	const struct inode *inode = file_inode(iocb->ki_filp);
+
+	return fscrypt_needs_contents_encryption(inode);
+}
+
+static inline u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk,
+					  u64 nr_blocks)
+{
+	return nr_blocks;
+}
 #endif /* !CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
 
 /**
-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v10 2/5] iomap: support direct I/O with fscrypt using blk-crypto
  2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  7:12   ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

Add support for this to the iomap DIO implementation by calling
fscrypt_set_bio_crypt_ctx() to set encryption contexts on the bios.

Don't check for the rare case where a DUN (crypto data unit number)
discontiguity creates a boundary that bios must not cross.  Instead,
filesystems are expected to handle this in ->iomap_begin() by limiting
the length of the mapping so that iomap doesn't have to worry about it.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/iomap/direct-io.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 03ea367df19a4..20325b3926fa3 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -6,6 +6,7 @@
 #include <linux/module.h>
 #include <linux/compiler.h>
 #include <linux/fs.h>
+#include <linux/fscrypt.h>
 #include <linux/pagemap.h>
 #include <linux/iomap.h>
 #include <linux/backing-dev.h>
@@ -179,11 +180,14 @@ static void iomap_dio_bio_end_io(struct bio *bio)
 static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
 		loff_t pos, unsigned len)
 {
+	struct inode *inode = file_inode(dio->iocb->ki_filp);
 	struct page *page = ZERO_PAGE(0);
 	int flags = REQ_SYNC | REQ_IDLE;
 	struct bio *bio;
 
 	bio = bio_alloc(GFP_KERNEL, 1);
+	fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
+				  GFP_KERNEL);
 	bio_set_dev(bio, iter->iomap.bdev);
 	bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos);
 	bio->bi_private = dio;
@@ -310,6 +314,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 		}
 
 		bio = bio_alloc(GFP_KERNEL, nr_pages);
+		fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
+					  GFP_KERNEL);
 		bio_set_dev(bio, iomap->bdev);
 		bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
 		bio->bi_write_hint = dio->iocb->ki_hint;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [f2fs-dev] [PATCH v10 2/5] iomap: support direct I/O with fscrypt using blk-crypto
@ 2022-01-20  7:12   ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fsdevel,
	Jaegeuk Kim, Satya Tangirala, linux-ext4

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

Add support for this to the iomap DIO implementation by calling
fscrypt_set_bio_crypt_ctx() to set encryption contexts on the bios.

Don't check for the rare case where a DUN (crypto data unit number)
discontiguity creates a boundary that bios must not cross.  Instead,
filesystems are expected to handle this in ->iomap_begin() by limiting
the length of the mapping so that iomap doesn't have to worry about it.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/iomap/direct-io.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 03ea367df19a4..20325b3926fa3 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -6,6 +6,7 @@
 #include <linux/module.h>
 #include <linux/compiler.h>
 #include <linux/fs.h>
+#include <linux/fscrypt.h>
 #include <linux/pagemap.h>
 #include <linux/iomap.h>
 #include <linux/backing-dev.h>
@@ -179,11 +180,14 @@ static void iomap_dio_bio_end_io(struct bio *bio)
 static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
 		loff_t pos, unsigned len)
 {
+	struct inode *inode = file_inode(dio->iocb->ki_filp);
 	struct page *page = ZERO_PAGE(0);
 	int flags = REQ_SYNC | REQ_IDLE;
 	struct bio *bio;
 
 	bio = bio_alloc(GFP_KERNEL, 1);
+	fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
+				  GFP_KERNEL);
 	bio_set_dev(bio, iter->iomap.bdev);
 	bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos);
 	bio->bi_private = dio;
@@ -310,6 +314,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 		}
 
 		bio = bio_alloc(GFP_KERNEL, nr_pages);
+		fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
+					  GFP_KERNEL);
 		bio_set_dev(bio, iomap->bdev);
 		bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
 		bio->bi_write_hint = dio->iocb->ki_hint;
-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v10 3/5] ext4: support direct I/O with fscrypt using blk-crypto
  2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  7:12   ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

Therefore, make ext4 support DIO on files that are using inline
encryption.  Since ext4 uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:

- Let DIO proceed when supported, by using fscrypt_dio_unsupported()
  instead of assuming that encrypted files never support DIO.

- In ext4_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
  length of the mapping in the rare case where a DUN discontiguity
  occurs in the middle of an extent.  The iomap DIO implementation
  requires this, since it assumes that it can submit a bio covering (up
  to) the whole mapping, without checking fscrypt constraints itself.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/ext4/file.c  | 10 ++++++----
 fs/ext4/inode.c |  7 +++++++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8cc11715518ac..2b520e99bee74 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -36,9 +36,11 @@
 #include "acl.h"
 #include "truncate.h"
 
-static bool ext4_dio_supported(struct inode *inode)
+static bool ext4_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
 {
-	if (IS_ENABLED(CONFIG_FS_ENCRYPTION) && IS_ENCRYPTED(inode))
+	struct inode *inode = file_inode(iocb->ki_filp);
+
+	if (fscrypt_dio_unsupported(iocb, iter))
 		return false;
 	if (fsverity_active(inode))
 		return false;
@@ -61,7 +63,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		inode_lock_shared(inode);
 	}
 
-	if (!ext4_dio_supported(inode)) {
+	if (!ext4_dio_supported(iocb, to)) {
 		inode_unlock_shared(inode);
 		/*
 		 * Fallback to buffered I/O if the operation being performed on
@@ -509,7 +511,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	}
 
 	/* Fallback to buffered I/O if the inode does not support direct I/O. */
-	if (!ext4_dio_supported(inode)) {
+	if (!ext4_dio_supported(iocb, from)) {
 		if (ilock_shared)
 			inode_unlock_shared(inode);
 		else
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5f79d265d06a0..7af1bba34b8b8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3409,6 +3409,13 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	if (ret < 0)
 		return ret;
 out:
+	/*
+	 * When inline encryption is enabled, sometimes I/O to an encrypted file
+	 * has to be broken up to guarantee DUN contiguity.  Handle this by
+	 * limiting the length of the mapping returned.
+	 */
+	map.m_len = fscrypt_limit_io_blocks(inode, map.m_lblk, map.m_len);
+
 	ext4_set_iomap(inode, iomap, &map, offset, length, flags);
 
 	return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [f2fs-dev] [PATCH v10 3/5] ext4: support direct I/O with fscrypt using blk-crypto
@ 2022-01-20  7:12   ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fsdevel,
	Jaegeuk Kim, Satya Tangirala, linux-ext4

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

Therefore, make ext4 support DIO on files that are using inline
encryption.  Since ext4 uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:

- Let DIO proceed when supported, by using fscrypt_dio_unsupported()
  instead of assuming that encrypted files never support DIO.

- In ext4_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
  length of the mapping in the rare case where a DUN discontiguity
  occurs in the middle of an extent.  The iomap DIO implementation
  requires this, since it assumes that it can submit a bio covering (up
  to) the whole mapping, without checking fscrypt constraints itself.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/ext4/file.c  | 10 ++++++----
 fs/ext4/inode.c |  7 +++++++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8cc11715518ac..2b520e99bee74 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -36,9 +36,11 @@
 #include "acl.h"
 #include "truncate.h"
 
-static bool ext4_dio_supported(struct inode *inode)
+static bool ext4_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
 {
-	if (IS_ENABLED(CONFIG_FS_ENCRYPTION) && IS_ENCRYPTED(inode))
+	struct inode *inode = file_inode(iocb->ki_filp);
+
+	if (fscrypt_dio_unsupported(iocb, iter))
 		return false;
 	if (fsverity_active(inode))
 		return false;
@@ -61,7 +63,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		inode_lock_shared(inode);
 	}
 
-	if (!ext4_dio_supported(inode)) {
+	if (!ext4_dio_supported(iocb, to)) {
 		inode_unlock_shared(inode);
 		/*
 		 * Fallback to buffered I/O if the operation being performed on
@@ -509,7 +511,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	}
 
 	/* Fallback to buffered I/O if the inode does not support direct I/O. */
-	if (!ext4_dio_supported(inode)) {
+	if (!ext4_dio_supported(iocb, from)) {
 		if (ilock_shared)
 			inode_unlock_shared(inode);
 		else
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5f79d265d06a0..7af1bba34b8b8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3409,6 +3409,13 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	if (ret < 0)
 		return ret;
 out:
+	/*
+	 * When inline encryption is enabled, sometimes I/O to an encrypted file
+	 * has to be broken up to guarantee DUN contiguity.  Handle this by
+	 * limiting the length of the mapping returned.
+	 */
+	map.m_len = fscrypt_limit_io_blocks(inode, map.m_lblk, map.m_len);
+
 	ext4_set_iomap(inode, iomap, &map, offset, length, flags);
 
 	return 0;
-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v10 4/5] f2fs: support direct I/O with fscrypt using blk-crypto
  2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  7:12   ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

Therefore, make f2fs support DIO on files that are using inline
encryption.  Since f2fs uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:

- Let DIO proceed when supported, by using fscrypt_dio_unsupported()
  instead of assuming that encrypted files never support DIO.

- In f2fs_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
  length of the mapping in the rare case where a DUN discontiguity
  occurs in the middle of an extent.  The iomap DIO implementation
  requires this, since it assumes that it can submit a bio covering (up
  to) the whole mapping, without checking fscrypt constraints itself.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/data.c | 7 +++++++
 fs/f2fs/f2fs.h | 6 +++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0a1d236212f85..90669c0d16c37 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -4057,6 +4057,13 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 
 	iomap->offset = blks_to_bytes(inode, map.m_lblk);
 
+	/*
+	 * When inline encryption is enabled, sometimes I/O to an encrypted file
+	 * has to be broken up to guarantee DUN contiguity.  Handle this by
+	 * limiting the length of the mapping returned.
+	 */
+	map.m_len = fscrypt_limit_io_blocks(inode, map.m_lblk, map.m_len);
+
 	if (map.m_flags & (F2FS_MAP_MAPPED | F2FS_MAP_UNWRITTEN)) {
 		iomap->length = blks_to_bytes(inode, map.m_len);
 		if (map.m_flags & F2FS_MAP_MAPPED) {
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index eb22fa91c2b26..97f9e53969ece 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4371,7 +4371,11 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	int rw = iov_iter_rw(iter);
 
-	if (f2fs_post_read_required(inode))
+	if (fscrypt_dio_unsupported(iocb, iter))
+		return true;
+	if (fsverity_active(inode))
+		return true;
+	if (f2fs_compressed_file(inode))
 		return true;
 
 	/* disallow direct IO if any of devices has unaligned blksize */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [f2fs-dev] [PATCH v10 4/5] f2fs: support direct I/O with fscrypt using blk-crypto
@ 2022-01-20  7:12   ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fsdevel,
	Jaegeuk Kim, Satya Tangirala, linux-ext4

From: Eric Biggers <ebiggers@google.com>

Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data.  However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.

Therefore, make f2fs support DIO on files that are using inline
encryption.  Since f2fs uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:

- Let DIO proceed when supported, by using fscrypt_dio_unsupported()
  instead of assuming that encrypted files never support DIO.

- In f2fs_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
  length of the mapping in the rare case where a DUN discontiguity
  occurs in the middle of an extent.  The iomap DIO implementation
  requires this, since it assumes that it can submit a bio covering (up
  to) the whole mapping, without checking fscrypt constraints itself.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/data.c | 7 +++++++
 fs/f2fs/f2fs.h | 6 +++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0a1d236212f85..90669c0d16c37 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -4057,6 +4057,13 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 
 	iomap->offset = blks_to_bytes(inode, map.m_lblk);
 
+	/*
+	 * When inline encryption is enabled, sometimes I/O to an encrypted file
+	 * has to be broken up to guarantee DUN contiguity.  Handle this by
+	 * limiting the length of the mapping returned.
+	 */
+	map.m_len = fscrypt_limit_io_blocks(inode, map.m_lblk, map.m_len);
+
 	if (map.m_flags & (F2FS_MAP_MAPPED | F2FS_MAP_UNWRITTEN)) {
 		iomap->length = blks_to_bytes(inode, map.m_len);
 		if (map.m_flags & F2FS_MAP_MAPPED) {
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index eb22fa91c2b26..97f9e53969ece 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4371,7 +4371,11 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	int rw = iov_iter_rw(iter);
 
-	if (f2fs_post_read_required(inode))
+	if (fscrypt_dio_unsupported(iocb, iter))
+		return true;
+	if (fsverity_active(inode))
+		return true;
+	if (f2fs_compressed_file(inode))
 		return true;
 
 	/* disallow direct IO if any of devices has unaligned blksize */
-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v10 5/5] fscrypt: update documentation for direct I/O support
  2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  7:12   ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu

From: Eric Biggers <ebiggers@google.com>

Now that direct I/O is supported on encrypted files in some cases,
document what these cases are.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 Documentation/filesystems/fscrypt.rst | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 4d5d50dca65c6..6ccd5efb25b77 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1047,8 +1047,8 @@ astute users may notice some differences in behavior:
   may be used to overwrite the source files but isn't guaranteed to be
   effective on all filesystems and storage devices.
 
-- Direct I/O is not supported on encrypted files.  Attempts to use
-  direct I/O on such files will fall back to buffered I/O.
+- Direct I/O is supported on encrypted files only under some
+  circumstances.  For details, see `Direct I/O support`_.
 
 - The fallocate operations FALLOC_FL_COLLAPSE_RANGE and
   FALLOC_FL_INSERT_RANGE are not supported on encrypted files and will
@@ -1179,6 +1179,27 @@ Inline encryption doesn't affect the ciphertext or other aspects of
 the on-disk format, so users may freely switch back and forth between
 using "inlinecrypt" and not using "inlinecrypt".
 
+Direct I/O support
+==================
+
+For direct I/O on an encrypted file to work, the following conditions
+must be met (in addition to the conditions for direct I/O on an
+unencrypted file):
+
+* The file must be using inline encryption.  Usually this means that
+  the filesystem must be mounted with ``-o inlinecrypt`` and inline
+  encryption hardware must be present.  However, a software fallback
+  is also available.  For details, see `Inline encryption support`_.
+
+* The I/O request must be fully aligned to the filesystem block size.
+  This means that the file position the I/O is targeting, the lengths
+  of all I/O segments, and the memory addresses of all I/O buffers
+  must be multiples of this value.  Note that the filesystem block
+  size may be greater than the logical block size of the block device.
+
+If either of the above conditions is not met, then direct I/O on the
+encrypted file will fall back to buffered I/O.
+
 Implementation details
 ======================
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [f2fs-dev] [PATCH v10 5/5] fscrypt: update documentation for direct I/O support
@ 2022-01-20  7:12   ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  7:12 UTC (permalink / raw)
  To: linux-fscrypt
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fsdevel,
	Jaegeuk Kim, linux-ext4

From: Eric Biggers <ebiggers@google.com>

Now that direct I/O is supported on encrypted files in some cases,
document what these cases are.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 Documentation/filesystems/fscrypt.rst | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 4d5d50dca65c6..6ccd5efb25b77 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1047,8 +1047,8 @@ astute users may notice some differences in behavior:
   may be used to overwrite the source files but isn't guaranteed to be
   effective on all filesystems and storage devices.
 
-- Direct I/O is not supported on encrypted files.  Attempts to use
-  direct I/O on such files will fall back to buffered I/O.
+- Direct I/O is supported on encrypted files only under some
+  circumstances.  For details, see `Direct I/O support`_.
 
 - The fallocate operations FALLOC_FL_COLLAPSE_RANGE and
   FALLOC_FL_INSERT_RANGE are not supported on encrypted files and will
@@ -1179,6 +1179,27 @@ Inline encryption doesn't affect the ciphertext or other aspects of
 the on-disk format, so users may freely switch back and forth between
 using "inlinecrypt" and not using "inlinecrypt".
 
+Direct I/O support
+==================
+
+For direct I/O on an encrypted file to work, the following conditions
+must be met (in addition to the conditions for direct I/O on an
+unencrypted file):
+
+* The file must be using inline encryption.  Usually this means that
+  the filesystem must be mounted with ``-o inlinecrypt`` and inline
+  encryption hardware must be present.  However, a software fallback
+  is also available.  For details, see `Inline encryption support`_.
+
+* The I/O request must be fully aligned to the filesystem block size.
+  This means that the file position the I/O is targeting, the lengths
+  of all I/O segments, and the memory addresses of all I/O buffers
+  must be multiples of this value.  Note that the filesystem block
+  size may be greater than the logical block size of the block device.
+
+If either of the above conditions is not met, then direct I/O on the
+encrypted file will fall back to buffered I/O.
+
 Implementation details
 ======================
 
-- 
2.34.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 1/5] fscrypt: add functions for direct I/O support
  2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  8:27     ` Christoph Hellwig
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-20  8:27 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fscrypt, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

> +/**
> + * fscrypt_dio_unsupported() - check whether a DIO (direct I/O) request is
> + *			       unsupported due to encryption constraints
> + * @iocb: the file and position the I/O is targeting
> + * @iter: the I/O data segment(s)
> + *
> + * Return: true if DIO is unsupported
> + */
> +bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter)

I always find non-negated functions easier to follow, i.e. turn this
into fscrypt_dio_supported().

> +	/*
> +	 * Since the granularity of encryption is filesystem blocks, the file
> +	 * position and total I/O length must be aligned to the filesystem block
> +	 * size -- not just to the block device's logical block size as is
> +	 * traditionally the case for DIO on many filesystems (not including
> +	 * f2fs, which only allows filesystem block aligned DIO anyway).

I would not really mention a specific file system here.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 1/5] fscrypt: add functions for direct I/O support
@ 2022-01-20  8:27     ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-20  8:27 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fscrypt,
	linux-fsdevel, Jaegeuk Kim, Satya Tangirala, linux-ext4

> +/**
> + * fscrypt_dio_unsupported() - check whether a DIO (direct I/O) request is
> + *			       unsupported due to encryption constraints
> + * @iocb: the file and position the I/O is targeting
> + * @iter: the I/O data segment(s)
> + *
> + * Return: true if DIO is unsupported
> + */
> +bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter)

I always find non-negated functions easier to follow, i.e. turn this
into fscrypt_dio_supported().

> +	/*
> +	 * Since the granularity of encryption is filesystem blocks, the file
> +	 * position and total I/O length must be aligned to the filesystem block
> +	 * size -- not just to the block device's logical block size as is
> +	 * traditionally the case for DIO on many filesystems (not including
> +	 * f2fs, which only allows filesystem block aligned DIO anyway).

I would not really mention a specific file system here.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 2/5] iomap: support direct I/O with fscrypt using blk-crypto
  2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  8:28     ` Christoph Hellwig
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-20  8:28 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fscrypt, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

On Wed, Jan 19, 2022 at 11:12:12PM -0800, Eric Biggers wrote:
>  	bio = bio_alloc(GFP_KERNEL, 1);
> +	fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
> +				  GFP_KERNEL);

Note that this will create a (harmless) conflict with my
"improve the bio allocation interface" series.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 2/5] iomap: support direct I/O with fscrypt using blk-crypto
@ 2022-01-20  8:28     ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-20  8:28 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fscrypt,
	linux-fsdevel, Jaegeuk Kim, Satya Tangirala, linux-ext4

On Wed, Jan 19, 2022 at 11:12:12PM -0800, Eric Biggers wrote:
>  	bio = bio_alloc(GFP_KERNEL, 1);
> +	fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
> +				  GFP_KERNEL);

Note that this will create a (harmless) conflict with my
"improve the bio allocation interface" series.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
@ 2022-01-20  8:30   ` Christoph Hellwig
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-20  8:30 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Christoph Hellwig, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fscrypt,
	linux-fsdevel, Jaegeuk Kim, linux-ext4

On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> 
> Given the above, as far as I know the only remaining objection to this
> patchset would be that DIO constraints aren't sufficiently discoverable
> by userspace.  Now, to put this in context, this is a longstanding issue
> with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> not specific to this feature, and it doesn't actually seem to be too
> important in practice; many other filesystem features place constraints
> on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> (And for better or worse, many systems using fscrypt already have
> out-of-tree patches that enable DIO support, and people don't seem to
> have trouble with the FS block size alignment requirement.)

It might make sense to use this as an opportunity to implement
XFS_IOC_DIOINFO for ext4 and f2fs.

> I plan to propose a new generic ioctl to address the issue of DIO
> constraints being insufficiently discoverable.  But until then, I'm
> wondering if people are willing to consider this patchset again, or
> whether it is considered blocked by this issue alone.  (And if this
> patchset is still unacceptable, would it be acceptable with f2fs support
> only, given that f2fs *already* only allows FS block size aligned DIO?)

I think the patchset looks fine, but I'd really love to have a way for
the alignment restrictions to be discoverable from the start.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20  8:30   ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-20  8:30 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fscrypt, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, Christoph Hellwig, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu

On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> 
> Given the above, as far as I know the only remaining objection to this
> patchset would be that DIO constraints aren't sufficiently discoverable
> by userspace.  Now, to put this in context, this is a longstanding issue
> with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> not specific to this feature, and it doesn't actually seem to be too
> important in practice; many other filesystem features place constraints
> on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> (And for better or worse, many systems using fscrypt already have
> out-of-tree patches that enable DIO support, and people don't seem to
> have trouble with the FS block size alignment requirement.)

It might make sense to use this as an opportunity to implement
XFS_IOC_DIOINFO for ext4 and f2fs.

> I plan to propose a new generic ioctl to address the issue of DIO
> constraints being insufficiently discoverable.  But until then, I'm
> wondering if people are willing to consider this patchset again, or
> whether it is considered blocked by this issue alone.  (And if this
> patchset is still unacceptable, would it be acceptable with f2fs support
> only, given that f2fs *already* only allows FS block size aligned DIO?)

I think the patchset looks fine, but I'd really love to have a way for
the alignment restrictions to be discoverable from the start.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 1/5] fscrypt: add functions for direct I/O support
  2022-01-20  8:27     ` [f2fs-dev] " Christoph Hellwig
@ 2022-01-20  9:04       ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  9:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fscrypt, linux-fsdevel, linux-ext4, linux-f2fs-devel,
	linux-xfs, Dave Chinner, Darrick J . Wong, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu, Satya Tangirala

On Thu, Jan 20, 2022 at 12:27:45AM -0800, Christoph Hellwig wrote:
> > +/**
> > + * fscrypt_dio_unsupported() - check whether a DIO (direct I/O) request is
> > + *			       unsupported due to encryption constraints
> > + * @iocb: the file and position the I/O is targeting
> > + * @iter: the I/O data segment(s)
> > + *
> > + * Return: true if DIO is unsupported
> > + */
> > +bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter)
> 
> I always find non-negated functions easier to follow, i.e. turn this
> into fscrypt_dio_supported().
> 

I actually had changed this from v9 because fscrypt_dio_supported() seemed
backwards, given that its purpose is to check whether DIO is unsupported, not
whether it's supported per se (and the function's comment reflected this).  What
ext4 and f2fs do is check a list of reasons why DIO would *not* be supported,
and if none apply, then it is supported.  This is just one of those reasons.

This is subjective though, so if people prefer the old way, I'll change it back.

- Eric

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 1/5] fscrypt: add functions for direct I/O support
@ 2022-01-20  9:04       ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20  9:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Satya Tangirala, Theodore Ts'o, Darrick J . Wong,
	Dave Chinner, linux-f2fs-devel, linux-xfs, linux-fscrypt,
	linux-fsdevel, Jaegeuk Kim, linux-ext4

On Thu, Jan 20, 2022 at 12:27:45AM -0800, Christoph Hellwig wrote:
> > +/**
> > + * fscrypt_dio_unsupported() - check whether a DIO (direct I/O) request is
> > + *			       unsupported due to encryption constraints
> > + * @iocb: the file and position the I/O is targeting
> > + * @iter: the I/O data segment(s)
> > + *
> > + * Return: true if DIO is unsupported
> > + */
> > +bool fscrypt_dio_unsupported(struct kiocb *iocb, struct iov_iter *iter)
> 
> I always find non-negated functions easier to follow, i.e. turn this
> into fscrypt_dio_supported().
> 

I actually had changed this from v9 because fscrypt_dio_supported() seemed
backwards, given that its purpose is to check whether DIO is unsupported, not
whether it's supported per se (and the function's comment reflected this).  What
ext4 and f2fs do is check a list of reasons why DIO would *not* be supported,
and if none apply, then it is supported.  This is just one of those reasons.

This is subjective though, so if people prefer the old way, I'll change it back.

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20  8:30   ` Christoph Hellwig
@ 2022-01-20 17:10     ` Darrick J. Wong
  -1 siblings, 0 replies; 44+ messages in thread
From: Darrick J. Wong @ 2022-01-20 17:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Eric Biggers, linux-fscrypt, linux-fsdevel, linux-ext4,
	linux-f2fs-devel, linux-xfs, Dave Chinner, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > 
> > Given the above, as far as I know the only remaining objection to this
> > patchset would be that DIO constraints aren't sufficiently discoverable
> > by userspace.  Now, to put this in context, this is a longstanding issue
> > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > not specific to this feature, and it doesn't actually seem to be too
> > important in practice; many other filesystem features place constraints
> > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > (And for better or worse, many systems using fscrypt already have
> > out-of-tree patches that enable DIO support, and people don't seem to
> > have trouble with the FS block size alignment requirement.)
> 
> It might make sense to use this as an opportunity to implement
> XFS_IOC_DIOINFO for ext4 and f2fs.

Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
list the /file/ position alignment requirement:

struct dioattr {
	__u32		d_mem;		/* data buffer memory alignment */
	__u32		d_miniosz;	/* min xfer size		*/
	__u32		d_maxiosz;	/* max xfer size		*/
};

Since I /think/ fscrypt requires that directio writes be aligned to file
block size, right?

> > I plan to propose a new generic ioctl to address the issue of DIO
> > constraints being insufficiently discoverable.  But until then, I'm

Which is what I suspect Eric meant by this sentence. :)

> > wondering if people are willing to consider this patchset again, or
> > whether it is considered blocked by this issue alone.  (And if this
> > patchset is still unacceptable, would it be acceptable with f2fs support
> > only, given that f2fs *already* only allows FS block size aligned DIO?)
> 
> I think the patchset looks fine, but I'd really love to have a way for
> the alignment restrictions to be discoverable from the start.

I agree.  The mechanics of the patchset look ok to me, but it's very
unfortunate that there's no way for userspace programs to ask the kernel
about the directio geometry for a file.

Ever since we added reflink to XFS I've wanted to add a way to tell
userspace that direct writes to a reflink(able) file will be much more
efficient if they can align the io request to 1 fs block instead of 1
sector.

How about something like this:

struct dioattr2 {
	__u32		d_mem;		/* data buffer memory alignment */
	__u32		d_miniosz;	/* min xfer size		*/
	__u32		d_maxiosz;	/* max xfer size		*/

	/* file range must be aligned to this value */
	__u32		d_min_fpos;

	/* for optimal performance, align file range to this */
	__u32		d_opt_fpos;

	__u32		d_padding[11];
};

--D

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20 17:10     ` Darrick J. Wong
  0 siblings, 0 replies; 44+ messages in thread
From: Darrick J. Wong @ 2022-01-20 17:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-xfs, Theodore Ts'o, Dave Chinner, linux-f2fs-devel,
	Eric Biggers, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > 
> > Given the above, as far as I know the only remaining objection to this
> > patchset would be that DIO constraints aren't sufficiently discoverable
> > by userspace.  Now, to put this in context, this is a longstanding issue
> > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > not specific to this feature, and it doesn't actually seem to be too
> > important in practice; many other filesystem features place constraints
> > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > (And for better or worse, many systems using fscrypt already have
> > out-of-tree patches that enable DIO support, and people don't seem to
> > have trouble with the FS block size alignment requirement.)
> 
> It might make sense to use this as an opportunity to implement
> XFS_IOC_DIOINFO for ext4 and f2fs.

Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
list the /file/ position alignment requirement:

struct dioattr {
	__u32		d_mem;		/* data buffer memory alignment */
	__u32		d_miniosz;	/* min xfer size		*/
	__u32		d_maxiosz;	/* max xfer size		*/
};

Since I /think/ fscrypt requires that directio writes be aligned to file
block size, right?

> > I plan to propose a new generic ioctl to address the issue of DIO
> > constraints being insufficiently discoverable.  But until then, I'm

Which is what I suspect Eric meant by this sentence. :)

> > wondering if people are willing to consider this patchset again, or
> > whether it is considered blocked by this issue alone.  (And if this
> > patchset is still unacceptable, would it be acceptable with f2fs support
> > only, given that f2fs *already* only allows FS block size aligned DIO?)
> 
> I think the patchset looks fine, but I'd really love to have a way for
> the alignment restrictions to be discoverable from the start.

I agree.  The mechanics of the patchset look ok to me, but it's very
unfortunate that there's no way for userspace programs to ask the kernel
about the directio geometry for a file.

Ever since we added reflink to XFS I've wanted to add a way to tell
userspace that direct writes to a reflink(able) file will be much more
efficient if they can align the io request to 1 fs block instead of 1
sector.

How about something like this:

struct dioattr2 {
	__u32		d_mem;		/* data buffer memory alignment */
	__u32		d_miniosz;	/* min xfer size		*/
	__u32		d_maxiosz;	/* max xfer size		*/

	/* file range must be aligned to this value */
	__u32		d_min_fpos;

	/* for optimal performance, align file range to this */
	__u32		d_opt_fpos;

	__u32		d_padding[11];
};

--D


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20 17:10     ` [f2fs-dev] " Darrick J. Wong
@ 2022-01-20 20:39       ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20 20:39 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, linux-fscrypt, linux-fsdevel, linux-ext4,
	linux-f2fs-devel, linux-xfs, Dave Chinner, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > 
> > > Given the above, as far as I know the only remaining objection to this
> > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > not specific to this feature, and it doesn't actually seem to be too
> > > important in practice; many other filesystem features place constraints
> > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > (And for better or worse, many systems using fscrypt already have
> > > out-of-tree patches that enable DIO support, and people don't seem to
> > > have trouble with the FS block size alignment requirement.)
> > 
> > It might make sense to use this as an opportunity to implement
> > XFS_IOC_DIOINFO for ext4 and f2fs.
> 
> Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> list the /file/ position alignment requirement:
> 
> struct dioattr {
> 	__u32		d_mem;		/* data buffer memory alignment */
> 	__u32		d_miniosz;	/* min xfer size		*/
> 	__u32		d_maxiosz;	/* max xfer size		*/
> };

Well, the comment above struct dioattr says:

	/*
	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
	 * alignment.
	 */

So d_miniosz serves that purpose already.

> 
> Since I /think/ fscrypt requires that directio writes be aligned to file
> block size, right?

The file position must be a multiple of the filesystem block size, yes.
Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
good enough for the fscrypt direct I/O case.

The real question is whether there are any direct I/O implementations where
XFS_IOC_DIOINFO would *not* be good enough, for example due to "xfer size
multiple" != "file seek offset alignment" being allowed.  In that case we would
need to define a new ioctl that is more general (like the one you described
below) rather than simply uplifting XFS_IOC_DIOINFO.

More general is nice, but it's not helpful if no one will actually use the extra
information.  So we need to figure out what is actually useful.

> How about something like this:
> 
> struct dioattr2 {
> 	__u32		d_mem;		/* data buffer memory alignment */
> 	__u32		d_miniosz;	/* min xfer size		*/
> 	__u32		d_maxiosz;	/* max xfer size		*/
> 
> 	/* file range must be aligned to this value */
> 	__u32		d_min_fpos;
> 
> 	/* for optimal performance, align file range to this */
> 	__u32		d_opt_fpos;
> 
> 	__u32		d_padding[11];
> };
> 

- Eric

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20 20:39       ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20 20:39 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, Theodore Ts'o, Dave Chinner, linux-f2fs-devel,
	Christoph Hellwig, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > 
> > > Given the above, as far as I know the only remaining objection to this
> > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > not specific to this feature, and it doesn't actually seem to be too
> > > important in practice; many other filesystem features place constraints
> > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > (And for better or worse, many systems using fscrypt already have
> > > out-of-tree patches that enable DIO support, and people don't seem to
> > > have trouble with the FS block size alignment requirement.)
> > 
> > It might make sense to use this as an opportunity to implement
> > XFS_IOC_DIOINFO for ext4 and f2fs.
> 
> Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> list the /file/ position alignment requirement:
> 
> struct dioattr {
> 	__u32		d_mem;		/* data buffer memory alignment */
> 	__u32		d_miniosz;	/* min xfer size		*/
> 	__u32		d_maxiosz;	/* max xfer size		*/
> };

Well, the comment above struct dioattr says:

	/*
	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
	 * alignment.
	 */

So d_miniosz serves that purpose already.

> 
> Since I /think/ fscrypt requires that directio writes be aligned to file
> block size, right?

The file position must be a multiple of the filesystem block size, yes.
Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
good enough for the fscrypt direct I/O case.

The real question is whether there are any direct I/O implementations where
XFS_IOC_DIOINFO would *not* be good enough, for example due to "xfer size
multiple" != "file seek offset alignment" being allowed.  In that case we would
need to define a new ioctl that is more general (like the one you described
below) rather than simply uplifting XFS_IOC_DIOINFO.

More general is nice, but it's not helpful if no one will actually use the extra
information.  So we need to figure out what is actually useful.

> How about something like this:
> 
> struct dioattr2 {
> 	__u32		d_mem;		/* data buffer memory alignment */
> 	__u32		d_miniosz;	/* min xfer size		*/
> 	__u32		d_maxiosz;	/* max xfer size		*/
> 
> 	/* file range must be aligned to this value */
> 	__u32		d_min_fpos;
> 
> 	/* for optimal performance, align file range to this */
> 	__u32		d_opt_fpos;
> 
> 	__u32		d_padding[11];
> };
> 

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20 20:39       ` [f2fs-dev] " Eric Biggers
@ 2022-01-20 21:00         ` Darrick J. Wong
  -1 siblings, 0 replies; 44+ messages in thread
From: Darrick J. Wong @ 2022-01-20 21:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Christoph Hellwig, linux-fscrypt, linux-fsdevel, linux-ext4,
	linux-f2fs-devel, linux-xfs, Dave Chinner, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > 
> > > > Given the above, as far as I know the only remaining objection to this
> > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > not specific to this feature, and it doesn't actually seem to be too
> > > > important in practice; many other filesystem features place constraints
> > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > (And for better or worse, many systems using fscrypt already have
> > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > have trouble with the FS block size alignment requirement.)
> > > 
> > > It might make sense to use this as an opportunity to implement
> > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > 
> > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > list the /file/ position alignment requirement:
> > 
> > struct dioattr {
> > 	__u32		d_mem;		/* data buffer memory alignment */
> > 	__u32		d_miniosz;	/* min xfer size		*/
> > 	__u32		d_maxiosz;	/* max xfer size		*/
> > };
> 
> Well, the comment above struct dioattr says:
> 
> 	/*
> 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> 	 * alignment.
> 	 */
> 
> So d_miniosz serves that purpose already.
> 
> > 
> > Since I /think/ fscrypt requires that directio writes be aligned to file
> > block size, right?
> 
> The file position must be a multiple of the filesystem block size, yes.
> Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> good enough for the fscrypt direct I/O case.

Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
add a couple of implementations for ext4 and f2fs, and I think that'll
be enough to get the fscrypt patchset moving again.

> The real question is whether there are any direct I/O implementations where
> XFS_IOC_DIOINFO would *not* be good enough, for example due to "xfer size
> multiple" != "file seek offset alignment" being allowed.  In that case we would
> need to define a new ioctl that is more general (like the one you described
> below) rather than simply uplifting XFS_IOC_DIOINFO.

I don't think there are any currently, but if anyone ever redesigns
DIOINFO we might as well make all those pieces explicit.

> More general is nice, but it's not helpful if no one will actually use the extra
> information.  So we need to figure out what is actually useful.

<nod> Clearly I haven't wanted d_opt_fpos badly enough to propose
revving the ioctl. ;)

--D

> 
> > How about something like this:
> > 
> > struct dioattr2 {
> > 	__u32		d_mem;		/* data buffer memory alignment */
> > 	__u32		d_miniosz;	/* min xfer size		*/
> > 	__u32		d_maxiosz;	/* max xfer size		*/
> > 
> > 	/* file range must be aligned to this value */
> > 	__u32		d_min_fpos;
> > 
> > 	/* for optimal performance, align file range to this */
> > 	__u32		d_opt_fpos;
> > 
> > 	__u32		d_padding[11];
> > };
> > 
> 
> - Eric

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20 21:00         ` Darrick J. Wong
  0 siblings, 0 replies; 44+ messages in thread
From: Darrick J. Wong @ 2022-01-20 21:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-xfs, Theodore Ts'o, Dave Chinner, linux-f2fs-devel,
	Christoph Hellwig, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > 
> > > > Given the above, as far as I know the only remaining objection to this
> > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > not specific to this feature, and it doesn't actually seem to be too
> > > > important in practice; many other filesystem features place constraints
> > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > (And for better or worse, many systems using fscrypt already have
> > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > have trouble with the FS block size alignment requirement.)
> > > 
> > > It might make sense to use this as an opportunity to implement
> > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > 
> > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > list the /file/ position alignment requirement:
> > 
> > struct dioattr {
> > 	__u32		d_mem;		/* data buffer memory alignment */
> > 	__u32		d_miniosz;	/* min xfer size		*/
> > 	__u32		d_maxiosz;	/* max xfer size		*/
> > };
> 
> Well, the comment above struct dioattr says:
> 
> 	/*
> 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> 	 * alignment.
> 	 */
> 
> So d_miniosz serves that purpose already.
> 
> > 
> > Since I /think/ fscrypt requires that directio writes be aligned to file
> > block size, right?
> 
> The file position must be a multiple of the filesystem block size, yes.
> Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> good enough for the fscrypt direct I/O case.

Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
add a couple of implementations for ext4 and f2fs, and I think that'll
be enough to get the fscrypt patchset moving again.

> The real question is whether there are any direct I/O implementations where
> XFS_IOC_DIOINFO would *not* be good enough, for example due to "xfer size
> multiple" != "file seek offset alignment" being allowed.  In that case we would
> need to define a new ioctl that is more general (like the one you described
> below) rather than simply uplifting XFS_IOC_DIOINFO.

I don't think there are any currently, but if anyone ever redesigns
DIOINFO we might as well make all those pieces explicit.

> More general is nice, but it's not helpful if no one will actually use the extra
> information.  So we need to figure out what is actually useful.

<nod> Clearly I haven't wanted d_opt_fpos badly enough to propose
revving the ioctl. ;)

--D

> 
> > How about something like this:
> > 
> > struct dioattr2 {
> > 	__u32		d_mem;		/* data buffer memory alignment */
> > 	__u32		d_miniosz;	/* min xfer size		*/
> > 	__u32		d_maxiosz;	/* max xfer size		*/
> > 
> > 	/* file range must be aligned to this value */
> > 	__u32		d_min_fpos;
> > 
> > 	/* for optimal performance, align file range to this */
> > 	__u32		d_opt_fpos;
> > 
> > 	__u32		d_padding[11];
> > };
> > 
> 
> - Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20 21:00         ` [f2fs-dev] " Darrick J. Wong
@ 2022-01-20 22:04           ` Dave Chinner
  -1 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-01-20 22:04 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Eric Biggers, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > 
> > > > > Given the above, as far as I know the only remaining objection to this
> > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > important in practice; many other filesystem features place constraints
> > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > (And for better or worse, many systems using fscrypt already have
> > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > have trouble with the FS block size alignment requirement.)
> > > > 
> > > > It might make sense to use this as an opportunity to implement
> > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > 
> > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > list the /file/ position alignment requirement:
> > > 
> > > struct dioattr {
> > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > };
> > 
> > Well, the comment above struct dioattr says:
> > 
> > 	/*
> > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > 	 * alignment.
> > 	 */
> > 
> > So d_miniosz serves that purpose already.
> > 
> > > 
> > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > block size, right?
> > 
> > The file position must be a multiple of the filesystem block size, yes.
> > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > good enough for the fscrypt direct I/O case.
> 
> Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> add a couple of implementations for ext4 and f2fs, and I think that'll
> be enough to get the fscrypt patchset moving again.

On the contrary, I'd much prefer to see this information added to
statx(). The file offset alignment info is a property of the current
file (e.g. XFS can have different per-file requirements depending on
whether the file data is hosted on the data or RT device, etc) and
so it's not a fixed property of the filesystem.

statx() was designed to be extended with per-file property
information, and we already have stuff like filesystem block size in
that syscall. Hence I would much prefer that we extend it with the
DIO properties we need to support rather than "create" a new VFS
ioctl to extract this information. We already have statx(), so let's
use it for what it was intended for.

> > The real question is whether there are any direct I/O implementations where
> > XFS_IOC_DIOINFO would *not* be good enough, for example due to "xfer size
> > multiple" != "file seek offset alignment" being allowed.  In that case we would
> > need to define a new ioctl that is more general (like the one you described
> > below) rather than simply uplifting XFS_IOC_DIOINFO.
> 
> I don't think there are any currently, but if anyone ever redesigns
> DIOINFO we might as well make all those pieces explicit.
> 
> > More general is nice, but it's not helpful if no one will actually use the extra
> > information.  So we need to figure out what is actually useful.
> 
> <nod> Clearly I haven't wanted d_opt_fpos badly enough to propose
> revving the ioctl. ;)

I think the number of applications that use DIOINFO outside of
xfsprogs/xfsdump/fstests can probably be counted on one hand.

Debian code search tells me:
-qemu (under ifdef CONFIG_XFS)
-ceph 16.2 (seastar database support?)
-diod contains a copy of fsstress
-e2fsprogs contains a copy of fsstress
-openmpi (under ifdef SGIMPI)
-partclone - actually, that has a complete copy of the xfsprogs
	     libxfs/ iand include/ directory in it, so it's using
	     the old libxfs_device_alignment() call that uses
	     XFS_IOC_DIOINFOD, and only when builing the xfsclone
	     binary.

Yup, I can count them on one 6 fingered hand, and their only use is
when XFS filesystems are specifically discovered. :)

Hence I think it would be much more useful to application developers
to include the IO alignment information in statx(), not to lift an
ioctl that is pretty much unused and unknown outside the core XFS
development environment....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20 22:04           ` Dave Chinner
  0 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-01-20 22:04 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Theodore Ts'o, linux-f2fs-devel,
	Eric Biggers, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4, linux-xfs

On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > 
> > > > > Given the above, as far as I know the only remaining objection to this
> > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > important in practice; many other filesystem features place constraints
> > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > (And for better or worse, many systems using fscrypt already have
> > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > have trouble with the FS block size alignment requirement.)
> > > > 
> > > > It might make sense to use this as an opportunity to implement
> > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > 
> > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > list the /file/ position alignment requirement:
> > > 
> > > struct dioattr {
> > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > };
> > 
> > Well, the comment above struct dioattr says:
> > 
> > 	/*
> > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > 	 * alignment.
> > 	 */
> > 
> > So d_miniosz serves that purpose already.
> > 
> > > 
> > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > block size, right?
> > 
> > The file position must be a multiple of the filesystem block size, yes.
> > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > good enough for the fscrypt direct I/O case.
> 
> Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> add a couple of implementations for ext4 and f2fs, and I think that'll
> be enough to get the fscrypt patchset moving again.

On the contrary, I'd much prefer to see this information added to
statx(). The file offset alignment info is a property of the current
file (e.g. XFS can have different per-file requirements depending on
whether the file data is hosted on the data or RT device, etc) and
so it's not a fixed property of the filesystem.

statx() was designed to be extended with per-file property
information, and we already have stuff like filesystem block size in
that syscall. Hence I would much prefer that we extend it with the
DIO properties we need to support rather than "create" a new VFS
ioctl to extract this information. We already have statx(), so let's
use it for what it was intended for.

> > The real question is whether there are any direct I/O implementations where
> > XFS_IOC_DIOINFO would *not* be good enough, for example due to "xfer size
> > multiple" != "file seek offset alignment" being allowed.  In that case we would
> > need to define a new ioctl that is more general (like the one you described
> > below) rather than simply uplifting XFS_IOC_DIOINFO.
> 
> I don't think there are any currently, but if anyone ever redesigns
> DIOINFO we might as well make all those pieces explicit.
> 
> > More general is nice, but it's not helpful if no one will actually use the extra
> > information.  So we need to figure out what is actually useful.
> 
> <nod> Clearly I haven't wanted d_opt_fpos badly enough to propose
> revving the ioctl. ;)

I think the number of applications that use DIOINFO outside of
xfsprogs/xfsdump/fstests can probably be counted on one hand.

Debian code search tells me:
-qemu (under ifdef CONFIG_XFS)
-ceph 16.2 (seastar database support?)
-diod contains a copy of fsstress
-e2fsprogs contains a copy of fsstress
-openmpi (under ifdef SGIMPI)
-partclone - actually, that has a complete copy of the xfsprogs
	     libxfs/ iand include/ directory in it, so it's using
	     the old libxfs_device_alignment() call that uses
	     XFS_IOC_DIOINFOD, and only when builing the xfsclone
	     binary.

Yup, I can count them on one 6 fingered hand, and their only use is
when XFS filesystems are specifically discovered. :)

Hence I think it would be much more useful to application developers
to include the IO alignment information in statx(), not to lift an
ioctl that is pretty much unused and unknown outside the core XFS
development environment....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20 22:04           ` [f2fs-dev] " Dave Chinner
@ 2022-01-20 22:48             ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20 22:48 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > 
> > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > important in practice; many other filesystem features place constraints
> > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > have trouble with the FS block size alignment requirement.)
> > > > > 
> > > > > It might make sense to use this as an opportunity to implement
> > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > 
> > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > list the /file/ position alignment requirement:
> > > > 
> > > > struct dioattr {
> > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > };
> > > 
> > > Well, the comment above struct dioattr says:
> > > 
> > > 	/*
> > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > 	 * alignment.
> > > 	 */
> > > 
> > > So d_miniosz serves that purpose already.
> > > 
> > > > 
> > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > block size, right?
> > > 
> > > The file position must be a multiple of the filesystem block size, yes.
> > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > good enough for the fscrypt direct I/O case.
> > 
> > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > add a couple of implementations for ext4 and f2fs, and I think that'll
> > be enough to get the fscrypt patchset moving again.
> 
> On the contrary, I'd much prefer to see this information added to
> statx(). The file offset alignment info is a property of the current
> file (e.g. XFS can have different per-file requirements depending on
> whether the file data is hosted on the data or RT device, etc) and
> so it's not a fixed property of the filesystem.
> 
> statx() was designed to be extended with per-file property
> information, and we already have stuff like filesystem block size in
> that syscall. Hence I would much prefer that we extend it with the
> DIO properties we need to support rather than "create" a new VFS
> ioctl to extract this information. We already have statx(), so let's
> use it for what it was intended for.
> 

I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
looks at the filesystem only, but that would be the expected implementation if
the DIO constraints don't currently vary between different files in XFS.

If DIO constraints do in fact already vary between different files in XFS, is
this just a bug in the XFS implementation of XFS_IOC_DIOINFO?  Or was
XFS_IOC_DIOINFO only ever intended to report per-filesystem state?  If the
latter, then yes, that would mean it wouldn't really be suitable to reuse to
start reporting per-file state.  (Per-file state is required for encrypted
files.  It's also required for other filesystem features; e.g., files that use
compression or fs-verity don't support direct I/O at all.)

- Eric

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20 22:48             ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-01-20 22:48 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-xfs, Theodore Ts'o, Darrick J. Wong, linux-f2fs-devel,
	Christoph Hellwig, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > 
> > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > important in practice; many other filesystem features place constraints
> > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > have trouble with the FS block size alignment requirement.)
> > > > > 
> > > > > It might make sense to use this as an opportunity to implement
> > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > 
> > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > list the /file/ position alignment requirement:
> > > > 
> > > > struct dioattr {
> > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > };
> > > 
> > > Well, the comment above struct dioattr says:
> > > 
> > > 	/*
> > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > 	 * alignment.
> > > 	 */
> > > 
> > > So d_miniosz serves that purpose already.
> > > 
> > > > 
> > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > block size, right?
> > > 
> > > The file position must be a multiple of the filesystem block size, yes.
> > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > good enough for the fscrypt direct I/O case.
> > 
> > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > add a couple of implementations for ext4 and f2fs, and I think that'll
> > be enough to get the fscrypt patchset moving again.
> 
> On the contrary, I'd much prefer to see this information added to
> statx(). The file offset alignment info is a property of the current
> file (e.g. XFS can have different per-file requirements depending on
> whether the file data is hosted on the data or RT device, etc) and
> so it's not a fixed property of the filesystem.
> 
> statx() was designed to be extended with per-file property
> information, and we already have stuff like filesystem block size in
> that syscall. Hence I would much prefer that we extend it with the
> DIO properties we need to support rather than "create" a new VFS
> ioctl to extract this information. We already have statx(), so let's
> use it for what it was intended for.
> 

I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
looks at the filesystem only, but that would be the expected implementation if
the DIO constraints don't currently vary between different files in XFS.

If DIO constraints do in fact already vary between different files in XFS, is
this just a bug in the XFS implementation of XFS_IOC_DIOINFO?  Or was
XFS_IOC_DIOINFO only ever intended to report per-filesystem state?  If the
latter, then yes, that would mean it wouldn't really be suitable to reuse to
start reporting per-file state.  (Per-file state is required for encrypted
files.  It's also required for other filesystem features; e.g., files that use
compression or fs-verity don't support direct I/O at all.)

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20 22:48             ` [f2fs-dev] " Eric Biggers
@ 2022-01-20 23:57               ` Dave Chinner
  -1 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-01-20 23:57 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Darrick J. Wong, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 02:48:52PM -0800, Eric Biggers wrote:
> On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> > On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > > 
> > > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > > important in practice; many other filesystem features place constraints
> > > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > > have trouble with the FS block size alignment requirement.)
> > > > > > 
> > > > > > It might make sense to use this as an opportunity to implement
> > > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > > 
> > > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > > list the /file/ position alignment requirement:
> > > > > 
> > > > > struct dioattr {
> > > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > > };
> > > > 
> > > > Well, the comment above struct dioattr says:
> > > > 
> > > > 	/*
> > > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > > 	 * alignment.
> > > > 	 */
> > > > 
> > > > So d_miniosz serves that purpose already.
> > > > 
> > > > > 
> > > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > > block size, right?
> > > > 
> > > > The file position must be a multiple of the filesystem block size, yes.
> > > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > > good enough for the fscrypt direct I/O case.
> > > 
> > > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > > add a couple of implementations for ext4 and f2fs, and I think that'll
> > > be enough to get the fscrypt patchset moving again.
> > 
> > On the contrary, I'd much prefer to see this information added to
> > statx(). The file offset alignment info is a property of the current
> > file (e.g. XFS can have different per-file requirements depending on
> > whether the file data is hosted on the data or RT device, etc) and
> > so it's not a fixed property of the filesystem.
> > 
> > statx() was designed to be extended with per-file property
> > information, and we already have stuff like filesystem block size in
> > that syscall. Hence I would much prefer that we extend it with the
> > DIO properties we need to support rather than "create" a new VFS
> > ioctl to extract this information. We already have statx(), so let's
> > use it for what it was intended for.
> > 
> 
> I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
> looks at the filesystem only,

You've got that wrong.

        case XFS_IOC_DIOINFO: {
>>>>>>          struct xfs_buftarg      *target = xfs_inode_buftarg(ip);
                struct dioattr          da;

                da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;

xfs_inode_buftarg() is determining which block device the inode is
storing it's data on, so the returned dioattr values can be
different for different inodes in the filesystem...

It's always been that way since the early Irix days - XFS RT devices
could have very different IO constraints than the data device and
DIO had to conform to the hardware limits underlying the filesystem.
Hence the dioattr information has -always- been per-inode
information.

> (Per-file state is required for encrypted
> files.  It's also required for other filesystem features; e.g., files that use
> compression or fs-verity don't support direct I/O at all.)

Which is exactly why is should be a property of statx(), rather than
try to re-use a ~30 year old filesystem specific API from a
different OS that was never intended to indicate things like "DIO
not supported on this file at all"....

We've been bitten many times by this "lift a rarely used filesystem
specific ioctl to the VFS because it exists" method of API
promotion. It almost always ends up in us discovering further down
the track that there's something wrong with the API, it doesn't
quite do what we need, we have to extend it anyway, or it's just
plain borken, etc. And then we have to create a new, fit for purpose
API anyway, and there's two VFS APIs we have to maintain forever
instead of just one...

Can we learn from past mistakes this time instead of repeating them
yet again?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-20 23:57               ` Dave Chinner
  0 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-01-20 23:57 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-xfs, Theodore Ts'o, Darrick J. Wong, linux-f2fs-devel,
	Christoph Hellwig, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Thu, Jan 20, 2022 at 02:48:52PM -0800, Eric Biggers wrote:
> On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> > On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > > 
> > > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > > important in practice; many other filesystem features place constraints
> > > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > > have trouble with the FS block size alignment requirement.)
> > > > > > 
> > > > > > It might make sense to use this as an opportunity to implement
> > > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > > 
> > > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > > list the /file/ position alignment requirement:
> > > > > 
> > > > > struct dioattr {
> > > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > > };
> > > > 
> > > > Well, the comment above struct dioattr says:
> > > > 
> > > > 	/*
> > > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > > 	 * alignment.
> > > > 	 */
> > > > 
> > > > So d_miniosz serves that purpose already.
> > > > 
> > > > > 
> > > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > > block size, right?
> > > > 
> > > > The file position must be a multiple of the filesystem block size, yes.
> > > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > > good enough for the fscrypt direct I/O case.
> > > 
> > > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > > add a couple of implementations for ext4 and f2fs, and I think that'll
> > > be enough to get the fscrypt patchset moving again.
> > 
> > On the contrary, I'd much prefer to see this information added to
> > statx(). The file offset alignment info is a property of the current
> > file (e.g. XFS can have different per-file requirements depending on
> > whether the file data is hosted on the data or RT device, etc) and
> > so it's not a fixed property of the filesystem.
> > 
> > statx() was designed to be extended with per-file property
> > information, and we already have stuff like filesystem block size in
> > that syscall. Hence I would much prefer that we extend it with the
> > DIO properties we need to support rather than "create" a new VFS
> > ioctl to extract this information. We already have statx(), so let's
> > use it for what it was intended for.
> > 
> 
> I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
> looks at the filesystem only,

You've got that wrong.

        case XFS_IOC_DIOINFO: {
>>>>>>          struct xfs_buftarg      *target = xfs_inode_buftarg(ip);
                struct dioattr          da;

                da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;

xfs_inode_buftarg() is determining which block device the inode is
storing it's data on, so the returned dioattr values can be
different for different inodes in the filesystem...

It's always been that way since the early Irix days - XFS RT devices
could have very different IO constraints than the data device and
DIO had to conform to the hardware limits underlying the filesystem.
Hence the dioattr information has -always- been per-inode
information.

> (Per-file state is required for encrypted
> files.  It's also required for other filesystem features; e.g., files that use
> compression or fs-verity don't support direct I/O at all.)

Which is exactly why is should be a property of statx(), rather than
try to re-use a ~30 year old filesystem specific API from a
different OS that was never intended to indicate things like "DIO
not supported on this file at all"....

We've been bitten many times by this "lift a rarely used filesystem
specific ioctl to the VFS because it exists" method of API
promotion. It almost always ends up in us discovering further down
the track that there's something wrong with the API, it doesn't
quite do what we need, we have to extend it anyway, or it's just
plain borken, etc. And then we have to create a new, fit for purpose
API anyway, and there's two VFS APIs we have to maintain forever
instead of just one...

Can we learn from past mistakes this time instead of repeating them
yet again?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-20 23:57               ` [f2fs-dev] " Dave Chinner
@ 2022-01-21  2:36                 ` Darrick J. Wong
  -1 siblings, 0 replies; 44+ messages in thread
From: Darrick J. Wong @ 2022-01-21  2:36 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Eric Biggers, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Fri, Jan 21, 2022 at 10:57:55AM +1100, Dave Chinner wrote:
> On Thu, Jan 20, 2022 at 02:48:52PM -0800, Eric Biggers wrote:
> > On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > > > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > > > 
> > > > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > > > important in practice; many other filesystem features place constraints
> > > > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > > > have trouble with the FS block size alignment requirement.)
> > > > > > > 
> > > > > > > It might make sense to use this as an opportunity to implement
> > > > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > > > 
> > > > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > > > list the /file/ position alignment requirement:
> > > > > > 
> > > > > > struct dioattr {
> > > > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > > > };
> > > > > 
> > > > > Well, the comment above struct dioattr says:
> > > > > 
> > > > > 	/*
> > > > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > > > 	 * alignment.
> > > > > 	 */
> > > > > 
> > > > > So d_miniosz serves that purpose already.
> > > > > 
> > > > > > 
> > > > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > > > block size, right?
> > > > > 
> > > > > The file position must be a multiple of the filesystem block size, yes.
> > > > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > > > good enough for the fscrypt direct I/O case.
> > > > 
> > > > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > > > add a couple of implementations for ext4 and f2fs, and I think that'll
> > > > be enough to get the fscrypt patchset moving again.
> > > 
> > > On the contrary, I'd much prefer to see this information added to
> > > statx(). The file offset alignment info is a property of the current
> > > file (e.g. XFS can have different per-file requirements depending on
> > > whether the file data is hosted on the data or RT device, etc) and
> > > so it's not a fixed property of the filesystem.
> > > 
> > > statx() was designed to be extended with per-file property
> > > information, and we already have stuff like filesystem block size in
> > > that syscall. Hence I would much prefer that we extend it with the
> > > DIO properties we need to support rather than "create" a new VFS
> > > ioctl to extract this information. We already have statx(), so let's
> > > use it for what it was intended for.

Eh, ok.  Let's do that instead.

> > > 
> > 
> > I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
> > looks at the filesystem only,
> 
> You've got that wrong.
> 
>         case XFS_IOC_DIOINFO: {
> >>>>>>          struct xfs_buftarg      *target = xfs_inode_buftarg(ip);
>                 struct dioattr          da;
> 
>                 da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> 
> xfs_inode_buftarg() is determining which block device the inode is
> storing it's data on, so the returned dioattr values can be
> different for different inodes in the filesystem...
> 
> It's always been that way since the early Irix days - XFS RT devices
> could have very different IO constraints than the data device and
> DIO had to conform to the hardware limits underlying the filesystem.
> Hence the dioattr information has -always- been per-inode
> information.
> 
> > (Per-file state is required for encrypted
> > files.  It's also required for other filesystem features; e.g., files that use
> > compression or fs-verity don't support direct I/O at all.)
> 
> Which is exactly why is should be a property of statx(), rather than
> try to re-use a ~30 year old filesystem specific API from a
> different OS that was never intended to indicate things like "DIO
> not supported on this file at all"....

Heh.  You mean like ALLOCSP?  Ok ok point taken.

> We've been bitten many times by this "lift a rarely used filesystem
> specific ioctl to the VFS because it exists" method of API
> promotion. It almost always ends up in us discovering further down
> the track that there's something wrong with the API, it doesn't
> quite do what we need, we have to extend it anyway, or it's just
> plain borken, etc. And then we have to create a new, fit for purpose
> API anyway, and there's two VFS APIs we have to maintain forever
> instead of just one...
> 
> Can we learn from past mistakes this time instead of repeating them
> yet again?

Sure.  How's this?  I couldn't think of a real case of directio
requiring different alignments for pos and bytecount, so the only real
addition here is the alignment requirements for best performance.

struct statx {
...
	/* 0x90 */
	__u64	stx_mnt_id;

	/* Memory buffer alignment required for directio, in bytes. */
	__u32	stx_dio_mem_align;

	/* File range alignment required for directio, in bytes. */
	__u32	stx_dio_fpos_align_min;

	/* 0xa0 */

	/* File range alignment needed for best performance, in bytes. */
	__u32	stx_dio_fpos_align_opt;

	/* Maximum size of a directio request, in bytes. */
	__u32	stx_dio_max_iosize;

	__u64	__spare3[11];	/* Spare space for future expansion */
	/* 0x100 */
};

Along with:

#define STATX_DIRECTIO	0x00001000U	/* Want/got directio geometry */

How about that?

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-21  2:36                 ` Darrick J. Wong
  0 siblings, 0 replies; 44+ messages in thread
From: Darrick J. Wong @ 2022-01-21  2:36 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Theodore Ts'o, linux-f2fs-devel,
	Eric Biggers, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4, linux-xfs

On Fri, Jan 21, 2022 at 10:57:55AM +1100, Dave Chinner wrote:
> On Thu, Jan 20, 2022 at 02:48:52PM -0800, Eric Biggers wrote:
> > On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > > > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > > > 
> > > > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > > > important in practice; many other filesystem features place constraints
> > > > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > > > have trouble with the FS block size alignment requirement.)
> > > > > > > 
> > > > > > > It might make sense to use this as an opportunity to implement
> > > > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > > > 
> > > > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > > > list the /file/ position alignment requirement:
> > > > > > 
> > > > > > struct dioattr {
> > > > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > > > };
> > > > > 
> > > > > Well, the comment above struct dioattr says:
> > > > > 
> > > > > 	/*
> > > > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > > > 	 * alignment.
> > > > > 	 */
> > > > > 
> > > > > So d_miniosz serves that purpose already.
> > > > > 
> > > > > > 
> > > > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > > > block size, right?
> > > > > 
> > > > > The file position must be a multiple of the filesystem block size, yes.
> > > > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > > > good enough for the fscrypt direct I/O case.
> > > > 
> > > > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > > > add a couple of implementations for ext4 and f2fs, and I think that'll
> > > > be enough to get the fscrypt patchset moving again.
> > > 
> > > On the contrary, I'd much prefer to see this information added to
> > > statx(). The file offset alignment info is a property of the current
> > > file (e.g. XFS can have different per-file requirements depending on
> > > whether the file data is hosted on the data or RT device, etc) and
> > > so it's not a fixed property of the filesystem.
> > > 
> > > statx() was designed to be extended with per-file property
> > > information, and we already have stuff like filesystem block size in
> > > that syscall. Hence I would much prefer that we extend it with the
> > > DIO properties we need to support rather than "create" a new VFS
> > > ioctl to extract this information. We already have statx(), so let's
> > > use it for what it was intended for.

Eh, ok.  Let's do that instead.

> > > 
> > 
> > I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
> > looks at the filesystem only,
> 
> You've got that wrong.
> 
>         case XFS_IOC_DIOINFO: {
> >>>>>>          struct xfs_buftarg      *target = xfs_inode_buftarg(ip);
>                 struct dioattr          da;
> 
>                 da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> 
> xfs_inode_buftarg() is determining which block device the inode is
> storing it's data on, so the returned dioattr values can be
> different for different inodes in the filesystem...
> 
> It's always been that way since the early Irix days - XFS RT devices
> could have very different IO constraints than the data device and
> DIO had to conform to the hardware limits underlying the filesystem.
> Hence the dioattr information has -always- been per-inode
> information.
> 
> > (Per-file state is required for encrypted
> > files.  It's also required for other filesystem features; e.g., files that use
> > compression or fs-verity don't support direct I/O at all.)
> 
> Which is exactly why is should be a property of statx(), rather than
> try to re-use a ~30 year old filesystem specific API from a
> different OS that was never intended to indicate things like "DIO
> not supported on this file at all"....

Heh.  You mean like ALLOCSP?  Ok ok point taken.

> We've been bitten many times by this "lift a rarely used filesystem
> specific ioctl to the VFS because it exists" method of API
> promotion. It almost always ends up in us discovering further down
> the track that there's something wrong with the API, it doesn't
> quite do what we need, we have to extend it anyway, or it's just
> plain borken, etc. And then we have to create a new, fit for purpose
> API anyway, and there's two VFS APIs we have to maintain forever
> instead of just one...
> 
> Can we learn from past mistakes this time instead of repeating them
> yet again?

Sure.  How's this?  I couldn't think of a real case of directio
requiring different alignments for pos and bytecount, so the only real
addition here is the alignment requirements for best performance.

struct statx {
...
	/* 0x90 */
	__u64	stx_mnt_id;

	/* Memory buffer alignment required for directio, in bytes. */
	__u32	stx_dio_mem_align;

	/* File range alignment required for directio, in bytes. */
	__u32	stx_dio_fpos_align_min;

	/* 0xa0 */

	/* File range alignment needed for best performance, in bytes. */
	__u32	stx_dio_fpos_align_opt;

	/* Maximum size of a directio request, in bytes. */
	__u32	stx_dio_max_iosize;

	__u64	__spare3[11];	/* Spare space for future expansion */
	/* 0x100 */
};

Along with:

#define STATX_DIRECTIO	0x00001000U	/* Want/got directio geometry */

How about that?

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 1/5] fscrypt: add functions for direct I/O support
  2022-01-20  9:04       ` [f2fs-dev] " Eric Biggers
@ 2022-01-21  7:10         ` Christoph Hellwig
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-21  7:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Christoph Hellwig, linux-fscrypt, linux-fsdevel, linux-ext4,
	linux-f2fs-devel, linux-xfs, Dave Chinner, Darrick J . Wong,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu, Satya Tangirala

On Thu, Jan 20, 2022 at 01:04:17AM -0800, Eric Biggers wrote:
> I actually had changed this from v9 because fscrypt_dio_supported() seemed
> backwards, given that its purpose is to check whether DIO is unsupported, not
> whether it's supported per se (and the function's comment reflected this).  What
> ext4 and f2fs do is check a list of reasons why DIO would *not* be supported,
> and if none apply, then it is supported.  This is just one of those reasons.
> 
> This is subjective though, so if people prefer the old way, I'll change it back.

I find non-negated API much better and would also help with undinwinding
the ext4/f2fs mess.  But I'm not going to block the series on such a
minor detail, of course.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 1/5] fscrypt: add functions for direct I/O support
@ 2022-01-21  7:10         ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-21  7:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-xfs, Theodore Ts'o, Darrick J . Wong, Dave Chinner,
	linux-f2fs-devel, Christoph Hellwig, linux-fscrypt,
	linux-fsdevel, Jaegeuk Kim, Satya Tangirala, linux-ext4

On Thu, Jan 20, 2022 at 01:04:17AM -0800, Eric Biggers wrote:
> I actually had changed this from v9 because fscrypt_dio_supported() seemed
> backwards, given that its purpose is to check whether DIO is unsupported, not
> whether it's supported per se (and the function's comment reflected this).  What
> ext4 and f2fs do is check a list of reasons why DIO would *not* be supported,
> and if none apply, then it is supported.  This is just one of those reasons.
> 
> This is subjective though, so if people prefer the old way, I'll change it back.

I find non-negated API much better and would also help with undinwinding
the ext4/f2fs mess.  But I'm not going to block the series on such a
minor detail, of course.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-21  2:36                 ` [f2fs-dev] " Darrick J. Wong
@ 2022-01-21  7:12                   ` Christoph Hellwig
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-21  7:12 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Dave Chinner, Eric Biggers, Christoph Hellwig, linux-fscrypt,
	linux-fsdevel, linux-ext4, linux-f2fs-devel, linux-xfs,
	Theodore Ts'o, Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 06:36:03PM -0800, Darrick J. Wong wrote:
> Sure.  How's this?  I couldn't think of a real case of directio
> requiring different alignments for pos and bytecount, so the only real
> addition here is the alignment requirements for best performance.

While I see some benefits of adding the information to a catchall like
statx we really need to be careful to not bloat the structure like
crazy.

> struct statx {
> ...
> 	/* 0x90 */
> 	__u64	stx_mnt_id;
> 
> 	/* Memory buffer alignment required for directio, in bytes. */
> 	__u32	stx_dio_mem_align;
> 
> 	/* File range alignment required for directio, in bytes. */
> 	__u32	stx_dio_fpos_align_min;

So this really needs a good explanation why we need both iven that we
had no real use case for this.

> 	/* File range alignment needed for best performance, in bytes. */
> 	__u32	stx_dio_fpos_align_opt;

And why we really care about this.  I guess you want to allow sector
size dio in reflink setups, but discourage it.  But is this really as
important?

> 	/* Maximum size of a directio request, in bytes. */
> 	__u32	stx_dio_max_iosize;

I know XFS_IOC_DIOINFO had this, but does it really make much sense?
Why do we need it for direct I/O and not buffered I/O?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-21  7:12                   ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-01-21  7:12 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Theodore Ts'o, Dave Chinner,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, linux-fsdevel,
	Jaegeuk Kim, linux-ext4, linux-xfs

On Thu, Jan 20, 2022 at 06:36:03PM -0800, Darrick J. Wong wrote:
> Sure.  How's this?  I couldn't think of a real case of directio
> requiring different alignments for pos and bytecount, so the only real
> addition here is the alignment requirements for best performance.

While I see some benefits of adding the information to a catchall like
statx we really need to be careful to not bloat the structure like
crazy.

> struct statx {
> ...
> 	/* 0x90 */
> 	__u64	stx_mnt_id;
> 
> 	/* Memory buffer alignment required for directio, in bytes. */
> 	__u32	stx_dio_mem_align;
> 
> 	/* File range alignment required for directio, in bytes. */
> 	__u32	stx_dio_fpos_align_min;

So this really needs a good explanation why we need both iven that we
had no real use case for this.

> 	/* File range alignment needed for best performance, in bytes. */
> 	__u32	stx_dio_fpos_align_opt;

And why we really care about this.  I guess you want to allow sector
size dio in reflink setups, but discourage it.  But is this really as
important?

> 	/* Maximum size of a directio request, in bytes. */
> 	__u32	stx_dio_max_iosize;

I know XFS_IOC_DIOINFO had this, but does it really make much sense?
Why do we need it for direct I/O and not buffered I/O?


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-21  2:36                 ` [f2fs-dev] " Darrick J. Wong
@ 2022-01-23 23:03                   ` Dave Chinner
  -1 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-01-23 23:03 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Eric Biggers, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Thu, Jan 20, 2022 at 06:36:03PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 21, 2022 at 10:57:55AM +1100, Dave Chinner wrote:
> Sure.  How's this?  I couldn't think of a real case of directio
> requiring different alignments for pos and bytecount, so the only real
> addition here is the alignment requirements for best performance.
> 
> struct statx {
> ...
> 	/* 0x90 */
> 	__u64	stx_mnt_id;
> 
> 	/* Memory buffer alignment required for directio, in bytes. */
> 	__u32	stx_dio_mem_align;

	__32	stx_mem_align_dio;

(for consistency with suggestions below)

> 
> 	/* File range alignment required for directio, in bytes. */
> 	__u32	stx_dio_fpos_align_min;

"fpos" is not really a user term - "offset" is the userspace term for
file position, and it's much less of a random letter salad if it's
named that way. Also, we don't need "min" in the name; the
description of the field in the man page can give all the gory
details about it being the minimum required alignment.

	__u32	stx_offset_align_dio;

> 
> 	/* 0xa0 */
> 
> 	/* File range alignment needed for best performance, in bytes. */
> 	__u32	stx_dio_fpos_align_opt;

This is a common property of both DIO and buffered IO, so no need
for it to be dio-only property.

	__u32	stx_offset_align_optimal;

> 
> 	/* Maximum size of a directio request, in bytes. */
> 	__u32	stx_dio_max_iosize;

Unnecessary, it will always be the syscall max IO size, because the
internal DIO code will slice and dice it down to the max sizes the
hardware supports.

> #define STATX_DIRECTIO	0x00001000U	/* Want/got directio geometry */
> 
> How about that?

Mostly seems reasonable at a first look.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-01-23 23:03                   ` Dave Chinner
  0 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-01-23 23:03 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Theodore Ts'o, linux-f2fs-devel,
	Eric Biggers, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4, linux-xfs

On Thu, Jan 20, 2022 at 06:36:03PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 21, 2022 at 10:57:55AM +1100, Dave Chinner wrote:
> Sure.  How's this?  I couldn't think of a real case of directio
> requiring different alignments for pos and bytecount, so the only real
> addition here is the alignment requirements for best performance.
> 
> struct statx {
> ...
> 	/* 0x90 */
> 	__u64	stx_mnt_id;
> 
> 	/* Memory buffer alignment required for directio, in bytes. */
> 	__u32	stx_dio_mem_align;

	__32	stx_mem_align_dio;

(for consistency with suggestions below)

> 
> 	/* File range alignment required for directio, in bytes. */
> 	__u32	stx_dio_fpos_align_min;

"fpos" is not really a user term - "offset" is the userspace term for
file position, and it's much less of a random letter salad if it's
named that way. Also, we don't need "min" in the name; the
description of the field in the man page can give all the gory
details about it being the minimum required alignment.

	__u32	stx_offset_align_dio;

> 
> 	/* 0xa0 */
> 
> 	/* File range alignment needed for best performance, in bytes. */
> 	__u32	stx_dio_fpos_align_opt;

This is a common property of both DIO and buffered IO, so no need
for it to be dio-only property.

	__u32	stx_offset_align_optimal;

> 
> 	/* Maximum size of a directio request, in bytes. */
> 	__u32	stx_dio_max_iosize;

Unnecessary, it will always be the syscall max IO size, because the
internal DIO code will slice and dice it down to the max sizes the
hardware supports.

> #define STATX_DIRECTIO	0x00001000U	/* Want/got directio geometry */
> 
> How about that?

Mostly seems reasonable at a first look.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-01-23 23:03                   ` [f2fs-dev] " Dave Chinner
@ 2022-02-09  1:10                     ` Eric Biggers
  -1 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-02-09  1:10 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-xfs, Theodore Ts'o, Darrick J. Wong, linux-f2fs-devel,
	Christoph Hellwig, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Mon, Jan 24, 2022 at 10:03:32AM +1100, Dave Chinner wrote:
> > 
> > 	/* 0xa0 */
> > 
> > 	/* File range alignment needed for best performance, in bytes. */
> > 	__u32	stx_dio_fpos_align_opt;
> 
> This is a common property of both DIO and buffered IO, so no need
> for it to be dio-only property.
> 
> 	__u32	stx_offset_align_optimal;
> 

Looking at this more closely: will stx_offset_align_optimal actually be useful,
given that st[x]_blksize already exists?

From the stat(2) and statx(2) man pages:

	st_blksize
		This field  gives  the  "preferred"  block  size  for  efficient
		filesystem I/O.

	stx_blksize
		The "preferred" block size for efficient filesystem I/O.  (Writ‐
		ing  to  a file in smaller chunks may cause an inefficient read-
		modify-rewrite.)

File offsets aren't explicitly mentioned, but I think it's implied they should
be a multiple of st[x]_blksize, just like the I/O size.  Otherwise, the I/O
would obviously require reading/writing partial blocks.

So, the proposed stx_offset_align_optimal field sounds like the same thing to
me.  Is there anything I'm misunderstanding?

Putting stx_offset_align_optimal behind the STATX_DIRECTIO flag would also be
confusing if it would apply to both direct and buffered I/O.

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-02-09  1:10                     ` Eric Biggers
  0 siblings, 0 replies; 44+ messages in thread
From: Eric Biggers @ 2022-02-09  1:10 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Mon, Jan 24, 2022 at 10:03:32AM +1100, Dave Chinner wrote:
> > 
> > 	/* 0xa0 */
> > 
> > 	/* File range alignment needed for best performance, in bytes. */
> > 	__u32	stx_dio_fpos_align_opt;
> 
> This is a common property of both DIO and buffered IO, so no need
> for it to be dio-only property.
> 
> 	__u32	stx_offset_align_optimal;
> 

Looking at this more closely: will stx_offset_align_optimal actually be useful,
given that st[x]_blksize already exists?

From the stat(2) and statx(2) man pages:

	st_blksize
		This field  gives  the  "preferred"  block  size  for  efficient
		filesystem I/O.

	stx_blksize
		The "preferred" block size for efficient filesystem I/O.  (Writ‐
		ing  to  a file in smaller chunks may cause an inefficient read-
		modify-rewrite.)

File offsets aren't explicitly mentioned, but I think it's implied they should
be a multiple of st[x]_blksize, just like the I/O size.  Otherwise, the I/O
would obviously require reading/writing partial blocks.

So, the proposed stx_offset_align_optimal field sounds like the same thing to
me.  Is there anything I'm misunderstanding?

Putting stx_offset_align_optimal behind the STATX_DIRECTIO flag would also be
confusing if it would apply to both direct and buffered I/O.

- Eric

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
  2022-02-09  1:10                     ` Eric Biggers
@ 2022-02-10  4:03                       ` Dave Chinner
  -1 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-02-10  4:03 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Darrick J. Wong, Christoph Hellwig, linux-fscrypt, linux-fsdevel,
	linux-ext4, linux-f2fs-devel, linux-xfs, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu

On Tue, Feb 08, 2022 at 05:10:03PM -0800, Eric Biggers wrote:
> On Mon, Jan 24, 2022 at 10:03:32AM +1100, Dave Chinner wrote:
> > > 
> > > 	/* 0xa0 */
> > > 
> > > 	/* File range alignment needed for best performance, in bytes. */
> > > 	__u32	stx_dio_fpos_align_opt;
> > 
> > This is a common property of both DIO and buffered IO, so no need
> > for it to be dio-only property.
> > 
> > 	__u32	stx_offset_align_optimal;
> > 
> 
> Looking at this more closely: will stx_offset_align_optimal actually be useful,
> given that st[x]_blksize already exists?

Yes, because....

> From the stat(2) and statx(2) man pages:
> 
> 	st_blksize
> 		This field  gives  the  "preferred"  block  size  for  efficient
> 		filesystem I/O.
> 
> 	stx_blksize
> 		The "preferred" block size for efficient filesystem I/O.  (Writ‐
> 		ing  to  a file in smaller chunks may cause an inefficient read-
> 		modify-rewrite.)

... historically speaking, this is intended to avoid RMW cycles for
sub-block and/or sub-PAGE_SIZE write() IOs. i.e. the practical
definition of st_blksize is the *minimum* IO size the needed to
avoid page cache RMW cycles.

However, XFS has a "-o largeio" mount option, that sets this value
to internal optimal filesytsem alignment values such as stripe unit
or even stripe width (-o largeio,swalloc). THis means it can be up
to 2GB (maybe larger?) in size.

THe problem with this is that many applications are not prepared to
see a value of, say, 16MB in st_blksize rather than 4096 bytes. An
example of such problems are applications sizing their IO buffers as
a multiple of st_blksize - we've had applications fail because they
try to use multi-GB sized IO buffers as a result of setting
st_blksize to the filesystem/storage idea of optimal IO size rather
than PAGE_SIZE.

Hence, we can't really change the value of st_blksize without
risking random breakage in userspace. hence the practical definition
of st_blksize is the *minimum* IO size that avoids RMW cycles for an
individual write() syscall, not the most efficient IO size.

> File offsets aren't explicitly mentioned, but I think it's implied they should
> be a multiple of st[x]_blksize, just like the I/O size.  Otherwise, the I/O
> would obviously require reading/writing partial blocks.

Of course it implies aligned file offsets - block aligned IO is
absolutely necessary for effcient filesystem IO. It has for pretty
much the entire of unix history...

> So, the proposed stx_offset_align_optimal field sounds like the same thing to
> me.  Is there anything I'm misunderstanding?
>
> Putting stx_offset_align_optimal behind the STATX_DIRECTIO flag would also be
> confusing if it would apply to both direct and buffered I/O.

So just name the flag STATX_IOALIGN so that it can cover generic,
buffered specific and DIO specific parameters in one hit. Simple,
yes?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
@ 2022-02-10  4:03                       ` Dave Chinner
  0 siblings, 0 replies; 44+ messages in thread
From: Dave Chinner @ 2022-02-10  4:03 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-xfs, Theodore Ts'o, Darrick J. Wong, linux-f2fs-devel,
	Christoph Hellwig, linux-fscrypt, linux-fsdevel, Jaegeuk Kim,
	linux-ext4

On Tue, Feb 08, 2022 at 05:10:03PM -0800, Eric Biggers wrote:
> On Mon, Jan 24, 2022 at 10:03:32AM +1100, Dave Chinner wrote:
> > > 
> > > 	/* 0xa0 */
> > > 
> > > 	/* File range alignment needed for best performance, in bytes. */
> > > 	__u32	stx_dio_fpos_align_opt;
> > 
> > This is a common property of both DIO and buffered IO, so no need
> > for it to be dio-only property.
> > 
> > 	__u32	stx_offset_align_optimal;
> > 
> 
> Looking at this more closely: will stx_offset_align_optimal actually be useful,
> given that st[x]_blksize already exists?

Yes, because....

> From the stat(2) and statx(2) man pages:
> 
> 	st_blksize
> 		This field  gives  the  "preferred"  block  size  for  efficient
> 		filesystem I/O.
> 
> 	stx_blksize
> 		The "preferred" block size for efficient filesystem I/O.  (Writ‐
> 		ing  to  a file in smaller chunks may cause an inefficient read-
> 		modify-rewrite.)

... historically speaking, this is intended to avoid RMW cycles for
sub-block and/or sub-PAGE_SIZE write() IOs. i.e. the practical
definition of st_blksize is the *minimum* IO size the needed to
avoid page cache RMW cycles.

However, XFS has a "-o largeio" mount option, that sets this value
to internal optimal filesytsem alignment values such as stripe unit
or even stripe width (-o largeio,swalloc). THis means it can be up
to 2GB (maybe larger?) in size.

THe problem with this is that many applications are not prepared to
see a value of, say, 16MB in st_blksize rather than 4096 bytes. An
example of such problems are applications sizing their IO buffers as
a multiple of st_blksize - we've had applications fail because they
try to use multi-GB sized IO buffers as a result of setting
st_blksize to the filesystem/storage idea of optimal IO size rather
than PAGE_SIZE.

Hence, we can't really change the value of st_blksize without
risking random breakage in userspace. hence the practical definition
of st_blksize is the *minimum* IO size that avoids RMW cycles for an
individual write() syscall, not the most efficient IO size.

> File offsets aren't explicitly mentioned, but I think it's implied they should
> be a multiple of st[x]_blksize, just like the I/O size.  Otherwise, the I/O
> would obviously require reading/writing partial blocks.

Of course it implies aligned file offsets - block aligned IO is
absolutely necessary for effcient filesystem IO. It has for pretty
much the entire of unix history...

> So, the proposed stx_offset_align_optimal field sounds like the same thing to
> me.  Is there anything I'm misunderstanding?
>
> Putting stx_offset_align_optimal behind the STATX_DIRECTIO flag would also be
> confusing if it would apply to both direct and buffered I/O.

So just name the flag STATX_IOALIGN so that it can cover generic,
buffered specific and DIO specific parameters in one hit. Simple,
yes?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2022-02-10  4:22 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-20  7:12 [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto Eric Biggers
2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
2022-01-20  7:12 ` [PATCH v10 1/5] fscrypt: add functions for direct I/O support Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  8:27   ` Christoph Hellwig
2022-01-20  8:27     ` [f2fs-dev] " Christoph Hellwig
2022-01-20  9:04     ` Eric Biggers
2022-01-20  9:04       ` [f2fs-dev] " Eric Biggers
2022-01-21  7:10       ` Christoph Hellwig
2022-01-21  7:10         ` [f2fs-dev] " Christoph Hellwig
2022-01-20  7:12 ` [PATCH v10 2/5] iomap: support direct I/O with fscrypt using blk-crypto Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  8:28   ` Christoph Hellwig
2022-01-20  8:28     ` [f2fs-dev] " Christoph Hellwig
2022-01-20  7:12 ` [PATCH v10 3/5] ext4: " Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  7:12 ` [PATCH v10 4/5] f2fs: " Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  7:12 ` [PATCH v10 5/5] fscrypt: update documentation for direct I/O support Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  8:30 ` [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto Christoph Hellwig
2022-01-20  8:30   ` Christoph Hellwig
2022-01-20 17:10   ` Darrick J. Wong
2022-01-20 17:10     ` [f2fs-dev] " Darrick J. Wong
2022-01-20 20:39     ` Eric Biggers
2022-01-20 20:39       ` [f2fs-dev] " Eric Biggers
2022-01-20 21:00       ` Darrick J. Wong
2022-01-20 21:00         ` [f2fs-dev] " Darrick J. Wong
2022-01-20 22:04         ` Dave Chinner
2022-01-20 22:04           ` [f2fs-dev] " Dave Chinner
2022-01-20 22:48           ` Eric Biggers
2022-01-20 22:48             ` [f2fs-dev] " Eric Biggers
2022-01-20 23:57             ` Dave Chinner
2022-01-20 23:57               ` [f2fs-dev] " Dave Chinner
2022-01-21  2:36               ` Darrick J. Wong
2022-01-21  2:36                 ` [f2fs-dev] " Darrick J. Wong
2022-01-21  7:12                 ` Christoph Hellwig
2022-01-21  7:12                   ` [f2fs-dev] " Christoph Hellwig
2022-01-23 23:03                 ` Dave Chinner
2022-01-23 23:03                   ` [f2fs-dev] " Dave Chinner
2022-02-09  1:10                   ` Eric Biggers
2022-02-09  1:10                     ` Eric Biggers
2022-02-10  4:03                     ` Dave Chinner
2022-02-10  4:03                       ` [f2fs-dev] " Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.