linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto
@ 2020-11-17 14:07 Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit Satya Tangirala
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Satya Tangirala

This patch series adds support for direct I/O with fscrypt using
blk-crypto. It has been rebased on fscrypt/master (i.e. the "master"
branch of the fscrypt tree at
https://git.kernel.org/pub/scm/fs/fscrypt/fscrypt.git)

Patch 1 ensures that bios are not split in the middle of a crypto data
unit.

Till now, blk-crypto expected the offset and length of each bvec in a bio
to be aligned to the crypto data unit size. Patch 2 enables
blk-crypto-fallback to work without this requirement. It also relaxes the
alignment requirement that blk-crypto checks for - now, blk-crypto only
requires that the length of the I/O is aligned to the crypto data unit
size.

Patch 3 adds two functions to fscrypt that need to be called to determine
if direct I/O is supported for a request.

Patches 4 and 5 modify direct-io and iomap respectively to set bio crypt
contexts on bios when appropriate by calling into fscrypt.

Patches 6 and 7 allow ext4 and f2fs direct I/O to support fscrypt without
falling back to buffered I/O.

Patch 8 updates the fscrypt documentation for direct I/O support.
The documentation now notes the required conditions for inline encryption
and direct I/O on encrypted files.

This patch series was tested by running xfstests with test_dummy_encryption
with and without the 'inlinecrypt' mount option, and there were no
meaningful regressions. One regression was for generic/587 on ext4,
but that test isn't compatible with test_dummy_encryption in the first
place, and the test "incorrectly" passes without the 'inlinecrypt' mount
option - a patch will be sent out to exclude that test when
test_dummy_encryption is turned on with ext4 (like the other quota related
tests that use user visible quota files).

Changes v6 => v7:
 - add patches 1 and 2 to allow blk-crypto to work with user buffers not
   aligned to crypto data unit size, so that direct I/O doesn't require
   that alignment either.
 - some cleanups

Changes v5 => v6:
 - fix bug with fscrypt_limit_io_blocks() and make it ready for 64 bit
   block numbers.
 - remove Reviewed-by for Patch 1 due to significant changes from when
   the Reviewed-by was given.

Changes v4 => v5:
 - replace fscrypt_limit_io_pages() with fscrypt_limit_io_block(), which
   is now called by individual filesystems (currently only ext4) instead
   of the iomap code. This new function serves the same end purpose as
   the one it replaces (ensuring that DUNs within a bio are contiguous)
   but operates purely with blocks instead of with pages.
 - make iomap_dio_zero() set bio_crypt_ctx's again, instead of just a
   WARN_ON() since some folks prefer that instead.
 - add Reviewed-by's

Changes v3 => v4:
 - Fix bug in iomap_dio_bio_actor() where fscrypt_limit_io_pages() was
   being called too early (thanks Eric!)
 - Improve comments and fix formatting in documentation
 - iomap_dio_zero() is only called to zero out partial blocks, but
   direct I/O is only supported on encrypted files when I/O is
   blocksize aligned, so it doesn't need to set encryption contexts on
   bios. Replace setting the encryption context with a WARN_ON(). (Eric)

Changes v2 => v3:
 - add changelog to coverletter

Changes v1 => v2:
 - Fix bug in f2fs caused by replacing f2fs_post_read_required() with
   !fscrypt_dio_supported() since the latter doesn't check for
   compressed inodes unlike the former.
 - Add patches 6 and 7 for fscrypt documentation
 - cleanups and comments

Eric Biggers (5):
  fscrypt: add functions for direct I/O support
  direct-io: add support for fscrypt using blk-crypto
  iomap: support direct I/O with fscrypt using blk-crypto
  ext4: support direct I/O with fscrypt using blk-crypto
  f2fs: support direct I/O with fscrypt using blk-crypto

Satya Tangirala (3):
  block: ensure bios are not split in middle of crypto data unit
  blk-crypto: don't require user buffer alignment
  fscrypt: update documentation for direct I/O support

 Documentation/filesystems/fscrypt.rst |  21 ++-
 block/bio.c                           |   1 +
 block/blk-crypto-fallback.c           | 212 +++++++++++++++++++-------
 block/blk-crypto-internal.h           |  18 +++
 block/blk-crypto.c                    |  19 +--
 block/blk-merge.c                     |  96 ++++++++++--
 block/blk-mq.c                        |   3 +
 block/bounce.c                        |   4 +
 fs/crypto/crypto.c                    |   8 +
 fs/crypto/inline_crypt.c              |  74 +++++++++
 fs/direct-io.c                        |  15 +-
 fs/ext4/file.c                        |  10 +-
 fs/ext4/inode.c                       |   7 +
 fs/f2fs/f2fs.h                        |   6 +-
 fs/iomap/direct-io.c                  |   6 +
 include/linux/fscrypt.h               |  18 +++
 16 files changed, 431 insertions(+), 87 deletions(-)

-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-17 23:31   ` Eric Biggers
  2020-11-25 22:15   ` Eric Biggers
  2020-11-17 14:07 ` [PATCH v7 2/8] blk-crypto: don't require user buffer alignment Satya Tangirala
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Satya Tangirala

Introduce blk_crypto_bio_sectors_alignment() that returns the required
alignment for the number of sectors in a bio. Any bio split must ensure
that the number of sectors in the resulting bios is aligned to that
returned value. This patch also updates __blk_queue_split(),
__blk_queue_bounce() and blk_crypto_split_bio_if_needed() to respect
blk_crypto_bio_sectors_alignment() when splitting bios.

Signed-off-by: Satya Tangirala <satyat@google.com>
---
 block/bio.c                 |  1 +
 block/blk-crypto-fallback.c | 10 ++--
 block/blk-crypto-internal.h | 18 +++++++
 block/blk-merge.c           | 96 ++++++++++++++++++++++++++++++++-----
 block/blk-mq.c              |  3 ++
 block/bounce.c              |  4 ++
 6 files changed, 117 insertions(+), 15 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fa01bef35bb1..259cef126df3 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1472,6 +1472,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
 
 	BUG_ON(sectors <= 0);
 	BUG_ON(sectors >= bio_sectors(bio));
+	WARN_ON(!IS_ALIGNED(sectors, blk_crypto_bio_sectors_alignment(bio)));
 
 	/* Zone append commands cannot be split */
 	if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND))
diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index c162b754efbd..db2d2c67b308 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -209,20 +209,22 @@ static bool blk_crypto_alloc_cipher_req(struct blk_ksm_keyslot *slot,
 static bool blk_crypto_split_bio_if_needed(struct bio **bio_ptr)
 {
 	struct bio *bio = *bio_ptr;
+	struct bio_crypt_ctx *bc = bio->bi_crypt_context;
 	unsigned int i = 0;
-	unsigned int num_sectors = 0;
+	unsigned int len = 0;
 	struct bio_vec bv;
 	struct bvec_iter iter;
 
 	bio_for_each_segment(bv, bio, iter) {
-		num_sectors += bv.bv_len >> SECTOR_SHIFT;
+		len += bv.bv_len;
 		if (++i == BIO_MAX_PAGES)
 			break;
 	}
-	if (num_sectors < bio_sectors(bio)) {
+	if (len < bio->bi_iter.bi_size) {
 		struct bio *split_bio;
 
-		split_bio = bio_split(bio, num_sectors, GFP_NOIO, NULL);
+		len = round_down(len, bc->bc_key->crypto_cfg.data_unit_size);
+		split_bio = bio_split(bio, len >> SECTOR_SHIFT, GFP_NOIO, NULL);
 		if (!split_bio) {
 			bio->bi_status = BLK_STS_RESOURCE;
 			return false;
diff --git a/block/blk-crypto-internal.h b/block/blk-crypto-internal.h
index 0d36aae538d7..304e90ed99f5 100644
--- a/block/blk-crypto-internal.h
+++ b/block/blk-crypto-internal.h
@@ -60,6 +60,19 @@ static inline bool blk_crypto_rq_is_encrypted(struct request *rq)
 	return rq->crypt_ctx;
 }
 
+/*
+ * Returns the alignment requirement for the number of sectors in this bio based
+ * on its bi_crypt_context. Any bios split from this bio must follow this
+ * alignment requirement as well.
+ */
+static inline unsigned int blk_crypto_bio_sectors_alignment(struct bio *bio)
+{
+	if (!bio_has_crypt_ctx(bio))
+		return 1;
+	return bio->bi_crypt_context->bc_key->crypto_cfg.data_unit_size >>
+								SECTOR_SHIFT;
+}
+
 #else /* CONFIG_BLK_INLINE_ENCRYPTION */
 
 static inline bool bio_crypt_rq_ctx_compatible(struct request *rq,
@@ -93,6 +106,11 @@ static inline bool blk_crypto_rq_is_encrypted(struct request *rq)
 	return false;
 }
 
+static inline unsigned int blk_crypto_bio_sectors_alignment(struct bio *bio)
+{
+	return 1;
+}
+
 #endif /* CONFIG_BLK_INLINE_ENCRYPTION */
 
 void __bio_crypt_advance(struct bio *bio, unsigned int bytes);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index bcf5e4580603..f34dda7132f9 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -149,13 +149,15 @@ static inline unsigned get_max_io_size(struct request_queue *q,
 	unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
 	unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
 	unsigned start_offset = bio->bi_iter.bi_sector & (pbs - 1);
+	unsigned int bio_sectors_alignment =
+					blk_crypto_bio_sectors_alignment(bio);
 
 	max_sectors += start_offset;
 	max_sectors &= ~(pbs - 1);
-	if (max_sectors > start_offset)
-		return max_sectors - start_offset;
+	if (max_sectors - start_offset >= bio_sectors_alignment)
+		return round_down(max_sectors - start_offset, bio_sectors_alignment);
 
-	return sectors & ~(lbs - 1);
+	return round_down(sectors & ~(lbs - 1), bio_sectors_alignment);
 }
 
 static inline unsigned get_max_segment_size(const struct request_queue *q,
@@ -174,6 +176,41 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
 			(unsigned long)queue_max_segment_size(q));
 }
 
+/**
+ * update_aligned_sectors_and_segs() - Ensures that *@aligned_sectors is aligned
+ *				       to @bio_sectors_alignment, and that
+ *				       *@aligned_segs is the value of nsegs
+ *				       when sectors reached/first exceeded that
+ *				       value of *@aligned_sectors.
+ *
+ * @nsegs: [in] The current number of segs
+ * @sectors: [in] The current number of sectors
+ * @aligned_segs: [in,out] The number of segments that make up @aligned_sectors
+ * @aligned_sectors: [in,out] The largest number of sectors <= @sectors that is
+ *		     aligned to @sectors
+ * @bio_sectors_alignment: [in] The alignment requirement for the number of
+ *			  sectors
+ *
+ * Updates *@aligned_sectors to the largest number <= @sectors that is also a
+ * multiple of @bio_sectors_alignment. This is done by updating *@aligned_sectors
+ * whenever @sectors is at least @bio_sectors_alignment more than
+ * *@aligned_sectors, since that means we can increment *@aligned_sectors while
+ * still keeping it aligned to @bio_sectors_alignment and also keeping it <=
+ * @sectors. *@aligned_segs is updated to the value of nsegs when @sectors first
+ * reaches/exceeds any value that causes *@aligned_sectors to be updated.
+ */
+static inline void update_aligned_sectors_and_segs(const unsigned int nsegs,
+						   const unsigned int sectors,
+						   unsigned int *aligned_segs,
+				unsigned int *aligned_sectors,
+				const unsigned int bio_sectors_alignment)
+{
+	if (sectors - *aligned_sectors < bio_sectors_alignment)
+		return;
+	*aligned_sectors = round_down(sectors, bio_sectors_alignment);
+	*aligned_segs = nsegs;
+}
+
 /**
  * bvec_split_segs - verify whether or not a bvec should be split in the middle
  * @q:        [in] request queue associated with the bio associated with @bv
@@ -195,9 +232,12 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
  * the block driver.
  */
 static bool bvec_split_segs(const struct request_queue *q,
-			    const struct bio_vec *bv, unsigned *nsegs,
-			    unsigned *sectors, unsigned max_segs,
-			    unsigned max_sectors)
+			    const struct bio_vec *bv, unsigned int *nsegs,
+			    unsigned int *sectors, unsigned int *aligned_segs,
+			    unsigned int *aligned_sectors,
+			    unsigned int bio_sectors_alignment,
+			    unsigned int max_segs,
+			    unsigned int max_sectors)
 {
 	unsigned max_len = (min(max_sectors, UINT_MAX >> 9) - *sectors) << 9;
 	unsigned len = min(bv->bv_len, max_len);
@@ -211,6 +251,11 @@ static bool bvec_split_segs(const struct request_queue *q,
 
 		(*nsegs)++;
 		total_len += seg_size;
+		update_aligned_sectors_and_segs(*nsegs,
+						*sectors + (total_len >> 9),
+						aligned_segs,
+						aligned_sectors,
+						bio_sectors_alignment);
 		len -= seg_size;
 
 		if ((bv->bv_offset + total_len) & queue_virt_boundary(q))
@@ -235,6 +280,8 @@ static bool bvec_split_segs(const struct request_queue *q,
  * following is guaranteed for the cloned bio:
  * - That it has at most get_max_io_size(@q, @bio) sectors.
  * - That it has at most queue_max_segments(@q) segments.
+ * - That the number of sectors in the returned bio is aligned to
+ *   blk_crypto_bio_sectors_alignment(@bio)
  *
  * Except for discard requests the cloned bio will point at the bi_io_vec of
  * the original bio. It is the responsibility of the caller to ensure that the
@@ -252,6 +299,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 	unsigned nsegs = 0, sectors = 0;
 	const unsigned max_sectors = get_max_io_size(q, bio);
 	const unsigned max_segs = queue_max_segments(q);
+	const unsigned int bio_sectors_alignment =
+					blk_crypto_bio_sectors_alignment(bio);
+	unsigned int aligned_segs = 0, aligned_sectors = 0;
 
 	bio_for_each_bvec(bv, bio, iter) {
 		/*
@@ -266,8 +316,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
 			nsegs++;
 			sectors += bv.bv_len >> 9;
-		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors, max_segs,
-					 max_sectors)) {
+			update_aligned_sectors_and_segs(nsegs, sectors,
+							&aligned_segs,
+							&aligned_sectors,
+							bio_sectors_alignment);
+		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
+					   &aligned_segs, &aligned_sectors,
+					   bio_sectors_alignment, max_segs,
+					   max_sectors)) {
 			goto split;
 		}
 
@@ -275,11 +331,24 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 		bvprvp = &bvprv;
 	}
 
+	/*
+	 * The input bio's number of sectors is assumed to be aligned to
+	 * bio_sectors_alignment. If that's the case, then this function should
+	 * ensure that aligned_segs == nsegs and aligned_sectors == sectors if
+	 * the bio is not going to be split.
+	 */
+	WARN_ON(aligned_segs != nsegs || aligned_sectors != sectors);
 	*segs = nsegs;
 	return NULL;
 split:
-	*segs = nsegs;
-	return bio_split(bio, sectors, GFP_NOIO, bs);
+	*segs = aligned_segs;
+	if (WARN_ON(aligned_sectors == 0))
+		goto err;
+	return bio_split(bio, aligned_sectors, GFP_NOIO, bs);
+err:
+	bio->bi_status = BLK_STS_IOERR;
+	bio_endio(bio);
+	return bio;
 }
 
 /**
@@ -366,6 +435,9 @@ unsigned int blk_recalc_rq_segments(struct request *rq)
 {
 	unsigned int nr_phys_segs = 0;
 	unsigned int nr_sectors = 0;
+	unsigned int nr_aligned_phys_segs = 0;
+	unsigned int nr_aligned_sectors = 0;
+	unsigned int bio_sectors_alignment;
 	struct req_iterator iter;
 	struct bio_vec bv;
 
@@ -381,9 +453,11 @@ unsigned int blk_recalc_rq_segments(struct request *rq)
 		return 1;
 	}
 
+	bio_sectors_alignment = blk_crypto_bio_sectors_alignment(rq->bio);
 	rq_for_each_bvec(bv, rq, iter)
 		bvec_split_segs(rq->q, &bv, &nr_phys_segs, &nr_sectors,
-				UINT_MAX, UINT_MAX);
+				&nr_aligned_phys_segs, &nr_aligned_sectors,
+				bio_sectors_alignment, UINT_MAX, UINT_MAX);
 	return nr_phys_segs;
 }
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 55bcee5dc032..de5c97ab8e5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2161,6 +2161,9 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
 	blk_queue_bounce(q, &bio);
 	__blk_queue_split(&bio, &nr_segs);
 
+	if (bio->bi_status != BLK_STS_OK)
+		goto queue_exit;
+
 	if (!bio_integrity_prep(bio))
 		goto queue_exit;
 
diff --git a/block/bounce.c b/block/bounce.c
index 162a6eee8999..b15224799008 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -295,6 +295,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
 	bool bounce = false;
 	int sectors = 0;
 	bool passthrough = bio_is_passthrough(*bio_orig);
+	unsigned int bio_sectors_alignment;
 
 	bio_for_each_segment(from, *bio_orig, iter) {
 		if (i++ < BIO_MAX_PAGES)
@@ -305,6 +306,9 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
 	if (!bounce)
 		return;
 
+	bio_sectors_alignment = blk_crypto_bio_sectors_alignment(bio);
+	sectors = round_down(sectors, bio_sectors_alignment);
+
 	if (!passthrough && sectors < bio_sectors(*bio_orig)) {
 		bio = bio_split(*bio_orig, sectors, GFP_NOIO, &bounce_bio_split);
 		bio_chain(bio, *bio_orig);
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 2/8] blk-crypto: don't require user buffer alignment
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-17 23:51   ` Eric Biggers
  2020-11-17 14:07 ` [PATCH v7 3/8] fscrypt: add functions for direct I/O support Satya Tangirala
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Satya Tangirala, Eric Biggers

Previously, blk-crypto-fallback required the offset and length of each bvec
in a bio to be aligned to the crypto data unit size. This patch enables
blk-crypto-fallback to work even if that's not the case - the requirement
now is only that the total length of the data in the bio is aligned to
the crypto data unit size.

Now that blk-crypto-fallback can handle bvecs not aligned to crypto data
units, and we've ensured that bios are not split in the middle of a
crypto data unit, we can relax the alignment check done by blk-crypto.

Co-developed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
---
 block/blk-crypto-fallback.c | 202 +++++++++++++++++++++++++++---------
 block/blk-crypto.c          |  19 +---
 2 files changed, 157 insertions(+), 64 deletions(-)

diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index db2d2c67b308..619f0746ce02 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -251,6 +251,65 @@ static void blk_crypto_dun_to_iv(const u64 dun[BLK_CRYPTO_DUN_ARRAY_SIZE],
 		iv->dun[i] = cpu_to_le64(dun[i]);
 }
 
+/*
+ * If the length of any bio segment isn't a multiple of data_unit_size
+ * (which can happen if data_unit_size > logical_block_size), then each
+ * encryption/decryption might need to be passed multiple scatterlist elements.
+ * If that will be the case, this function allocates and initializes src and dst
+ * scatterlists (or a combined src/dst scatterlist) with the needed length.
+ *
+ * If 1 element is guaranteed to be enough (which is usually the case, and is
+ * guaranteed when data_unit_size <= logical_block_size), then this function
+ * just initializes the on-stack scatterlist(s).
+ */
+static bool blk_crypto_alloc_sglists(struct bio *bio,
+				     const struct bvec_iter *start_iter,
+				     unsigned int data_unit_size,
+				     struct scatterlist **src_p,
+				     struct scatterlist **dst_p)
+{
+	struct bio_vec bv;
+	struct bvec_iter iter;
+	bool aligned = true;
+	unsigned int count = 0;
+
+	__bio_for_each_segment(bv, bio, iter, *start_iter) {
+		count++;
+		aligned &= IS_ALIGNED(bv.bv_len, data_unit_size);
+	}
+	if (aligned) {
+		count = 1;
+	} else {
+		/*
+		 * We can't need more elements than bio segments, and we can't
+		 * need more than the number of sectors per data unit.  This may
+		 * overestimate the required length by a bit, but that's okay.
+		 */
+		count = min(count, data_unit_size >> SECTOR_SHIFT);
+	}
+
+	if (count > 1) {
+		*src_p = kmalloc_array(count, sizeof(struct scatterlist),
+				       GFP_NOIO);
+		if (!*src_p)
+			return false;
+		if (dst_p) {
+			*dst_p = kmalloc_array(count,
+					       sizeof(struct scatterlist),
+					       GFP_NOIO);
+			if (!*dst_p) {
+				kfree(*src_p);
+				*src_p = NULL;
+				return false;
+			}
+		}
+	}
+	sg_init_table(*src_p, count);
+	if (dst_p)
+		sg_init_table(*dst_p, count);
+	return true;
+}
+
 /*
  * The crypto API fallback's encryption routine.
  * Allocate a bounce bio for encryption, encrypt the input bio using crypto API,
@@ -267,9 +326,12 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
 	struct skcipher_request *ciph_req = NULL;
 	DECLARE_CRYPTO_WAIT(wait);
 	u64 curr_dun[BLK_CRYPTO_DUN_ARRAY_SIZE];
-	struct scatterlist src, dst;
+	struct scatterlist _src, *src = &_src;
+	struct scatterlist _dst, *dst = &_dst;
 	union blk_crypto_iv iv;
-	unsigned int i, j;
+	unsigned int i;
+	unsigned int sg_idx = 0;
+	unsigned int du_filled = 0;
 	bool ret = false;
 	blk_status_t blk_st;
 
@@ -281,11 +343,18 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
 	bc = src_bio->bi_crypt_context;
 	data_unit_size = bc->bc_key->crypto_cfg.data_unit_size;
 
+	/* Allocate scatterlists if needed */
+	if (!blk_crypto_alloc_sglists(src_bio, &src_bio->bi_iter,
+				      data_unit_size, &src, &dst)) {
+		src_bio->bi_status = BLK_STS_RESOURCE;
+		return false;
+	}
+
 	/* Allocate bounce bio for encryption */
 	enc_bio = blk_crypto_clone_bio(src_bio);
 	if (!enc_bio) {
 		src_bio->bi_status = BLK_STS_RESOURCE;
-		return false;
+		goto out_free_sglists;
 	}
 
 	/*
@@ -305,45 +374,57 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
 	}
 
 	memcpy(curr_dun, bc->bc_dun, sizeof(curr_dun));
-	sg_init_table(&src, 1);
-	sg_init_table(&dst, 1);
 
-	skcipher_request_set_crypt(ciph_req, &src, &dst, data_unit_size,
+	skcipher_request_set_crypt(ciph_req, src, dst, data_unit_size,
 				   iv.bytes);
 
-	/* Encrypt each page in the bounce bio */
+	/*
+	 * Encrypt each data unit in the bounce bio.
+	 *
+	 * Take care to handle the case where a data unit spans bio segments.
+	 * This can happen when data_unit_size > logical_block_size.
+	 */
 	for (i = 0; i < enc_bio->bi_vcnt; i++) {
-		struct bio_vec *enc_bvec = &enc_bio->bi_io_vec[i];
-		struct page *plaintext_page = enc_bvec->bv_page;
+		struct bio_vec *bv = &enc_bio->bi_io_vec[i];
+		struct page *plaintext_page = bv->bv_page;
 		struct page *ciphertext_page =
 			mempool_alloc(blk_crypto_bounce_page_pool, GFP_NOIO);
+		unsigned int offset_in_bv = 0;
 
-		enc_bvec->bv_page = ciphertext_page;
+		bv->bv_page = ciphertext_page;
 
 		if (!ciphertext_page) {
 			src_bio->bi_status = BLK_STS_RESOURCE;
 			goto out_free_bounce_pages;
 		}
 
-		sg_set_page(&src, plaintext_page, data_unit_size,
-			    enc_bvec->bv_offset);
-		sg_set_page(&dst, ciphertext_page, data_unit_size,
-			    enc_bvec->bv_offset);
-
-		/* Encrypt each data unit in this page */
-		for (j = 0; j < enc_bvec->bv_len; j += data_unit_size) {
-			blk_crypto_dun_to_iv(curr_dun, &iv);
-			if (crypto_wait_req(crypto_skcipher_encrypt(ciph_req),
-					    &wait)) {
-				i++;
-				src_bio->bi_status = BLK_STS_IOERR;
-				goto out_free_bounce_pages;
+		while (offset_in_bv < bv->bv_len) {
+			unsigned int n = min(bv->bv_len - offset_in_bv,
+					     data_unit_size - du_filled);
+			sg_set_page(&src[sg_idx], plaintext_page, n,
+				    bv->bv_offset + offset_in_bv);
+			sg_set_page(&dst[sg_idx], ciphertext_page, n,
+				    bv->bv_offset + offset_in_bv);
+			sg_idx++;
+			offset_in_bv += n;
+			du_filled += n;
+			if (du_filled == data_unit_size) {
+				blk_crypto_dun_to_iv(curr_dun, &iv);
+				if (crypto_wait_req(crypto_skcipher_encrypt(ciph_req),
+						    &wait)) {
+					src_bio->bi_status = BLK_STS_IOERR;
+					goto out_free_bounce_pages;
+				}
+				bio_crypt_dun_increment(curr_dun, 1);
+				sg_idx = 0;
+				du_filled = 0;
 			}
-			bio_crypt_dun_increment(curr_dun, 1);
-			src.offset += data_unit_size;
-			dst.offset += data_unit_size;
 		}
 	}
+	if (WARN_ON_ONCE(du_filled != 0)) {
+		src_bio->bi_status = BLK_STS_IOERR;
+		goto out_free_bounce_pages;
+	}
 
 	enc_bio->bi_private = src_bio;
 	enc_bio->bi_end_io = blk_crypto_fallback_encrypt_endio;
@@ -364,7 +445,11 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
 out_put_enc_bio:
 	if (enc_bio)
 		bio_put(enc_bio);
-
+out_free_sglists:
+	if (src != &_src)
+		kfree(src);
+	if (dst != &_dst)
+		kfree(dst);
 	return ret;
 }
 
@@ -383,13 +468,21 @@ static void blk_crypto_fallback_decrypt_bio(struct work_struct *work)
 	DECLARE_CRYPTO_WAIT(wait);
 	u64 curr_dun[BLK_CRYPTO_DUN_ARRAY_SIZE];
 	union blk_crypto_iv iv;
-	struct scatterlist sg;
+	struct scatterlist _sg, *sg = &_sg;
 	struct bio_vec bv;
 	struct bvec_iter iter;
 	const int data_unit_size = bc->bc_key->crypto_cfg.data_unit_size;
-	unsigned int i;
+	unsigned int sg_idx = 0;
+	unsigned int du_filled = 0;
 	blk_status_t blk_st;
 
+	/* Allocate scatterlist if needed */
+	if (!blk_crypto_alloc_sglists(bio, &f_ctx->crypt_iter, data_unit_size,
+				      &sg, NULL)) {
+		bio->bi_status = BLK_STS_RESOURCE;
+		goto out_no_sglists;
+	}
+
 	/*
 	 * Use the crypto API fallback keyslot manager to get a crypto_skcipher
 	 * for the algorithm and key specified for this bio.
@@ -407,33 +500,48 @@ static void blk_crypto_fallback_decrypt_bio(struct work_struct *work)
 	}
 
 	memcpy(curr_dun, bc->bc_dun, sizeof(curr_dun));
-	sg_init_table(&sg, 1);
-	skcipher_request_set_crypt(ciph_req, &sg, &sg, data_unit_size,
-				   iv.bytes);
+	skcipher_request_set_crypt(ciph_req, sg, sg, data_unit_size, iv.bytes);
 
-	/* Decrypt each segment in the bio */
+	/*
+	 * Decrypt each data unit in the bio.
+	 *
+	 * Take care to handle the case where a data unit spans bio segments.
+	 * This can happen when data_unit_size > logical_block_size.
+	 */
 	__bio_for_each_segment(bv, bio, iter, f_ctx->crypt_iter) {
-		struct page *page = bv.bv_page;
-
-		sg_set_page(&sg, page, data_unit_size, bv.bv_offset);
-
-		/* Decrypt each data unit in the segment */
-		for (i = 0; i < bv.bv_len; i += data_unit_size) {
-			blk_crypto_dun_to_iv(curr_dun, &iv);
-			if (crypto_wait_req(crypto_skcipher_decrypt(ciph_req),
-					    &wait)) {
-				bio->bi_status = BLK_STS_IOERR;
-				goto out;
+		unsigned int offset_in_bv = 0;
+
+		while (offset_in_bv < bv.bv_len) {
+			unsigned int n = min(bv.bv_len - offset_in_bv,
+					     data_unit_size - du_filled);
+			sg_set_page(&sg[sg_idx++], bv.bv_page, n,
+				    bv.bv_offset + offset_in_bv);
+			offset_in_bv += n;
+			du_filled += n;
+			if (du_filled == data_unit_size) {
+				blk_crypto_dun_to_iv(curr_dun, &iv);
+				if (crypto_wait_req(crypto_skcipher_decrypt(ciph_req),
+						    &wait)) {
+					bio->bi_status = BLK_STS_IOERR;
+					goto out;
+				}
+				bio_crypt_dun_increment(curr_dun, 1);
+				sg_idx = 0;
+				du_filled = 0;
 			}
-			bio_crypt_dun_increment(curr_dun, 1);
-			sg.offset += data_unit_size;
 		}
 	}
-
+	if (WARN_ON_ONCE(du_filled != 0)) {
+		bio->bi_status = BLK_STS_IOERR;
+		goto out;
+	}
 out:
 	skcipher_request_free(ciph_req);
 	blk_ksm_put_slot(slot);
 out_no_keyslot:
+	if (sg != &_sg)
+		kfree(sg);
+out_no_sglists:
 	mempool_free(f_ctx, bio_fallback_crypt_ctx_pool);
 	bio_endio(bio);
 }
diff --git a/block/blk-crypto.c b/block/blk-crypto.c
index 5da43f0973b4..fcee0038f7e0 100644
--- a/block/blk-crypto.c
+++ b/block/blk-crypto.c
@@ -200,22 +200,6 @@ bool bio_crypt_ctx_mergeable(struct bio_crypt_ctx *bc1, unsigned int bc1_bytes,
 	return !bc1 || bio_crypt_dun_is_contiguous(bc1, bc1_bytes, bc2->bc_dun);
 }
 
-/* Check that all I/O segments are data unit aligned. */
-static bool bio_crypt_check_alignment(struct bio *bio)
-{
-	const unsigned int data_unit_size =
-		bio->bi_crypt_context->bc_key->crypto_cfg.data_unit_size;
-	struct bvec_iter iter;
-	struct bio_vec bv;
-
-	bio_for_each_segment(bv, bio, iter) {
-		if (!IS_ALIGNED(bv.bv_len | bv.bv_offset, data_unit_size))
-			return false;
-	}
-
-	return true;
-}
-
 blk_status_t __blk_crypto_init_request(struct request *rq)
 {
 	return blk_ksm_get_slot_for_key(rq->q->ksm, rq->crypt_ctx->bc_key,
@@ -271,7 +255,8 @@ bool __blk_crypto_bio_prep(struct bio **bio_ptr)
 		goto fail;
 	}
 
-	if (!bio_crypt_check_alignment(bio)) {
+	if (!IS_ALIGNED(bio->bi_iter.bi_size,
+			bc_key->crypto_cfg.data_unit_size)) {
 		bio->bi_status = BLK_STS_IOERR;
 		goto fail;
 	}
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 3/8] fscrypt: add functions for direct I/O support
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 2/8] blk-crypto: don't require user buffer alignment Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 4/8] direct-io: add support for fscrypt using blk-crypto Satya Tangirala
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Eric Biggers, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Introduce fscrypt_dio_supported() to check whether a direct I/O request
is unsupported due to encryption constraints.

Also introduce fscrypt_limit_io_blocks() to limit how many blocks can be
added to a bio being prepared for direct I/O. This is needed for
filesystems that use the iomap direct I/O implementation to avoid DUN
wraparound in the middle of a bio (which is possible with the
IV_INO_LBLK_32 IV generation method). Elsewhere fscrypt_mergeable_bio()
is used for this, but iomap operates on logical ranges directly, so
filesystems using iomap won't have a chance to call fscrypt_mergeable_bio()
on every block added to a bio. So we need this function which limits a
logical range in one go.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
---
 fs/crypto/crypto.c       |  8 +++++
 fs/crypto/inline_crypt.c | 74 ++++++++++++++++++++++++++++++++++++++++
 include/linux/fscrypt.h  | 18 ++++++++++
 3 files changed, 100 insertions(+)

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 4ef3f714046a..4fcca79f39ae 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -69,6 +69,14 @@ void fscrypt_free_bounce_page(struct page *bounce_page)
 }
 EXPORT_SYMBOL(fscrypt_free_bounce_page);
 
+/*
+ * Generate the IV for the given logical block number within the given file.
+ * For filenames encryption, lblk_num == 0.
+ *
+ * Keep this in sync with fscrypt_limit_io_blocks().  fscrypt_limit_io_blocks()
+ * needs to know about any IV generation methods where the low bits of IV don't
+ * simply contain the lblk_num (e.g., IV_INO_LBLK_32).
+ */
 void fscrypt_generate_iv(union fscrypt_iv *iv, u64 lblk_num,
 			 const struct fscrypt_info *ci)
 {
diff --git a/fs/crypto/inline_crypt.c b/fs/crypto/inline_crypt.c
index c57bebfa48fe..956f5bfab7a0 100644
--- a/fs/crypto/inline_crypt.c
+++ b/fs/crypto/inline_crypt.c
@@ -17,6 +17,7 @@
 #include <linux/buffer_head.h>
 #include <linux/sched/mm.h>
 #include <linux/slab.h>
+#include <linux/uio.h>
 
 #include "fscrypt_private.h"
 
@@ -363,3 +364,76 @@ bool fscrypt_mergeable_bio_bh(struct bio *bio,
 	return fscrypt_mergeable_bio(bio, inode, next_lblk);
 }
 EXPORT_SYMBOL_GPL(fscrypt_mergeable_bio_bh);
+
+/**
+ * fscrypt_dio_supported() - check whether a direct I/O request is unsupported
+ *			     due to encryption constraints
+ * @iocb: the file and position the I/O is targeting
+ * @iter: the I/O data segment(s)
+ *
+ * Return: true if direct I/O is supported
+ */
+bool fscrypt_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
+{
+	const struct inode *inode = file_inode(iocb->ki_filp);
+	const unsigned int blocksize = i_blocksize(inode);
+
+	/* If the file is unencrypted, no veto from us. */
+	if (!fscrypt_needs_contents_encryption(inode))
+		return true;
+
+	/* We only support direct I/O with inline crypto, not fs-layer crypto */
+	if (!fscrypt_inode_uses_inline_crypto(inode))
+		return false;
+
+	/*
+	 * Since the granularity of encryption is filesystem blocks, the I/O
+	 * must be block aligned -- not just disk sector aligned.
+	 */
+	if (!IS_ALIGNED(iocb->ki_pos | iov_iter_count(iter), blocksize))
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(fscrypt_dio_supported);
+
+/**
+ * fscrypt_limit_io_blocks() - limit I/O blocks to avoid discontiguous DUNs
+ * @inode: the file on which I/O is being done
+ * @lblk: the block at which the I/O is being started from
+ * @nr_blocks: the number of blocks we want to submit starting at @lblk
+ *
+ * Determine the limit to the number of blocks that can be submitted in the bio
+ * targeting @lblk without causing a data unit number (DUN) discontinuity.
+ *
+ * This is normally just @nr_blocks, as normally the DUNs just increment along
+ * with the logical blocks.  (Or the file is not encrypted.)
+ *
+ * In rare cases, fscrypt can be using an IV generation method that allows the
+ * DUN to wrap around within logically continuous blocks, and that wraparound
+ * will occur.  If this happens, a value less than @nr_blocks will be returned
+ * so that the wraparound doesn't occur in the middle of the bio.
+ *
+ * Return: the actual number of blocks that can be submitted
+ */
+u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks)
+{
+	const struct fscrypt_info *ci = inode->i_crypt_info;
+	u32 dun;
+
+	if (!fscrypt_inode_uses_inline_crypto(inode))
+		return nr_blocks;
+
+	if (nr_blocks <= 1)
+		return nr_blocks;
+
+	if (!(fscrypt_policy_flags(&ci->ci_policy) &
+	      FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32))
+		return nr_blocks;
+
+	/* With IV_INO_LBLK_32, the DUN can wrap around from U32_MAX to 0. */
+
+	dun = ci->ci_hashed_ino + lblk;
+
+	return min_t(u64, nr_blocks, (u64)U32_MAX + 1 - dun);
+}
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index a8f7a43f031b..39cce302660b 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -567,6 +567,10 @@ bool fscrypt_mergeable_bio(struct bio *bio, const struct inode *inode,
 bool fscrypt_mergeable_bio_bh(struct bio *bio,
 			      const struct buffer_head *next_bh);
 
+bool fscrypt_dio_supported(struct kiocb *iocb, struct iov_iter *iter);
+
+u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk, u64 nr_blocks);
+
 #else /* CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
 
 static inline bool __fscrypt_inode_uses_inline_crypto(const struct inode *inode)
@@ -595,6 +599,20 @@ static inline bool fscrypt_mergeable_bio_bh(struct bio *bio,
 {
 	return true;
 }
+
+static inline bool fscrypt_dio_supported(struct kiocb *iocb,
+					 struct iov_iter *iter)
+{
+	const struct inode *inode = file_inode(iocb->ki_filp);
+
+	return !fscrypt_needs_contents_encryption(inode);
+}
+
+static inline u64 fscrypt_limit_io_blocks(const struct inode *inode, u64 lblk,
+					  u64 nr_blocks)
+{
+	return nr_blocks;
+}
 #endif /* !CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
 
 /**
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 4/8] direct-io: add support for fscrypt using blk-crypto
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (2 preceding siblings ...)
  2020-11-17 14:07 ` [PATCH v7 3/8] fscrypt: add functions for direct I/O support Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 5/8] iomap: support direct I/O with " Satya Tangirala
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Eric Biggers, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Set bio crypt contexts on bios by calling into fscrypt when required,
and explicitly check for DUN continuity when adding pages to the bio.
(While DUN continuity is usually implied by logical block contiguity,
this is not the case when using certain fscrypt IV generation methods
like IV_INO_LBLK_32).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/direct-io.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index d53fa92a1ab6..f6672c4030e3 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -24,6 +24,7 @@
 #include <linux/module.h>
 #include <linux/types.h>
 #include <linux/fs.h>
+#include <linux/fscrypt.h>
 #include <linux/mm.h>
 #include <linux/slab.h>
 #include <linux/highmem.h>
@@ -392,6 +393,7 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
 	      sector_t first_sector, int nr_vecs)
 {
 	struct bio *bio;
+	struct inode *inode = dio->inode;
 
 	/*
 	 * bio_alloc() is guaranteed to return a bio when allowed to sleep and
@@ -399,6 +401,9 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
 	 */
 	bio = bio_alloc(GFP_KERNEL, nr_vecs);
 
+	fscrypt_set_bio_crypt_ctx(bio, inode,
+				  sdio->cur_page_fs_offset >> inode->i_blkbits,
+				  GFP_KERNEL);
 	bio_set_dev(bio, bdev);
 	bio->bi_iter.bi_sector = first_sector;
 	bio_set_op_attrs(bio, dio->op, dio->op_flags);
@@ -763,9 +768,17 @@ static inline int dio_send_cur_page(struct dio *dio, struct dio_submit *sdio,
 		 * current logical offset in the file does not equal what would
 		 * be the next logical offset in the bio, submit the bio we
 		 * have.
+		 *
+		 * When fscrypt inline encryption is used, data unit number
+		 * (DUN) contiguity is also required.  Normally that's implied
+		 * by logical contiguity.  However, certain IV generation
+		 * methods (e.g. IV_INO_LBLK_32) don't guarantee it.  So, we
+		 * must explicitly check fscrypt_mergeable_bio() too.
 		 */
 		if (sdio->final_block_in_bio != sdio->cur_page_block ||
-		    cur_offset != bio_next_offset)
+		    cur_offset != bio_next_offset ||
+		    !fscrypt_mergeable_bio(sdio->bio, dio->inode,
+					   cur_offset >> dio->inode->i_blkbits))
 			dio_bio_submit(dio, sdio);
 	}
 
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 5/8] iomap: support direct I/O with fscrypt using blk-crypto
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (3 preceding siblings ...)
  2020-11-17 14:07 ` [PATCH v7 4/8] direct-io: add support for fscrypt using blk-crypto Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 6/8] ext4: " Satya Tangirala
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Eric Biggers, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Set bio crypt contexts on bios by calling into fscrypt when required.
No DUN contiguity checks are done - callers are expected to set up the
iomap correctly to ensure that each bio submitted by iomap will not have
blocks with incontiguous DUNs by calling fscrypt_limit_io_blocks()
appropriately.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
---
 fs/iomap/direct-io.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 933f234d5bec..b4240cc3c9f9 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -6,6 +6,7 @@
 #include <linux/module.h>
 #include <linux/compiler.h>
 #include <linux/fs.h>
+#include <linux/fscrypt.h>
 #include <linux/iomap.h>
 #include <linux/backing-dev.h>
 #include <linux/uio.h>
@@ -185,11 +186,14 @@ static void
 iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
 		unsigned len)
 {
+	struct inode *inode = file_inode(dio->iocb->ki_filp);
 	struct page *page = ZERO_PAGE(0);
 	int flags = REQ_SYNC | REQ_IDLE;
 	struct bio *bio;
 
 	bio = bio_alloc(GFP_KERNEL, 1);
+	fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
+				  GFP_KERNEL);
 	bio_set_dev(bio, iomap->bdev);
 	bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
 	bio->bi_private = dio;
@@ -272,6 +276,8 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 		}
 
 		bio = bio_alloc(GFP_KERNEL, nr_pages);
+		fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
+					  GFP_KERNEL);
 		bio_set_dev(bio, iomap->bdev);
 		bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
 		bio->bi_write_hint = dio->iocb->ki_hint;
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 6/8] ext4: support direct I/O with fscrypt using blk-crypto
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (4 preceding siblings ...)
  2020-11-17 14:07 ` [PATCH v7 5/8] iomap: support direct I/O with " Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-12-03 13:45   ` Theodore Y. Ts'o
  2020-11-17 14:07 ` [PATCH v7 7/8] f2fs: " Satya Tangirala
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Eric Biggers, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Wire up ext4 with fscrypt direct I/O support. Direct I/O with fscrypt is
only supported through blk-crypto (i.e. CONFIG_BLK_INLINE_ENCRYPTION must
have been enabled, the 'inlinecrypt' mount option must have been specified,
and either hardware inline encryption support must be present or
CONFIG_BLK_INLINE_ENCYRPTION_FALLBACK must have been enabled). Further,
direct I/O on encrypted files is only supported when the *length* of the
I/O is aligned to the filesystem block size (which is *not* necessarily the
same as the block device's block size).

fscrypt_limit_io_blocks() is called before setting up the iomap to ensure
that the blocks of each bio that iomap will submit will have contiguous
DUNs. Note that fscrypt_limit_io_blocks() is normally a no-op, as normally
the DUNs simply increment along with the logical blocks. But it's needed
to handle an edge case in one of the fscrypt IV generation methods.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/ext4/file.c  | 10 ++++++----
 fs/ext4/inode.c |  7 +++++++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 3ed8c048fb12..be77b7732c8e 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -36,9 +36,11 @@
 #include "acl.h"
 #include "truncate.h"
 
-static bool ext4_dio_supported(struct inode *inode)
+static bool ext4_dio_supported(struct kiocb *iocb, struct iov_iter *iter)
 {
-	if (IS_ENABLED(CONFIG_FS_ENCRYPTION) && IS_ENCRYPTED(inode))
+	struct inode *inode = file_inode(iocb->ki_filp);
+
+	if (!fscrypt_dio_supported(iocb, iter))
 		return false;
 	if (fsverity_active(inode))
 		return false;
@@ -61,7 +63,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		inode_lock_shared(inode);
 	}
 
-	if (!ext4_dio_supported(inode)) {
+	if (!ext4_dio_supported(iocb, to)) {
 		inode_unlock_shared(inode);
 		/*
 		 * Fallback to buffered I/O if the operation being performed on
@@ -495,7 +497,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	}
 
 	/* Fallback to buffered I/O if the inode does not support direct I/O. */
-	if (!ext4_dio_supported(inode)) {
+	if (!ext4_dio_supported(iocb, from)) {
 		if (ilock_shared)
 			inode_unlock_shared(inode);
 		else
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0d8385aea898..0ef3d805bb8c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3473,6 +3473,13 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	if (ret < 0)
 		return ret;
 out:
+	/*
+	 * When inline encryption is enabled, sometimes I/O to an encrypted file
+	 * has to be broken up to guarantee DUN contiguity. Handle this by
+	 * limiting the length of the mapping returned.
+	 */
+	map.m_len = fscrypt_limit_io_blocks(inode, map.m_lblk, map.m_len);
+
 	ext4_set_iomap(inode, iomap, &map, offset, length);
 
 	return 0;
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 7/8] f2fs: support direct I/O with fscrypt using blk-crypto
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (5 preceding siblings ...)
  2020-11-17 14:07 ` [PATCH v7 6/8] ext4: " Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-17 14:07 ` [PATCH v7 8/8] fscrypt: update documentation for direct I/O support Satya Tangirala
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Eric Biggers, Satya Tangirala

From: Eric Biggers <ebiggers@google.com>

Wire up f2fs with fscrypt direct I/O support. direct I/O with fscrypt is
only supported through blk-crypto (i.e. CONFIG_BLK_INLINE_ENCRYPTION must
have been enabled, the 'inlinecrypt' mount option must have been specified,
and either hardware inline encryption support must be present or
CONFIG_BLK_INLINE_ENCYRPTION_FALLBACK must have been enabled). Further,
direct I/O on encrypted files is only supported when the *length* of the
I/O is aligned to the filesystem block size (which is *not* necessarily the
same as the block device's block size).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/f2fs.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index cb700d797296..d518e668618e 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4120,7 +4120,11 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	int rw = iov_iter_rw(iter);
 
-	if (f2fs_post_read_required(inode))
+	if (!fscrypt_dio_supported(iocb, iter))
+		return true;
+	if (fsverity_active(inode))
+		return true;
+	if (f2fs_compressed_file(inode))
 		return true;
 	if (f2fs_is_multi_device(sbi))
 		return true;
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v7 8/8] fscrypt: update documentation for direct I/O support
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (6 preceding siblings ...)
  2020-11-17 14:07 ` [PATCH v7 7/8] f2fs: " Satya Tangirala
@ 2020-11-17 14:07 ` Satya Tangirala
  2020-11-18  2:38   ` Eric Biggers
  2020-11-17 17:15 ` [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Theodore Y. Ts'o
  2020-11-18  2:51 ` Eric Biggers
  9 siblings, 1 reply; 21+ messages in thread
From: Satya Tangirala @ 2020-11-17 14:07 UTC (permalink / raw)
  To: Theodore Y . Ts'o, Jaegeuk Kim, Eric Biggers, Chao Yu,
	Jens Axboe, Darrick J . Wong
  Cc: linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Satya Tangirala, Eric Biggers

Update fscrypt documentation to reflect the addition of direct I/O support
and document the necessary conditions for direct I/O on encrypted files.

Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 Documentation/filesystems/fscrypt.rst | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 44b67ebd6e40..757b8aa2af9b 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1047,8 +1047,10 @@ astute users may notice some differences in behavior:
   may be used to overwrite the source files but isn't guaranteed to be
   effective on all filesystems and storage devices.
 
-- Direct I/O is not supported on encrypted files.  Attempts to use
-  direct I/O on such files will fall back to buffered I/O.
+- Direct I/O is supported on encrypted files only under some
+  circumstances (see `Direct I/O support`_ for details). When these
+  circumstances are not met, attempts to use direct I/O on encrypted
+  files will fall back to buffered I/O.
 
 - The fallocate operations FALLOC_FL_COLLAPSE_RANGE and
   FALLOC_FL_INSERT_RANGE are not supported on encrypted files and will
@@ -1121,6 +1123,21 @@ It is not currently possible to backup and restore encrypted files
 without the encryption key.  This would require special APIs which
 have not yet been implemented.
 
+Direct I/O support
+==================
+
+Direct I/O on encrypted files is supported through blk-crypto. In
+particular, this means the kernel must have CONFIG_BLK_INLINE_ENCRYPTION
+enabled, the filesystem must have had the 'inlinecrypt' mount option
+specified, and either hardware inline encryption must be present, or
+CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK must have been enabled. Further,
+the length of any I/O must be aligned to the filesystem block size
+(*not* necessarily the same as the block device's block size). If any of
+these conditions isn't met, attempts to do direct I/O on an encrypted file
+will fall back to buffered I/O. However, there aren't any additional
+requirements on user buffer alignment (apart from those already present
+when using direct I/O on unencrypted files).
+
 Encryption policy enforcement
 =============================
 
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (7 preceding siblings ...)
  2020-11-17 14:07 ` [PATCH v7 8/8] fscrypt: update documentation for direct I/O support Satya Tangirala
@ 2020-11-17 17:15 ` Theodore Y. Ts'o
  2020-11-17 17:20   ` Darrick J. Wong
  2020-12-03 23:57   ` Satya Tangirala
  2020-11-18  2:51 ` Eric Biggers
  9 siblings, 2 replies; 21+ messages in thread
From: Theodore Y. Ts'o @ 2020-11-17 17:15 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Jaegeuk Kim, Eric Biggers, Chao Yu, Jens Axboe, Darrick J . Wong,
	linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4

What is the expected use case for Direct I/O using fscrypt?  This
isn't a problem which is unique to fscrypt, but one of the really
unfortunate aspects of the DIO interface is the silent fallback to
buffered I/O.  We've lived with this because DIO goes back decades,
and the original use case was to keep enterprise databases happy, and
the rules around what is necessary for DIO to work was relatively well
understood.

But with fscrypt, there's going to be some additional requirements
(e.g., using inline crypto) required or else DIO silently fall back to
buffered I/O for encrypted files.  Depending on the intended use case
of DIO with fscrypt, this caveat might or might not be unfortunately
surprising for applications.

I wonder if we should have some kind of interface so we can more
explicitly allow applications to query exactly what the requirements
might be for a particular file vis-a-vis Direct I/O.  What are the
memory alignment requirements, what are the file offset alignment
requirements, what are the write size requirements, for a particular
file.

						- Ted

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto
  2020-11-17 17:15 ` [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Theodore Y. Ts'o
@ 2020-11-17 17:20   ` Darrick J. Wong
  2020-12-03 23:57   ` Satya Tangirala
  1 sibling, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2020-11-17 17:20 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Satya Tangirala, Jaegeuk Kim, Eric Biggers, Chao Yu, Jens Axboe,
	linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4

On Tue, Nov 17, 2020 at 12:15:26PM -0500, Theodore Y. Ts'o wrote:
> What is the expected use case for Direct I/O using fscrypt?  This
> isn't a problem which is unique to fscrypt, but one of the really
> unfortunate aspects of the DIO interface is the silent fallback to
> buffered I/O.  We've lived with this because DIO goes back decades,
> and the original use case was to keep enterprise databases happy, and
> the rules around what is necessary for DIO to work was relatively well
> understood.
> 
> But with fscrypt, there's going to be some additional requirements
> (e.g., using inline crypto) required or else DIO silently fall back to
> buffered I/O for encrypted files.  Depending on the intended use case
> of DIO with fscrypt, this caveat might or might not be unfortunately
> surprising for applications.
> 
> I wonder if we should have some kind of interface so we can more
> explicitly allow applications to query exactly what the requirements
> might be for a particular file vis-a-vis Direct I/O.  What are the
> memory alignment requirements, what are the file offset alignment
> requirements, what are the write size requirements, for a particular
> file.

In Ye Olde days there was XFS_IOC_DIOINFO to communicate all that (xfs
hardcodes 512b file offset alignment), but in this modern era perhaps
it's time to shovel that into statx...

--D

> 
> 						- Ted

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit
  2020-11-17 14:07 ` [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit Satya Tangirala
@ 2020-11-17 23:31   ` Eric Biggers
  2020-11-18  0:38     ` Satya Tangirala
  2020-11-25 22:15   ` Eric Biggers
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Biggers @ 2020-11-17 23:31 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Tue, Nov 17, 2020 at 02:07:01PM +0000, Satya Tangirala wrote:
> Introduce blk_crypto_bio_sectors_alignment() that returns the required
> alignment for the number of sectors in a bio. Any bio split must ensure
> that the number of sectors in the resulting bios is aligned to that
> returned value. This patch also updates __blk_queue_split(),
> __blk_queue_bounce() and blk_crypto_split_bio_if_needed() to respect
> blk_crypto_bio_sectors_alignment() when splitting bios.
> 
> Signed-off-by: Satya Tangirala <satyat@google.com>
> ---
>  block/bio.c                 |  1 +
>  block/blk-crypto-fallback.c | 10 ++--
>  block/blk-crypto-internal.h | 18 +++++++
>  block/blk-merge.c           | 96 ++++++++++++++++++++++++++++++++-----
>  block/blk-mq.c              |  3 ++
>  block/bounce.c              |  4 ++
>  6 files changed, 117 insertions(+), 15 deletions(-)
> 

I feel like this should be split into multiple patches: one patch that
introduces blk_crypto_bio_sectors_alignment(), and a patch for each place that
needs to take blk_crypto_bio_sectors_alignment() into account.

It would also help to give a real-world example of why support for
data_unit_size > logical_block_size is needed.  E.g. ext4 or f2fs encryption
with a 4096-byte filesystem block size, using eMMC inline encryption hardware
that has logical_block_size=512.

Also, is this needed even without the fscrypt direct I/O support?  If so, it
should be sent out separately.

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index bcf5e4580603..f34dda7132f9 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -149,13 +149,15 @@ static inline unsigned get_max_io_size(struct request_queue *q,
>  	unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
>  	unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
>  	unsigned start_offset = bio->bi_iter.bi_sector & (pbs - 1);
> +	unsigned int bio_sectors_alignment =
> +					blk_crypto_bio_sectors_alignment(bio);
>  
>  	max_sectors += start_offset;
>  	max_sectors &= ~(pbs - 1);
> -	if (max_sectors > start_offset)
> -		return max_sectors - start_offset;
> +	if (max_sectors - start_offset >= bio_sectors_alignment)
> +		return round_down(max_sectors - start_offset, bio_sectors_alignment);
>  
> -	return sectors & ~(lbs - 1);
> +	return round_down(sectors & ~(lbs - 1), bio_sectors_alignment);
>  }

'max_sectors - start_offset >= bio_sectors_alignment' looks wrong, as
'max_sectors - start_offset' underflows if 'max_sectors < start_offset'.

Maybe consider something like the below?

static inline unsigned get_max_io_size(struct request_queue *q,
				       struct bio *bio)
{
	unsigned sectors = blk_max_size_offset(q, bio->bi_iter.bi_sector);
	unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
	unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
	sector_t pb_aligned_sector =
		round_down(bio->bi_iter.bi_sector + sectors, pbs);

	lbs = max(lbs, blk_crypto_bio_sectors_alignment(bio));

	if (pb_aligned_sector >= bio->bi_iter.bi_sector + lbs)
		sectors = pb_aligned_sector - bio->bi_iter.bi_sector;

	return round_down(sectors, lbs);
}

Maybe it would be useful to have a helper function bio_required_alignment() that
returns the crypto data unit size if the bio has an encryption context, and the
logical block size if it doesn't?

>  
>  static inline unsigned get_max_segment_size(const struct request_queue *q,
> @@ -174,6 +176,41 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
>  			(unsigned long)queue_max_segment_size(q));
>  }
>  
> +/**
> + * update_aligned_sectors_and_segs() - Ensures that *@aligned_sectors is aligned
> + *				       to @bio_sectors_alignment, and that
> + *				       *@aligned_segs is the value of nsegs
> + *				       when sectors reached/first exceeded that
> + *				       value of *@aligned_sectors.
> + *
> + * @nsegs: [in] The current number of segs
> + * @sectors: [in] The current number of sectors
> + * @aligned_segs: [in,out] The number of segments that make up @aligned_sectors
> + * @aligned_sectors: [in,out] The largest number of sectors <= @sectors that is
> + *		     aligned to @sectors
> + * @bio_sectors_alignment: [in] The alignment requirement for the number of
> + *			  sectors
> + *
> + * Updates *@aligned_sectors to the largest number <= @sectors that is also a
> + * multiple of @bio_sectors_alignment. This is done by updating *@aligned_sectors
> + * whenever @sectors is at least @bio_sectors_alignment more than
> + * *@aligned_sectors, since that means we can increment *@aligned_sectors while
> + * still keeping it aligned to @bio_sectors_alignment and also keeping it <=
> + * @sectors. *@aligned_segs is updated to the value of nsegs when @sectors first
> + * reaches/exceeds any value that causes *@aligned_sectors to be updated.
> + */
> +static inline void update_aligned_sectors_and_segs(const unsigned int nsegs,
> +						   const unsigned int sectors,
> +						   unsigned int *aligned_segs,
> +				unsigned int *aligned_sectors,
> +				const unsigned int bio_sectors_alignment)
> +{
> +	if (sectors - *aligned_sectors < bio_sectors_alignment)
> +		return;
> +	*aligned_sectors = round_down(sectors, bio_sectors_alignment);
> +	*aligned_segs = nsegs;
> +}
> +
>  /**
>   * bvec_split_segs - verify whether or not a bvec should be split in the middle
>   * @q:        [in] request queue associated with the bio associated with @bv
> @@ -195,9 +232,12 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
>   * the block driver.
>   */
>  static bool bvec_split_segs(const struct request_queue *q,
> -			    const struct bio_vec *bv, unsigned *nsegs,
> -			    unsigned *sectors, unsigned max_segs,
> -			    unsigned max_sectors)
> +			    const struct bio_vec *bv, unsigned int *nsegs,
> +			    unsigned int *sectors, unsigned int *aligned_segs,
> +			    unsigned int *aligned_sectors,
> +			    unsigned int bio_sectors_alignment,
> +			    unsigned int max_segs,
> +			    unsigned int max_sectors)
>  {
>  	unsigned max_len = (min(max_sectors, UINT_MAX >> 9) - *sectors) << 9;
>  	unsigned len = min(bv->bv_len, max_len);
> @@ -211,6 +251,11 @@ static bool bvec_split_segs(const struct request_queue *q,
>  
>  		(*nsegs)++;
>  		total_len += seg_size;
> +		update_aligned_sectors_and_segs(*nsegs,
> +						*sectors + (total_len >> 9),
> +						aligned_segs,
> +						aligned_sectors,
> +						bio_sectors_alignment);
>  		len -= seg_size;
>  
>  		if ((bv->bv_offset + total_len) & queue_virt_boundary(q))
> @@ -235,6 +280,8 @@ static bool bvec_split_segs(const struct request_queue *q,
>   * following is guaranteed for the cloned bio:
>   * - That it has at most get_max_io_size(@q, @bio) sectors.
>   * - That it has at most queue_max_segments(@q) segments.
> + * - That the number of sectors in the returned bio is aligned to
> + *   blk_crypto_bio_sectors_alignment(@bio)
>   *
>   * Except for discard requests the cloned bio will point at the bi_io_vec of
>   * the original bio. It is the responsibility of the caller to ensure that the
> @@ -252,6 +299,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  	unsigned nsegs = 0, sectors = 0;
>  	const unsigned max_sectors = get_max_io_size(q, bio);
>  	const unsigned max_segs = queue_max_segments(q);
> +	const unsigned int bio_sectors_alignment =
> +					blk_crypto_bio_sectors_alignment(bio);
> +	unsigned int aligned_segs = 0, aligned_sectors = 0;
>  
>  	bio_for_each_bvec(bv, bio, iter) {
>  		/*
> @@ -266,8 +316,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
>  			nsegs++;
>  			sectors += bv.bv_len >> 9;
> -		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors, max_segs,
> -					 max_sectors)) {
> +			update_aligned_sectors_and_segs(nsegs, sectors,
> +							&aligned_segs,
> +							&aligned_sectors,
> +							bio_sectors_alignment);
> +		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
> +					   &aligned_segs, &aligned_sectors,
> +					   bio_sectors_alignment, max_segs,
> +					   max_sectors)) {
>  			goto split;
>  		}
>  
> @@ -275,11 +331,24 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  		bvprvp = &bvprv;
>  	}
>  
> +	/*
> +	 * The input bio's number of sectors is assumed to be aligned to
> +	 * bio_sectors_alignment. If that's the case, then this function should
> +	 * ensure that aligned_segs == nsegs and aligned_sectors == sectors if
> +	 * the bio is not going to be split.
> +	 */
> +	WARN_ON(aligned_segs != nsegs || aligned_sectors != sectors);
>  	*segs = nsegs;
>  	return NULL;
>  split:
> -	*segs = nsegs;
> -	return bio_split(bio, sectors, GFP_NOIO, bs);
> +	*segs = aligned_segs;
> +	if (WARN_ON(aligned_sectors == 0))
> +		goto err;
> +	return bio_split(bio, aligned_sectors, GFP_NOIO, bs);
> +err:
> +	bio->bi_status = BLK_STS_IOERR;
> +	bio_endio(bio);
> +	return bio;
>  }

This part is pretty complex.  Are you sure it's needed?  How was alignment to
logical_block_size ensured before?

> diff --git a/block/bounce.c b/block/bounce.c
> index 162a6eee8999..b15224799008 100644
> --- a/block/bounce.c
> +++ b/block/bounce.c
> @@ -295,6 +295,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
>  	bool bounce = false;
>  	int sectors = 0;
>  	bool passthrough = bio_is_passthrough(*bio_orig);
> +	unsigned int bio_sectors_alignment;
>  
>  	bio_for_each_segment(from, *bio_orig, iter) {
>  		if (i++ < BIO_MAX_PAGES)
> @@ -305,6 +306,9 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
>  	if (!bounce)
>  		return;
>  
> +	bio_sectors_alignment = blk_crypto_bio_sectors_alignment(bio);
> +	sectors = round_down(sectors, bio_sectors_alignment);
> +

This can be one line:

	sectors = round_down(sectors, blk_crypto_bio_sectors_alignment(bio));

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 2/8] blk-crypto: don't require user buffer alignment
  2020-11-17 14:07 ` [PATCH v7 2/8] blk-crypto: don't require user buffer alignment Satya Tangirala
@ 2020-11-17 23:51   ` Eric Biggers
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2020-11-17 23:51 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Tue, Nov 17, 2020 at 02:07:02PM +0000, Satya Tangirala wrote:
> Previously, blk-crypto-fallback required the offset and length of each bvec
> in a bio to be aligned to the crypto data unit size. This patch enables
> blk-crypto-fallback to work even if that's not the case - the requirement
> now is only that the total length of the data in the bio is aligned to
> the crypto data unit size.
> 
> Now that blk-crypto-fallback can handle bvecs not aligned to crypto data
> units, and we've ensured that bios are not split in the middle of a
> crypto data unit, we can relax the alignment check done by blk-crypto.

I think the blk-crypto.c and blk-crypto-fallback.c changes belong in different
patches.

There should also be a brief explanation of why this is needed (make the
alignment requirements on direct I/O to encrypted files somewhat more similar to
standard unencrypted files, right)?.

Also, how were the blk-crypto-fallback changes tested?

> @@ -305,45 +374,57 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
>  	}
>  
>  	memcpy(curr_dun, bc->bc_dun, sizeof(curr_dun));
> -	sg_init_table(&src, 1);
> -	sg_init_table(&dst, 1);
>  
> -	skcipher_request_set_crypt(ciph_req, &src, &dst, data_unit_size,
> +	skcipher_request_set_crypt(ciph_req, src, dst, data_unit_size,
>  				   iv.bytes);
>  
> -	/* Encrypt each page in the bounce bio */
> +	/*
> +	 * Encrypt each data unit in the bounce bio.
> +	 *
> +	 * Take care to handle the case where a data unit spans bio segments.
> +	 * This can happen when data_unit_size > logical_block_size.
> +	 */
>  	for (i = 0; i < enc_bio->bi_vcnt; i++) {
> -		struct bio_vec *enc_bvec = &enc_bio->bi_io_vec[i];
> -		struct page *plaintext_page = enc_bvec->bv_page;
> +		struct bio_vec *bv = &enc_bio->bi_io_vec[i];
> +		struct page *plaintext_page = bv->bv_page;
>  		struct page *ciphertext_page =
>  			mempool_alloc(blk_crypto_bounce_page_pool, GFP_NOIO);
> +		unsigned int offset_in_bv = 0;
>  
> -		enc_bvec->bv_page = ciphertext_page;
> +		bv->bv_page = ciphertext_page;
>  
>  		if (!ciphertext_page) {
>  			src_bio->bi_status = BLK_STS_RESOURCE;
>  			goto out_free_bounce_pages;
>  		}
>  
> -		sg_set_page(&src, plaintext_page, data_unit_size,
> -			    enc_bvec->bv_offset);
> -		sg_set_page(&dst, ciphertext_page, data_unit_size,
> -			    enc_bvec->bv_offset);
> -
> -		/* Encrypt each data unit in this page */
> -		for (j = 0; j < enc_bvec->bv_len; j += data_unit_size) {
> -			blk_crypto_dun_to_iv(curr_dun, &iv);
> -			if (crypto_wait_req(crypto_skcipher_encrypt(ciph_req),
> -					    &wait)) {
> -				i++;
> -				src_bio->bi_status = BLK_STS_IOERR;
> -				goto out_free_bounce_pages;
> +		while (offset_in_bv < bv->bv_len) {
> +			unsigned int n = min(bv->bv_len - offset_in_bv,
> +					     data_unit_size - du_filled);
> +			sg_set_page(&src[sg_idx], plaintext_page, n,
> +				    bv->bv_offset + offset_in_bv);
> +			sg_set_page(&dst[sg_idx], ciphertext_page, n,
> +				    bv->bv_offset + offset_in_bv);
> +			sg_idx++;
> +			offset_in_bv += n;
> +			du_filled += n;
> +			if (du_filled == data_unit_size) {
> +				blk_crypto_dun_to_iv(curr_dun, &iv);
> +				if (crypto_wait_req(crypto_skcipher_encrypt(ciph_req),
> +						    &wait)) {
> +					src_bio->bi_status = BLK_STS_IOERR;
> +					goto out_free_bounce_pages;
> +				}
> +				bio_crypt_dun_increment(curr_dun, 1);
> +				sg_idx = 0;
> +				du_filled = 0;

This is leaking a bounce page if crypto_skcipher_encrypt() fails.  This can be
fixed either by keeping the 'i++' that was on the error path before, or by
moving the i++ in the for statement to just below to where the bounce page was
successfully allocated.

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit
  2020-11-17 23:31   ` Eric Biggers
@ 2020-11-18  0:38     ` Satya Tangirala
  2020-11-25 22:12       ` Eric Biggers
  0 siblings, 1 reply; 21+ messages in thread
From: Satya Tangirala @ 2020-11-18  0:38 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Tue, Nov 17, 2020 at 03:31:23PM -0800, Eric Biggers wrote:
> On Tue, Nov 17, 2020 at 02:07:01PM +0000, Satya Tangirala wrote:
> > Introduce blk_crypto_bio_sectors_alignment() that returns the required
> > alignment for the number of sectors in a bio. Any bio split must ensure
> > that the number of sectors in the resulting bios is aligned to that
> > returned value. This patch also updates __blk_queue_split(),
> > __blk_queue_bounce() and blk_crypto_split_bio_if_needed() to respect
> > blk_crypto_bio_sectors_alignment() when splitting bios.
> > 
> > Signed-off-by: Satya Tangirala <satyat@google.com>
> > ---
> >  block/bio.c                 |  1 +
> >  block/blk-crypto-fallback.c | 10 ++--
> >  block/blk-crypto-internal.h | 18 +++++++
> >  block/blk-merge.c           | 96 ++++++++++++++++++++++++++++++++-----
> >  block/blk-mq.c              |  3 ++
> >  block/bounce.c              |  4 ++
> >  6 files changed, 117 insertions(+), 15 deletions(-)
> > 
> 
> I feel like this should be split into multiple patches: one patch that
> introduces blk_crypto_bio_sectors_alignment(), and a patch for each place that
> needs to take blk_crypto_bio_sectors_alignment() into account.
> 
> It would also help to give a real-world example of why support for
> data_unit_size > logical_block_size is needed.  E.g. ext4 or f2fs encryption
> with a 4096-byte filesystem block size, using eMMC inline encryption hardware
> that has logical_block_size=512.
> 
> Also, is this needed even without the fscrypt direct I/O support?  If so, it
> should be sent out separately.
> 
Yes, I think it's needed even without the fscrypt direct I/O support.
And ok, I'll send it out separately then :)
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index bcf5e4580603..f34dda7132f9 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -149,13 +149,15 @@ static inline unsigned get_max_io_size(struct request_queue *q,
> >  	unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
> >  	unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
> >  	unsigned start_offset = bio->bi_iter.bi_sector & (pbs - 1);
> > +	unsigned int bio_sectors_alignment =
> > +					blk_crypto_bio_sectors_alignment(bio);
> >  
> >  	max_sectors += start_offset;
> >  	max_sectors &= ~(pbs - 1);
> > -	if (max_sectors > start_offset)
> > -		return max_sectors - start_offset;
> > +	if (max_sectors - start_offset >= bio_sectors_alignment)
> > +		return round_down(max_sectors - start_offset, bio_sectors_alignment);
> >  
> > -	return sectors & ~(lbs - 1);
> > +	return round_down(sectors & ~(lbs - 1), bio_sectors_alignment);
> >  }
> 
> 'max_sectors - start_offset >= bio_sectors_alignment' looks wrong, as
> 'max_sectors - start_offset' underflows if 'max_sectors < start_offset'.
> 
> Maybe consider something like the below?
> 
> static inline unsigned get_max_io_size(struct request_queue *q,
> 				       struct bio *bio)
> {
> 	unsigned sectors = blk_max_size_offset(q, bio->bi_iter.bi_sector);
> 	unsigned pbs = queue_physical_block_size(q) >> SECTOR_SHIFT;
> 	unsigned lbs = queue_logical_block_size(q) >> SECTOR_SHIFT;
> 	sector_t pb_aligned_sector =
> 		round_down(bio->bi_iter.bi_sector + sectors, pbs);
> 
> 	lbs = max(lbs, blk_crypto_bio_sectors_alignment(bio));
> 
> 	if (pb_aligned_sector >= bio->bi_iter.bi_sector + lbs)
> 		sectors = pb_aligned_sector - bio->bi_iter.bi_sector;
> 
> 	return round_down(sectors, lbs);
> }
> 
> Maybe it would be useful to have a helper function bio_required_alignment() that
> returns the crypto data unit size if the bio has an encryption context, and the
> logical block size if it doesn't?
>
> >  
> >  static inline unsigned get_max_segment_size(const struct request_queue *q,
> > @@ -174,6 +176,41 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
> >  			(unsigned long)queue_max_segment_size(q));
> >  }
> >  
> > +/**
> > + * update_aligned_sectors_and_segs() - Ensures that *@aligned_sectors is aligned
> > + *				       to @bio_sectors_alignment, and that
> > + *				       *@aligned_segs is the value of nsegs
> > + *				       when sectors reached/first exceeded that
> > + *				       value of *@aligned_sectors.
> > + *
> > + * @nsegs: [in] The current number of segs
> > + * @sectors: [in] The current number of sectors
> > + * @aligned_segs: [in,out] The number of segments that make up @aligned_sectors
> > + * @aligned_sectors: [in,out] The largest number of sectors <= @sectors that is
> > + *		     aligned to @sectors
> > + * @bio_sectors_alignment: [in] The alignment requirement for the number of
> > + *			  sectors
> > + *
> > + * Updates *@aligned_sectors to the largest number <= @sectors that is also a
> > + * multiple of @bio_sectors_alignment. This is done by updating *@aligned_sectors
> > + * whenever @sectors is at least @bio_sectors_alignment more than
> > + * *@aligned_sectors, since that means we can increment *@aligned_sectors while
> > + * still keeping it aligned to @bio_sectors_alignment and also keeping it <=
> > + * @sectors. *@aligned_segs is updated to the value of nsegs when @sectors first
> > + * reaches/exceeds any value that causes *@aligned_sectors to be updated.
> > + */
> > +static inline void update_aligned_sectors_and_segs(const unsigned int nsegs,
> > +						   const unsigned int sectors,
> > +						   unsigned int *aligned_segs,
> > +				unsigned int *aligned_sectors,
> > +				const unsigned int bio_sectors_alignment)
> > +{
> > +	if (sectors - *aligned_sectors < bio_sectors_alignment)
> > +		return;
> > +	*aligned_sectors = round_down(sectors, bio_sectors_alignment);
> > +	*aligned_segs = nsegs;
> > +}
> > +
> >  /**
> >   * bvec_split_segs - verify whether or not a bvec should be split in the middle
> >   * @q:        [in] request queue associated with the bio associated with @bv
> > @@ -195,9 +232,12 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
> >   * the block driver.
> >   */
> >  static bool bvec_split_segs(const struct request_queue *q,
> > -			    const struct bio_vec *bv, unsigned *nsegs,
> > -			    unsigned *sectors, unsigned max_segs,
> > -			    unsigned max_sectors)
> > +			    const struct bio_vec *bv, unsigned int *nsegs,
> > +			    unsigned int *sectors, unsigned int *aligned_segs,
> > +			    unsigned int *aligned_sectors,
> > +			    unsigned int bio_sectors_alignment,
> > +			    unsigned int max_segs,
> > +			    unsigned int max_sectors)
> >  {
> >  	unsigned max_len = (min(max_sectors, UINT_MAX >> 9) - *sectors) << 9;
> >  	unsigned len = min(bv->bv_len, max_len);
> > @@ -211,6 +251,11 @@ static bool bvec_split_segs(const struct request_queue *q,
> >  
> >  		(*nsegs)++;
> >  		total_len += seg_size;
> > +		update_aligned_sectors_and_segs(*nsegs,
> > +						*sectors + (total_len >> 9),
> > +						aligned_segs,
> > +						aligned_sectors,
> > +						bio_sectors_alignment);
> >  		len -= seg_size;
> >  
> >  		if ((bv->bv_offset + total_len) & queue_virt_boundary(q))
> > @@ -235,6 +280,8 @@ static bool bvec_split_segs(const struct request_queue *q,
> >   * following is guaranteed for the cloned bio:
> >   * - That it has at most get_max_io_size(@q, @bio) sectors.
> >   * - That it has at most queue_max_segments(@q) segments.
> > + * - That the number of sectors in the returned bio is aligned to
> > + *   blk_crypto_bio_sectors_alignment(@bio)
> >   *
> >   * Except for discard requests the cloned bio will point at the bi_io_vec of
> >   * the original bio. It is the responsibility of the caller to ensure that the
> > @@ -252,6 +299,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> >  	unsigned nsegs = 0, sectors = 0;
> >  	const unsigned max_sectors = get_max_io_size(q, bio);
> >  	const unsigned max_segs = queue_max_segments(q);
> > +	const unsigned int bio_sectors_alignment =
> > +					blk_crypto_bio_sectors_alignment(bio);
> > +	unsigned int aligned_segs = 0, aligned_sectors = 0;
> >  
> >  	bio_for_each_bvec(bv, bio, iter) {
> >  		/*
> > @@ -266,8 +316,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> >  		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> >  			nsegs++;
> >  			sectors += bv.bv_len >> 9;
> > -		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors, max_segs,
> > -					 max_sectors)) {
> > +			update_aligned_sectors_and_segs(nsegs, sectors,
> > +							&aligned_segs,
> > +							&aligned_sectors,
> > +							bio_sectors_alignment);
> > +		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
> > +					   &aligned_segs, &aligned_sectors,
> > +					   bio_sectors_alignment, max_segs,
> > +					   max_sectors)) {
> >  			goto split;
> >  		}
> >  
> > @@ -275,11 +331,24 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> >  		bvprvp = &bvprv;
> >  	}
> >  
> > +	/*
> > +	 * The input bio's number of sectors is assumed to be aligned to
> > +	 * bio_sectors_alignment. If that's the case, then this function should
> > +	 * ensure that aligned_segs == nsegs and aligned_sectors == sectors if
> > +	 * the bio is not going to be split.
> > +	 */
> > +	WARN_ON(aligned_segs != nsegs || aligned_sectors != sectors);
> >  	*segs = nsegs;
> >  	return NULL;
> >  split:
> > -	*segs = nsegs;
> > -	return bio_split(bio, sectors, GFP_NOIO, bs);
> > +	*segs = aligned_segs;
> > +	if (WARN_ON(aligned_sectors == 0))
> > +		goto err;
> > +	return bio_split(bio, aligned_sectors, GFP_NOIO, bs);
> > +err:
> > +	bio->bi_status = BLK_STS_IOERR;
> > +	bio_endio(bio);
> > +	return bio;
> >  }
> 
> This part is pretty complex.  Are you sure it's needed?  How was alignment to
> logical_block_size ensured before?
> 
Afaict, alignment to logical_block_size (lbs) is done by assuming that
bv->bv_len is always lbs aligned (among other things). Is that not the
case?

If it is the case, that's what we're trying to avoid with this patch (we
want to be able to submit bios that have 2 bvecs that together make up a
single crypto data unit, for example). And this is complex because
multiple segments could "add up" to make up a single crypto data unit,
but this function's job is to limit both the number of segments *and*
the number of sectors - so when ensuring that the number of sectors is
aligned to crypto data unit size, we also want the smallest number of
segments that can make up that aligned number of sectors.
> > diff --git a/block/bounce.c b/block/bounce.c
> > index 162a6eee8999..b15224799008 100644
> > --- a/block/bounce.c
> > +++ b/block/bounce.c
> > @@ -295,6 +295,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
> >  	bool bounce = false;
> >  	int sectors = 0;
> >  	bool passthrough = bio_is_passthrough(*bio_orig);
> > +	unsigned int bio_sectors_alignment;
> >  
> >  	bio_for_each_segment(from, *bio_orig, iter) {
> >  		if (i++ < BIO_MAX_PAGES)
> > @@ -305,6 +306,9 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
> >  	if (!bounce)
> >  		return;
> >  
> > +	bio_sectors_alignment = blk_crypto_bio_sectors_alignment(bio);
> > +	sectors = round_down(sectors, bio_sectors_alignment);
> > +
> 
> This can be one line:
> 
> 	sectors = round_down(sectors, blk_crypto_bio_sectors_alignment(bio));
> 
Sure thing. I also messed up the argument being passed - it should've
been *bio_orig, not bio :(. Would you have any recommendations on how to
test code in bounce.c?
> - Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 8/8] fscrypt: update documentation for direct I/O support
  2020-11-17 14:07 ` [PATCH v7 8/8] fscrypt: update documentation for direct I/O support Satya Tangirala
@ 2020-11-18  2:38   ` Eric Biggers
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2020-11-18  2:38 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Tue, Nov 17, 2020 at 02:07:08PM +0000, Satya Tangirala wrote:
> +Direct I/O support
> +==================
> +
> +Direct I/O on encrypted files is supported through blk-crypto. In
> +particular, this means the kernel must have CONFIG_BLK_INLINE_ENCRYPTION
> +enabled, the filesystem must have had the 'inlinecrypt' mount option
> +specified, and either hardware inline encryption must be present, or
> +CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK must have been enabled. Further,
> +the length of any I/O must be aligned to the filesystem block size
> +(*not* necessarily the same as the block device's block size). If any of
> +these conditions isn't met, attempts to do direct I/O on an encrypted file
> +will fall back to buffered I/O. However, there aren't any additional
> +requirements on user buffer alignment (apart from those already present
> +when using direct I/O on unencrypted files).

Actually the position in the file the I/O is targeting must be fs-block aligned
too, not just the length of the I/O.

It's only the pointer to the user data buffer that no longer needs to be
fs-block aligned (this changed between v6 and v7).

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto
  2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
                   ` (8 preceding siblings ...)
  2020-11-17 17:15 ` [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Theodore Y. Ts'o
@ 2020-11-18  2:51 ` Eric Biggers
  9 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2020-11-18  2:51 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Tue, Nov 17, 2020 at 02:07:00PM +0000, Satya Tangirala wrote:
> This patch series was tested by running xfstests with test_dummy_encryption
> with and without the 'inlinecrypt' mount option, and there were no
> meaningful regressions. One regression was for generic/587 on ext4,
> but that test isn't compatible with test_dummy_encryption in the first
> place, and the test "incorrectly" passes without the 'inlinecrypt' mount
> option - a patch will be sent out to exclude that test when
> test_dummy_encryption is turned on with ext4 (like the other quota related
> tests that use user visible quota files).

It would be helpful to have some more testing results that show that the direct
I/O support is really working as intended, especially in the new case where
logical_block_size < data_unit_size and buffers are only logical_block_size
aligned --- both with real hardware and with blk-crypto-fallback.  Using my
patchset https://lkml.kernel.org/r/20201112194011.103774-1-ebiggers@kernel.org
it should be possible to test with real eMMC inline encryption hardware on
Snapdragon 630; it has logical_block_size=512.

Also note, generic/587 was already added to the ext4/encrypt and ext4/encrypt_1k
exclusion lists by xfstests-bld commit 02e4bfe628b4.

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit
  2020-11-18  0:38     ` Satya Tangirala
@ 2020-11-25 22:12       ` Eric Biggers
  2020-12-02 22:52         ` Satya Tangirala
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Biggers @ 2020-11-25 22:12 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Wed, Nov 18, 2020 at 12:38:15AM +0000, Satya Tangirala wrote:
> > > +/**
> > > + * update_aligned_sectors_and_segs() - Ensures that *@aligned_sectors is aligned
> > > + *				       to @bio_sectors_alignment, and that
> > > + *				       *@aligned_segs is the value of nsegs
> > > + *				       when sectors reached/first exceeded that
> > > + *				       value of *@aligned_sectors.
> > > + *
> > > + * @nsegs: [in] The current number of segs
> > > + * @sectors: [in] The current number of sectors
> > > + * @aligned_segs: [in,out] The number of segments that make up @aligned_sectors
> > > + * @aligned_sectors: [in,out] The largest number of sectors <= @sectors that is
> > > + *		     aligned to @sectors
> > > + * @bio_sectors_alignment: [in] The alignment requirement for the number of
> > > + *			  sectors
> > > + *
> > > + * Updates *@aligned_sectors to the largest number <= @sectors that is also a
> > > + * multiple of @bio_sectors_alignment. This is done by updating *@aligned_sectors
> > > + * whenever @sectors is at least @bio_sectors_alignment more than
> > > + * *@aligned_sectors, since that means we can increment *@aligned_sectors while
> > > + * still keeping it aligned to @bio_sectors_alignment and also keeping it <=
> > > + * @sectors. *@aligned_segs is updated to the value of nsegs when @sectors first
> > > + * reaches/exceeds any value that causes *@aligned_sectors to be updated.
> > > + */
> > > +static inline void update_aligned_sectors_and_segs(const unsigned int nsegs,
> > > +						   const unsigned int sectors,
> > > +						   unsigned int *aligned_segs,
> > > +				unsigned int *aligned_sectors,
> > > +				const unsigned int bio_sectors_alignment)
> > > +{
> > > +	if (sectors - *aligned_sectors < bio_sectors_alignment)
> > > +		return;
> > > +	*aligned_sectors = round_down(sectors, bio_sectors_alignment);
> > > +	*aligned_segs = nsegs;
> > > +}
> > > +
> > >  /**
> > >   * bvec_split_segs - verify whether or not a bvec should be split in the middle
> > >   * @q:        [in] request queue associated with the bio associated with @bv
> > > @@ -195,9 +232,12 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
> > >   * the block driver.
> > >   */
> > >  static bool bvec_split_segs(const struct request_queue *q,
> > > -			    const struct bio_vec *bv, unsigned *nsegs,
> > > -			    unsigned *sectors, unsigned max_segs,
> > > -			    unsigned max_sectors)
> > > +			    const struct bio_vec *bv, unsigned int *nsegs,
> > > +			    unsigned int *sectors, unsigned int *aligned_segs,
> > > +			    unsigned int *aligned_sectors,
> > > +			    unsigned int bio_sectors_alignment,
> > > +			    unsigned int max_segs,
> > > +			    unsigned int max_sectors)
> > >  {
> > >  	unsigned max_len = (min(max_sectors, UINT_MAX >> 9) - *sectors) << 9;
> > >  	unsigned len = min(bv->bv_len, max_len);
> > > @@ -211,6 +251,11 @@ static bool bvec_split_segs(const struct request_queue *q,
> > >  
> > >  		(*nsegs)++;
> > >  		total_len += seg_size;
> > > +		update_aligned_sectors_and_segs(*nsegs,
> > > +						*sectors + (total_len >> 9),
> > > +						aligned_segs,
> > > +						aligned_sectors,
> > > +						bio_sectors_alignment);
> > >  		len -= seg_size;
> > >  
> > >  		if ((bv->bv_offset + total_len) & queue_virt_boundary(q))
> > > @@ -235,6 +280,8 @@ static bool bvec_split_segs(const struct request_queue *q,
> > >   * following is guaranteed for the cloned bio:
> > >   * - That it has at most get_max_io_size(@q, @bio) sectors.
> > >   * - That it has at most queue_max_segments(@q) segments.
> > > + * - That the number of sectors in the returned bio is aligned to
> > > + *   blk_crypto_bio_sectors_alignment(@bio)
> > >   *
> > >   * Except for discard requests the cloned bio will point at the bi_io_vec of
> > >   * the original bio. It is the responsibility of the caller to ensure that the
> > > @@ -252,6 +299,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> > >  	unsigned nsegs = 0, sectors = 0;
> > >  	const unsigned max_sectors = get_max_io_size(q, bio);
> > >  	const unsigned max_segs = queue_max_segments(q);
> > > +	const unsigned int bio_sectors_alignment =
> > > +					blk_crypto_bio_sectors_alignment(bio);
> > > +	unsigned int aligned_segs = 0, aligned_sectors = 0;
> > >  
> > >  	bio_for_each_bvec(bv, bio, iter) {
> > >  		/*
> > > @@ -266,8 +316,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> > >  		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> > >  			nsegs++;
> > >  			sectors += bv.bv_len >> 9;
> > > -		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors, max_segs,
> > > -					 max_sectors)) {
> > > +			update_aligned_sectors_and_segs(nsegs, sectors,
> > > +							&aligned_segs,
> > > +							&aligned_sectors,
> > > +							bio_sectors_alignment);
> > > +		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
> > > +					   &aligned_segs, &aligned_sectors,
> > > +					   bio_sectors_alignment, max_segs,
> > > +					   max_sectors)) {
> > >  			goto split;
> > >  		}
> > >  
> > > @@ -275,11 +331,24 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> > >  		bvprvp = &bvprv;
> > >  	}
> > >  
> > > +	/*
> > > +	 * The input bio's number of sectors is assumed to be aligned to
> > > +	 * bio_sectors_alignment. If that's the case, then this function should
> > > +	 * ensure that aligned_segs == nsegs and aligned_sectors == sectors if
> > > +	 * the bio is not going to be split.
> > > +	 */
> > > +	WARN_ON(aligned_segs != nsegs || aligned_sectors != sectors);
> > >  	*segs = nsegs;
> > >  	return NULL;
> > >  split:
> > > -	*segs = nsegs;
> > > -	return bio_split(bio, sectors, GFP_NOIO, bs);
> > > +	*segs = aligned_segs;
> > > +	if (WARN_ON(aligned_sectors == 0))
> > > +		goto err;
> > > +	return bio_split(bio, aligned_sectors, GFP_NOIO, bs);
> > > +err:
> > > +	bio->bi_status = BLK_STS_IOERR;
> > > +	bio_endio(bio);
> > > +	return bio;
> > >  }
> > 
> > This part is pretty complex.  Are you sure it's needed?  How was alignment to
> > logical_block_size ensured before?
> > 
> Afaict, alignment to logical_block_size (lbs) is done by assuming that
> bv->bv_len is always lbs aligned (among other things). Is that not the
> case?

I believe that's the case; bvecs are logical_block_size aligned.

So the new thing (with data_unit_size > logical_block_size) is that
bvec boundaries aren't necessarily valid split points anymore.

> 
> If it is the case, that's what we're trying to avoid with this patch (we
> want to be able to submit bios that have 2 bvecs that together make up a
> single crypto data unit, for example). And this is complex because
> multiple segments could "add up" to make up a single crypto data unit,
> but this function's job is to limit both the number of segments *and*
> the number of sectors - so when ensuring that the number of sectors is
> aligned to crypto data unit size, we also want the smallest number of
> segments that can make up that aligned number of sectors.

Does the number of physical segments that is calculated have to be exact, or
could it be a slight overestimate?  If the purpose of the calculation is just to
size scatterlists and to avoid exceeding the hardware limit on the number of
physical segments (and at a quick glance that seems to be the purpose, though I
didn't look at everything), it seems that a slight overestimate would be okay.

If so, couldn't the number of sectors could simply be rounded down to
blk_crypto_bio_sectors_alignment(bio) when blk_bio_segment_split() actually
calls bio_split()?  That would be much simpler; why doesn't that work?

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit
  2020-11-17 14:07 ` [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit Satya Tangirala
  2020-11-17 23:31   ` Eric Biggers
@ 2020-11-25 22:15   ` Eric Biggers
  1 sibling, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2020-11-25 22:15 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Tue, Nov 17, 2020 at 02:07:01PM +0000, Satya Tangirala wrote:
> @@ -275,11 +331,24 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  		bvprvp = &bvprv;
>  	}
>  
> +	/*
> +	 * The input bio's number of sectors is assumed to be aligned to
> +	 * bio_sectors_alignment. If that's the case, then this function should
> +	 * ensure that aligned_segs == nsegs and aligned_sectors == sectors if
> +	 * the bio is not going to be split.
> +	 */
> +	WARN_ON(aligned_segs != nsegs || aligned_sectors != sectors);
>  	*segs = nsegs;
>  	return NULL;
>  split:
> -	*segs = nsegs;
> -	return bio_split(bio, sectors, GFP_NOIO, bs);
> +	*segs = aligned_segs;
> +	if (WARN_ON(aligned_sectors == 0))
> +		goto err;
> +	return bio_split(bio, aligned_sectors, GFP_NOIO, bs);
> +err:
> +	bio->bi_status = BLK_STS_IOERR;
> +	bio_endio(bio);
> +	return bio;
>  }
[...]
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 55bcee5dc032..de5c97ab8e5a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2161,6 +2161,9 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
>  	blk_queue_bounce(q, &bio);
>  	__blk_queue_split(&bio, &nr_segs);
>  
> +	if (bio->bi_status != BLK_STS_OK)
> +		goto queue_exit;
> +

Note that as soon as bio_endio() is called, the bio may be freed.

So accessing the bio after that is not correct.

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit
  2020-11-25 22:12       ` Eric Biggers
@ 2020-12-02 22:52         ` Satya Tangirala
  0 siblings, 0 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-12-02 22:52 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Theodore Y . Ts'o, Jaegeuk Kim, Chao Yu, Jens Axboe,
	Darrick J . Wong, linux-kernel, linux-fscrypt, linux-f2fs-devel,
	linux-xfs, linux-block, linux-ext4

On Wed, Nov 25, 2020 at 02:12:35PM -0800, Eric Biggers wrote:
> On Wed, Nov 18, 2020 at 12:38:15AM +0000, Satya Tangirala wrote:
> > > > +/**
> > > > + * update_aligned_sectors_and_segs() - Ensures that *@aligned_sectors is aligned
> > > > + *				       to @bio_sectors_alignment, and that
> > > > + *				       *@aligned_segs is the value of nsegs
> > > > + *				       when sectors reached/first exceeded that
> > > > + *				       value of *@aligned_sectors.
> > > > + *
> > > > + * @nsegs: [in] The current number of segs
> > > > + * @sectors: [in] The current number of sectors
> > > > + * @aligned_segs: [in,out] The number of segments that make up @aligned_sectors
> > > > + * @aligned_sectors: [in,out] The largest number of sectors <= @sectors that is
> > > > + *		     aligned to @sectors
> > > > + * @bio_sectors_alignment: [in] The alignment requirement for the number of
> > > > + *			  sectors
> > > > + *
> > > > + * Updates *@aligned_sectors to the largest number <= @sectors that is also a
> > > > + * multiple of @bio_sectors_alignment. This is done by updating *@aligned_sectors
> > > > + * whenever @sectors is at least @bio_sectors_alignment more than
> > > > + * *@aligned_sectors, since that means we can increment *@aligned_sectors while
> > > > + * still keeping it aligned to @bio_sectors_alignment and also keeping it <=
> > > > + * @sectors. *@aligned_segs is updated to the value of nsegs when @sectors first
> > > > + * reaches/exceeds any value that causes *@aligned_sectors to be updated.
> > > > + */
> > > > +static inline void update_aligned_sectors_and_segs(const unsigned int nsegs,
> > > > +						   const unsigned int sectors,
> > > > +						   unsigned int *aligned_segs,
> > > > +				unsigned int *aligned_sectors,
> > > > +				const unsigned int bio_sectors_alignment)
> > > > +{
> > > > +	if (sectors - *aligned_sectors < bio_sectors_alignment)
> > > > +		return;
> > > > +	*aligned_sectors = round_down(sectors, bio_sectors_alignment);
> > > > +	*aligned_segs = nsegs;
> > > > +}
> > > > +
> > > >  /**
> > > >   * bvec_split_segs - verify whether or not a bvec should be split in the middle
> > > >   * @q:        [in] request queue associated with the bio associated with @bv
> > > > @@ -195,9 +232,12 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
> > > >   * the block driver.
> > > >   */
> > > >  static bool bvec_split_segs(const struct request_queue *q,
> > > > -			    const struct bio_vec *bv, unsigned *nsegs,
> > > > -			    unsigned *sectors, unsigned max_segs,
> > > > -			    unsigned max_sectors)
> > > > +			    const struct bio_vec *bv, unsigned int *nsegs,
> > > > +			    unsigned int *sectors, unsigned int *aligned_segs,
> > > > +			    unsigned int *aligned_sectors,
> > > > +			    unsigned int bio_sectors_alignment,
> > > > +			    unsigned int max_segs,
> > > > +			    unsigned int max_sectors)
> > > >  {
> > > >  	unsigned max_len = (min(max_sectors, UINT_MAX >> 9) - *sectors) << 9;
> > > >  	unsigned len = min(bv->bv_len, max_len);
> > > > @@ -211,6 +251,11 @@ static bool bvec_split_segs(const struct request_queue *q,
> > > >  
> > > >  		(*nsegs)++;
> > > >  		total_len += seg_size;
> > > > +		update_aligned_sectors_and_segs(*nsegs,
> > > > +						*sectors + (total_len >> 9),
> > > > +						aligned_segs,
> > > > +						aligned_sectors,
> > > > +						bio_sectors_alignment);
> > > >  		len -= seg_size;
> > > >  
> > > >  		if ((bv->bv_offset + total_len) & queue_virt_boundary(q))
> > > > @@ -235,6 +280,8 @@ static bool bvec_split_segs(const struct request_queue *q,
> > > >   * following is guaranteed for the cloned bio:
> > > >   * - That it has at most get_max_io_size(@q, @bio) sectors.
> > > >   * - That it has at most queue_max_segments(@q) segments.
> > > > + * - That the number of sectors in the returned bio is aligned to
> > > > + *   blk_crypto_bio_sectors_alignment(@bio)
> > > >   *
> > > >   * Except for discard requests the cloned bio will point at the bi_io_vec of
> > > >   * the original bio. It is the responsibility of the caller to ensure that the
> > > > @@ -252,6 +299,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> > > >  	unsigned nsegs = 0, sectors = 0;
> > > >  	const unsigned max_sectors = get_max_io_size(q, bio);
> > > >  	const unsigned max_segs = queue_max_segments(q);
> > > > +	const unsigned int bio_sectors_alignment =
> > > > +					blk_crypto_bio_sectors_alignment(bio);
> > > > +	unsigned int aligned_segs = 0, aligned_sectors = 0;
> > > >  
> > > >  	bio_for_each_bvec(bv, bio, iter) {
> > > >  		/*
> > > > @@ -266,8 +316,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> > > >  		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> > > >  			nsegs++;
> > > >  			sectors += bv.bv_len >> 9;
> > > > -		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors, max_segs,
> > > > -					 max_sectors)) {
> > > > +			update_aligned_sectors_and_segs(nsegs, sectors,
> > > > +							&aligned_segs,
> > > > +							&aligned_sectors,
> > > > +							bio_sectors_alignment);
> > > > +		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
> > > > +					   &aligned_segs, &aligned_sectors,
> > > > +					   bio_sectors_alignment, max_segs,
> > > > +					   max_sectors)) {
> > > >  			goto split;
> > > >  		}
> > > >  
> > > > @@ -275,11 +331,24 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> > > >  		bvprvp = &bvprv;
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * The input bio's number of sectors is assumed to be aligned to
> > > > +	 * bio_sectors_alignment. If that's the case, then this function should
> > > > +	 * ensure that aligned_segs == nsegs and aligned_sectors == sectors if
> > > > +	 * the bio is not going to be split.
> > > > +	 */
> > > > +	WARN_ON(aligned_segs != nsegs || aligned_sectors != sectors);
> > > >  	*segs = nsegs;
> > > >  	return NULL;
> > > >  split:
> > > > -	*segs = nsegs;
> > > > -	return bio_split(bio, sectors, GFP_NOIO, bs);
> > > > +	*segs = aligned_segs;
> > > > +	if (WARN_ON(aligned_sectors == 0))
> > > > +		goto err;
> > > > +	return bio_split(bio, aligned_sectors, GFP_NOIO, bs);
> > > > +err:
> > > > +	bio->bi_status = BLK_STS_IOERR;
> > > > +	bio_endio(bio);
> > > > +	return bio;
> > > >  }
> > > 
> > > This part is pretty complex.  Are you sure it's needed?  How was alignment to
> > > logical_block_size ensured before?
> > > 
> > Afaict, alignment to logical_block_size (lbs) is done by assuming that
> > bv->bv_len is always lbs aligned (among other things). Is that not the
> > case?
> 
> I believe that's the case; bvecs are logical_block_size aligned.
> 
> So the new thing (with data_unit_size > logical_block_size) is that
> bvec boundaries aren't necessarily valid split points anymore.
> 
> > 
> > If it is the case, that's what we're trying to avoid with this patch (we
> > want to be able to submit bios that have 2 bvecs that together make up a
> > single crypto data unit, for example). And this is complex because
> > multiple segments could "add up" to make up a single crypto data unit,
> > but this function's job is to limit both the number of segments *and*
> > the number of sectors - so when ensuring that the number of sectors is
> > aligned to crypto data unit size, we also want the smallest number of
> > segments that can make up that aligned number of sectors.
> 
> Does the number of physical segments that is calculated have to be exact, or
> could it be a slight overestimate?  If the purpose of the calculation is just to
> size scatterlists and to avoid exceeding the hardware limit on the number of
> physical segments (and at a quick glance that seems to be the purpose, though I
> didn't look at everything), it seems that a slight overestimate would be okay.
> 
> If so, couldn't the number of sectors could simply be rounded down to
> blk_crypto_bio_sectors_alignment(bio) when blk_bio_segment_split() actually
> calls bio_split()?  That would be much simpler; why doesn't that work?
> 
I was assuming we'd prefer the better bound, but yeah it would be much
simpler if an overestimate was alright.

I'll look through the users of that estimate to try to gauge better
whether overestimating is ok to do (although if someone can already
authoritatively say that it's ok/not ok to overestimate, that would be
awesome too :)).
> - Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 6/8] ext4: support direct I/O with fscrypt using blk-crypto
  2020-11-17 14:07 ` [PATCH v7 6/8] ext4: " Satya Tangirala
@ 2020-12-03 13:45   ` Theodore Y. Ts'o
  0 siblings, 0 replies; 21+ messages in thread
From: Theodore Y. Ts'o @ 2020-12-03 13:45 UTC (permalink / raw)
  To: Satya Tangirala
  Cc: Jaegeuk Kim, Eric Biggers, Chao Yu, Jens Axboe, Darrick J . Wong,
	linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4, Eric Biggers

On Tue, Nov 17, 2020 at 02:07:06PM +0000, Satya Tangirala wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Wire up ext4 with fscrypt direct I/O support. Direct I/O with fscrypt is
> only supported through blk-crypto (i.e. CONFIG_BLK_INLINE_ENCRYPTION must
> have been enabled, the 'inlinecrypt' mount option must have been specified,
> and either hardware inline encryption support must be present or
> CONFIG_BLK_INLINE_ENCYRPTION_FALLBACK must have been enabled). Further,
> direct I/O on encrypted files is only supported when the *length* of the
> I/O is aligned to the filesystem block size (which is *not* necessarily the
> same as the block device's block size).
> 
> fscrypt_limit_io_blocks() is called before setting up the iomap to ensure
> that the blocks of each bio that iomap will submit will have contiguous
> DUNs. Note that fscrypt_limit_io_blocks() is normally a no-op, as normally
> the DUNs simply increment along with the logical blocks. But it's needed
> to handle an edge case in one of the fscrypt IV generation methods.
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> Co-developed-by: Satya Tangirala <satyat@google.com>
> Signed-off-by: Satya Tangirala <satyat@google.com>
> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>

Acked-by: Theodore Ts'o <tytso@mit.edu>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto
  2020-11-17 17:15 ` [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Theodore Y. Ts'o
  2020-11-17 17:20   ` Darrick J. Wong
@ 2020-12-03 23:57   ` Satya Tangirala
  1 sibling, 0 replies; 21+ messages in thread
From: Satya Tangirala @ 2020-12-03 23:57 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Jaegeuk Kim, Eric Biggers, Chao Yu, Jens Axboe, Darrick J . Wong,
	linux-kernel, linux-fscrypt, linux-f2fs-devel, linux-xfs,
	linux-block, linux-ext4

On Tue, Nov 17, 2020 at 12:15:26PM -0500, Theodore Y. Ts'o wrote:
> What is the expected use case for Direct I/O using fscrypt?  This
> isn't a problem which is unique to fscrypt, but one of the really
> unfortunate aspects of the DIO interface is the silent fallback to
> buffered I/O.  We've lived with this because DIO goes back decades,
> and the original use case was to keep enterprise databases happy, and
> the rules around what is necessary for DIO to work was relatively well
> understood.
> 
> But with fscrypt, there's going to be some additional requirements
> (e.g., using inline crypto) required or else DIO silently fall back to
> buffered I/O for encrypted files.  Depending on the intended use case
> of DIO with fscrypt, this caveat might or might not be unfortunately
> surprising for applications.
> 
> I wonder if we should have some kind of interface so we can more
> explicitly allow applications to query exactly what the requirements
> might be for a particular file vis-a-vis Direct I/O.  What are the
> memory alignment requirements, what are the file offset alignment
> requirements, what are the write size requirements, for a particular
> file.
> 
(Credit to Eric for the description of use cases that I'm
copying/summarizing here).
The primary motivation for this patch series is Android - some devices use
zram with cold page writeback enabled to an encrypted swap file, so direct
I/O is needed to avoid double-caching the data in the swap file. In
general, this patch is useful for avoiding double caching any time a
loopback device is created in an encrypted directory. We also expect this
to be useful for databases that want to use direct I/O but also want to
encrypt data at the FS level.

I do think having a good way to tell userspace about the DIO requirements
would be great to have. Userspace does have ways to access to most, but not
all, of the information it needs to figure out the DIO requirements (I
don't think userspace has any way of figuring out if inline encryption
hardware is available right now), so it would be nice if there was a
good/unified API for getting those requirements.

Do you think we'll need that before these patches can go in though? I do
think the patches as is are useful for their primary use case even without
the ability to explicitly query for the DIO requirements, because Android
devices are predictable w.r.t inline encryption support (devices ship with
either blk-crypto-fallback or have inline encryption hardware, and the
patchset's requirements are met in either case). And even when used on
machines without such predictability, this patch is at worst the same as
the current situation, and at best an improvement.
> 						- Ted

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-12-03 23:58 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17 14:07 [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Satya Tangirala
2020-11-17 14:07 ` [PATCH v7 1/8] block: ensure bios are not split in middle of crypto data unit Satya Tangirala
2020-11-17 23:31   ` Eric Biggers
2020-11-18  0:38     ` Satya Tangirala
2020-11-25 22:12       ` Eric Biggers
2020-12-02 22:52         ` Satya Tangirala
2020-11-25 22:15   ` Eric Biggers
2020-11-17 14:07 ` [PATCH v7 2/8] blk-crypto: don't require user buffer alignment Satya Tangirala
2020-11-17 23:51   ` Eric Biggers
2020-11-17 14:07 ` [PATCH v7 3/8] fscrypt: add functions for direct I/O support Satya Tangirala
2020-11-17 14:07 ` [PATCH v7 4/8] direct-io: add support for fscrypt using blk-crypto Satya Tangirala
2020-11-17 14:07 ` [PATCH v7 5/8] iomap: support direct I/O with " Satya Tangirala
2020-11-17 14:07 ` [PATCH v7 6/8] ext4: " Satya Tangirala
2020-12-03 13:45   ` Theodore Y. Ts'o
2020-11-17 14:07 ` [PATCH v7 7/8] f2fs: " Satya Tangirala
2020-11-17 14:07 ` [PATCH v7 8/8] fscrypt: update documentation for direct I/O support Satya Tangirala
2020-11-18  2:38   ` Eric Biggers
2020-11-17 17:15 ` [PATCH v7 0/8] add support for direct I/O with fscrypt using blk-crypto Theodore Y. Ts'o
2020-11-17 17:20   ` Darrick J. Wong
2020-12-03 23:57   ` Satya Tangirala
2020-11-18  2:51 ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).