All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] ceph: size handling for the fscrypt
@ 2021-11-01  2:04 xiubli
  2021-11-01  2:04 ` [PATCH v4 1/4] Revert "ceph: make client zero partial trailing block on truncate" xiubli
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: xiubli @ 2021-11-01  2:04 UTC (permalink / raw)
  To: jlayton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

This patch series is based on the "fscrypt_size_handling" branch in
https://github.com/lxbsz/linux.git, which is based Jeff's
"ceph-fscrypt-content-experimental" branch in
https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
and added two upstream commits, which should be merged already.

These two upstream commits should be removed after Jeff rebase
his "ceph-fscrypt-content-experimental" branch to upstream code.

====

This approach is based on the discussion from V1 and V2, which will
pass the encrypted last block contents to MDS along with the truncate
request.

This will send the encrypted last block contents to MDS along with
the truncate request when truncating to a smaller size and at the
same time new size does not align to BLOCK SIZE.

The MDS side patch is raised in PR
https://github.com/ceph/ceph/pull/43588, which is also based Jeff's
previous great work in PR https://github.com/ceph/ceph/pull/41284.

The MDS will use the filer.write_trunc(), which could update and
truncate the file in one shot, instead of filer.truncate().

This just assume kclient won't support the inline data feature, which
will be remove soon, more detail please see:
https://tracker.ceph.com/issues/52916

Changed in V4:
- Retry the truncate request by 20 times before fail it with -EAGAIN.
- Remove the "fill_last_block" label and move the code to else branch.
- Remove the #3 patch, which has already been sent out separately, in
  V3 series.
- Improve some comments in the code.


Changed in V3:
- Fix possibly corrupting the file just before the MDS acquires the
  xlock for FILE lock, another client has updated it.
- Flush the pagecache buffer before reading the last block for the
  when filling the truncate request.
- Some other minore fixes.

Xiubo Li (4):
  Revert "ceph: make client zero partial trailing block on truncate"
  ceph: add __ceph_get_caps helper support
  ceph: add __ceph_sync_read helper support
  ceph: add truncate size handling support for fscrypt

 fs/ceph/caps.c              |  21 ++--
 fs/ceph/file.c              |  44 +++++---
 fs/ceph/inode.c             | 203 ++++++++++++++++++++++++++++++------
 fs/ceph/super.h             |   6 +-
 include/linux/ceph/crypto.h |  28 +++++
 5 files changed, 251 insertions(+), 51 deletions(-)
 create mode 100644 include/linux/ceph/crypto.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/4] Revert "ceph: make client zero partial trailing block on truncate"
  2021-11-01  2:04 [PATCH v4 0/4] ceph: size handling for the fscrypt xiubli
@ 2021-11-01  2:04 ` xiubli
  2021-11-01  2:04 ` [PATCH v4 2/4] ceph: add __ceph_get_caps helper support xiubli
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: xiubli @ 2021-11-01  2:04 UTC (permalink / raw)
  To: jlayton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

This reverts commit c97968122078ce0380cd8db405b8505a8b0a55d8.
---
 fs/ceph/file.c  |  3 ++-
 fs/ceph/inode.c | 23 ++---------------------
 fs/ceph/super.h |  1 -
 3 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index ee13512b610d..af58be73ce1c 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -2250,7 +2250,8 @@ static void ceph_zero_pagecache_range(struct inode *inode, loff_t offset,
 		ceph_zero_partial_page(inode, offset, length);
 }
 
-int ceph_zero_partial_object(struct inode *inode, loff_t offset, loff_t *length)
+static int ceph_zero_partial_object(struct inode *inode,
+				    loff_t offset, loff_t *length)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 5d47b98b61af..9b798690fdc9 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2393,6 +2393,7 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 					cpu_to_le64(round_up(isize,
 							     CEPH_FSCRYPT_BLOCK_SIZE));
 				req->r_fscrypt_file = attr->ia_size;
+				/* FIXME: client must zero out any partial blocks! */
 			} else {
 				req->r_args.setattr.size = cpu_to_le64(attr->ia_size);
 				req->r_args.setattr.old_size = cpu_to_le64(isize);
@@ -2481,28 +2482,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 	ceph_mdsc_put_request(req);
 	ceph_free_cap_flush(prealloc_cf);
 
-	if (err >= 0 && (mask & (CEPH_SETATTR_SIZE|CEPH_SETATTR_FSCRYPT_FILE))) {
+	if (err >= 0 && (mask & CEPH_SETATTR_SIZE))
 		__ceph_do_pending_vmtruncate(inode);
-		if (mask & CEPH_SETATTR_FSCRYPT_FILE) {
-			loff_t orig_len, len;
-
-			len = round_up(attr->ia_size, CEPH_FSCRYPT_BLOCK_SIZE) - attr->ia_size;
-			orig_len = len;
-
-			/*
-			 * FIXME: this is just doing the truncating the last OSD
-			 * 	  object, but for "real" fscrypt support, we need
-			 * 	  to do a RMW with the end of the block zeroed out.
-			 */
-			if (len) {
-				err = ceph_zero_partial_object(inode, attr->ia_size, &len);
-				/* This had better not be shortened */
-				WARN_ONCE(!err && len != orig_len,
-					  "attr->ia_size=%lld orig_len=%lld len=%lld\n",
-					  attr->ia_size, orig_len, len);
-			}
-		}
-	}
 
 	return err;
 }
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 6d4a22c6d32d..7f3976b3319d 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1236,7 +1236,6 @@ extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 extern int ceph_release(struct inode *inode, struct file *filp);
 extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page,
 				  char *data, size_t len);
-int ceph_zero_partial_object(struct inode *inode, loff_t offset, loff_t *length);
 
 /* dir.c */
 extern const struct file_operations ceph_dir_fops;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/4] ceph: add __ceph_get_caps helper support
  2021-11-01  2:04 [PATCH v4 0/4] ceph: size handling for the fscrypt xiubli
  2021-11-01  2:04 ` [PATCH v4 1/4] Revert "ceph: make client zero partial trailing block on truncate" xiubli
@ 2021-11-01  2:04 ` xiubli
  2021-11-01  2:04 ` [PATCH v4 3/4] ceph: add __ceph_sync_read " xiubli
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: xiubli @ 2021-11-01  2:04 UTC (permalink / raw)
  To: jlayton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/caps.c  | 19 +++++++++++++------
 fs/ceph/super.h |  2 ++
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index d628dcdbf869..4e2a588465c5 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -2876,10 +2876,9 @@ int ceph_try_get_caps(struct inode *inode, int need, int want,
  * due to a small max_size, make sure we check_max_size (and possibly
  * ask the mds) so we don't get hung up indefinitely.
  */
-int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got)
+int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, int need,
+		    int want, loff_t endoff, int *got)
 {
-	struct ceph_file_info *fi = filp->private_data;
-	struct inode *inode = file_inode(filp);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	int ret, _got, flags;
@@ -2888,7 +2887,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
 	if (ret < 0)
 		return ret;
 
-	if ((fi->fmode & CEPH_FILE_MODE_WR) &&
+	if (fi && (fi->fmode & CEPH_FILE_MODE_WR) &&
 	    fi->filp_gen != READ_ONCE(fsc->filp_gen))
 		return -EBADF;
 
@@ -2896,7 +2895,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
 
 	while (true) {
 		flags &= CEPH_FILE_MODE_MASK;
-		if (atomic_read(&fi->num_locks))
+		if (fi && atomic_read(&fi->num_locks))
 			flags |= CHECK_FILELOCK;
 		_got = 0;
 		ret = try_get_cap_refs(inode, need, want, endoff,
@@ -2941,7 +2940,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
 				continue;
 		}
 
-		if ((fi->fmode & CEPH_FILE_MODE_WR) &&
+		if (fi && (fi->fmode & CEPH_FILE_MODE_WR) &&
 		    fi->filp_gen != READ_ONCE(fsc->filp_gen)) {
 			if (ret >= 0 && _got)
 				ceph_put_cap_refs(ci, _got);
@@ -3004,6 +3003,14 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
 	return 0;
 }
 
+int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got)
+{
+	struct ceph_file_info *fi = filp->private_data;
+	struct inode *inode = file_inode(filp);
+
+	return __ceph_get_caps(inode, fi, need, want, endoff, got);
+}
+
 /*
  * Take cap refs.  Caller must already know we hold at least one ref
  * on the caps in question or we don't know this is safe.
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 7f3976b3319d..027d5f579ba0 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1208,6 +1208,8 @@ extern int ceph_encode_dentry_release(void **p, struct dentry *dn,
 				      struct inode *dir,
 				      int mds, int drop, int unless);
 
+extern int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi,
+			   int need, int want, loff_t endoff, int *got);
 extern int ceph_get_caps(struct file *filp, int need, int want,
 			 loff_t endoff, int *got);
 extern int ceph_try_get_caps(struct inode *inode,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/4] ceph: add __ceph_sync_read helper support
  2021-11-01  2:04 [PATCH v4 0/4] ceph: size handling for the fscrypt xiubli
  2021-11-01  2:04 ` [PATCH v4 1/4] Revert "ceph: make client zero partial trailing block on truncate" xiubli
  2021-11-01  2:04 ` [PATCH v4 2/4] ceph: add __ceph_get_caps helper support xiubli
@ 2021-11-01  2:04 ` xiubli
  2021-11-01  2:04 ` [PATCH v4 4/4] ceph: add truncate size handling support for fscrypt xiubli
  2021-11-01 10:27 ` [PATCH v4 0/4] ceph: size handling for the fscrypt Jeff Layton
  4 siblings, 0 replies; 13+ messages in thread
From: xiubli @ 2021-11-01  2:04 UTC (permalink / raw)
  To: jlayton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/file.c  | 35 +++++++++++++++++++++++------------
 fs/ceph/super.h |  2 ++
 2 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index af58be73ce1c..9ce78c97de9a 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -901,21 +901,18 @@ static inline void fscrypt_adjust_off_and_len(struct inode *inode, u64 *off, u64
  * If we get a short result from the OSD, check against i_size; we need to
  * only return a short read to the caller if we hit EOF.
  */
-static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
-			      int *retry_op)
+ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos,
+			 struct iov_iter *to, int *retry_op)
 {
-	struct file *file = iocb->ki_filp;
-	struct inode *inode = file_inode(file);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	struct ceph_osd_client *osdc = &fsc->client->osdc;
 	ssize_t ret;
-	u64 off = iocb->ki_pos;
+	u64 off = *ki_pos;
 	u64 len = iov_iter_count(to);
 	u64 i_size;
 
-	dout("sync_read on file %p %llu~%u %s\n", file, off, (unsigned)len,
-	     (file->f_flags & O_DIRECT) ? "O_DIRECT" : "");
+	dout("sync_read on inode %p %llu~%u\n", inode, *ki_pos, (unsigned)len);
 
 	if (!len)
 		return 0;
@@ -1058,14 +1055,14 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
 			break;
 	}
 
-	if (off > iocb->ki_pos) {
+	if (off > *ki_pos) {
 		if (off >= i_size) {
 			*retry_op = CHECK_EOF;
-			ret = i_size - iocb->ki_pos;
-			iocb->ki_pos = i_size;
+			ret = i_size - *ki_pos;
+			*ki_pos = i_size;
 		} else {
-			ret = off - iocb->ki_pos;
-			iocb->ki_pos = off;
+			ret = off - *ki_pos;
+			*ki_pos = off;
 		}
 	}
 out:
@@ -1073,6 +1070,20 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
 	return ret;
 }
 
+static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
+			      int *retry_op)
+{
+	struct file *file = iocb->ki_filp;
+	struct inode *inode = file_inode(file);
+
+	dout("sync_read on file %p %llu~%u %s\n", file, iocb->ki_pos,
+	     (unsigned)iov_iter_count(to),
+	     (file->f_flags & O_DIRECT) ? "O_DIRECT" : "");
+
+	return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op);
+
+}
+
 struct ceph_aio_request {
 	struct kiocb *iocb;
 	size_t total_len;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 027d5f579ba0..57bc952c54e1 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1235,6 +1235,8 @@ extern int ceph_renew_caps(struct inode *inode, int fmode);
 extern int ceph_open(struct inode *inode, struct file *file);
 extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 			    struct file *file, unsigned flags, umode_t mode);
+extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos,
+				struct iov_iter *to, int *retry_op);
 extern int ceph_release(struct inode *inode, struct file *filp);
 extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page,
 				  char *data, size_t len);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/4] ceph: add truncate size handling support for fscrypt
  2021-11-01  2:04 [PATCH v4 0/4] ceph: size handling for the fscrypt xiubli
                   ` (2 preceding siblings ...)
  2021-11-01  2:04 ` [PATCH v4 3/4] ceph: add __ceph_sync_read " xiubli
@ 2021-11-01  2:04 ` xiubli
  2021-11-01 10:27 ` [PATCH v4 0/4] ceph: size handling for the fscrypt Jeff Layton
  4 siblings, 0 replies; 13+ messages in thread
From: xiubli @ 2021-11-01  2:04 UTC (permalink / raw)
  To: jlayton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

This will transfer the encrypted last block contents to the MDS
along with the truncate request only when the new size is smaller
and not aligned to the fscrypt BLOCK size. When the last block is
located in the file hole, the truncate request will only contain
the header.

The MDS could fail to do the truncate if there has another client
or process has already updated the Rados object which contains
the last block, and will return -EAGAIN, then the kclient needs
to retry it. The RMW will take around 50ms, and will let it retry
20 times for now.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/caps.c              |   2 -
 fs/ceph/file.c              |  10 +-
 fs/ceph/inode.c             | 182 ++++++++++++++++++++++++++++++++++--
 fs/ceph/super.h             |   3 +-
 include/linux/ceph/crypto.h |  28 ++++++
 5 files changed, 211 insertions(+), 14 deletions(-)
 create mode 100644 include/linux/ceph/crypto.h

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 4e2a588465c5..c9624b059eb0 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1299,8 +1299,6 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg)
 	 * fscrypt_auth holds the crypto context (if any). fscrypt_file
 	 * tracks the real i_size as an __le64 field (and we use a rounded-up
 	 * i_size in * the traditional size field).
-	 *
-	 * FIXME: should we encrypt fscrypt_file field?
 	 */
 	ceph_encode_32(&p, arg->fscrypt_auth_len);
 	ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 9ce78c97de9a..8673a4dc5538 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -902,7 +902,8 @@ static inline void fscrypt_adjust_off_and_len(struct inode *inode, u64 *off, u64
  * only return a short read to the caller if we hit EOF.
  */
 ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos,
-			 struct iov_iter *to, int *retry_op)
+			 struct iov_iter *to, int *retry_op,
+			 u64 *assert_ver)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
@@ -978,6 +979,9 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos,
 					 req->r_end_latency,
 					 len, ret);
 
+		/* Grab assert version. It must be non-zero. */
+		*assert_ver = req->r_version;
+		WARN_ON_ONCE(assert_ver == 0);
 		ceph_osdc_put_request(req);
 
 		i_size = i_size_read(inode);
@@ -1075,12 +1079,14 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
 {
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file_inode(file);
+	u64 assert_ver;
 
 	dout("sync_read on file %p %llu~%u %s\n", file, iocb->ki_pos,
 	     (unsigned)iov_iter_count(to),
 	     (file->f_flags & O_DIRECT) ? "O_DIRECT" : "");
 
-	return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op);
+	return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op,
+				&assert_ver);
 
 }
 
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 9b798690fdc9..d84692d6609a 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -21,6 +21,7 @@
 #include "cache.h"
 #include "crypto.h"
 #include <linux/ceph/decode.h>
+#include <linux/ceph/crypto.h>
 
 /*
  * Ceph inode operations
@@ -1034,10 +1035,14 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page,
 		pool_ns = old_ns;
 
 		if (IS_ENCRYPTED(inode) && size &&
-		    (iinfo->fscrypt_file_len == sizeof(__le64))) {
-			size = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file);
-			if (info->size != round_up(size, CEPH_FSCRYPT_BLOCK_SIZE))
-				pr_warn("size=%llu fscrypt_file=%llu\n", info->size, size);
+		    (iinfo->fscrypt_file_len >= sizeof(__le64))) {
+			u64 fsize = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file);
+			if (fsize) {
+				size = fsize;
+				if (info->size != round_up(size, CEPH_FSCRYPT_BLOCK_SIZE))
+					pr_warn("size=%llu fscrypt_file=%llu\n",
+						info->size, size);
+			}
 		}
 
 		queue_trunc = ceph_fill_file_size(inode, issued,
@@ -2229,6 +2234,129 @@ static const struct inode_operations ceph_encrypted_symlink_iops = {
 	.listxattr = ceph_listxattr,
 };
 
+/*
+ * Transfer the encrypted last block to the MDS and the MDS
+ * will update the file when truncating a smaller size.
+ *
+ * We don't support a PAGE_SIZE that is smaller than the
+ * CEPH_FSCRYPT_BLOCK_SIZE.
+ */
+static int fill_fscrypt_truncate(struct inode *inode,
+				 struct ceph_mds_request *req,
+				 struct iattr *attr)
+{
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	int boff = attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE;
+	loff_t pos, orig_pos = round_down(attr->ia_size, CEPH_FSCRYPT_BLOCK_SIZE);
+	u64 block = orig_pos >> CEPH_FSCRYPT_BLOCK_SHIFT;
+	struct ceph_pagelist *pagelist = NULL;
+	struct kvec iov;
+	struct iov_iter iter;
+	struct page *page = NULL;
+	struct ceph_fscrypt_truncate_size_header header;
+	int retry_op = 0;
+	int len = CEPH_FSCRYPT_BLOCK_SIZE;
+	loff_t i_size = i_size_read(inode);
+	u64 assert_ver = cpu_to_le64(0);
+	int got, ret, issued;
+
+	ret = __ceph_get_caps(inode, NULL, CEPH_CAP_FILE_RD, 0, -1, &got);
+	if (ret < 0)
+		return ret;
+
+	dout("%s size %lld -> %lld got cap refs on %s\n", __func__,
+	     i_size, attr->ia_size, ceph_cap_string(got));
+
+	issued = __ceph_caps_issued(ci, NULL);
+
+	/* Try to writeback the dirty pagecaches */
+	if (issued & (CEPH_CAP_FILE_BUFFER))
+		filemap_fdatawrite(&inode->i_data);
+
+	page = __page_cache_alloc(GFP_KERNEL);
+	if (page == NULL) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	pagelist = ceph_pagelist_alloc(GFP_KERNEL);
+	if (!pagelist) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	iov.iov_base = kmap_local_page(page);
+	iov.iov_len = len;
+	iov_iter_kvec(&iter, READ, &iov, 1, len);
+
+	pos = orig_pos;
+	ret = __ceph_sync_read(inode, &pos, &iter, &retry_op, &assert_ver);
+	ceph_put_cap_refs(ci, got);
+
+	/* Insert the header first */
+	header.ver = 1;
+	header.compat = 1;
+
+	/*
+	 * If we hit a hole here, we should just skip filling
+	 * the fscrypt for the request, because once the fscrypt
+	 * is enabled, the file will be split into many blocks
+	 * with the size of CEPH_FSCRYPT_BLOCK_SIZE, if there
+	 * has a hole, the hole size should be multiple of block
+	 * size.
+	 */
+	if (pos < i_size && ret < len) {
+		dout("%s hit hole, ppos %lld < size %lld\n",
+		     __func__, pos, i_size);
+
+		header.data_len = cpu_to_le32(8 + 8 + 4);
+		header.assert_ver = cpu_to_le64(0);
+		header.file_offset = cpu_to_le64(0);
+		header.block_size = cpu_to_le64(0);
+		ret = 0;
+	} else {
+		header.data_len = cpu_to_le32(8 + 8 + 4 + CEPH_FSCRYPT_BLOCK_SIZE);
+		header.assert_ver = assert_ver;
+		header.file_offset = cpu_to_le64(orig_pos);
+		header.block_size = cpu_to_le64(CEPH_FSCRYPT_BLOCK_SIZE);
+
+		/* truncate and zero out the extra contents for the last block */
+		memset(iov.iov_base + boff, 0, PAGE_SIZE - boff);
+
+		/* encrypt the last block */
+		ret = fscrypt_encrypt_block_inplace(inode, page,
+						    CEPH_FSCRYPT_BLOCK_SIZE,
+						    0, block,
+						    GFP_KERNEL);
+		if (ret)
+			goto out;
+
+	}
+
+	/* Insert the header */
+	ret = ceph_pagelist_append(pagelist, &header, sizeof(header));
+	if (ret)
+		goto out;
+
+	if (header.block_size) {
+		/* Append the last block contents to pagelist */
+		ret = ceph_pagelist_append(pagelist, iov.iov_base,
+					   CEPH_FSCRYPT_BLOCK_SIZE);
+		if (ret)
+			goto out;
+	}
+	req->r_pagelist = pagelist;
+out:
+	dout("%s %p size dropping cap refs on %s\n", __func__,
+	     inode, ceph_cap_string(got));
+	kunmap_local(iov.iov_base);
+	if (page)
+		__free_pages(page, 0);
+	if (ret && pagelist)
+		ceph_pagelist_release(pagelist);
+	return ret;
+}
+
 int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *cia)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
@@ -2236,12 +2364,15 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 	struct ceph_mds_request *req;
 	struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc;
 	struct ceph_cap_flush *prealloc_cf;
+	loff_t isize = i_size_read(inode);
 	int issued;
 	int release = 0, dirtied = 0;
 	int mask = 0;
 	int err = 0;
 	int inode_dirty_flags = 0;
 	bool lock_snap_rwsem = false;
+	bool fill_fscrypt;
+	int truncate_retry = 20; /* The RMW will take around 50ms */
 
 	prealloc_cf = ceph_alloc_cap_flush();
 	if (!prealloc_cf)
@@ -2254,6 +2385,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 		return PTR_ERR(req);
 	}
 
+retry:
+	fill_fscrypt = false;
 	spin_lock(&ci->i_ceph_lock);
 	issued = __ceph_caps_issued(ci, NULL);
 
@@ -2367,10 +2500,27 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 		}
 	}
 	if (ia_valid & ATTR_SIZE) {
-		loff_t isize = i_size_read(inode);
-
 		dout("setattr %p size %lld -> %lld\n", inode, isize, attr->ia_size);
-		if ((issued & CEPH_CAP_FILE_EXCL) && attr->ia_size >= isize) {
+		/*
+		 * Only when the new size is smaller and not aligned to
+		 * CEPH_FSCRYPT_BLOCK_SIZE will the RMW is needed.
+		 */
+		if (IS_ENCRYPTED(inode) && attr->ia_size < isize &&
+		    (attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE)) {
+			mask |= CEPH_SETATTR_SIZE;
+			release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL |
+				   CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR;
+			set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
+			mask |= CEPH_SETATTR_FSCRYPT_FILE;
+			req->r_args.setattr.size =
+				cpu_to_le64(round_up(attr->ia_size,
+						     CEPH_FSCRYPT_BLOCK_SIZE));
+			req->r_args.setattr.old_size =
+				cpu_to_le64(round_up(isize,
+						     CEPH_FSCRYPT_BLOCK_SIZE));
+			req->r_fscrypt_file = attr->ia_size;
+			fill_fscrypt = true;
+		} else if ((issued & CEPH_CAP_FILE_EXCL) && attr->ia_size >= isize) {
 			if (attr->ia_size > isize) {
 				i_size_write(inode, attr->ia_size);
 				inode->i_blocks = calc_inode_blocks(attr->ia_size);
@@ -2393,7 +2543,6 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 					cpu_to_le64(round_up(isize,
 							     CEPH_FSCRYPT_BLOCK_SIZE));
 				req->r_fscrypt_file = attr->ia_size;
-				/* FIXME: client must zero out any partial blocks! */
 			} else {
 				req->r_args.setattr.size = cpu_to_le64(attr->ia_size);
 				req->r_args.setattr.old_size = cpu_to_le64(isize);
@@ -2465,7 +2614,6 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 	if (inode_dirty_flags)
 		__mark_inode_dirty(inode, inode_dirty_flags);
 
-
 	if (mask) {
 		req->r_inode = inode;
 		ihold(inode);
@@ -2473,7 +2621,23 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
 		req->r_args.setattr.mask = cpu_to_le32(mask);
 		req->r_num_caps = 1;
 		req->r_stamp = attr->ia_ctime;
+		if (fill_fscrypt) {
+			err = fill_fscrypt_truncate(inode, req, attr);
+			if (err)
+				goto out;
+		}
+
+		/*
+		 * The truncate will return -EAGAIN when some one
+		 * has updated the last block before the MDS hold
+		 * the xlock for the FILE lock. Need to retry it.
+		 */
 		err = ceph_mdsc_do_request(mdsc, NULL, req);
+		if (err == -EAGAIN && truncate_retry--) {
+			dout("setattr %p result=%d (%s locally, %d remote), retry it!\n",
+			     inode, err, ceph_cap_string(dirtied), mask);
+			goto retry;
+		}
 	}
 out:
 	dout("setattr %p result=%d (%s locally, %d remote)\n", inode, err,
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 57bc952c54e1..c8144273ff28 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1236,7 +1236,8 @@ extern int ceph_open(struct inode *inode, struct file *file);
 extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 			    struct file *file, unsigned flags, umode_t mode);
 extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos,
-				struct iov_iter *to, int *retry_op);
+				struct iov_iter *to, int *retry_op,
+				u64 *assert_ver);
 extern int ceph_release(struct inode *inode, struct file *filp);
 extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page,
 				  char *data, size_t len);
diff --git a/include/linux/ceph/crypto.h b/include/linux/ceph/crypto.h
new file mode 100644
index 000000000000..2b0961902887
--- /dev/null
+++ b/include/linux/ceph/crypto.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _FS_CEPH_CRYPTO_H
+#define _FS_CEPH_CRYPTO_H
+
+#include <linux/types.h>
+
+/*
+ * Header for the crypted file when truncating the size, this
+ * will be sent to MDS, and the MDS will update the encrypted
+ * last block and then truncate the size.
+ */
+struct ceph_fscrypt_truncate_size_header {
+       __u8  ver;
+       __u8  compat;
+
+       /*
+	* It will be sizeof(assert_ver + file_offset + block_size)
+	* if the last block is empty when it's located in a file
+	* hole. Or the data_len will plus CEPH_FSCRYPT_BLOCK_SIZE.
+	*/
+       __le32 data_len;
+
+       __le64 assert_ver;
+       __le64 file_offset;
+       __le32 block_size;
+} __packed;
+
+#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-01  2:04 [PATCH v4 0/4] ceph: size handling for the fscrypt xiubli
                   ` (3 preceding siblings ...)
  2021-11-01  2:04 ` [PATCH v4 4/4] ceph: add truncate size handling support for fscrypt xiubli
@ 2021-11-01 10:27 ` Jeff Layton
  2021-11-01 17:07   ` Jeff Layton
  2021-11-02  9:44   ` Xiubo Li
  4 siblings, 2 replies; 13+ messages in thread
From: Jeff Layton @ 2021-11-01 10:27 UTC (permalink / raw)
  To: xiubli; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel

On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> This patch series is based on the "fscrypt_size_handling" branch in
> https://github.com/lxbsz/linux.git, which is based Jeff's
> "ceph-fscrypt-content-experimental" branch in
> https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
> and added two upstream commits, which should be merged already.
> 
> These two upstream commits should be removed after Jeff rebase
> his "ceph-fscrypt-content-experimental" branch to upstream code.
> 

I don't think I was clear last time. I'd like for you to post the
_entire_ stack of patches that is based on top of
ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
this point, so I think it's a reasonable place for you to base your
work. That way you're not beginning with a revert.

Again, feel free to cherry-pick any of the patches in any of my other
branches for your series, but I'd like to see a complete series of
patches.

Thanks,
Jeff


> ====
> 
> This approach is based on the discussion from V1 and V2, which will
> pass the encrypted last block contents to MDS along with the truncate
> request.
> 
> This will send the encrypted last block contents to MDS along with
> the truncate request when truncating to a smaller size and at the
> same time new size does not align to BLOCK SIZE.
> 
> The MDS side patch is raised in PR
> https://github.com/ceph/ceph/pull/43588, which is also based Jeff's
> previous great work in PR https://github.com/ceph/ceph/pull/41284.
> 
> The MDS will use the filer.write_trunc(), which could update and
> truncate the file in one shot, instead of filer.truncate().
> 
> This just assume kclient won't support the inline data feature, which
> will be remove soon, more detail please see:
> https://tracker.ceph.com/issues/52916
> 
> Changed in V4:
> - Retry the truncate request by 20 times before fail it with -EAGAIN.
> - Remove the "fill_last_block" label and move the code to else branch.
> - Remove the #3 patch, which has already been sent out separately, in
>   V3 series.
> - Improve some comments in the code.
> 
> 
> Changed in V3:
> - Fix possibly corrupting the file just before the MDS acquires the
>   xlock for FILE lock, another client has updated it.
> - Flush the pagecache buffer before reading the last block for the
>   when filling the truncate request.
> - Some other minore fixes.
> 
> Xiubo Li (4):
>   Revert "ceph: make client zero partial trailing block on truncate"
>   ceph: add __ceph_get_caps helper support
>   ceph: add __ceph_sync_read helper support
>   ceph: add truncate size handling support for fscrypt
> 
>  fs/ceph/caps.c              |  21 ++--
>  fs/ceph/file.c              |  44 +++++---
>  fs/ceph/inode.c             | 203 ++++++++++++++++++++++++++++++------
>  fs/ceph/super.h             |   6 +-
>  include/linux/ceph/crypto.h |  28 +++++
>  5 files changed, 251 insertions(+), 51 deletions(-)
>  create mode 100644 include/linux/ceph/crypto.h
> 

-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-01 10:27 ` [PATCH v4 0/4] ceph: size handling for the fscrypt Jeff Layton
@ 2021-11-01 17:07   ` Jeff Layton
  2021-11-02  1:02     ` Xiubo Li
  2021-11-02  9:44   ` Xiubo Li
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2021-11-01 17:07 UTC (permalink / raw)
  To: xiubli; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel

On Mon, 2021-11-01 at 06:27 -0400, Jeff Layton wrote:
> On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
> > From: Xiubo Li <xiubli@redhat.com>
> > 
> > This patch series is based on the "fscrypt_size_handling" branch in
> > https://github.com/lxbsz/linux.git, which is based Jeff's
> > "ceph-fscrypt-content-experimental" branch in
> > https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
> > and added two upstream commits, which should be merged already.
> > 
> > These two upstream commits should be removed after Jeff rebase
> > his "ceph-fscrypt-content-experimental" branch to upstream code.
> > 
> 
> I don't think I was clear last time. I'd like for you to post the
> _entire_ stack of patches that is based on top of
> ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
> this point, so I think it's a reasonable place for you to base your
> work. That way you're not beginning with a revert.
> 
> Again, feel free to cherry-pick any of the patches in any of my other
> branches for your series, but I'd like to see a complete series of
> patches.
> 
> 

To be even more clear:

The main reason this patchset is not helpful is that the
ceph-fscrypt-content-experimental branch in my tree has bitrotted in the
face of other changes that have gone into the testing branch since it
was cut. Also, that branch had several patches that added in actual
encryption of the content, which we don't want to do at this point.

For the work you're doing, what I'd like to see is a patchset based on
top of the ceph-client/wip-fscrypt-fnames branch. That patchset should
make it so what when encryption is enabled, the size handling for the
inode is changed to use the new scheme we've added, but don't do any
actual content encryption yet. Feel free to pick any of the patches in
my trees that you need to do this, but it needs to be the whole series.

What we need to be able to do in this phase is validate that the size
handling is correct in both the encrypted and non-encrypted cases, but
without encrypting the data. Once that piece is working, then we should
be able to add in content encryption.


> 
> > ====
> > 
> > This approach is based on the discussion from V1 and V2, which will
> > pass the encrypted last block contents to MDS along with the truncate
> > request.
> > 
> > This will send the encrypted last block contents to MDS along with
> > the truncate request when truncating to a smaller size and at the
> > same time new size does not align to BLOCK SIZE.
> > 
> > The MDS side patch is raised in PR
> > https://github.com/ceph/ceph/pull/43588, which is also based Jeff's
> > previous great work in PR https://github.com/ceph/ceph/pull/41284.
> > 
> > The MDS will use the filer.write_trunc(), which could update and
> > truncate the file in one shot, instead of filer.truncate().
> > 
> > This just assume kclient won't support the inline data feature, which
> > will be remove soon, more detail please see:
> > https://tracker.ceph.com/issues/52916
> > 
> > Changed in V4:
> > - Retry the truncate request by 20 times before fail it with -EAGAIN.
> > - Remove the "fill_last_block" label and move the code to else branch.
> > - Remove the #3 patch, which has already been sent out separately, in
> >   V3 series.
> > - Improve some comments in the code.
> > 
> > 
> > Changed in V3:
> > - Fix possibly corrupting the file just before the MDS acquires the
> >   xlock for FILE lock, another client has updated it.
> > - Flush the pagecache buffer before reading the last block for the
> >   when filling the truncate request.
> > - Some other minore fixes.
> > 
> > Xiubo Li (4):
> >   Revert "ceph: make client zero partial trailing block on truncate"
> >   ceph: add __ceph_get_caps helper support
> >   ceph: add __ceph_sync_read helper support
> >   ceph: add truncate size handling support for fscrypt
> > 
> >  fs/ceph/caps.c              |  21 ++--
> >  fs/ceph/file.c              |  44 +++++---
> >  fs/ceph/inode.c             | 203 ++++++++++++++++++++++++++++++------
> >  fs/ceph/super.h             |   6 +-
> >  include/linux/ceph/crypto.h |  28 +++++
> >  5 files changed, 251 insertions(+), 51 deletions(-)
> >  create mode 100644 include/linux/ceph/crypto.h
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-01 17:07   ` Jeff Layton
@ 2021-11-02  1:02     ` Xiubo Li
  0 siblings, 0 replies; 13+ messages in thread
From: Xiubo Li @ 2021-11-02  1:02 UTC (permalink / raw)
  To: Jeff Layton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel


On 11/2/21 1:07 AM, Jeff Layton wrote:
> On Mon, 2021-11-01 at 06:27 -0400, Jeff Layton wrote:
>> On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
>>> From: Xiubo Li <xiubli@redhat.com>
>>>
>>> This patch series is based on the "fscrypt_size_handling" branch in
>>> https://github.com/lxbsz/linux.git, which is based Jeff's
>>> "ceph-fscrypt-content-experimental" branch in
>>> https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
>>> and added two upstream commits, which should be merged already.
>>>
>>> These two upstream commits should be removed after Jeff rebase
>>> his "ceph-fscrypt-content-experimental" branch to upstream code.
>>>
>> I don't think I was clear last time. I'd like for you to post the
>> _entire_ stack of patches that is based on top of
>> ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
>> this point, so I think it's a reasonable place for you to base your
>> work. That way you're not beginning with a revert.
>>
>> Again, feel free to cherry-pick any of the patches in any of my other
>> branches for your series, but I'd like to see a complete series of
>> patches.
>>
>>
> To be even more clear:
>
> The main reason this patchset is not helpful is that the
> ceph-fscrypt-content-experimental branch in my tree has bitrotted in the
> face of other changes that have gone into the testing branch since it
> was cut. Also, that branch had several patches that added in actual
> encryption of the content, which we don't want to do at this point.
>
> For the work you're doing, what I'd like to see is a patchset based on
> top of the ceph-client/wip-fscrypt-fnames branch. That patchset should
> make it so what when encryption is enabled, the size handling for the
> inode is changed to use the new scheme we've added, but don't do any
> actual content encryption yet. Feel free to pick any of the patches in
> my trees that you need to do this, but it needs to be the whole series.
>
> What we need to be able to do in this phase is validate that the size
> handling is correct in both the encrypted and non-encrypted cases, but
> without encrypting the data. Once that piece is working, then we should
> be able to add in content encryption.

Okay, get it.

I will cleanup the code and respin it later.

BRs

-- Xiubo


>
>>> ====
>>>
>>> This approach is based on the discussion from V1 and V2, which will
>>> pass the encrypted last block contents to MDS along with the truncate
>>> request.
>>>
>>> This will send the encrypted last block contents to MDS along with
>>> the truncate request when truncating to a smaller size and at the
>>> same time new size does not align to BLOCK SIZE.
>>>
>>> The MDS side patch is raised in PR
>>> https://github.com/ceph/ceph/pull/43588, which is also based Jeff's
>>> previous great work in PR https://github.com/ceph/ceph/pull/41284.
>>>
>>> The MDS will use the filer.write_trunc(), which could update and
>>> truncate the file in one shot, instead of filer.truncate().
>>>
>>> This just assume kclient won't support the inline data feature, which
>>> will be remove soon, more detail please see:
>>> https://tracker.ceph.com/issues/52916
>>>
>>> Changed in V4:
>>> - Retry the truncate request by 20 times before fail it with -EAGAIN.
>>> - Remove the "fill_last_block" label and move the code to else branch.
>>> - Remove the #3 patch, which has already been sent out separately, in
>>>    V3 series.
>>> - Improve some comments in the code.
>>>
>>>
>>> Changed in V3:
>>> - Fix possibly corrupting the file just before the MDS acquires the
>>>    xlock for FILE lock, another client has updated it.
>>> - Flush the pagecache buffer before reading the last block for the
>>>    when filling the truncate request.
>>> - Some other minore fixes.
>>>
>>> Xiubo Li (4):
>>>    Revert "ceph: make client zero partial trailing block on truncate"
>>>    ceph: add __ceph_get_caps helper support
>>>    ceph: add __ceph_sync_read helper support
>>>    ceph: add truncate size handling support for fscrypt
>>>
>>>   fs/ceph/caps.c              |  21 ++--
>>>   fs/ceph/file.c              |  44 +++++---
>>>   fs/ceph/inode.c             | 203 ++++++++++++++++++++++++++++++------
>>>   fs/ceph/super.h             |   6 +-
>>>   include/linux/ceph/crypto.h |  28 +++++
>>>   5 files changed, 251 insertions(+), 51 deletions(-)
>>>   create mode 100644 include/linux/ceph/crypto.h
>>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-01 10:27 ` [PATCH v4 0/4] ceph: size handling for the fscrypt Jeff Layton
  2021-11-01 17:07   ` Jeff Layton
@ 2021-11-02  9:44   ` Xiubo Li
  2021-11-02 10:52     ` Jeff Layton
  1 sibling, 1 reply; 13+ messages in thread
From: Xiubo Li @ 2021-11-02  9:44 UTC (permalink / raw)
  To: Jeff Layton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel


On 11/1/21 6:27 PM, Jeff Layton wrote:
> On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> This patch series is based on the "fscrypt_size_handling" branch in
>> https://github.com/lxbsz/linux.git, which is based Jeff's
>> "ceph-fscrypt-content-experimental" branch in
>> https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
>> and added two upstream commits, which should be merged already.
>>
>> These two upstream commits should be removed after Jeff rebase
>> his "ceph-fscrypt-content-experimental" branch to upstream code.
>>
> I don't think I was clear last time. I'd like for you to post the
> _entire_ stack of patches that is based on top of
> ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
> this point, so I think it's a reasonable place for you to base your
> work. That way you're not beginning with a revert.

Hi Jeff,

BTW, have test by disabling the CONFIG_FS_ENCRYPTION option for branch 
ceph-client/wip-fscrypt-fnames ?

I have tried it today but the kernel will crash always with the 
following script. I tried many times the terminal, which is running 'cat 
/proc/kmsg' will always be stuck without any call trace about it.

# mkdir dir && echo "123" > dir/testfile

By enabling the CONFIG_FS_ENCRYPTION, I haven't countered any issue yet.

I am still debugging on it.


BRs

--Xiubo


>
> Again, feel free to cherry-pick any of the patches in any of my other
> branches for your series, but I'd like to see a complete series of
> patches.
>
> Thanks,
> Jeff
>
>
>> ====
>>
>> This approach is based on the discussion from V1 and V2, which will
>> pass the encrypted last block contents to MDS along with the truncate
>> request.
>>
>> This will send the encrypted last block contents to MDS along with
>> the truncate request when truncating to a smaller size and at the
>> same time new size does not align to BLOCK SIZE.
>>
>> The MDS side patch is raised in PR
>> https://github.com/ceph/ceph/pull/43588, which is also based Jeff's
>> previous great work in PR https://github.com/ceph/ceph/pull/41284.
>>
>> The MDS will use the filer.write_trunc(), which could update and
>> truncate the file in one shot, instead of filer.truncate().
>>
>> This just assume kclient won't support the inline data feature, which
>> will be remove soon, more detail please see:
>> https://tracker.ceph.com/issues/52916
>>
>> Changed in V4:
>> - Retry the truncate request by 20 times before fail it with -EAGAIN.
>> - Remove the "fill_last_block" label and move the code to else branch.
>> - Remove the #3 patch, which has already been sent out separately, in
>>    V3 series.
>> - Improve some comments in the code.
>>
>>
>> Changed in V3:
>> - Fix possibly corrupting the file just before the MDS acquires the
>>    xlock for FILE lock, another client has updated it.
>> - Flush the pagecache buffer before reading the last block for the
>>    when filling the truncate request.
>> - Some other minore fixes.
>>
>> Xiubo Li (4):
>>    Revert "ceph: make client zero partial trailing block on truncate"
>>    ceph: add __ceph_get_caps helper support
>>    ceph: add __ceph_sync_read helper support
>>    ceph: add truncate size handling support for fscrypt
>>
>>   fs/ceph/caps.c              |  21 ++--
>>   fs/ceph/file.c              |  44 +++++---
>>   fs/ceph/inode.c             | 203 ++++++++++++++++++++++++++++++------
>>   fs/ceph/super.h             |   6 +-
>>   include/linux/ceph/crypto.h |  28 +++++
>>   5 files changed, 251 insertions(+), 51 deletions(-)
>>   create mode 100644 include/linux/ceph/crypto.h
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-02  9:44   ` Xiubo Li
@ 2021-11-02 10:52     ` Jeff Layton
  2021-11-02 11:29       ` Jeff Layton
  2021-11-02 11:31       ` Xiubo Li
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff Layton @ 2021-11-02 10:52 UTC (permalink / raw)
  To: Xiubo Li; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel

On Tue, 2021-11-02 at 17:44 +0800, Xiubo Li wrote:
> On 11/1/21 6:27 PM, Jeff Layton wrote:
> > On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
> > > From: Xiubo Li <xiubli@redhat.com>
> > > 
> > > This patch series is based on the "fscrypt_size_handling" branch in
> > > https://github.com/lxbsz/linux.git, which is based Jeff's
> > > "ceph-fscrypt-content-experimental" branch in
> > > https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
> > > and added two upstream commits, which should be merged already.
> > > 
> > > These two upstream commits should be removed after Jeff rebase
> > > his "ceph-fscrypt-content-experimental" branch to upstream code.
> > > 
> > I don't think I was clear last time. I'd like for you to post the
> > _entire_ stack of patches that is based on top of
> > ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
> > this point, so I think it's a reasonable place for you to base your
> > work. That way you're not beginning with a revert.
> 
> Hi Jeff,
> 
> BTW, have test by disabling the CONFIG_FS_ENCRYPTION option for branch 
> ceph-client/wip-fscrypt-fnames ?
> 
> I have tried it today but the kernel will crash always with the 
> following script. I tried many times the terminal, which is running 'cat 
> /proc/kmsg' will always be stuck without any call trace about it.
> 
> # mkdir dir && echo "123" > dir/testfile
> 
> By enabling the CONFIG_FS_ENCRYPTION, I haven't countered any issue yet.
> 
> I am still debugging on it.
> 
> 


No, I hadn't noticed that, but I can reproduce it too. AFAICT, bash is
sitting in a pselect() call:

[jlayton@client1 ~]$ sudo cat /proc/1163/stack
[<0>] poll_schedule_timeout.constprop.0+0x53/0xa0
[<0>] do_select+0xb51/0xc70
[<0>] core_sys_select+0x2ac/0x620
[<0>] do_pselect.constprop.0+0x101/0x1b0
[<0>] __x64_sys_pselect6+0x9a/0xc0
[<0>] do_syscall_64+0x3b/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae

After playing around a bit more, I saw this KASAN pop, which may be
related:

[ 1046.013880] ==================================================================
[ 1046.017053] BUG: KASAN: out-of-bounds in encode_cap_msg+0x76c/0xa80 [ceph]
[ 1046.019441] Read of size 18446744071716025685 at addr ffff8881011bf558 by task kworker/7:1/82
[ 1046.022243] 
[ 1046.022785] CPU: 7 PID: 82 Comm: kworker/7:1 Tainted: G            E     5.15.0-rc6+ #43
[ 1046.025421] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
[ 1046.028159] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 1046.030111] Call Trace:
[ 1046.030983]  dump_stack_lvl+0x57/0x72
[ 1046.032177]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.033864]  print_address_description.constprop.0+0x1f/0x140
[ 1046.035807]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.037221]  ? encode_cap_msg+0x76c/0xa80 [ceph]
[ 1046.038680]  kasan_report.cold+0x7f/0x11b
[ 1046.039853]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.041317]  ? encode_cap_msg+0x76c/0xa80 [ceph]
[ 1046.042782]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.044168]  kasan_check_range+0xf5/0x1d0
[ 1046.045325]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.046679]  memcpy+0x20/0x60
[ 1046.047555]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.048930]  encode_cap_msg+0x76c/0xa80 [ceph]
[ 1046.050383]  ? ceph_kvmalloc+0xdd/0x110 [libceph]
[ 1046.051888]  ? ceph_msg_new2+0xf7/0x210 [libceph]
[ 1046.053395]  __send_cap+0x40/0x180 [ceph]
[ 1046.054696]  ceph_check_caps+0x5a2/0xc50 [ceph]
[ 1046.056482]  ? deref_stack_reg+0xb0/0xb0
[ 1046.057786]  ? ceph_con_workfn+0x224/0x8b0 [libceph]
[ 1046.059471]  ? __ceph_should_report_size+0x90/0x90 [ceph]
[ 1046.061190]  ? lock_is_held_type+0xe0/0x110
[ 1046.062453]  ? find_held_lock+0x85/0xa0
[ 1046.063684]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.065089]  ? lock_release+0x1c7/0x3e0
[ 1046.066225]  ? wait_for_completion+0x150/0x150
[ 1046.067570]  ? __ceph_caps_file_wanted+0x25a/0x380 [ceph]
[ 1046.069319]  handle_cap_grant+0x113c/0x13a0 [ceph]
[ 1046.070962]  ? ceph_kick_flushing_inode_caps+0x240/0x240 [ceph]
[ 1046.081699]  ? __cap_is_valid+0x82/0x100 [ceph]
[ 1046.091755]  ? rb_next+0x1e/0x80
[ 1046.096640]  ? __ceph_caps_issued+0xe0/0x130 [ceph]
[ 1046.101331]  ceph_handle_caps+0x10f9/0x2280 [ceph]
[ 1046.106003]  ? mds_dispatch+0x134/0x2470 [ceph]
[ 1046.110416]  ? ceph_remove_capsnap+0x90/0x90 [ceph]
[ 1046.114901]  ? __mutex_lock+0x180/0xc10
[ 1046.119178]  ? release_sock+0x1d/0xf0
[ 1046.123331]  ? mds_dispatch+0xaf/0x2470 [ceph]
[ 1046.127588]  ? __mutex_unlock_slowpath+0x105/0x3c0
[ 1046.131845]  mds_dispatch+0x6fb/0x2470 [ceph]
[ 1046.136002]  ? tcp_recvmsg+0xe0/0x2c0
[ 1046.140038]  ? ceph_mdsc_handle_mdsmap+0x3c0/0x3c0 [ceph]
[ 1046.144255]  ? wait_for_completion+0x150/0x150
[ 1046.148235]  ceph_con_process_message+0xd9/0x240 [libceph]
[ 1046.152387]  ? iov_iter_advance+0x8e/0x480
[ 1046.156239]  process_message+0xf/0x100 [libceph]
[ 1046.160219]  ceph_con_v2_try_read+0x1561/0x1b00 [libceph]
[ 1046.164317]  ? __handle_control+0x1730/0x1730 [libceph]
[ 1046.168345]  ? __lock_acquire+0x830/0x2c60
[ 1046.172183]  ? __mutex_lock+0x180/0xc10
[ 1046.175910]  ? ceph_con_workfn+0x41/0x8b0 [libceph]
[ 1046.179814]  ? lockdep_hardirqs_on_prepare+0x220/0x220
[ 1046.183688]  ? mutex_lock_io_nested+0xba0/0xba0
[ 1046.187559]  ? lock_release+0x3e0/0x3e0
[ 1046.191422]  ceph_con_workfn+0x224/0x8b0 [libceph]
[ 1046.195464]  process_one_work+0x4fd/0x9a0
[ 1046.199281]  ? pwq_dec_nr_in_flight+0x100/0x100
[ 1046.203075]  ? rwlock_bug.part.0+0x60/0x60
[ 1046.206787]  worker_thread+0x2d4/0x6e0
[ 1046.210488]  ? process_one_work+0x9a0/0x9a0
[ 1046.214254]  kthread+0x1e3/0x210
[ 1046.217911]  ? set_kthread_struct+0x80/0x80
[ 1046.221694]  ret_from_fork+0x22/0x30
[ 1046.225553] 
[ 1046.228927] The buggy address belongs to the page:
[ 1046.232690] page:000000001ee14099 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1011bf
[ 1046.237195] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 1046.241352] raw: 0017ffffc0000000 ffffea0004046fc8 ffffea0004046fc8 0000000000000000
[ 1046.245998] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 1046.250612] page dumped because: kasan: bad access detected
[ 1046.254948] 
[ 1046.258789] addr ffff8881011bf558 is located in stack of task kworker/7:1/82 at offset 296 in frame:
[ 1046.263501]  ceph_check_caps+0x0/0xc50 [ceph]
[ 1046.267766] 
[ 1046.271643] this frame has 3 objects:
[ 1046.275934]  [32, 36) 'implemented'
[ 1046.275941]  [48, 56) 'oldest_flush_tid'
[ 1046.280091]  [80, 352) 'arg'
[ 1046.284281] 
[ 1046.291847] Memory state around the buggy address:
[ 1046.295874]  ffff8881011bf400: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f2 f2 f2
[ 1046.300247]  ffff8881011bf480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1046.304752] >ffff8881011bf500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1046.309172]                                                     ^
[ 1046.313414]  ffff8881011bf580: 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00
[ 1046.318113]  ffff8881011bf600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1046.322543] ==================================================================

I'll keep investigating too.
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-02 10:52     ` Jeff Layton
@ 2021-11-02 11:29       ` Jeff Layton
  2021-11-02 12:13         ` Xiubo Li
  2021-11-02 11:31       ` Xiubo Li
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2021-11-02 11:29 UTC (permalink / raw)
  To: Xiubo Li; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel

On Tue, 2021-11-02 at 06:52 -0400, Jeff Layton wrote:
> On Tue, 2021-11-02 at 17:44 +0800, Xiubo Li wrote:
> > On 11/1/21 6:27 PM, Jeff Layton wrote:
> > > On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
> > > > From: Xiubo Li <xiubli@redhat.com>
> > > > 
> > > > This patch series is based on the "fscrypt_size_handling" branch in
> > > > https://github.com/lxbsz/linux.git, which is based Jeff's
> > > > "ceph-fscrypt-content-experimental" branch in
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
> > > > and added two upstream commits, which should be merged already.
> > > > 
> > > > These two upstream commits should be removed after Jeff rebase
> > > > his "ceph-fscrypt-content-experimental" branch to upstream code.
> > > > 
> > > I don't think I was clear last time. I'd like for you to post the
> > > _entire_ stack of patches that is based on top of
> > > ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
> > > this point, so I think it's a reasonable place for you to base your
> > > work. That way you're not beginning with a revert.
> > 
> > Hi Jeff,
> > 
> > BTW, have test by disabling the CONFIG_FS_ENCRYPTION option for branch 
> > ceph-client/wip-fscrypt-fnames ?
> > 
> > I have tried it today but the kernel will crash always with the 
> > following script. I tried many times the terminal, which is running 'cat 
> > /proc/kmsg' will always be stuck without any call trace about it.
> > 
> > # mkdir dir && echo "123" > dir/testfile
> > 
> > By enabling the CONFIG_FS_ENCRYPTION, I haven't countered any issue yet.
> > 
> > I am still debugging on it.
> > 
> > 
> 
> 
> No, I hadn't noticed that, but I can reproduce it too. AFAICT, bash is
> sitting in a pselect() call:
> 
> [jlayton@client1 ~]$ sudo cat /proc/1163/stack
> [<0>] poll_schedule_timeout.constprop.0+0x53/0xa0
> [<0>] do_select+0xb51/0xc70
> [<0>] core_sys_select+0x2ac/0x620
> [<0>] do_pselect.constprop.0+0x101/0x1b0
> [<0>] __x64_sys_pselect6+0x9a/0xc0
> [<0>] do_syscall_64+0x3b/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> After playing around a bit more, I saw this KASAN pop, which may be
> related:
> 
> [ 1046.013880] ==================================================================
> [ 1046.017053] BUG: KASAN: out-of-bounds in encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.019441] Read of size 18446744071716025685 at addr ffff8881011bf558 by task kworker/7:1/82
> [ 1046.022243] 
> [ 1046.022785] CPU: 7 PID: 82 Comm: kworker/7:1 Tainted: G            E     5.15.0-rc6+ #43
> [ 1046.025421] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
> [ 1046.028159] Workqueue: ceph-msgr ceph_con_workfn [libceph]
> [ 1046.030111] Call Trace:
> [ 1046.030983]  dump_stack_lvl+0x57/0x72
> [ 1046.032177]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.033864]  print_address_description.constprop.0+0x1f/0x140
> [ 1046.035807]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.037221]  ? encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.038680]  kasan_report.cold+0x7f/0x11b
> [ 1046.039853]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.041317]  ? encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.042782]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.044168]  kasan_check_range+0xf5/0x1d0
> [ 1046.045325]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.046679]  memcpy+0x20/0x60
> [ 1046.047555]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.048930]  encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.050383]  ? ceph_kvmalloc+0xdd/0x110 [libceph]
> [ 1046.051888]  ? ceph_msg_new2+0xf7/0x210 [libceph]
> [ 1046.053395]  __send_cap+0x40/0x180 [ceph]
> [ 1046.054696]  ceph_check_caps+0x5a2/0xc50 [ceph]
> [ 1046.056482]  ? deref_stack_reg+0xb0/0xb0
> [ 1046.057786]  ? ceph_con_workfn+0x224/0x8b0 [libceph]
> [ 1046.059471]  ? __ceph_should_report_size+0x90/0x90 [ceph]
> [ 1046.061190]  ? lock_is_held_type+0xe0/0x110
> [ 1046.062453]  ? find_held_lock+0x85/0xa0
> [ 1046.063684]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.065089]  ? lock_release+0x1c7/0x3e0
> [ 1046.066225]  ? wait_for_completion+0x150/0x150
> [ 1046.067570]  ? __ceph_caps_file_wanted+0x25a/0x380 [ceph]
> [ 1046.069319]  handle_cap_grant+0x113c/0x13a0 [ceph]
> [ 1046.070962]  ? ceph_kick_flushing_inode_caps+0x240/0x240 [ceph]
> [ 1046.081699]  ? __cap_is_valid+0x82/0x100 [ceph]
> [ 1046.091755]  ? rb_next+0x1e/0x80
> [ 1046.096640]  ? __ceph_caps_issued+0xe0/0x130 [ceph]
> [ 1046.101331]  ceph_handle_caps+0x10f9/0x2280 [ceph]
> [ 1046.106003]  ? mds_dispatch+0x134/0x2470 [ceph]
> [ 1046.110416]  ? ceph_remove_capsnap+0x90/0x90 [ceph]
> [ 1046.114901]  ? __mutex_lock+0x180/0xc10
> [ 1046.119178]  ? release_sock+0x1d/0xf0
> [ 1046.123331]  ? mds_dispatch+0xaf/0x2470 [ceph]
> [ 1046.127588]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.131845]  mds_dispatch+0x6fb/0x2470 [ceph]
> [ 1046.136002]  ? tcp_recvmsg+0xe0/0x2c0
> [ 1046.140038]  ? ceph_mdsc_handle_mdsmap+0x3c0/0x3c0 [ceph]
> [ 1046.144255]  ? wait_for_completion+0x150/0x150
> [ 1046.148235]  ceph_con_process_message+0xd9/0x240 [libceph]
> [ 1046.152387]  ? iov_iter_advance+0x8e/0x480
> [ 1046.156239]  process_message+0xf/0x100 [libceph]
> [ 1046.160219]  ceph_con_v2_try_read+0x1561/0x1b00 [libceph]
> [ 1046.164317]  ? __handle_control+0x1730/0x1730 [libceph]
> [ 1046.168345]  ? __lock_acquire+0x830/0x2c60
> [ 1046.172183]  ? __mutex_lock+0x180/0xc10
> [ 1046.175910]  ? ceph_con_workfn+0x41/0x8b0 [libceph]
> [ 1046.179814]  ? lockdep_hardirqs_on_prepare+0x220/0x220
> [ 1046.183688]  ? mutex_lock_io_nested+0xba0/0xba0
> [ 1046.187559]  ? lock_release+0x3e0/0x3e0
> [ 1046.191422]  ceph_con_workfn+0x224/0x8b0 [libceph]
> [ 1046.195464]  process_one_work+0x4fd/0x9a0
> [ 1046.199281]  ? pwq_dec_nr_in_flight+0x100/0x100
> [ 1046.203075]  ? rwlock_bug.part.0+0x60/0x60
> [ 1046.206787]  worker_thread+0x2d4/0x6e0
> [ 1046.210488]  ? process_one_work+0x9a0/0x9a0
> [ 1046.214254]  kthread+0x1e3/0x210
> [ 1046.217911]  ? set_kthread_struct+0x80/0x80
> [ 1046.221694]  ret_from_fork+0x22/0x30
> [ 1046.225553] 
> [ 1046.228927] The buggy address belongs to the page:
> [ 1046.232690] page:000000001ee14099 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1011bf
> [ 1046.237195] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
> [ 1046.241352] raw: 0017ffffc0000000 ffffea0004046fc8 ffffea0004046fc8 0000000000000000
> [ 1046.245998] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [ 1046.250612] page dumped because: kasan: bad access detected
> [ 1046.254948] 
> [ 1046.258789] addr ffff8881011bf558 is located in stack of task kworker/7:1/82 at offset 296 in frame:
> [ 1046.263501]  ceph_check_caps+0x0/0xc50 [ceph]
> [ 1046.267766] 
> [ 1046.271643] this frame has 3 objects:
> [ 1046.275934]  [32, 36) 'implemented'
> [ 1046.275941]  [48, 56) 'oldest_flush_tid'
> [ 1046.280091]  [80, 352) 'arg'
> [ 1046.284281] 
> [ 1046.291847] Memory state around the buggy address:
> [ 1046.295874]  ffff8881011bf400: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f2 f2 f2
> [ 1046.300247]  ffff8881011bf480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.304752] >ffff8881011bf500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.309172]                                                     ^
> [ 1046.313414]  ffff8881011bf580: 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00
> [ 1046.318113]  ffff8881011bf600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.322543] ==================================================================
> 
> I'll keep investigating too.

Found it -- this patch seems to fix it. I'll plan to roll it into the
earlier patch that caused the bug, and will push an updated branch to
wip-fscrypt-fnames.

Good catch!

--------------------------------------8<-------------------------------

[PATCH] SQUASH: fix cap encoding when fscrypt is disabled

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/caps.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 6e9f4de883d1..80f521dd7254 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1312,11 +1312,16 @@ static void encode_cap_msg(struct ceph_msg *msg,
struct cap_msg_args *arg)
 	ceph_encode_64(&p, 0);
 	ceph_encode_64(&p, 0);
 
+#if IS_ENABLED(CONFIG_FS_ENCRYPTION)
 	/* fscrypt_auth and fscrypt_file (version 12) */
 	ceph_encode_32(&p, arg->fscrypt_auth_len);
 	ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len);
 	ceph_encode_32(&p, arg->fscrypt_file_len);
 	ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len);
+#else /* CONFIG_FS_ENCRYPTION */
+	ceph_encode_32(&p, 0);
+	ceph_encode_32(&p, 0);
+#endif /* CONFIG_FS_ENCRYPTION */
 }
 
 /*
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-02 10:52     ` Jeff Layton
  2021-11-02 11:29       ` Jeff Layton
@ 2021-11-02 11:31       ` Xiubo Li
  1 sibling, 0 replies; 13+ messages in thread
From: Xiubo Li @ 2021-11-02 11:31 UTC (permalink / raw)
  To: Jeff Layton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel


On 11/2/21 6:52 PM, Jeff Layton wrote:
> On Tue, 2021-11-02 at 17:44 +0800, Xiubo Li wrote:
>> On 11/1/21 6:27 PM, Jeff Layton wrote:
>>> On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
>>>> From: Xiubo Li <xiubli@redhat.com>
>>>>
>>>> This patch series is based on the "fscrypt_size_handling" branch in
>>>> https://github.com/lxbsz/linux.git, which is based Jeff's
>>>> "ceph-fscrypt-content-experimental" branch in
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
>>>> and added two upstream commits, which should be merged already.
>>>>
>>>> These two upstream commits should be removed after Jeff rebase
>>>> his "ceph-fscrypt-content-experimental" branch to upstream code.
>>>>
>>> I don't think I was clear last time. I'd like for you to post the
>>> _entire_ stack of patches that is based on top of
>>> ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
>>> this point, so I think it's a reasonable place for you to base your
>>> work. That way you're not beginning with a revert.
>> Hi Jeff,
>>
>> BTW, have test by disabling the CONFIG_FS_ENCRYPTION option for branch
>> ceph-client/wip-fscrypt-fnames ?
>>
>> I have tried it today but the kernel will crash always with the
>> following script. I tried many times the terminal, which is running 'cat
>> /proc/kmsg' will always be stuck without any call trace about it.
>>
>> # mkdir dir && echo "123" > dir/testfile
>>
>> By enabling the CONFIG_FS_ENCRYPTION, I haven't countered any issue yet.
>>
>> I am still debugging on it.
>>
>>
>
> No, I hadn't noticed that, but I can reproduce it too. AFAICT, bash is
> sitting in a pselect() call:
>
> [jlayton@client1 ~]$ sudo cat /proc/1163/stack
> [<0>] poll_schedule_timeout.constprop.0+0x53/0xa0
> [<0>] do_select+0xb51/0xc70
> [<0>] core_sys_select+0x2ac/0x620
> [<0>] do_pselect.constprop.0+0x101/0x1b0
> [<0>] __x64_sys_pselect6+0x9a/0xc0
> [<0>] do_syscall_64+0x3b/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> After playing around a bit more, I saw this KASAN pop, which may be
> related:
>
> [ 1046.013880] ==================================================================
> [ 1046.017053] BUG: KASAN: out-of-bounds in encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.019441] Read of size 18446744071716025685 at addr ffff8881011bf558 by task kworker/7:1/82
> [ 1046.022243]
> [ 1046.022785] CPU: 7 PID: 82 Comm: kworker/7:1 Tainted: G            E     5.15.0-rc6+ #43
> [ 1046.025421] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
> [ 1046.028159] Workqueue: ceph-msgr ceph_con_workfn [libceph]
> [ 1046.030111] Call Trace:
> [ 1046.030983]  dump_stack_lvl+0x57/0x72
> [ 1046.032177]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.033864]  print_address_description.constprop.0+0x1f/0x140
> [ 1046.035807]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.037221]  ? encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.038680]  kasan_report.cold+0x7f/0x11b
> [ 1046.039853]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.041317]  ? encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.042782]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.044168]  kasan_check_range+0xf5/0x1d0
> [ 1046.045325]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.046679]  memcpy+0x20/0x60
> [ 1046.047555]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.048930]  encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.050383]  ? ceph_kvmalloc+0xdd/0x110 [libceph]
> [ 1046.051888]  ? ceph_msg_new2+0xf7/0x210 [libceph]
> [ 1046.053395]  __send_cap+0x40/0x180 [ceph]
> [ 1046.054696]  ceph_check_caps+0x5a2/0xc50 [ceph]
> [ 1046.056482]  ? deref_stack_reg+0xb0/0xb0
> [ 1046.057786]  ? ceph_con_workfn+0x224/0x8b0 [libceph]
> [ 1046.059471]  ? __ceph_should_report_size+0x90/0x90 [ceph]
> [ 1046.061190]  ? lock_is_held_type+0xe0/0x110
> [ 1046.062453]  ? find_held_lock+0x85/0xa0
> [ 1046.063684]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.065089]  ? lock_release+0x1c7/0x3e0
> [ 1046.066225]  ? wait_for_completion+0x150/0x150
> [ 1046.067570]  ? __ceph_caps_file_wanted+0x25a/0x380 [ceph]
> [ 1046.069319]  handle_cap_grant+0x113c/0x13a0 [ceph]
> [ 1046.070962]  ? ceph_kick_flushing_inode_caps+0x240/0x240 [ceph]
> [ 1046.081699]  ? __cap_is_valid+0x82/0x100 [ceph]
> [ 1046.091755]  ? rb_next+0x1e/0x80
> [ 1046.096640]  ? __ceph_caps_issued+0xe0/0x130 [ceph]
> [ 1046.101331]  ceph_handle_caps+0x10f9/0x2280 [ceph]
> [ 1046.106003]  ? mds_dispatch+0x134/0x2470 [ceph]
> [ 1046.110416]  ? ceph_remove_capsnap+0x90/0x90 [ceph]
> [ 1046.114901]  ? __mutex_lock+0x180/0xc10
> [ 1046.119178]  ? release_sock+0x1d/0xf0
> [ 1046.123331]  ? mds_dispatch+0xaf/0x2470 [ceph]
> [ 1046.127588]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.131845]  mds_dispatch+0x6fb/0x2470 [ceph]
> [ 1046.136002]  ? tcp_recvmsg+0xe0/0x2c0
> [ 1046.140038]  ? ceph_mdsc_handle_mdsmap+0x3c0/0x3c0 [ceph]
> [ 1046.144255]  ? wait_for_completion+0x150/0x150
> [ 1046.148235]  ceph_con_process_message+0xd9/0x240 [libceph]
> [ 1046.152387]  ? iov_iter_advance+0x8e/0x480
> [ 1046.156239]  process_message+0xf/0x100 [libceph]
> [ 1046.160219]  ceph_con_v2_try_read+0x1561/0x1b00 [libceph]
> [ 1046.164317]  ? __handle_control+0x1730/0x1730 [libceph]
> [ 1046.168345]  ? __lock_acquire+0x830/0x2c60
> [ 1046.172183]  ? __mutex_lock+0x180/0xc10
> [ 1046.175910]  ? ceph_con_workfn+0x41/0x8b0 [libceph]
> [ 1046.179814]  ? lockdep_hardirqs_on_prepare+0x220/0x220
> [ 1046.183688]  ? mutex_lock_io_nested+0xba0/0xba0
> [ 1046.187559]  ? lock_release+0x3e0/0x3e0
> [ 1046.191422]  ceph_con_workfn+0x224/0x8b0 [libceph]
> [ 1046.195464]  process_one_work+0x4fd/0x9a0
> [ 1046.199281]  ? pwq_dec_nr_in_flight+0x100/0x100
> [ 1046.203075]  ? rwlock_bug.part.0+0x60/0x60
> [ 1046.206787]  worker_thread+0x2d4/0x6e0
> [ 1046.210488]  ? process_one_work+0x9a0/0x9a0
> [ 1046.214254]  kthread+0x1e3/0x210
> [ 1046.217911]  ? set_kthread_struct+0x80/0x80
> [ 1046.221694]  ret_from_fork+0x22/0x30
> [ 1046.225553]
> [ 1046.228927] The buggy address belongs to the page:
> [ 1046.232690] page:000000001ee14099 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1011bf
> [ 1046.237195] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
> [ 1046.241352] raw: 0017ffffc0000000 ffffea0004046fc8 ffffea0004046fc8 0000000000000000
> [ 1046.245998] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [ 1046.250612] page dumped because: kasan: bad access detected
> [ 1046.254948]
> [ 1046.258789] addr ffff8881011bf558 is located in stack of task kworker/7:1/82 at offset 296 in frame:
> [ 1046.263501]  ceph_check_caps+0x0/0xc50 [ceph]
> [ 1046.267766]
> [ 1046.271643] this frame has 3 objects:
> [ 1046.275934]  [32, 36) 'implemented'
> [ 1046.275941]  [48, 56) 'oldest_flush_tid'
> [ 1046.280091]  [80, 352) 'arg'
> [ 1046.284281]
> [ 1046.291847] Memory state around the buggy address:
> [ 1046.295874]  ffff8881011bf400: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f2 f2 f2
> [ 1046.300247]  ffff8881011bf480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.304752] >ffff8881011bf500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.309172]                                                     ^
> [ 1046.313414]  ffff8881011bf580: 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00
> [ 1046.318113]  ffff8881011bf600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.322543] ==================================================================

Cool.

It seems the issue I mentioned before months ago, which is it doesn't 
allocated enough space for the msg structure, but I couldn't remember in 
which thread.



> I'll keep investigating too.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] ceph: size handling for the fscrypt
  2021-11-02 11:29       ` Jeff Layton
@ 2021-11-02 12:13         ` Xiubo Li
  0 siblings, 0 replies; 13+ messages in thread
From: Xiubo Li @ 2021-11-02 12:13 UTC (permalink / raw)
  To: Jeff Layton; +Cc: idryomov, vshankar, pdonnell, khiremat, ceph-devel


On 11/2/21 7:29 PM, Jeff Layton wrote:
> On Tue, 2021-11-02 at 06:52 -0400, Jeff Layton wrote:
>> On Tue, 2021-11-02 at 17:44 +0800, Xiubo Li wrote:
>>> On 11/1/21 6:27 PM, Jeff Layton wrote:
>>>> On Mon, 2021-11-01 at 10:04 +0800, xiubli@redhat.com wrote:
>>>>> From: Xiubo Li <xiubli@redhat.com>
>>>>>
>>>>> This patch series is based on the "fscrypt_size_handling" branch in
>>>>> https://github.com/lxbsz/linux.git, which is based Jeff's
>>>>> "ceph-fscrypt-content-experimental" branch in
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
>>>>> and added two upstream commits, which should be merged already.
>>>>>
>>>>> These two upstream commits should be removed after Jeff rebase
>>>>> his "ceph-fscrypt-content-experimental" branch to upstream code.
>>>>>
>>>> I don't think I was clear last time. I'd like for you to post the
>>>> _entire_ stack of patches that is based on top of
>>>> ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
>>>> this point, so I think it's a reasonable place for you to base your
>>>> work. That way you're not beginning with a revert.
>>> Hi Jeff,
>>>
>>> BTW, have test by disabling the CONFIG_FS_ENCRYPTION option for branch
>>> ceph-client/wip-fscrypt-fnames ?
>>>
>>> I have tried it today but the kernel will crash always with the
>>> following script. I tried many times the terminal, which is running 'cat
>>> /proc/kmsg' will always be stuck without any call trace about it.
>>>
>>> # mkdir dir && echo "123" > dir/testfile
>>>
>>> By enabling the CONFIG_FS_ENCRYPTION, I haven't countered any issue yet.
>>>
>>> I am still debugging on it.
>>>
>>>
>>
>> No, I hadn't noticed that, but I can reproduce it too. AFAICT, bash is
>> sitting in a pselect() call:
>>
>> [jlayton@client1 ~]$ sudo cat /proc/1163/stack
>> [<0>] poll_schedule_timeout.constprop.0+0x53/0xa0
>> [<0>] do_select+0xb51/0xc70
>> [<0>] core_sys_select+0x2ac/0x620
>> [<0>] do_pselect.constprop.0+0x101/0x1b0
>> [<0>] __x64_sys_pselect6+0x9a/0xc0
>> [<0>] do_syscall_64+0x3b/0x90
>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> After playing around a bit more, I saw this KASAN pop, which may be
>> related:
>>
>> [ 1046.013880] ==================================================================
>> [ 1046.017053] BUG: KASAN: out-of-bounds in encode_cap_msg+0x76c/0xa80 [ceph]
>> [ 1046.019441] Read of size 18446744071716025685 at addr ffff8881011bf558 by task kworker/7:1/82
>> [ 1046.022243]
>> [ 1046.022785] CPU: 7 PID: 82 Comm: kworker/7:1 Tainted: G            E     5.15.0-rc6+ #43
>> [ 1046.025421] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
>> [ 1046.028159] Workqueue: ceph-msgr ceph_con_workfn [libceph]
>> [ 1046.030111] Call Trace:
>> [ 1046.030983]  dump_stack_lvl+0x57/0x72
>> [ 1046.032177]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.033864]  print_address_description.constprop.0+0x1f/0x140
>> [ 1046.035807]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.037221]  ? encode_cap_msg+0x76c/0xa80 [ceph]
>> [ 1046.038680]  kasan_report.cold+0x7f/0x11b
>> [ 1046.039853]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.041317]  ? encode_cap_msg+0x76c/0xa80 [ceph]
>> [ 1046.042782]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.044168]  kasan_check_range+0xf5/0x1d0
>> [ 1046.045325]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.046679]  memcpy+0x20/0x60
>> [ 1046.047555]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.048930]  encode_cap_msg+0x76c/0xa80 [ceph]
>> [ 1046.050383]  ? ceph_kvmalloc+0xdd/0x110 [libceph]
>> [ 1046.051888]  ? ceph_msg_new2+0xf7/0x210 [libceph]
>> [ 1046.053395]  __send_cap+0x40/0x180 [ceph]
>> [ 1046.054696]  ceph_check_caps+0x5a2/0xc50 [ceph]
>> [ 1046.056482]  ? deref_stack_reg+0xb0/0xb0
>> [ 1046.057786]  ? ceph_con_workfn+0x224/0x8b0 [libceph]
>> [ 1046.059471]  ? __ceph_should_report_size+0x90/0x90 [ceph]
>> [ 1046.061190]  ? lock_is_held_type+0xe0/0x110
>> [ 1046.062453]  ? find_held_lock+0x85/0xa0
>> [ 1046.063684]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.065089]  ? lock_release+0x1c7/0x3e0
>> [ 1046.066225]  ? wait_for_completion+0x150/0x150
>> [ 1046.067570]  ? __ceph_caps_file_wanted+0x25a/0x380 [ceph]
>> [ 1046.069319]  handle_cap_grant+0x113c/0x13a0 [ceph]
>> [ 1046.070962]  ? ceph_kick_flushing_inode_caps+0x240/0x240 [ceph]
>> [ 1046.081699]  ? __cap_is_valid+0x82/0x100 [ceph]
>> [ 1046.091755]  ? rb_next+0x1e/0x80
>> [ 1046.096640]  ? __ceph_caps_issued+0xe0/0x130 [ceph]
>> [ 1046.101331]  ceph_handle_caps+0x10f9/0x2280 [ceph]
>> [ 1046.106003]  ? mds_dispatch+0x134/0x2470 [ceph]
>> [ 1046.110416]  ? ceph_remove_capsnap+0x90/0x90 [ceph]
>> [ 1046.114901]  ? __mutex_lock+0x180/0xc10
>> [ 1046.119178]  ? release_sock+0x1d/0xf0
>> [ 1046.123331]  ? mds_dispatch+0xaf/0x2470 [ceph]
>> [ 1046.127588]  ? __mutex_unlock_slowpath+0x105/0x3c0
>> [ 1046.131845]  mds_dispatch+0x6fb/0x2470 [ceph]
>> [ 1046.136002]  ? tcp_recvmsg+0xe0/0x2c0
>> [ 1046.140038]  ? ceph_mdsc_handle_mdsmap+0x3c0/0x3c0 [ceph]
>> [ 1046.144255]  ? wait_for_completion+0x150/0x150
>> [ 1046.148235]  ceph_con_process_message+0xd9/0x240 [libceph]
>> [ 1046.152387]  ? iov_iter_advance+0x8e/0x480
>> [ 1046.156239]  process_message+0xf/0x100 [libceph]
>> [ 1046.160219]  ceph_con_v2_try_read+0x1561/0x1b00 [libceph]
>> [ 1046.164317]  ? __handle_control+0x1730/0x1730 [libceph]
>> [ 1046.168345]  ? __lock_acquire+0x830/0x2c60
>> [ 1046.172183]  ? __mutex_lock+0x180/0xc10
>> [ 1046.175910]  ? ceph_con_workfn+0x41/0x8b0 [libceph]
>> [ 1046.179814]  ? lockdep_hardirqs_on_prepare+0x220/0x220
>> [ 1046.183688]  ? mutex_lock_io_nested+0xba0/0xba0
>> [ 1046.187559]  ? lock_release+0x3e0/0x3e0
>> [ 1046.191422]  ceph_con_workfn+0x224/0x8b0 [libceph]
>> [ 1046.195464]  process_one_work+0x4fd/0x9a0
>> [ 1046.199281]  ? pwq_dec_nr_in_flight+0x100/0x100
>> [ 1046.203075]  ? rwlock_bug.part.0+0x60/0x60
>> [ 1046.206787]  worker_thread+0x2d4/0x6e0
>> [ 1046.210488]  ? process_one_work+0x9a0/0x9a0
>> [ 1046.214254]  kthread+0x1e3/0x210
>> [ 1046.217911]  ? set_kthread_struct+0x80/0x80
>> [ 1046.221694]  ret_from_fork+0x22/0x30
>> [ 1046.225553]
>> [ 1046.228927] The buggy address belongs to the page:
>> [ 1046.232690] page:000000001ee14099 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1011bf
>> [ 1046.237195] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
>> [ 1046.241352] raw: 0017ffffc0000000 ffffea0004046fc8 ffffea0004046fc8 0000000000000000
>> [ 1046.245998] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
>> [ 1046.250612] page dumped because: kasan: bad access detected
>> [ 1046.254948]
>> [ 1046.258789] addr ffff8881011bf558 is located in stack of task kworker/7:1/82 at offset 296 in frame:
>> [ 1046.263501]  ceph_check_caps+0x0/0xc50 [ceph]
>> [ 1046.267766]
>> [ 1046.271643] this frame has 3 objects:
>> [ 1046.275934]  [32, 36) 'implemented'
>> [ 1046.275941]  [48, 56) 'oldest_flush_tid'
>> [ 1046.280091]  [80, 352) 'arg'
>> [ 1046.284281]
>> [ 1046.291847] Memory state around the buggy address:
>> [ 1046.295874]  ffff8881011bf400: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f2 f2 f2
>> [ 1046.300247]  ffff8881011bf480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 1046.304752] >ffff8881011bf500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 1046.309172]                                                     ^
>> [ 1046.313414]  ffff8881011bf580: 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00
>> [ 1046.318113]  ffff8881011bf600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 1046.322543] ==================================================================
>>
>> I'll keep investigating too.
> Found it -- this patch seems to fix it. I'll plan to roll it into the
> earlier patch that caused the bug, and will push an updated branch to
> wip-fscrypt-fnames.
>
> Good catch!
>
> --------------------------------------8<-------------------------------
>
> [PATCH] SQUASH: fix cap encoding when fscrypt is disabled
>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/ceph/caps.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index 6e9f4de883d1..80f521dd7254 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -1312,11 +1312,16 @@ static void encode_cap_msg(struct ceph_msg *msg,
> struct cap_msg_args *arg)
>   	ceph_encode_64(&p, 0);
>   	ceph_encode_64(&p, 0);
>   
> +#if IS_ENABLED(CONFIG_FS_ENCRYPTION)
>   	/* fscrypt_auth and fscrypt_file (version 12) */
>   	ceph_encode_32(&p, arg->fscrypt_auth_len);
>   	ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len);
>   	ceph_encode_32(&p, arg->fscrypt_file_len);
>   	ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len);
> +#else /* CONFIG_FS_ENCRYPTION */
> +	ceph_encode_32(&p, 0);
> +	ceph_encode_32(&p, 0);
> +#endif /* CONFIG_FS_ENCRYPTION */
>   }
>   
>   /*

Yeah, this should be the root cause.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-02 12:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-01  2:04 [PATCH v4 0/4] ceph: size handling for the fscrypt xiubli
2021-11-01  2:04 ` [PATCH v4 1/4] Revert "ceph: make client zero partial trailing block on truncate" xiubli
2021-11-01  2:04 ` [PATCH v4 2/4] ceph: add __ceph_get_caps helper support xiubli
2021-11-01  2:04 ` [PATCH v4 3/4] ceph: add __ceph_sync_read " xiubli
2021-11-01  2:04 ` [PATCH v4 4/4] ceph: add truncate size handling support for fscrypt xiubli
2021-11-01 10:27 ` [PATCH v4 0/4] ceph: size handling for the fscrypt Jeff Layton
2021-11-01 17:07   ` Jeff Layton
2021-11-02  1:02     ` Xiubo Li
2021-11-02  9:44   ` Xiubo Li
2021-11-02 10:52     ` Jeff Layton
2021-11-02 11:29       ` Jeff Layton
2021-11-02 12:13         ` Xiubo Li
2021-11-02 11:31       ` Xiubo Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.