All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] ceph: implement later versions of MClientRequest headers
@ 2020-12-09 18:53 Jeff Layton
  2020-12-09 18:53 ` [PATCH 1/4] ceph: don't reach into request header for readdir info Jeff Layton
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Jeff Layton @ 2020-12-09 18:53 UTC (permalink / raw)
  To: ceph-devel; +Cc: pdonnell, xiubli, idryomov

A few years ago, userland ceph added support for changing the birthtime
via setattr, as well as support for sending supplementary groups in a
MDS request.

This patchset updates the kclient to use the newer protocol. The
necessary structures are extended and the code is changed to support the
newer formats when it detects that the MDS will support it.

Supplementary groups will now be transmitted in the request, but for now
the setting of btime is not implemented.

This is a prerequisite step to adding support for the new "alternate
name" field that Xiubo has been working on, which we'll need for
proper fscrypt support.

Jeff Layton (4):
  ceph: don't reach into request header for readdir info
  ceph: take a cred reference instead of tracking individual uid/gid
  ceph: clean up argument lists to __prepare_send_request and
    __send_request
  ceph: implement updated ceph_mds_request_head structure

 fs/ceph/inode.c              |  5 +-
 fs/ceph/mds_client.c         | 98 ++++++++++++++++++++++++++----------
 fs/ceph/mds_client.h         |  3 +-
 include/linux/ceph/ceph_fs.h | 32 +++++++++++-
 4 files changed, 106 insertions(+), 32 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] ceph: don't reach into request header for readdir info
  2020-12-09 18:53 [PATCH 0/4] ceph: implement later versions of MClientRequest headers Jeff Layton
@ 2020-12-09 18:53 ` Jeff Layton
  2020-12-09 18:53 ` [PATCH 2/4] ceph: take a cred reference instead of tracking individual uid/gid Jeff Layton
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2020-12-09 18:53 UTC (permalink / raw)
  To: ceph-devel; +Cc: pdonnell, xiubli, idryomov

We already have a pointer to the argument struct in req->r_args. Use that
instead of groveling around in the ceph_mds_request_head.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/inode.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 9b85d86d8efb..93633cb4a905 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -1594,8 +1594,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req,
 	struct dentry *dn;
 	struct inode *in;
 	int err = 0, skipped = 0, ret, i;
-	struct ceph_mds_request_head *rhead = req->r_request->front.iov_base;
-	u32 frag = le32_to_cpu(rhead->args.readdir.frag);
+	u32 frag = le32_to_cpu(req->r_args.readdir.frag);
 	u32 last_hash = 0;
 	u32 fpos_offset;
 	struct ceph_readdir_cache_control cache_ctl = {};
@@ -1612,7 +1611,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req,
 		} else if (rinfo->offset_hash) {
 			/* mds understands offset_hash */
 			WARN_ON_ONCE(req->r_readdir_offset != 2);
-			last_hash = le32_to_cpu(rhead->args.readdir.offset_hash);
+			last_hash = le32_to_cpu(req->r_args.readdir.offset_hash);
 		}
 	}
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] ceph: take a cred reference instead of tracking individual uid/gid
  2020-12-09 18:53 [PATCH 0/4] ceph: implement later versions of MClientRequest headers Jeff Layton
  2020-12-09 18:53 ` [PATCH 1/4] ceph: don't reach into request header for readdir info Jeff Layton
@ 2020-12-09 18:53 ` Jeff Layton
  2020-12-09 18:53 ` [PATCH 3/4] ceph: clean up argument lists to __prepare_send_request and __send_request Jeff Layton
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2020-12-09 18:53 UTC (permalink / raw)
  To: ceph-devel; +Cc: pdonnell, xiubli, idryomov

Replace req->r_uid/r_gid with an r_cred pointer and take a reference to
that at the point where we previously would sample the two.  Use that to
populate the uid and gid in the header and release the reference when
the request is freed.

This should enable us to later add support for sending supplementary
group lists in MDS requests.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 8 ++++----
 fs/ceph/mds_client.h | 3 +--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 7d354d4e7933..1f1c5e490596 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -833,6 +833,7 @@ void ceph_mdsc_release_request(struct kref *kref)
 	}
 	kfree(req->r_path1);
 	kfree(req->r_path2);
+	put_cred(req->r_cred);
 	if (req->r_pagelist)
 		ceph_pagelist_release(req->r_pagelist);
 	put_request_session(req);
@@ -888,8 +889,7 @@ static void __register_request(struct ceph_mds_client *mdsc,
 	ceph_mdsc_get_request(req);
 	insert_request(&mdsc->request_tree, req);
 
-	req->r_uid = current_fsuid();
-	req->r_gid = current_fsgid();
+	req->r_cred = get_current_cred();
 
 	if (mdsc->oldest_tid == 0 && req->r_op != CEPH_MDS_OP_SETFILELOCK)
 		mdsc->oldest_tid = req->r_tid;
@@ -2542,8 +2542,8 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
 
 	head->mdsmap_epoch = cpu_to_le32(mdsc->mdsmap->m_epoch);
 	head->op = cpu_to_le32(req->r_op);
-	head->caller_uid = cpu_to_le32(from_kuid(&init_user_ns, req->r_uid));
-	head->caller_gid = cpu_to_le32(from_kgid(&init_user_ns, req->r_gid));
+	head->caller_uid = cpu_to_le32(from_kuid(&init_user_ns, req->r_cred->fsuid));
+	head->caller_gid = cpu_to_le32(from_kgid(&init_user_ns, req->r_cred->fsgid));
 	head->ino = cpu_to_le64(req->r_deleg_ino);
 	head->args = req->r_args;
 
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index f5adbebcb38e..eaa7c5422116 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -275,8 +275,7 @@ struct ceph_mds_request {
 
 	union ceph_mds_request_args r_args;
 	int r_fmode;        /* file mode, if expecting cap */
-	kuid_t r_uid;
-	kgid_t r_gid;
+	const struct cred *r_cred;
 	int r_request_release_offset;
 	struct timespec64 r_stamp;
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] ceph: clean up argument lists to __prepare_send_request and __send_request
  2020-12-09 18:53 [PATCH 0/4] ceph: implement later versions of MClientRequest headers Jeff Layton
  2020-12-09 18:53 ` [PATCH 1/4] ceph: don't reach into request header for readdir info Jeff Layton
  2020-12-09 18:53 ` [PATCH 2/4] ceph: take a cred reference instead of tracking individual uid/gid Jeff Layton
@ 2020-12-09 18:53 ` Jeff Layton
  2020-12-09 18:53 ` [PATCH 4/4] ceph: implement updated ceph_mds_request_head structure Jeff Layton
  2020-12-10  2:19 ` [PATCH 0/4] ceph: implement later versions of MClientRequest headers Xiubo Li
  4 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2020-12-09 18:53 UTC (permalink / raw)
  To: ceph-devel; +Cc: pdonnell, xiubli, idryomov

We can always get the mdsc from the session, so there's no need to pass
it in as a separate argument. Pass the session to __prepare_send_request
as well, to prepare for later patches that will need to access it.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 1f1c5e490596..f76ae9e7d4c1 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2634,10 +2634,12 @@ static void complete_request(struct ceph_mds_client *mdsc,
 /*
  * called under mdsc->mutex
  */
-static int __prepare_send_request(struct ceph_mds_client *mdsc,
+static int __prepare_send_request(struct ceph_mds_session *session,
 				  struct ceph_mds_request *req,
-				  int mds, bool drop_cap_releases)
+				  bool drop_cap_releases)
 {
+	int mds = session->s_mds;
+	struct ceph_mds_client *mdsc = session->s_mdsc;
 	struct ceph_mds_request_head *rhead;
 	struct ceph_msg *msg;
 	int flags = 0;
@@ -2721,15 +2723,13 @@ static int __prepare_send_request(struct ceph_mds_client *mdsc,
 /*
  * called under mdsc->mutex
  */
-static int __send_request(struct ceph_mds_client *mdsc,
-			  struct ceph_mds_session *session,
+static int __send_request(struct ceph_mds_session *session,
 			  struct ceph_mds_request *req,
 			  bool drop_cap_releases)
 {
 	int err;
 
-	err = __prepare_send_request(mdsc, req, session->s_mds,
-				     drop_cap_releases);
+	err = __prepare_send_request(session, req, drop_cap_releases);
 	if (!err) {
 		ceph_msg_get(req->r_request);
 		ceph_con_send(&session->s_con, req->r_request);
@@ -2856,7 +2856,7 @@ static void __do_request(struct ceph_mds_client *mdsc,
 	if (req->r_request_started == 0)   /* note request start time */
 		req->r_request_started = jiffies;
 
-	err = __send_request(mdsc, session, req, false);
+	err = __send_request(session, req, false);
 
 out_session:
 	ceph_put_mds_session(session);
@@ -3535,7 +3535,7 @@ static void replay_unsafe_requests(struct ceph_mds_client *mdsc,
 
 	mutex_lock(&mdsc->mutex);
 	list_for_each_entry_safe(req, nreq, &session->s_unsafe, r_unsafe_item)
-		__send_request(mdsc, session, req, true);
+		__send_request(session, req, true);
 
 	/*
 	 * also re-send old requests when MDS enters reconnect stage. So that MDS
@@ -3556,7 +3556,7 @@ static void replay_unsafe_requests(struct ceph_mds_client *mdsc,
 
 		ceph_mdsc_release_dir_caps_no_check(req);
 
-		__send_request(mdsc, session, req, true);
+		__send_request(session, req, true);
 	}
 	mutex_unlock(&mdsc->mutex);
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] ceph: implement updated ceph_mds_request_head structure
  2020-12-09 18:53 [PATCH 0/4] ceph: implement later versions of MClientRequest headers Jeff Layton
                   ` (2 preceding siblings ...)
  2020-12-09 18:53 ` [PATCH 3/4] ceph: clean up argument lists to __prepare_send_request and __send_request Jeff Layton
@ 2020-12-09 18:53 ` Jeff Layton
  2020-12-16 15:02   ` Jeff Layton
  2020-12-10  2:19 ` [PATCH 0/4] ceph: implement later versions of MClientRequest headers Xiubo Li
  4 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2020-12-09 18:53 UTC (permalink / raw)
  To: ceph-devel; +Cc: pdonnell, xiubli, idryomov

When we added the btime feature in mainline ceph, we had to extend
struct ceph_mds_request_args so that it could be set. Implement the same
in the kernel client.

Rename ceph_mds_request_head with a _old extension, and a union
ceph_mds_request_args_ext to allow for the extended size of the new
header format.

Add the appropriate code to handle both formats in struct
create_request_message and key the behavior on whether the peer supports
CEPH_FEATURE_FS_BTIME.

The gid_list field in the payload is now populated from the saved
credential. For now, we don't add any support for setting the btime via
setattr, but this does enable us to add that in the future.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c         | 72 +++++++++++++++++++++++++++++-------
 include/linux/ceph/ceph_fs.h | 32 +++++++++++++++-
 2 files changed, 90 insertions(+), 14 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index f76ae9e7d4c1..e9db2d1e0020 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2478,21 +2478,24 @@ static int set_request_path_attr(struct inode *rinode, struct dentry *rdentry,
 /*
  * called under mdsc->mutex
  */
-static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
+static struct ceph_msg *create_request_message(struct ceph_mds_session *session,
 					       struct ceph_mds_request *req,
-					       int mds, bool drop_cap_releases)
+					       bool drop_cap_releases)
 {
+	int mds = session->s_mds;
+	struct ceph_mds_client *mdsc = session->s_mdsc;
 	struct ceph_msg *msg;
-	struct ceph_mds_request_head *head;
+	struct ceph_mds_request_head_old *head;
 	const char *path1 = NULL;
 	const char *path2 = NULL;
 	u64 ino1 = 0, ino2 = 0;
 	int pathlen1 = 0, pathlen2 = 0;
 	bool freepath1 = false, freepath2 = false;
-	int len;
+	int len, i;
 	u16 releases;
 	void *p, *end;
 	int ret;
+	bool legacy = !(session->s_con.peer_features & CEPH_FEATURE_FS_BTIME);
 
 	ret = set_request_path_attr(req->r_inode, req->r_dentry,
 			      req->r_parent, req->r_path1, req->r_ino1.ino,
@@ -2514,14 +2517,23 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
 		goto out_free1;
 	}
 
-	len = sizeof(*head) +
-		pathlen1 + pathlen2 + 2*(1 + sizeof(u32) + sizeof(u64)) +
+	if (legacy) {
+		/* Old style */
+		len = sizeof(*head);
+	} else {
+		/* New style: add gid_list and any later fields */
+		len = sizeof(struct ceph_mds_request_head) +
+		      sizeof(u32) + (sizeof(u64) * req->r_cred->group_info->ngroups);
+	}
+
+	len += pathlen1 + pathlen2 + 2*(1 + sizeof(u32) + sizeof(u64)) +
 		sizeof(struct ceph_timespec);
 
 	/* calculate (max) length for cap releases */
 	len += sizeof(struct ceph_mds_request_release) *
 		(!!req->r_inode_drop + !!req->r_dentry_drop +
 		 !!req->r_old_inode_drop + !!req->r_old_dentry_drop);
+
 	if (req->r_dentry_drop)
 		len += pathlen1;
 	if (req->r_old_dentry_drop)
@@ -2533,11 +2545,25 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
 		goto out_free2;
 	}
 
-	msg->hdr.version = cpu_to_le16(3);
 	msg->hdr.tid = cpu_to_le64(req->r_tid);
 
-	head = msg->front.iov_base;
-	p = msg->front.iov_base + sizeof(*head);
+	/*
+	 * The old ceph_mds_request_header didn't contain a version field, and
+	 * one was added when we moved the message version from 3->4.
+	 */
+	if (legacy) {
+		msg->hdr.version = cpu_to_le16(3);
+		head = msg->front.iov_base;
+		p = msg->front.iov_base + sizeof(*head);
+	} else {
+		struct ceph_mds_request_head *new_head = msg->front.iov_base;
+
+		msg->hdr.version = cpu_to_le16(4);
+		new_head->version = cpu_to_le16(CEPH_MDS_REQUEST_HEAD_VERSION);
+		head = (struct ceph_mds_request_head_old *)&new_head->oldest_client_tid;
+		p = msg->front.iov_base + sizeof(*new_head);
+	}
+
 	end = msg->front.iov_base + msg->front.iov_len;
 
 	head->mdsmap_epoch = cpu_to_le32(mdsc->mdsmap->m_epoch);
@@ -2588,6 +2614,14 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
 		ceph_encode_copy(&p, &ts, sizeof(ts));
 	}
 
+	/* gid list */
+	if (!legacy) {
+		ceph_encode_32(&p, req->r_cred->group_info->ngroups);
+		for (i = 0; i < req->r_cred->group_info->ngroups; i++)
+			ceph_encode_64(&p, from_kgid(&init_user_ns,
+				       req->r_cred->group_info->gid[i]));
+	}
+
 	if (WARN_ON_ONCE(p > end)) {
 		ceph_msg_put(msg);
 		msg = ERR_PTR(-ERANGE);
@@ -2631,6 +2665,17 @@ static void complete_request(struct ceph_mds_client *mdsc,
 	complete_all(&req->r_completion);
 }
 
+static struct ceph_mds_request_head_old *find_old_request_head(void *p, u64 features)
+{
+	bool legacy = !(features & CEPH_FEATURE_FS_BTIME);
+	struct ceph_mds_request_head *new_head;
+
+	if (legacy)
+		return (struct ceph_mds_request_head_old *)p;
+	new_head = (struct ceph_mds_request_head *)p;
+	return (struct ceph_mds_request_head_old *)&new_head->oldest_client_tid;
+}
+
 /*
  * called under mdsc->mutex
  */
@@ -2640,7 +2685,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
 {
 	int mds = session->s_mds;
 	struct ceph_mds_client *mdsc = session->s_mdsc;
-	struct ceph_mds_request_head *rhead;
+	struct ceph_mds_request_head_old *rhead;
 	struct ceph_msg *msg;
 	int flags = 0;
 
@@ -2659,6 +2704,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
 
 	if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) {
 		void *p;
+
 		/*
 		 * Replay.  Do not regenerate message (and rebuild
 		 * paths, etc.); just use the original message.
@@ -2666,7 +2712,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
 		 * d_move mangles the src name.
 		 */
 		msg = req->r_request;
-		rhead = msg->front.iov_base;
+		rhead = find_old_request_head(msg->front.iov_base, session->s_con.peer_features);
 
 		flags = le32_to_cpu(rhead->flags);
 		flags |= CEPH_MDS_FLAG_REPLAY;
@@ -2697,14 +2743,14 @@ static int __prepare_send_request(struct ceph_mds_session *session,
 		ceph_msg_put(req->r_request);
 		req->r_request = NULL;
 	}
-	msg = create_request_message(mdsc, req, mds, drop_cap_releases);
+	msg = create_request_message(session, req, drop_cap_releases);
 	if (IS_ERR(msg)) {
 		req->r_err = PTR_ERR(msg);
 		return PTR_ERR(msg);
 	}
 	req->r_request = msg;
 
-	rhead = msg->front.iov_base;
+	rhead = find_old_request_head(msg->front.iov_base, session->s_con.peer_features);
 	rhead->oldest_client_tid = cpu_to_le64(__get_oldest_tid(mdsc));
 	if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags))
 		flags |= CEPH_MDS_FLAG_REPLAY;
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index c0f1b921ec69..d44d98033d58 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -446,11 +446,25 @@ union ceph_mds_request_args {
 	} __attribute__ ((packed)) lookupino;
 } __attribute__ ((packed));
 
+union ceph_mds_request_args_ext {
+	union ceph_mds_request_args old;
+	struct {
+		__le32 mode;
+		__le32 uid;
+		__le32 gid;
+		struct ceph_timespec mtime;
+		struct ceph_timespec atime;
+		__le64 size, old_size;       /* old_size needed by truncate */
+		__le32 mask;                 /* CEPH_SETATTR_* */
+		struct ceph_timespec btime;
+	} __attribute__ ((packed)) setattr_ext;
+};
+
 #define CEPH_MDS_FLAG_REPLAY		1 /* this is a replayed op */
 #define CEPH_MDS_FLAG_WANT_DENTRY	2 /* want dentry in reply */
 #define CEPH_MDS_FLAG_ASYNC		4 /* request is asynchronous */
 
-struct ceph_mds_request_head {
+struct ceph_mds_request_head_old {
 	__le64 oldest_client_tid;
 	__le32 mdsmap_epoch;           /* on client */
 	__le32 flags;                  /* CEPH_MDS_FLAG_* */
@@ -463,6 +477,22 @@ struct ceph_mds_request_head {
 	union ceph_mds_request_args args;
 } __attribute__ ((packed));
 
+#define CEPH_MDS_REQUEST_HEAD_VERSION  1
+
+struct ceph_mds_request_head {
+	__le16 version;                /* struct version */
+	__le64 oldest_client_tid;
+	__le32 mdsmap_epoch;           /* on client */
+	__le32 flags;                  /* CEPH_MDS_FLAG_* */
+	__u8 num_retry, num_fwd;       /* count retry, fwd attempts */
+	__le16 num_releases;           /* # include cap/lease release records */
+	__le32 op;                     /* mds op code */
+	__le32 caller_uid, caller_gid;
+	__le64 ino;                    /* use this ino for openc, mkdir, mknod,
+					  etc. (if replaying) */
+	union ceph_mds_request_args_ext args;
+} __attribute__ ((packed));
+
 /* cap/lease release record */
 struct ceph_mds_request_release {
 	__le64 ino, cap_id;            /* ino and unique cap id */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] ceph: implement later versions of MClientRequest headers
  2020-12-09 18:53 [PATCH 0/4] ceph: implement later versions of MClientRequest headers Jeff Layton
                   ` (3 preceding siblings ...)
  2020-12-09 18:53 ` [PATCH 4/4] ceph: implement updated ceph_mds_request_head structure Jeff Layton
@ 2020-12-10  2:19 ` Xiubo Li
  4 siblings, 0 replies; 7+ messages in thread
From: Xiubo Li @ 2020-12-10  2:19 UTC (permalink / raw)
  To: Jeff Layton, ceph-devel; +Cc: pdonnell, idryomov

On 2020/12/10 2:53, Jeff Layton wrote:
> A few years ago, userland ceph added support for changing the birthtime
> via setattr, as well as support for sending supplementary groups in a
> MDS request.
>
> This patchset updates the kclient to use the newer protocol. The
> necessary structures are extended and the code is changed to support the
> newer formats when it detects that the MDS will support it.
>
> Supplementary groups will now be transmitted in the request, but for now
> the setting of btime is not implemented.
>
> This is a prerequisite step to adding support for the new "alternate
> name" field that Xiubo has been working on, which we'll need for
> proper fscrypt support.
>
> Jeff Layton (4):
>    ceph: don't reach into request header for readdir info
>    ceph: take a cred reference instead of tracking individual uid/gid
>    ceph: clean up argument lists to __prepare_send_request and
>      __send_request
>    ceph: implement updated ceph_mds_request_head structure
>
>   fs/ceph/inode.c              |  5 +-
>   fs/ceph/mds_client.c         | 98 ++++++++++++++++++++++++++----------
>   fs/ceph/mds_client.h         |  3 +-
>   include/linux/ceph/ceph_fs.h | 32 +++++++++++-
>   4 files changed, 106 insertions(+), 32 deletions(-)
>
This series looks good to me.

Reviewed-by: Xiubo Li <xiubli@redhat.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4/4] ceph: implement updated ceph_mds_request_head structure
  2020-12-09 18:53 ` [PATCH 4/4] ceph: implement updated ceph_mds_request_head structure Jeff Layton
@ 2020-12-16 15:02   ` Jeff Layton
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2020-12-16 15:02 UTC (permalink / raw)
  To: ceph-devel; +Cc: pdonnell, xiubli, idryomov

On Wed, 2020-12-09 at 13:53 -0500, Jeff Layton wrote:
> When we added the btime feature in mainline ceph, we had to extend
> struct ceph_mds_request_args so that it could be set. Implement the same
> in the kernel client.
> 
> Rename ceph_mds_request_head with a _old extension, and a union
> ceph_mds_request_args_ext to allow for the extended size of the new
> header format.
> 
> Add the appropriate code to handle both formats in struct
> create_request_message and key the behavior on whether the peer supports
> CEPH_FEATURE_FS_BTIME.
> 
> The gid_list field in the payload is now populated from the saved
> credential. For now, we don't add any support for setting the btime via
> setattr, but this does enable us to add that in the future.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/ceph/mds_client.c         | 72 +++++++++++++++++++++++++++++-------
>  include/linux/ceph/ceph_fs.h | 32 +++++++++++++++-
>  2 files changed, 90 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index f76ae9e7d4c1..e9db2d1e0020 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -2478,21 +2478,24 @@ static int set_request_path_attr(struct inode *rinode, struct dentry *rdentry,
>  /*
>   * called under mdsc->mutex
>   */
> -static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
> +static struct ceph_msg *create_request_message(struct ceph_mds_session *session,
>  					       struct ceph_mds_request *req,
> -					       int mds, bool drop_cap_releases)
> +					       bool drop_cap_releases)
>  {
> +	int mds = session->s_mds;
> +	struct ceph_mds_client *mdsc = session->s_mdsc;
>  	struct ceph_msg *msg;
> -	struct ceph_mds_request_head *head;
> +	struct ceph_mds_request_head_old *head;
>  	const char *path1 = NULL;
>  	const char *path2 = NULL;
>  	u64 ino1 = 0, ino2 = 0;
>  	int pathlen1 = 0, pathlen2 = 0;
>  	bool freepath1 = false, freepath2 = false;
> -	int len;
> +	int len, i;
>  	u16 releases;
>  	void *p, *end;
>  	int ret;
> +	bool legacy = !(session->s_con.peer_features & CEPH_FEATURE_FS_BTIME);
>  
> 
>  	ret = set_request_path_attr(req->r_inode, req->r_dentry,
>  			      req->r_parent, req->r_path1, req->r_ino1.ino,
> @@ -2514,14 +2517,23 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
>  		goto out_free1;
>  	}
>  
> 
> -	len = sizeof(*head) +
> -		pathlen1 + pathlen2 + 2*(1 + sizeof(u32) + sizeof(u64)) +
> +	if (legacy) {
> +		/* Old style */
> +		len = sizeof(*head);
> +	} else {
> +		/* New style: add gid_list and any later fields */
> +		len = sizeof(struct ceph_mds_request_head) +
> +		      sizeof(u32) + (sizeof(u64) * req->r_cred->group_info->ngroups);
> +	}
> +
> +	len += pathlen1 + pathlen2 + 2*(1 + sizeof(u32) + sizeof(u64)) +
>  		sizeof(struct ceph_timespec);
>  
> 
>  	/* calculate (max) length for cap releases */
>  	len += sizeof(struct ceph_mds_request_release) *
>  		(!!req->r_inode_drop + !!req->r_dentry_drop +
>  		 !!req->r_old_inode_drop + !!req->r_old_dentry_drop);
> +
>  	if (req->r_dentry_drop)
>  		len += pathlen1;
>  	if (req->r_old_dentry_drop)
> @@ -2533,11 +2545,25 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
>  		goto out_free2;
>  	}
>  
> 
> -	msg->hdr.version = cpu_to_le16(3);
>  	msg->hdr.tid = cpu_to_le64(req->r_tid);
>  
> 
> -	head = msg->front.iov_base;
> -	p = msg->front.iov_base + sizeof(*head);
> +	/*
> +	 * The old ceph_mds_request_header didn't contain a version field, and
> +	 * one was added when we moved the message version from 3->4.
> +	 */
> +	if (legacy) {
> +		msg->hdr.version = cpu_to_le16(3);
> +		head = msg->front.iov_base;
> +		p = msg->front.iov_base + sizeof(*head);
> +	} else {
> +		struct ceph_mds_request_head *new_head = msg->front.iov_base;
> +
> +		msg->hdr.version = cpu_to_le16(4);
> +		new_head->version = cpu_to_le16(CEPH_MDS_REQUEST_HEAD_VERSION);
> +		head = (struct ceph_mds_request_head_old *)&new_head->oldest_client_tid;
> +		p = msg->front.iov_base + sizeof(*new_head);
> +	}
> +
>  	end = msg->front.iov_base + msg->front.iov_len;
>  
> 
>  	head->mdsmap_epoch = cpu_to_le32(mdsc->mdsmap->m_epoch);
> @@ -2588,6 +2614,14 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc,
>  		ceph_encode_copy(&p, &ts, sizeof(ts));
>  	}
>  
> 
> +	/* gid list */
> +	if (!legacy) {
> +		ceph_encode_32(&p, req->r_cred->group_info->ngroups);
> +		for (i = 0; i < req->r_cred->group_info->ngroups; i++)
> +			ceph_encode_64(&p, from_kgid(&init_user_ns,
> +				       req->r_cred->group_info->gid[i]));
> +	}
> +
>  	if (WARN_ON_ONCE(p > end)) {
>  		ceph_msg_put(msg);
>  		msg = ERR_PTR(-ERANGE);
> @@ -2631,6 +2665,17 @@ static void complete_request(struct ceph_mds_client *mdsc,
>  	complete_all(&req->r_completion);
>  }
>  
> 
> +static struct ceph_mds_request_head_old *find_old_request_head(void *p, u64 features)
> +{
> +	bool legacy = !(features & CEPH_FEATURE_FS_BTIME);
> +	struct ceph_mds_request_head *new_head;
> +
> +	if (legacy)
> +		return (struct ceph_mds_request_head_old *)p;
> +	new_head = (struct ceph_mds_request_head *)p;
> +	return (struct ceph_mds_request_head_old *)&new_head->oldest_client_tid;
> +}
> +
>  /*
>   * called under mdsc->mutex
>   */
> @@ -2640,7 +2685,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
>  {
>  	int mds = session->s_mds;
>  	struct ceph_mds_client *mdsc = session->s_mdsc;
> -	struct ceph_mds_request_head *rhead;
> +	struct ceph_mds_request_head_old *rhead;
>  	struct ceph_msg *msg;
>  	int flags = 0;
>  
> 
> @@ -2659,6 +2704,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
>  
> 
>  	if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) {
>  		void *p;
> +
>  		/*
>  		 * Replay.  Do not regenerate message (and rebuild
>  		 * paths, etc.); just use the original message.
> @@ -2666,7 +2712,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
>  		 * d_move mangles the src name.
>  		 */
>  		msg = req->r_request;
> -		rhead = msg->front.iov_base;
> +		rhead = find_old_request_head(msg->front.iov_base, session->s_con.peer_features);
>  
> 
>  		flags = le32_to_cpu(rhead->flags);
>  		flags |= CEPH_MDS_FLAG_REPLAY;
> @@ -2697,14 +2743,14 @@ static int __prepare_send_request(struct ceph_mds_session *session,
>  		ceph_msg_put(req->r_request);
>  		req->r_request = NULL;
>  	}
> -	msg = create_request_message(mdsc, req, mds, drop_cap_releases);
> +	msg = create_request_message(session, req, drop_cap_releases);
>  	if (IS_ERR(msg)) {
>  		req->r_err = PTR_ERR(msg);
>  		return PTR_ERR(msg);
>  	}
>  	req->r_request = msg;
>  
> 
> -	rhead = msg->front.iov_base;
> +	rhead = find_old_request_head(msg->front.iov_base, session->s_con.peer_features);
>  	rhead->oldest_client_tid = cpu_to_le64(__get_oldest_tid(mdsc));
>  	if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags))
>  		flags |= CEPH_MDS_FLAG_REPLAY;
> diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
> index c0f1b921ec69..d44d98033d58 100644
> --- a/include/linux/ceph/ceph_fs.h
> +++ b/include/linux/ceph/ceph_fs.h
> @@ -446,11 +446,25 @@ union ceph_mds_request_args {
>  	} __attribute__ ((packed)) lookupino;
>  } __attribute__ ((packed));
>  
> 
> +union ceph_mds_request_args_ext {
> +	union ceph_mds_request_args old;
> +	struct {
> +		__le32 mode;
> +		__le32 uid;
> +		__le32 gid;
> +		struct ceph_timespec mtime;
> +		struct ceph_timespec atime;
> +		__le64 size, old_size;       /* old_size needed by truncate */
> +		__le32 mask;                 /* CEPH_SETATTR_* */
> +		struct ceph_timespec btime;
> +	} __attribute__ ((packed)) setattr_ext;
> +};
> +
>  #define CEPH_MDS_FLAG_REPLAY		1 /* this is a replayed op */
>  #define CEPH_MDS_FLAG_WANT_DENTRY	2 /* want dentry in reply */
>  #define CEPH_MDS_FLAG_ASYNC		4 /* request is asynchronous */
>  
> 
> -struct ceph_mds_request_head {
> +struct ceph_mds_request_head_old {
>  	__le64 oldest_client_tid;
>  	__le32 mdsmap_epoch;           /* on client */
>  	__le32 flags;                  /* CEPH_MDS_FLAG_* */
> @@ -463,6 +477,22 @@ struct ceph_mds_request_head {
>  	union ceph_mds_request_args args;
>  } __attribute__ ((packed));
>  
> 
> +#define CEPH_MDS_REQUEST_HEAD_VERSION  1
> +
> +struct ceph_mds_request_head {
> +	__le16 version;                /* struct version */
> +	__le64 oldest_client_tid;
> +	__le32 mdsmap_epoch;           /* on client */
> +	__le32 flags;                  /* CEPH_MDS_FLAG_* */
> +	__u8 num_retry, num_fwd;       /* count retry, fwd attempts */
> +	__le16 num_releases;           /* # include cap/lease release records */
> +	__le32 op;                     /* mds op code */
> +	__le32 caller_uid, caller_gid;
> +	__le64 ino;                    /* use this ino for openc, mkdir, mknod,
> +					  etc. (if replaying) */
> +	union ceph_mds_request_args_ext args;
> +} __attribute__ ((packed));
> +
>  /* cap/lease release record */
>  struct ceph_mds_request_release {
>  	__le64 ino, cap_id;            /* ino and unique cap id */

Patrick has hit some errors that look like this:

    failed to decode message of type 24 v4: End of buffer

I've not been able to reproduce it yet, but for now, I'm going to back
this patch out of the testing branch to validate that it is the problem.

See: https://tracker.ceph.com/issues/48618
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-16 15:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09 18:53 [PATCH 0/4] ceph: implement later versions of MClientRequest headers Jeff Layton
2020-12-09 18:53 ` [PATCH 1/4] ceph: don't reach into request header for readdir info Jeff Layton
2020-12-09 18:53 ` [PATCH 2/4] ceph: take a cred reference instead of tracking individual uid/gid Jeff Layton
2020-12-09 18:53 ` [PATCH 3/4] ceph: clean up argument lists to __prepare_send_request and __send_request Jeff Layton
2020-12-09 18:53 ` [PATCH 4/4] ceph: implement updated ceph_mds_request_head structure Jeff Layton
2020-12-16 15:02   ` Jeff Layton
2020-12-10  2:19 ` [PATCH 0/4] ceph: implement later versions of MClientRequest headers Xiubo Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.