linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/14] ceph: support idmapped mounts
@ 2023-06-08 15:42 Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 01/14] fs: export mnt_idmap_get/mnt_idmap_put Alexander Mikhalitsyn
                   ` (14 more replies)
  0 siblings, 15 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton,
	ceph-devel, linux-kernel

Dear friends,

This patchset was originally developed by Christian Brauner but I'll continue
to push it forward. Christian allowed me to do that :)

This feature is already actively used/tested with LXD/LXC project.

Git tree (based on https://github.com/ceph/ceph-client.git master):
v5: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v5
current: https://github.com/mihalicyn/linux/tree/fs.idmapped.ceph

In the version 3 I've changed only two commits:
- fs: export mnt_idmap_get/mnt_idmap_put
- ceph: allow idmapped setattr inode op
and added a new one:
- ceph: pass idmap to __ceph_setattr

In the version 4 I've reworked the ("ceph: stash idmapping in mdsc request")
commit. Now we take idmap refcounter just in place where req->r_mnt_idmap
is filled. It's more safer approach and prevents possible refcounter underflow
on error paths where __register_request wasn't called but ceph_mdsc_release_request is
called.

Changelog for version 5:
- a few commits were squashed into one (as suggested by Xiubo Li)
- started passing an idmapping everywhere (if possible), so a caller
UID/GID-s will be mapped almost everywhere (as suggested by Xiubo Li)

I can confirm that this version passes xfstests.

Links to previous versions:
v1: https://lore.kernel.org/all/20220104140414.155198-1-brauner@kernel.org/
v2: https://lore.kernel.org/lkml/20230524153316.476973-1-aleksandr.mikhalitsyn@canonical.com/
tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v2
v3: https://lore.kernel.org/lkml/20230607152038.469739-1-aleksandr.mikhalitsyn@canonical.com/#t
v4: https://lore.kernel.org/lkml/20230607180958.645115-1-aleksandr.mikhalitsyn@canonical.com/#t
tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v4

Kind regards,
Alex

Original description from Christian:
========================================================================
This patch series enables cephfs to support idmapped mounts, i.e. the
ability to alter ownership information on a per-mount basis.

Container managers such as LXD support sharaing data via cephfs between
the host and unprivileged containers and between unprivileged containers.
They may all use different idmappings. Idmapped mounts can be used to
create mounts with the idmapping used for the container (or a different
one specific to the use-case).

There are in fact more use-cases such as remapping ownership for
mountpoints on the host itself to grant or restrict access to different
users or to make it possible to enforce that programs running as root
will write with a non-zero {g,u}id to disk.

The patch series is simple overall and few changes are needed to cephfs.
There is one cephfs specific issue that I would like to discuss and
solve which I explain in detail in:

[PATCH 02/12] ceph: handle idmapped mounts in create_request_message()

It has to do with how to handle mds serves which have id-based access
restrictions configured. I would ask you to please take a look at the
explanation in the aforementioned patch.

The patch series passes the vfs and idmapped mount testsuite as part of
xfstests. To run it you will need a config like:

[ceph]
export FSTYP=ceph
export TEST_DIR=/mnt/test
export TEST_DEV=10.103.182.10:6789:/
export TEST_FS_MOUNT_OPTS="-o name=admin,secret=$password

and then simply call

sudo ./check -g idmapped

========================================================================

Alexander Mikhalitsyn (5):
  fs: export mnt_idmap_get/mnt_idmap_put
  ceph: pass idmap to __ceph_setattr
  ceph: pass idmap to ceph_do_getattr
  ceph: pass idmap to __ceph_setxattr
  ceph: pass idmap to ceph_open/ioctl_set_layout

Christian Brauner (9):
  ceph: stash idmapping in mdsc request
  ceph: handle idmapped mounts in create_request_message()
  ceph: pass an idmapping to mknod/symlink/mkdir/rename
  ceph: allow idmapped getattr inode op
  ceph: allow idmapped permission inode op
  ceph: allow idmapped setattr inode op
  ceph/acl: allow idmapped set_acl inode op
  ceph/file: allow idmapped atomic_open inode op
  ceph: allow idmapped mounts

 fs/ceph/acl.c                 |  8 ++++----
 fs/ceph/addr.c                |  3 ++-
 fs/ceph/caps.c                |  3 ++-
 fs/ceph/dir.c                 |  4 ++++
 fs/ceph/export.c              |  2 +-
 fs/ceph/file.c                | 21 ++++++++++++++-----
 fs/ceph/inode.c               | 38 +++++++++++++++++++++--------------
 fs/ceph/ioctl.c               |  9 +++++++--
 fs/ceph/mds_client.c          | 27 +++++++++++++++++++++----
 fs/ceph/mds_client.h          |  1 +
 fs/ceph/quota.c               |  2 +-
 fs/ceph/super.c               |  6 +++---
 fs/ceph/super.h               | 14 ++++++++-----
 fs/ceph/xattr.c               | 18 +++++++++--------
 fs/mnt_idmapping.c            |  2 ++
 include/linux/mnt_idmapping.h |  3 +++
 16 files changed, 111 insertions(+), 50 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v5 01/14] fs: export mnt_idmap_get/mnt_idmap_put
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 02/14] ceph: stash idmapping in mdsc request Alexander Mikhalitsyn
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Alexander Mikhalitsyn, Alexander Viro, Seth Forshee,
	linux-kernel

These helpers are required to support idmapped mounts in the Cephfs.

Cc: Christian Brauner <brauner@kernel.org>
Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
---
v3:
	- EXPORT_SYMBOL -> EXPORT_SYMBOL_GPL as Christoph Hellwig suggested
---
 fs/mnt_idmapping.c            | 2 ++
 include/linux/mnt_idmapping.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/fs/mnt_idmapping.c b/fs/mnt_idmapping.c
index 4905665c47d0..57d1dedf3f8f 100644
--- a/fs/mnt_idmapping.c
+++ b/fs/mnt_idmapping.c
@@ -256,6 +256,7 @@ struct mnt_idmap *mnt_idmap_get(struct mnt_idmap *idmap)
 
 	return idmap;
 }
+EXPORT_SYMBOL_GPL(mnt_idmap_get);
 
 /**
  * mnt_idmap_put - put a reference to an idmapping
@@ -271,3 +272,4 @@ void mnt_idmap_put(struct mnt_idmap *idmap)
 		kfree(idmap);
 	}
 }
+EXPORT_SYMBOL_GPL(mnt_idmap_put);
diff --git a/include/linux/mnt_idmapping.h b/include/linux/mnt_idmapping.h
index 057c89867aa2..b8da2db4ecd2 100644
--- a/include/linux/mnt_idmapping.h
+++ b/include/linux/mnt_idmapping.h
@@ -115,6 +115,9 @@ static inline bool vfsgid_eq_kgid(vfsgid_t vfsgid, kgid_t kgid)
 
 int vfsgid_in_group_p(vfsgid_t vfsgid);
 
+struct mnt_idmap *mnt_idmap_get(struct mnt_idmap *idmap);
+void mnt_idmap_put(struct mnt_idmap *idmap);
+
 vfsuid_t make_vfsuid(struct mnt_idmap *idmap,
 		     struct user_namespace *fs_userns, kuid_t kuid);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 02/14] ceph: stash idmapping in mdsc request
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 01/14] fs: export mnt_idmap_get/mnt_idmap_put Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 03/14] ceph: handle idmapped mounts in create_request_message() Alexander Mikhalitsyn
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

When sending a mds request cephfs will send relevant data for the
requested operation. For creation requests the caller's fs{g,u}id is
used to set the ownership of the newly created filesystem object. For
setattr requests the caller can pass in arbitrary {g,u}id values to
which the relevant filesystem object is supposed to be changed.

If the caller is performing the relevant operation via an idmapped mount
cephfs simply needs to take the idmapping into account when it sends the
relevant mds request.

In order to support idmapped mounts for cephfs we stash the idmapping
whenever they are relevant for the operation for the duration of the
request. Since mds requests can be queued and performed asynchronously
we make sure to keep the idmapping around and release it once the
request has finished.

In follow-up patches we will use this to send correct ownership
information over the wire. This patch just adds the basic infrastructure
to keep the idmapping around. The actual conversion patches are all
fairly minimal.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
v4:
	- don't call mnt_idmap_get(..) in __register_request
---
 fs/ceph/mds_client.c | 5 +++++
 fs/ceph/mds_client.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 4c0f22acf53d..05a99a8eb292 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -12,6 +12,7 @@
 #include <linux/bits.h>
 #include <linux/ktime.h>
 #include <linux/bitmap.h>
+#include <linux/mnt_idmapping.h>
 
 #include "super.h"
 #include "mds_client.h"
@@ -962,6 +963,8 @@ void ceph_mdsc_release_request(struct kref *kref)
 	kfree(req->r_path1);
 	kfree(req->r_path2);
 	put_cred(req->r_cred);
+	if (req->r_mnt_idmap)
+		mnt_idmap_put(req->r_mnt_idmap);
 	if (req->r_pagelist)
 		ceph_pagelist_release(req->r_pagelist);
 	put_request_session(req);
@@ -1018,6 +1021,8 @@ static void __register_request(struct ceph_mds_client *mdsc,
 	insert_request(&mdsc->request_tree, req);
 
 	req->r_cred = get_current_cred();
+	if (!req->r_mnt_idmap)
+		req->r_mnt_idmap = &nop_mnt_idmap;
 
 	if (mdsc->oldest_tid == 0 && req->r_op != CEPH_MDS_OP_SETFILELOCK)
 		mdsc->oldest_tid = req->r_tid;
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index 724307ff89cd..32001ade1ea7 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -280,6 +280,7 @@ struct ceph_mds_request {
 	int r_fmode;        /* file mode, if expecting cap */
 	int r_request_release_offset;
 	const struct cred *r_cred;
+	struct mnt_idmap *r_mnt_idmap;
 	struct timespec64 r_stamp;
 
 	/* for choosing which mds to send this request to */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 03/14] ceph: handle idmapped mounts in create_request_message()
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 01/14] fs: export mnt_idmap_get/mnt_idmap_put Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 02/14] ceph: stash idmapping in mdsc request Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 04/14] ceph: pass an idmapping to mknod/symlink/mkdir/rename Alexander Mikhalitsyn
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Inode operations that create a new filesystem object such as ->mknod,
->create, ->mkdir() and others don't take a {g,u}id argument explicitly.
Instead the caller's fs{g,u}id is used for the {g,u}id of the new
filesystem object.

Cephfs mds creation request argument structures mirror this filesystem
behavior. They don't encode a {g,u}id explicitly. Instead the caller's
fs{g,u}id that is always sent as part of any mds request is used by the
servers to set the {g,u}id of the new filesystem object.

In order to ensure that the correct {g,u}id is used map the caller's
fs{g,u}id for creation requests. This doesn't require complex changes.
It suffices to pass in the relevant idmapping recorded in the request
message. If this request message was triggered from an inode operation
that creates filesystem objects it will have passed down the relevant
idmaping. If this is a request message that was triggered from an inode
operation that doens't need to take idmappings into account the initial
idmapping is passed down which is an identity mapping and thus is
guaranteed to leave the caller's fs{g,u}id unchanged.,u}id is sent.

The last few weeks before Christmas 2021 I have spent time not just
reading and poking the cephfs kernel code but also took a look at the
ceph mds server userspace to ensure I didn't miss some subtlety.

This made me aware of one complication to solve. All requests send the
caller's fs{g,u}id over the wire. The caller's fs{g,u}id matters for the
server in exactly two cases:

1. to set the ownership for creation requests
2. to determine whether this client is allowed access on this server

Case 1. we already covered and explained. Case 2. is only relevant for
servers where an explicit uid access restriction has been set. That is
to say the mds server restricts access to requests coming from a
specific uid. Servers without uid restrictions will grant access to
requests from any uid by setting MDS_AUTH_UID_ANY.

Case 2. introduces the complication because the caller's fs{g,u}id is
not just used to record ownership but also serves as the {g,u}id used
when checking access to the server.

Consider a user mounting a cephfs client and creating an idmapped mount
from it that maps files owned by uid 1000 to be owned uid 0:

mount -t cephfs -o [...] /unmapped
mount-idmapped --map-mount 1000:0:1 /idmapped

That is to say if the mounted cephfs filesystem contains a file "file1"
which is owned by uid 1000:

- looking at it via /unmapped/file1 will report it as owned by uid 1000
  (One can think of this as the on-disk value.)
- looking at it via /idmapped/file1 will report it as owned by uid 0

Now, consider creating new files via the idmapped mount at /idmapped.
When a caller with fs{g,u}id 1000 creates a file "file2" by going
through the idmapped mount mounted at /idmapped it will create a file
that is owned by uid 1000 on-disk, i.e.:

- looking at it via /unmapped/file2 will report it as owned by uid 1000
- looking at it via /idmapped/file2 will report it as owned by uid 0

Now consider an mds server that has a uid access restriction set and
only grants access to requests from uid 0.

If the client sends a creation request for a file e.g. /idmapped/file2
it will send the caller's fs{g,u}id idmapped according to the idmapped
mount. So if the caller has fs{g,u}id 1000 it will be mapped to {g,u}id
0 in the idmapped mount and will be sent over the wire allowing the
caller access to the mds server.

However, if the caller is not issuing a creation request the caller's
fs{g,u}id will be send without the mount's idmapping applied. So if the
caller that just successfully created a new file on the restricted mds
server sends a request as fs{g,u}id 1000 access will be refused. This
however is inconsistent.

From my perspective the root of the problem lies in the fact that
creation requests implicitly infer the ownership from the {g,u}id that
gets sent along with every mds request.

I have thought of multiple ways of addressing this problem but the one I
prefer is to give all mds requests that create a filesystem object a
proper, separate {g,u}id field entry in the argument struct. This is,
for example how ->setattr mds requests work.

This way the caller's fs{g,u}id can be used consistenly for server
access checks and is separated from the ownership for new filesystem
objects.

Servers could then be updated to refuse creation requests whenever the
{g,u}id used for access checking doesn't match the {g,u}id used for
creating the filesystem object just as is done for setattr requests on a
uid restricted server. But I am, of course, open to other suggestions.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/mds_client.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 05a99a8eb292..8826be3c209f 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2581,6 +2581,8 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session,
 	void *p, *end;
 	int ret;
 	bool legacy = !(session->s_con.peer_features & CEPH_FEATURE_FS_BTIME);
+	kuid_t caller_fsuid;
+	kgid_t caller_fsgid;
 
 	ret = set_request_path_attr(req->r_inode, req->r_dentry,
 			      req->r_parent, req->r_path1, req->r_ino1.ino,
@@ -2649,10 +2651,22 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session,
 
 	head->mdsmap_epoch = cpu_to_le32(mdsc->mdsmap->m_epoch);
 	head->op = cpu_to_le32(req->r_op);
-	head->caller_uid = cpu_to_le32(from_kuid(&init_user_ns,
-						 req->r_cred->fsuid));
-	head->caller_gid = cpu_to_le32(from_kgid(&init_user_ns,
-						 req->r_cred->fsgid));
+	/*
+	 * Inode operations that create filesystem objects based on the
+	 * caller's fs{g,u}id like ->mknod(), ->create(), ->mkdir() etc. don't
+	 * have separate {g,u}id fields in their respective structs in the
+	 * ceph_mds_request_args union. Instead the caller_{g,u}id field is
+	 * used to set ownership of the newly created inode by the mds server.
+	 * For these inode operations we need to send the mapped fs{g,u}id over
+	 * the wire. For other cases we simple set req->r_mnt_idmap to the
+	 * initial idmapping meaning the unmapped fs{g,u}id is sent.
+	 */
+	caller_fsuid = from_vfsuid(req->r_mnt_idmap, &init_user_ns,
+					VFSUIDT_INIT(req->r_cred->fsuid));
+	caller_fsgid = from_vfsgid(req->r_mnt_idmap, &init_user_ns,
+					VFSGIDT_INIT(req->r_cred->fsgid));
+	head->caller_uid = cpu_to_le32(from_kuid(&init_user_ns, caller_fsuid));
+	head->caller_gid = cpu_to_le32(from_kgid(&init_user_ns, caller_fsgid));
 	head->ino = cpu_to_le64(req->r_deleg_ino);
 	head->args = req->r_args;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 04/14] ceph: pass an idmapping to mknod/symlink/mkdir/rename
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (2 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 03/14] ceph: handle idmapped mounts in create_request_message() Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 05/14] ceph: allow idmapped getattr inode op Alexander Mikhalitsyn
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable mknod/symlink/mkdir/rename iops to handle idmapped mounts.
This is just a matter of passing down the mount's idmapping.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
v4:
	- call mnt_idmap_get
---
 fs/ceph/dir.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index cb67ac821f0e..355c5574ad27 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -884,6 +884,7 @@ static int ceph_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	req->r_parent = dir;
 	ihold(dir);
 	set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	req->r_args.mknod.mode = cpu_to_le32(mode);
 	req->r_args.mknod.rdev = cpu_to_le32(rdev);
 	req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL;
@@ -955,6 +956,7 @@ static int ceph_symlink(struct mnt_idmap *idmap, struct inode *dir,
 	req->r_num_caps = 2;
 	req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL;
 	req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	if (as_ctx.pagelist) {
 		req->r_pagelist = as_ctx.pagelist;
 		as_ctx.pagelist = NULL;
@@ -1022,6 +1024,7 @@ static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	ihold(dir);
 	set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
 	req->r_args.mkdir.mode = cpu_to_le32(mode);
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL;
 	req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
 	if (as_ctx.pagelist) {
@@ -1324,6 +1327,7 @@ static int ceph_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 	req->r_old_dentry_unless = CEPH_CAP_FILE_EXCL;
 	req->r_dentry_drop = CEPH_CAP_FILE_SHARED;
 	req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	/* release LINK_RDCACHE on source inode (mds will lock it) */
 	req->r_old_inode_drop = CEPH_CAP_LINK_SHARED | CEPH_CAP_LINK_EXCL;
 	if (d_really_is_positive(new_dentry)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 05/14] ceph: allow idmapped getattr inode op
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (3 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 04/14] ceph: pass an idmapping to mknod/symlink/mkdir/rename Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 06/14] ceph: allow idmapped permission " Alexander Mikhalitsyn
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable ceph_getattr() to handle idmapped mounts. This is just a matter
of passing down the mount's idmapping.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 8e5f41d45283..2e988612ed6c 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2465,7 +2465,7 @@ int ceph_getattr(struct mnt_idmap *idmap, const struct path *path,
 			return err;
 	}
 
-	generic_fillattr(&nop_mnt_idmap, inode, stat);
+	generic_fillattr(idmap, inode, stat);
 	stat->ino = ceph_present_inode(inode);
 
 	/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 06/14] ceph: allow idmapped permission inode op
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (4 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 05/14] ceph: allow idmapped getattr inode op Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 07/14] ceph: pass idmap to __ceph_setattr Alexander Mikhalitsyn
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable ceph_permission() to handle idmapped mounts. This is just a
matter of passing down the mount's idmapping.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 2e988612ed6c..37e1cbfc7c89 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2408,7 +2408,7 @@ int ceph_permission(struct mnt_idmap *idmap, struct inode *inode,
 	err = ceph_do_getattr(inode, CEPH_CAP_AUTH_SHARED, false);
 
 	if (!err)
-		err = generic_permission(&nop_mnt_idmap, inode, mask);
+		err = generic_permission(idmap, inode, mask);
 	return err;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 07/14] ceph: pass idmap to __ceph_setattr
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (5 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 06/14] ceph: allow idmapped permission " Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 08/14] ceph: allow idmapped setattr inode op Alexander Mikhalitsyn
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Alexander Mikhalitsyn, linux-kernel

Just pass down the mount's idmapping to __ceph_setattr,
because we will need it later.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: brauner@kernel.org
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/acl.c   | 4 ++--
 fs/ceph/inode.c | 6 ++++--
 fs/ceph/super.h | 3 ++-
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
index 6945a938d396..51ffef848429 100644
--- a/fs/ceph/acl.c
+++ b/fs/ceph/acl.c
@@ -140,7 +140,7 @@ int ceph_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 		newattrs.ia_ctime = current_time(inode);
 		newattrs.ia_mode = new_mode;
 		newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-		ret = __ceph_setattr(inode, &newattrs);
+		ret = __ceph_setattr(idmap, inode, &newattrs);
 		if (ret)
 			goto out_free;
 	}
@@ -151,7 +151,7 @@ int ceph_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 			newattrs.ia_ctime = old_ctime;
 			newattrs.ia_mode = old_mode;
 			newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-			__ceph_setattr(inode, &newattrs);
+			__ceph_setattr(idmap, inode, &newattrs);
 		}
 		goto out_free;
 	}
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 37e1cbfc7c89..bface707c9bb 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2009,7 +2009,8 @@ static const struct inode_operations ceph_symlink_iops = {
 	.listxattr = ceph_listxattr,
 };
 
-int __ceph_setattr(struct inode *inode, struct iattr *attr)
+int __ceph_setattr(struct mnt_idmap *idmap, struct inode *inode,
+		   struct iattr *attr)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	unsigned int ia_valid = attr->ia_valid;
@@ -2206,6 +2207,7 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr)
 	if (mask) {
 		req->r_inode = inode;
 		ihold(inode);
+		req->r_mnt_idmap = mnt_idmap_get(idmap);
 		req->r_inode_drop = release;
 		req->r_args.setattr.mask = cpu_to_le32(mask);
 		req->r_num_caps = 1;
@@ -2252,7 +2254,7 @@ int ceph_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
 	    ceph_quota_is_max_bytes_exceeded(inode, attr->ia_size))
 		return -EDQUOT;
 
-	err = __ceph_setattr(inode, attr);
+	err = __ceph_setattr(idmap, inode, attr);
 
 	if (err >= 0 && (attr->ia_valid & ATTR_MODE))
 		err = posix_acl_chmod(&nop_mnt_idmap, dentry, attr->ia_mode);
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index d24bf0db5234..d9cc27307cb7 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1052,7 +1052,8 @@ static inline int ceph_do_getattr(struct inode *inode, int mask, bool force)
 }
 extern int ceph_permission(struct mnt_idmap *idmap,
 			   struct inode *inode, int mask);
-extern int __ceph_setattr(struct inode *inode, struct iattr *attr);
+extern int __ceph_setattr(struct mnt_idmap *idmap, struct inode *inode,
+			  struct iattr *attr);
 extern int ceph_setattr(struct mnt_idmap *idmap,
 			struct dentry *dentry, struct iattr *attr);
 extern int ceph_getattr(struct mnt_idmap *idmap,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 08/14] ceph: allow idmapped setattr inode op
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (6 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 07/14] ceph: pass idmap to __ceph_setattr Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 09/14] ceph/acl: allow idmapped set_acl " Alexander Mikhalitsyn
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable __ceph_setattr() to handle idmapped mounts. This is just a matter
of passing down the mount's idmapping.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
[ adapted to b27c82e12965 ("attr: port attribute changes to new types") ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
v4:
	- introduced fsuid/fsgid local variables
v3:
	- reworked as Christian suggested here:
	https://lore.kern
---
 fs/ceph/inode.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index bface707c9bb..58ec603a55af 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2052,31 +2052,35 @@ int __ceph_setattr(struct mnt_idmap *idmap, struct inode *inode,
 	dout("setattr %p issued %s\n", inode, ceph_cap_string(issued));
 
 	if (ia_valid & ATTR_UID) {
+		kuid_t fsuid = from_vfsuid(idmap, i_user_ns(inode), attr->ia_vfsuid);
+
 		dout("setattr %p uid %d -> %d\n", inode,
 		     from_kuid(&init_user_ns, inode->i_uid),
 		     from_kuid(&init_user_ns, attr->ia_uid));
 		if (issued & CEPH_CAP_AUTH_EXCL) {
-			inode->i_uid = attr->ia_uid;
+			inode->i_uid = fsuid;
 			dirtied |= CEPH_CAP_AUTH_EXCL;
 		} else if ((issued & CEPH_CAP_AUTH_SHARED) == 0 ||
-			   !uid_eq(attr->ia_uid, inode->i_uid)) {
+			   !uid_eq(fsuid, inode->i_uid)) {
 			req->r_args.setattr.uid = cpu_to_le32(
-				from_kuid(&init_user_ns, attr->ia_uid));
+				from_kuid(&init_user_ns, fsuid));
 			mask |= CEPH_SETATTR_UID;
 			release |= CEPH_CAP_AUTH_SHARED;
 		}
 	}
 	if (ia_valid & ATTR_GID) {
+		kgid_t fsgid = from_vfsgid(idmap, i_user_ns(inode), attr->ia_vfsgid);
+
 		dout("setattr %p gid %d -> %d\n", inode,
 		     from_kgid(&init_user_ns, inode->i_gid),
 		     from_kgid(&init_user_ns, attr->ia_gid));
 		if (issued & CEPH_CAP_AUTH_EXCL) {
-			inode->i_gid = attr->ia_gid;
+			inode->i_gid = fsgid;
 			dirtied |= CEPH_CAP_AUTH_EXCL;
 		} else if ((issued & CEPH_CAP_AUTH_SHARED) == 0 ||
-			   !gid_eq(attr->ia_gid, inode->i_gid)) {
+			   !gid_eq(fsgid, inode->i_gid)) {
 			req->r_args.setattr.gid = cpu_to_le32(
-				from_kgid(&init_user_ns, attr->ia_gid));
+				from_kgid(&init_user_ns, fsgid));
 			mask |= CEPH_SETATTR_GID;
 			release |= CEPH_CAP_AUTH_SHARED;
 		}
@@ -2242,7 +2246,7 @@ int ceph_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
 	if (ceph_inode_is_shutdown(inode))
 		return -ESTALE;
 
-	err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
+	err = setattr_prepare(idmap, dentry, attr);
 	if (err != 0)
 		return err;
 
@@ -2257,7 +2261,7 @@ int ceph_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
 	err = __ceph_setattr(idmap, inode, attr);
 
 	if (err >= 0 && (attr->ia_valid & ATTR_MODE))
-		err = posix_acl_chmod(&nop_mnt_idmap, dentry, attr->ia_mode);
+		err = posix_acl_chmod(idmap, dentry, attr->ia_mode);
 
 	return err;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 09/14] ceph/acl: allow idmapped set_acl inode op
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (7 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 08/14] ceph: allow idmapped setattr inode op Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 10/14] ceph/file: allow idmapped atomic_open " Alexander Mikhalitsyn
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable ceph_set_acl() to handle idmapped mounts. This is just a matter
of passing down the mount's idmapping.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/acl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
index 51ffef848429..d0ca5a0060d8 100644
--- a/fs/ceph/acl.c
+++ b/fs/ceph/acl.c
@@ -105,7 +105,7 @@ int ceph_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 	case ACL_TYPE_ACCESS:
 		name = XATTR_NAME_POSIX_ACL_ACCESS;
 		if (acl) {
-			ret = posix_acl_update_mode(&nop_mnt_idmap, inode,
+			ret = posix_acl_update_mode(idmap, inode,
 						    &new_mode, &acl);
 			if (ret)
 				goto out;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 10/14] ceph/file: allow idmapped atomic_open inode op
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (8 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 09/14] ceph/acl: allow idmapped set_acl " Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 11/14] ceph: pass idmap to ceph_do_getattr Alexander Mikhalitsyn
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable ceph_atomic_open() to handle idmapped mounts. This is just a
matter of passing down the mount's idmapping.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
[ adapted to 5fadbd9929 ("ceph: rely on vfs for setgid stripping") ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
v4:
	- call mnt_idmap_get
---
 fs/ceph/file.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index f4d8bf7dec88..d46b6b8b5fcb 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -654,7 +654,9 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry,
 	in.truncate_seq = cpu_to_le32(1);
 	in.truncate_size = cpu_to_le64(-1ULL);
 	in.xattr_version = cpu_to_le64(1);
-	in.uid = cpu_to_le32(from_kuid(&init_user_ns, current_fsuid()));
+	in.uid = cpu_to_le32(from_kuid(&init_user_ns,
+				       mapped_fsuid(req->r_mnt_idmap,
+						    &init_user_ns)));
 	if (dir->i_mode & S_ISGID) {
 		in.gid = cpu_to_le32(from_kgid(&init_user_ns, dir->i_gid));
 
@@ -662,7 +664,9 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry,
 		if (S_ISDIR(mode))
 			mode |= S_ISGID;
 	} else {
-		in.gid = cpu_to_le32(from_kgid(&init_user_ns, current_fsgid()));
+		in.gid = cpu_to_le32(from_kgid(&init_user_ns,
+				     mapped_fsgid(req->r_mnt_idmap,
+						  &init_user_ns)));
 	}
 	in.mode = cpu_to_le32((u32)mode);
 
@@ -731,6 +735,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 		     struct file *file, unsigned flags, umode_t mode)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
+	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
 	struct ceph_mds_request *req;
 	struct dentry *dn;
@@ -786,6 +791,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 		mask |= CEPH_CAP_XATTR_SHARED;
 	req->r_args.open.mask = cpu_to_le32(mask);
 	req->r_parent = dir;
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	ihold(dir);
 
 	if (flags & O_CREAT) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 11/14] ceph: pass idmap to ceph_do_getattr
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (9 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 10/14] ceph/file: allow idmapped atomic_open " Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 12/14] ceph: pass idmap to __ceph_setxattr Alexander Mikhalitsyn
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Alexander Mikhalitsyn, linux-kernel

Just pass down the mount's idmapping to *ceph_do_getattr,
everywhere when possible, because we will need it later.

Here we have two cases:
- filemap_fault/read/write/lseek (when idmap is accessible)
- export_ops/list_xattr/get_xattr (when idmap is not accessible)
in this case we pass &nop_mnt_idmap.

So we can meet permission issue when MDS UID/GID-based path
restriction is used.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: brauner@kernel.org
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/addr.c   | 3 ++-
 fs/ceph/caps.c   | 3 ++-
 fs/ceph/export.c | 2 +-
 fs/ceph/file.c   | 9 ++++++---
 fs/ceph/inode.c  | 8 +++++---
 fs/ceph/ioctl.c  | 6 ++++--
 fs/ceph/quota.c  | 2 +-
 fs/ceph/super.c  | 4 ++--
 fs/ceph/super.h  | 8 +++++---
 fs/ceph/xattr.c  | 6 +++---
 10 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 6bb251a4d613..757e8e170c48 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1471,6 +1471,7 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf)
 		/* does not support inline data > PAGE_SIZE */
 		ret = VM_FAULT_SIGBUS;
 	} else {
+		struct mnt_idmap *idmap = file_mnt_idmap(vma->vm_file);
 		struct address_space *mapping = inode->i_mapping;
 		struct page *page;
 
@@ -1481,7 +1482,7 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf)
 			ret = VM_FAULT_OOM;
 			goto out_inline;
 		}
-		err = __ceph_do_getattr(inode, page,
+		err = __ceph_do_getattr(idmap, inode, page,
 					 CEPH_STAT_CAP_INLINE_DATA, true);
 		if (err < 0 || off >= i_size_read(inode)) {
 			unlock_page(page);
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 2321e5ddb664..d083ec5fda36 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -2906,6 +2906,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
 	struct inode *inode = file_inode(filp);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct mnt_idmap *idmap = file_mnt_idmap(filp);
 	int ret, _got, flags;
 
 	ret = ceph_pool_perm_check(inode, need);
@@ -3015,7 +3016,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
 			 * getattr request will bring inline data into
 			 * page cache
 			 */
-			ret = __ceph_do_getattr(inode, NULL,
+			ret = __ceph_do_getattr(idmap, inode, NULL,
 						CEPH_STAT_CAP_INLINE_DATA,
 						true);
 			if (ret < 0)
diff --git a/fs/ceph/export.c b/fs/ceph/export.c
index f780e4e0d062..9f3c6e911ae6 100644
--- a/fs/ceph/export.c
+++ b/fs/ceph/export.c
@@ -187,7 +187,7 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
 	if (IS_ERR(inode))
 		return ERR_CAST(inode);
 	/* We need LINK caps to reliably check i_nlink */
-	err = ceph_do_getattr(inode, CEPH_CAP_LINK_SHARED, false);
+	err = ceph_do_getattr(&nop_mnt_idmap, inode, CEPH_CAP_LINK_SHARED, false);
 	if (err) {
 		iput(inode);
 		return ERR_PTR(err);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index d46b6b8b5fcb..0019d5b4ae3c 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1613,6 +1613,7 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	size_t len = iov_iter_count(to);
 	struct inode *inode = file_inode(filp);
 	struct ceph_inode_info *ci = ceph_inode(inode);
+	struct mnt_idmap *idmap = file_mnt_idmap(filp);
 	bool direct_lock = iocb->ki_flags & IOCB_DIRECT;
 	ssize_t ret;
 	int want = 0, got = 0;
@@ -1693,7 +1694,7 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to)
 				return -ENOMEM;
 		}
 
-		statret = __ceph_do_getattr(inode, page,
+		statret = __ceph_do_getattr(idmap, inode, page,
 					    CEPH_STAT_CAP_INLINE_DATA, !!page);
 		if (statret < 0) {
 			if (page)
@@ -1768,6 +1769,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	struct inode *inode = file_inode(file);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	struct ceph_osd_client *osdc = &fsc->client->osdc;
 	struct ceph_cap_flush *prealloc_cf;
 	ssize_t count, written = 0;
@@ -1801,7 +1803,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	current->backing_dev_info = inode_to_bdi(inode);
 
 	if (iocb->ki_flags & IOCB_APPEND) {
-		err = ceph_do_getattr(inode, CEPH_STAT_CAP_SIZE, false);
+		err = ceph_do_getattr(idmap, inode, CEPH_STAT_CAP_SIZE, false);
 		if (err < 0)
 			goto out;
 	}
@@ -1957,9 +1959,10 @@ static loff_t ceph_llseek(struct file *file, loff_t offset, int whence)
 {
 	if (whence == SEEK_END || whence == SEEK_DATA || whence == SEEK_HOLE) {
 		struct inode *inode = file_inode(file);
+		struct mnt_idmap *idmap = file_mnt_idmap(file);
 		int ret;
 
-		ret = ceph_do_getattr(inode, CEPH_STAT_CAP_SIZE, false);
+		ret = ceph_do_getattr(idmap, inode, CEPH_STAT_CAP_SIZE, false);
 		if (ret < 0)
 			return ret;
 	}
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 58ec603a55af..3838d7dd7cd7 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2300,7 +2300,8 @@ int ceph_try_to_choose_auth_mds(struct inode *inode, int mask)
  * Verify that we have a lease on the given mask.  If not,
  * do a getattr against an mds.
  */
-int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
+int __ceph_do_getattr(struct mnt_idmap *idmap, struct inode *inode,
+		      struct page *locked_page,
 		      int mask, bool force)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(inode->i_sb);
@@ -2325,6 +2326,7 @@ int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
 		return PTR_ERR(req);
 	req->r_inode = inode;
 	ihold(inode);
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	req->r_num_caps = 1;
 	req->r_args.getattr.mask = cpu_to_le32(mask);
 	req->r_locked_page = locked_page;
@@ -2411,7 +2413,7 @@ int ceph_permission(struct mnt_idmap *idmap, struct inode *inode,
 	if (mask & MAY_NOT_BLOCK)
 		return -ECHILD;
 
-	err = ceph_do_getattr(inode, CEPH_CAP_AUTH_SHARED, false);
+	err = ceph_do_getattr(idmap, inode, CEPH_CAP_AUTH_SHARED, false);
 
 	if (!err)
 		err = generic_permission(idmap, inode, mask);
@@ -2464,7 +2466,7 @@ int ceph_getattr(struct mnt_idmap *idmap, const struct path *path,
 
 	/* Skip the getattr altogether if we're asked not to sync */
 	if ((flags & AT_STATX_SYNC_TYPE) != AT_STATX_DONT_SYNC) {
-		err = ceph_do_getattr(inode,
+		err = ceph_do_getattr(idmap, inode,
 				statx_to_caps(request_mask, inode->i_mode),
 				flags & AT_STATX_FORCE_SYNC);
 		if (err)
diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c
index deac817647eb..07be54ecc94d 100644
--- a/fs/ceph/ioctl.c
+++ b/fs/ceph/ioctl.c
@@ -17,10 +17,11 @@
 static long ceph_ioctl_get_layout(struct file *file, void __user *arg)
 {
 	struct ceph_inode_info *ci = ceph_inode(file_inode(file));
+	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	struct ceph_ioctl_layout l;
 	int err;
 
-	err = ceph_do_getattr(file_inode(file), CEPH_STAT_CAP_LAYOUT, false);
+	err = ceph_do_getattr(idmap, file_inode(file), CEPH_STAT_CAP_LAYOUT, false);
 	if (!err) {
 		l.stripe_unit = ci->i_layout.stripe_unit;
 		l.stripe_count = ci->i_layout.stripe_count;
@@ -64,6 +65,7 @@ static long __validate_layout(struct ceph_mds_client *mdsc,
 static long ceph_ioctl_set_layout(struct file *file, void __user *arg)
 {
 	struct inode *inode = file_inode(file);
+	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc;
 	struct ceph_mds_request *req;
 	struct ceph_ioctl_layout l;
@@ -75,7 +77,7 @@ static long ceph_ioctl_set_layout(struct file *file, void __user *arg)
 		return -EFAULT;
 
 	/* validate changed params against current layout */
-	err = ceph_do_getattr(file_inode(file), CEPH_STAT_CAP_LAYOUT, false);
+	err = ceph_do_getattr(idmap, file_inode(file), CEPH_STAT_CAP_LAYOUT, false);
 	if (err)
 		return err;
 
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index 64592adfe48f..aea122ac3cbe 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -150,7 +150,7 @@ static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc,
 	}
 	if (qri->inode) {
 		/* get caps */
-		int ret = __ceph_do_getattr(qri->inode, NULL,
+		int ret = __ceph_do_getattr(&nop_mnt_idmap, qri->inode, NULL,
 					    CEPH_STAT_CAP_INODE, true);
 		if (ret >= 0)
 			in = qri->inode;
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 3fc48b43cab0..797a6cb3733c 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1415,8 +1415,8 @@ int ceph_force_reconnect(struct super_block *sb)
 	fsc->mount_state = CEPH_MOUNT_MOUNTED;
 
 	if (sb->s_root) {
-		err = __ceph_do_getattr(d_inode(sb->s_root), NULL,
-					CEPH_STAT_CAP_INODE, true);
+		err = __ceph_do_getattr(&nop_mnt_idmap, d_inode(sb->s_root),
+					NULL, CEPH_STAT_CAP_INODE, true);
 	}
 	return err;
 }
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index d9cc27307cb7..ccef4a6bac52 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1044,11 +1044,13 @@ static inline void ceph_queue_flush_snaps(struct inode *inode)
 }
 
 extern int ceph_try_to_choose_auth_mds(struct inode *inode, int mask);
-extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
+extern int __ceph_do_getattr(struct mnt_idmap *idmap, struct inode *inode,
+			     struct page *locked_page,
 			     int mask, bool force);
-static inline int ceph_do_getattr(struct inode *inode, int mask, bool force)
+static inline int ceph_do_getattr(struct mnt_idmap *idmap, struct inode *inode,
+				  int mask, bool force)
 {
-	return __ceph_do_getattr(inode, NULL, mask, force);
+	return __ceph_do_getattr(idmap, inode, NULL, mask, force);
 }
 extern int ceph_permission(struct mnt_idmap *idmap,
 			   struct inode *inode, int mask);
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
index 806183959c47..d3ac854bc11f 100644
--- a/fs/ceph/xattr.c
+++ b/fs/ceph/xattr.c
@@ -952,7 +952,7 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
 			mask |= CEPH_STAT_RSTAT;
 		if (vxattr->flags & VXATTR_FLAG_DIRSTAT)
 			mask |= CEPH_CAP_FILE_SHARED;
-		err = ceph_do_getattr(inode, mask, true);
+		err = ceph_do_getattr(&nop_mnt_idmap, inode, mask, true);
 		if (err)
 			return err;
 		err = -ENODATA;
@@ -989,7 +989,7 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
 		}
 
 		/* get xattrs from mds (if we don't already have them) */
-		err = ceph_do_getattr(inode, CEPH_STAT_CAP_XATTR, true);
+		err = ceph_do_getattr(&nop_mnt_idmap, inode, CEPH_STAT_CAP_XATTR, true);
 		if (err)
 			return err;
 		spin_lock(&ci->i_ceph_lock);
@@ -1038,7 +1038,7 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size)
 	if (ci->i_xattrs.version == 0 ||
 	    !__ceph_caps_issued_mask_metric(ci, CEPH_CAP_XATTR_SHARED, 1)) {
 		spin_unlock(&ci->i_ceph_lock);
-		err = ceph_do_getattr(inode, CEPH_STAT_CAP_XATTR, true);
+		err = ceph_do_getattr(&nop_mnt_idmap, inode, CEPH_STAT_CAP_XATTR, true);
 		if (err)
 			return err;
 		spin_lock(&ci->i_ceph_lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 12/14] ceph: pass idmap to __ceph_setxattr
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (10 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 11/14] ceph: pass idmap to ceph_do_getattr Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 13/14] ceph: pass idmap to ceph_open/ioctl_set_layout Alexander Mikhalitsyn
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Alexander Mikhalitsyn, linux-kernel

Just pass down the mount's idmapping to __ceph_setxattr.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: brauner@kernel.org
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/acl.c   |  2 +-
 fs/ceph/super.h |  3 ++-
 fs/ceph/xattr.c | 12 +++++++-----
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
index d0ca5a0060d8..bb02776e3df2 100644
--- a/fs/ceph/acl.c
+++ b/fs/ceph/acl.c
@@ -145,7 +145,7 @@ int ceph_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
 			goto out_free;
 	}
 
-	ret = __ceph_setxattr(inode, name, value, size, 0);
+	ret = __ceph_setxattr(idmap, inode, name, value, size, 0);
 	if (ret) {
 		if (new_mode != old_mode) {
 			newattrs.ia_ctime = old_ctime;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index ccef4a6bac52..e23aec9554b3 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1073,7 +1073,8 @@ static inline bool ceph_inode_is_shutdown(struct inode *inode)
 }
 
 /* xattr.c */
-int __ceph_setxattr(struct inode *, const char *, const void *, size_t, int);
+int __ceph_setxattr(struct mnt_idmap *, struct inode *,
+		    const char *, const void *, size_t, int);
 int ceph_do_getvxattr(struct inode *inode, const char *name, void *value, size_t size);
 ssize_t __ceph_getxattr(struct inode *, const char *, void *, size_t);
 extern ssize_t ceph_listxattr(struct dentry *, char *, size_t);
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
index d3ac854bc11f..0acb292f600d 100644
--- a/fs/ceph/xattr.c
+++ b/fs/ceph/xattr.c
@@ -1064,7 +1064,8 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size)
 	return err;
 }
 
-static int ceph_sync_setxattr(struct inode *inode, const char *name,
+static int ceph_sync_setxattr(struct mnt_idmap *idmap,
+			      struct inode *inode, const char *name,
 			      const char *value, size_t size, int flags)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(inode->i_sb);
@@ -1118,6 +1119,7 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name,
 
 	req->r_inode = inode;
 	ihold(inode);
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	req->r_num_caps = 1;
 	req->r_inode_drop = CEPH_CAP_XATTR_SHARED;
 
@@ -1132,8 +1134,8 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name,
 	return err;
 }
 
-int __ceph_setxattr(struct inode *inode, const char *name,
-			const void *value, size_t size, int flags)
+int __ceph_setxattr(struct mnt_idmap *idmap, struct inode *inode,
+		    const char *name, const void *value, size_t size, int flags)
 {
 	struct ceph_vxattr *vxattr;
 	struct ceph_inode_info *ci = ceph_inode(inode);
@@ -1262,7 +1264,7 @@ int __ceph_setxattr(struct inode *inode, const char *name,
 				    "during filling trace\n", inode);
 		err = -EBUSY;
 	} else {
-		err = ceph_sync_setxattr(inode, name, value, size, flags);
+		err = ceph_sync_setxattr(idmap, inode, name, value, size, flags);
 		if (err >= 0 && check_realm) {
 			/* check if snaprealm was created for quota inode */
 			spin_lock(&ci->i_ceph_lock);
@@ -1298,7 +1300,7 @@ static int ceph_set_xattr_handler(const struct xattr_handler *handler,
 {
 	if (!ceph_is_valid_xattr(name))
 		return -EOPNOTSUPP;
-	return __ceph_setxattr(inode, name, value, size, flags);
+	return __ceph_setxattr(idmap, inode, name, value, size, flags);
 }
 
 static const struct xattr_handler ceph_other_xattr_handler = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 13/14] ceph: pass idmap to ceph_open/ioctl_set_layout
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (11 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 12/14] ceph: pass idmap to __ceph_setxattr Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-08 15:42 ` [PATCH v5 14/14] ceph: allow idmapped mounts Alexander Mikhalitsyn
  2023-06-09  1:57 ` [PATCH v5 00/14] ceph: support " Xiubo Li
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Alexander Mikhalitsyn, linux-kernel

Pass an idmapping to:
- ceph_open
- ceph_ioctl_set_layout

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: brauner@kernel.org
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/file.c  | 2 ++
 fs/ceph/ioctl.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 0019d5b4ae3c..3c3aacbf900b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -356,6 +356,7 @@ int ceph_open(struct inode *inode, struct file *file)
 	struct ceph_mds_client *mdsc = fsc->mdsc;
 	struct ceph_mds_request *req;
 	struct ceph_file_info *fi = file->private_data;
+	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	int err;
 	int flags, fmode, wanted;
 
@@ -426,6 +427,7 @@ int ceph_open(struct inode *inode, struct file *file)
 	ihold(inode);
 
 	req->r_num_caps = 1;
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 	err = ceph_mdsc_do_request(mdsc, NULL, req);
 	if (!err)
 		err = ceph_init_file(inode, file, req->r_fmode);
diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c
index 07be54ecc94d..d3568643d0af 100644
--- a/fs/ceph/ioctl.c
+++ b/fs/ceph/ioctl.c
@@ -113,6 +113,7 @@ static long ceph_ioctl_set_layout(struct file *file, void __user *arg)
 	req->r_inode = inode;
 	ihold(inode);
 	req->r_num_caps = 1;
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 
 	req->r_inode_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL;
 
@@ -138,6 +139,7 @@ static long ceph_ioctl_set_layout(struct file *file, void __user *arg)
 static long ceph_ioctl_set_layout_policy (struct file *file, void __user *arg)
 {
 	struct inode *inode = file_inode(file);
+	struct mnt_idmap *idmap = file_mnt_idmap(file);
 	struct ceph_mds_request *req;
 	struct ceph_ioctl_layout l;
 	int err;
@@ -159,6 +161,7 @@ static long ceph_ioctl_set_layout_policy (struct file *file, void __user *arg)
 	req->r_inode = inode;
 	ihold(inode);
 	req->r_num_caps = 1;
+	req->r_mnt_idmap = mnt_idmap_get(idmap);
 
 	req->r_args.setlayout.layout.fl_stripe_unit =
 			cpu_to_le32(l.stripe_unit);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v5 14/14] ceph: allow idmapped mounts
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (12 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 13/14] ceph: pass idmap to ceph_open/ioctl_set_layout Alexander Mikhalitsyn
@ 2023-06-08 15:42 ` Alexander Mikhalitsyn
  2023-06-09  1:57 ` [PATCH v5 00/14] ceph: support " Xiubo Li
  14 siblings, 0 replies; 47+ messages in thread
From: Alexander Mikhalitsyn @ 2023-06-08 15:42 UTC (permalink / raw)
  To: xiubli
  Cc: brauner, stgraber, linux-fsdevel, Jeff Layton, Ilya Dryomov,
	ceph-devel, Christian Brauner, Alexander Mikhalitsyn,
	linux-kernel

From: Christian Brauner <christian.brauner@ubuntu.com>

Now that we converted cephfs internally to account for idmapped mounts
allow the creation of idmapped mounts on by setting the FS_ALLOW_IDMAP
flag.

Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
 fs/ceph/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 797a6cb3733c..a72adc21f489 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1389,7 +1389,7 @@ static struct file_system_type ceph_fs_type = {
 	.name		= "ceph",
 	.init_fs_context = ceph_init_fs_context,
 	.kill_sb	= ceph_kill_sb,
-	.fs_flags	= FS_RENAME_DOES_D_MOVE,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("ceph");
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
                   ` (13 preceding siblings ...)
  2023-06-08 15:42 ` [PATCH v5 14/14] ceph: allow idmapped mounts Alexander Mikhalitsyn
@ 2023-06-09  1:57 ` Xiubo Li
  2023-06-09  8:59   ` Aleksandr Mikhalitsyn
  14 siblings, 1 reply; 47+ messages in thread
From: Xiubo Li @ 2023-06-09  1:57 UTC (permalink / raw)
  To: Alexander Mikhalitsyn
  Cc: brauner, stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton,
	ceph-devel, linux-kernel


On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> Dear friends,
>
> This patchset was originally developed by Christian Brauner but I'll continue
> to push it forward. Christian allowed me to do that :)
>
> This feature is already actively used/tested with LXD/LXC project.
>
> Git tree (based on https://github.com/ceph/ceph-client.git master):

Could you rebase these patches to 'testing' branch ?

And you still have missed several places, for example the following cases:


    1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
              req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR, 
mode);
    2    389  fs/ceph/dir.c <<ceph_readdir>>
              req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
    3    789  fs/ceph/dir.c <<ceph_lookup>>
              req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
    ...


For this requests you also need to set the real idmap.


Thanks

- Xiubo



> v5: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v5
> current: https://github.com/mihalicyn/linux/tree/fs.idmapped.ceph
>
> In the version 3 I've changed only two commits:
> - fs: export mnt_idmap_get/mnt_idmap_put
> - ceph: allow idmapped setattr inode op
> and added a new one:
> - ceph: pass idmap to __ceph_setattr
>
> In the version 4 I've reworked the ("ceph: stash idmapping in mdsc request")
> commit. Now we take idmap refcounter just in place where req->r_mnt_idmap
> is filled. It's more safer approach and prevents possible refcounter underflow
> on error paths where __register_request wasn't called but ceph_mdsc_release_request is
> called.
>
> Changelog for version 5:
> - a few commits were squashed into one (as suggested by Xiubo Li)
> - started passing an idmapping everywhere (if possible), so a caller
> UID/GID-s will be mapped almost everywhere (as suggested by Xiubo Li)
>
> I can confirm that this version passes xfstests.
>
> Links to previous versions:
> v1: https://lore.kernel.org/all/20220104140414.155198-1-brauner@kernel.org/
> v2: https://lore.kernel.org/lkml/20230524153316.476973-1-aleksandr.mikhalitsyn@canonical.com/
> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v2
> v3: https://lore.kernel.org/lkml/20230607152038.469739-1-aleksandr.mikhalitsyn@canonical.com/#t
> v4: https://lore.kernel.org/lkml/20230607180958.645115-1-aleksandr.mikhalitsyn@canonical.com/#t
> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v4
>
> Kind regards,
> Alex
>
> Original description from Christian:
> ========================================================================
> This patch series enables cephfs to support idmapped mounts, i.e. the
> ability to alter ownership information on a per-mount basis.
>
> Container managers such as LXD support sharaing data via cephfs between
> the host and unprivileged containers and between unprivileged containers.
> They may all use different idmappings. Idmapped mounts can be used to
> create mounts with the idmapping used for the container (or a different
> one specific to the use-case).
>
> There are in fact more use-cases such as remapping ownership for
> mountpoints on the host itself to grant or restrict access to different
> users or to make it possible to enforce that programs running as root
> will write with a non-zero {g,u}id to disk.
>
> The patch series is simple overall and few changes are needed to cephfs.
> There is one cephfs specific issue that I would like to discuss and
> solve which I explain in detail in:
>
> [PATCH 02/12] ceph: handle idmapped mounts in create_request_message()
>
> It has to do with how to handle mds serves which have id-based access
> restrictions configured. I would ask you to please take a look at the
> explanation in the aforementioned patch.
>
> The patch series passes the vfs and idmapped mount testsuite as part of
> xfstests. To run it you will need a config like:
>
> [ceph]
> export FSTYP=ceph
> export TEST_DIR=/mnt/test
> export TEST_DEV=10.103.182.10:6789:/
> export TEST_FS_MOUNT_OPTS="-o name=admin,secret=$password
>
> and then simply call
>
> sudo ./check -g idmapped
>
> ========================================================================
>
> Alexander Mikhalitsyn (5):
>    fs: export mnt_idmap_get/mnt_idmap_put
>    ceph: pass idmap to __ceph_setattr
>    ceph: pass idmap to ceph_do_getattr
>    ceph: pass idmap to __ceph_setxattr
>    ceph: pass idmap to ceph_open/ioctl_set_layout
>
> Christian Brauner (9):
>    ceph: stash idmapping in mdsc request
>    ceph: handle idmapped mounts in create_request_message()
>    ceph: pass an idmapping to mknod/symlink/mkdir/rename
>    ceph: allow idmapped getattr inode op
>    ceph: allow idmapped permission inode op
>    ceph: allow idmapped setattr inode op
>    ceph/acl: allow idmapped set_acl inode op
>    ceph/file: allow idmapped atomic_open inode op
>    ceph: allow idmapped mounts
>
>   fs/ceph/acl.c                 |  8 ++++----
>   fs/ceph/addr.c                |  3 ++-
>   fs/ceph/caps.c                |  3 ++-
>   fs/ceph/dir.c                 |  4 ++++
>   fs/ceph/export.c              |  2 +-
>   fs/ceph/file.c                | 21 ++++++++++++++-----
>   fs/ceph/inode.c               | 38 +++++++++++++++++++++--------------
>   fs/ceph/ioctl.c               |  9 +++++++--
>   fs/ceph/mds_client.c          | 27 +++++++++++++++++++++----
>   fs/ceph/mds_client.h          |  1 +
>   fs/ceph/quota.c               |  2 +-
>   fs/ceph/super.c               |  6 +++---
>   fs/ceph/super.h               | 14 ++++++++-----
>   fs/ceph/xattr.c               | 18 +++++++++--------
>   fs/mnt_idmapping.c            |  2 ++
>   include/linux/mnt_idmapping.h |  3 +++
>   16 files changed, 111 insertions(+), 50 deletions(-)
>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-09  1:57 ` [PATCH v5 00/14] ceph: support " Xiubo Li
@ 2023-06-09  8:59   ` Aleksandr Mikhalitsyn
  2023-06-09  9:59     ` Christian Brauner
  0 siblings, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-09  8:59 UTC (permalink / raw)
  To: Xiubo Li
  Cc: brauner, stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton,
	ceph-devel, linux-kernel

On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > Dear friends,
> >
> > This patchset was originally developed by Christian Brauner but I'll continue
> > to push it forward. Christian allowed me to do that :)
> >
> > This feature is already actively used/tested with LXD/LXC project.
> >
> > Git tree (based on https://github.com/ceph/ceph-client.git master):

Hi Xiubo!

>
> Could you rebase these patches to 'testing' branch ?

Will do in -v6.

>
> And you still have missed several places, for example the following cases:
>
>
>     1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
>               req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> mode);

+

>     2    389  fs/ceph/dir.c <<ceph_readdir>>
>               req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);

+

>     3    789  fs/ceph/dir.c <<ceph_lookup>>
>               req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);

We don't have an idmapping passed to lookup from the VFS layer. As I
mentioned before, it's just impossible now.

I've checked all places with ceph_mdsc_create_request and passed
idmapping everywhere if possible (in v6, that I will send soon).

>     ...
>
>
> For this requests you also need to set the real idmap.

Thanks,
Alex

>
>
> Thanks
>
> - Xiubo
>
>
>
> > v5: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v5
> > current: https://github.com/mihalicyn/linux/tree/fs.idmapped.ceph
> >
> > In the version 3 I've changed only two commits:
> > - fs: export mnt_idmap_get/mnt_idmap_put
> > - ceph: allow idmapped setattr inode op
> > and added a new one:
> > - ceph: pass idmap to __ceph_setattr
> >
> > In the version 4 I've reworked the ("ceph: stash idmapping in mdsc request")
> > commit. Now we take idmap refcounter just in place where req->r_mnt_idmap
> > is filled. It's more safer approach and prevents possible refcounter underflow
> > on error paths where __register_request wasn't called but ceph_mdsc_release_request is
> > called.
> >
> > Changelog for version 5:
> > - a few commits were squashed into one (as suggested by Xiubo Li)
> > - started passing an idmapping everywhere (if possible), so a caller
> > UID/GID-s will be mapped almost everywhere (as suggested by Xiubo Li)
> >
> > I can confirm that this version passes xfstests.
> >
> > Links to previous versions:
> > v1: https://lore.kernel.org/all/20220104140414.155198-1-brauner@kernel.org/
> > v2: https://lore.kernel.org/lkml/20230524153316.476973-1-aleksandr.mikhalitsyn@canonical.com/
> > tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v2
> > v3: https://lore.kernel.org/lkml/20230607152038.469739-1-aleksandr.mikhalitsyn@canonical.com/#t
> > v4: https://lore.kernel.org/lkml/20230607180958.645115-1-aleksandr.mikhalitsyn@canonical.com/#t
> > tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v4
> >
> > Kind regards,
> > Alex
> >
> > Original description from Christian:
> > ========================================================================
> > This patch series enables cephfs to support idmapped mounts, i.e. the
> > ability to alter ownership information on a per-mount basis.
> >
> > Container managers such as LXD support sharaing data via cephfs between
> > the host and unprivileged containers and between unprivileged containers.
> > They may all use different idmappings. Idmapped mounts can be used to
> > create mounts with the idmapping used for the container (or a different
> > one specific to the use-case).
> >
> > There are in fact more use-cases such as remapping ownership for
> > mountpoints on the host itself to grant or restrict access to different
> > users or to make it possible to enforce that programs running as root
> > will write with a non-zero {g,u}id to disk.
> >
> > The patch series is simple overall and few changes are needed to cephfs.
> > There is one cephfs specific issue that I would like to discuss and
> > solve which I explain in detail in:
> >
> > [PATCH 02/12] ceph: handle idmapped mounts in create_request_message()
> >
> > It has to do with how to handle mds serves which have id-based access
> > restrictions configured. I would ask you to please take a look at the
> > explanation in the aforementioned patch.
> >
> > The patch series passes the vfs and idmapped mount testsuite as part of
> > xfstests. To run it you will need a config like:
> >
> > [ceph]
> > export FSTYP=ceph
> > export TEST_DIR=/mnt/test
> > export TEST_DEV=10.103.182.10:6789:/
> > export TEST_FS_MOUNT_OPTS="-o name=admin,secret=$password
> >
> > and then simply call
> >
> > sudo ./check -g idmapped
> >
> > ========================================================================
> >
> > Alexander Mikhalitsyn (5):
> >    fs: export mnt_idmap_get/mnt_idmap_put
> >    ceph: pass idmap to __ceph_setattr
> >    ceph: pass idmap to ceph_do_getattr
> >    ceph: pass idmap to __ceph_setxattr
> >    ceph: pass idmap to ceph_open/ioctl_set_layout
> >
> > Christian Brauner (9):
> >    ceph: stash idmapping in mdsc request
> >    ceph: handle idmapped mounts in create_request_message()
> >    ceph: pass an idmapping to mknod/symlink/mkdir/rename
> >    ceph: allow idmapped getattr inode op
> >    ceph: allow idmapped permission inode op
> >    ceph: allow idmapped setattr inode op
> >    ceph/acl: allow idmapped set_acl inode op
> >    ceph/file: allow idmapped atomic_open inode op
> >    ceph: allow idmapped mounts
> >
> >   fs/ceph/acl.c                 |  8 ++++----
> >   fs/ceph/addr.c                |  3 ++-
> >   fs/ceph/caps.c                |  3 ++-
> >   fs/ceph/dir.c                 |  4 ++++
> >   fs/ceph/export.c              |  2 +-
> >   fs/ceph/file.c                | 21 ++++++++++++++-----
> >   fs/ceph/inode.c               | 38 +++++++++++++++++++++--------------
> >   fs/ceph/ioctl.c               |  9 +++++++--
> >   fs/ceph/mds_client.c          | 27 +++++++++++++++++++++----
> >   fs/ceph/mds_client.h          |  1 +
> >   fs/ceph/quota.c               |  2 +-
> >   fs/ceph/super.c               |  6 +++---
> >   fs/ceph/super.h               | 14 ++++++++-----
> >   fs/ceph/xattr.c               | 18 +++++++++--------
> >   fs/mnt_idmapping.c            |  2 ++
> >   include/linux/mnt_idmapping.h |  3 +++
> >   16 files changed, 111 insertions(+), 50 deletions(-)
> >
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-09  8:59   ` Aleksandr Mikhalitsyn
@ 2023-06-09  9:59     ` Christian Brauner
  2023-06-09 10:12       ` Aleksandr Mikhalitsyn
  0 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2023-06-09  9:59 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Xiubo Li, stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton,
	ceph-devel, linux-kernel

On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> >
> >
> > On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > > Dear friends,
> > >
> > > This patchset was originally developed by Christian Brauner but I'll continue
> > > to push it forward. Christian allowed me to do that :)
> > >
> > > This feature is already actively used/tested with LXD/LXC project.
> > >
> > > Git tree (based on https://github.com/ceph/ceph-client.git master):
> 
> Hi Xiubo!
> 
> >
> > Could you rebase these patches to 'testing' branch ?
> 
> Will do in -v6.
> 
> >
> > And you still have missed several places, for example the following cases:
> >
> >
> >     1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> >               req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> > mode);
> 
> +
> 
> >     2    389  fs/ceph/dir.c <<ceph_readdir>>
> >               req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> 
> +
> 
> >     3    789  fs/ceph/dir.c <<ceph_lookup>>
> >               req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> 
> We don't have an idmapping passed to lookup from the VFS layer. As I
> mentioned before, it's just impossible now.

->lookup() doesn't deal with idmappings and really can't otherwise you
risk ending up with inode aliasing which is really not something you
want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
value. So better not even risk exposing the idmapping in there at all.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-09  9:59     ` Christian Brauner
@ 2023-06-09 10:12       ` Aleksandr Mikhalitsyn
  2023-06-13  1:43         ` Xiubo Li
  0 siblings, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-09 10:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Xiubo Li, stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton,
	ceph-devel, linux-kernel

On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> > On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >
> > >
> > > On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > > > Dear friends,
> > > >
> > > > This patchset was originally developed by Christian Brauner but I'll continue
> > > > to push it forward. Christian allowed me to do that :)
> > > >
> > > > This feature is already actively used/tested with LXD/LXC project.
> > > >
> > > > Git tree (based on https://github.com/ceph/ceph-client.git master):
> >
> > Hi Xiubo!
> >
> > >
> > > Could you rebase these patches to 'testing' branch ?
> >
> > Will do in -v6.
> >
> > >
> > > And you still have missed several places, for example the following cases:
> > >
> > >
> > >     1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> > >               req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> > > mode);
> >
> > +
> >
> > >     2    389  fs/ceph/dir.c <<ceph_readdir>>
> > >               req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> >
> > +
> >
> > >     3    789  fs/ceph/dir.c <<ceph_lookup>>
> > >               req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> >
> > We don't have an idmapping passed to lookup from the VFS layer. As I
> > mentioned before, it's just impossible now.
>
> ->lookup() doesn't deal with idmappings and really can't otherwise you
> risk ending up with inode aliasing which is really not something you
> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
> value. So better not even risk exposing the idmapping in there at all.

Thanks for adding, Christian!

I agree, every time when we use an idmapping we need to be careful with
what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
idmapping (not mount),
but in this case, Xiubo want's current_fs{u,g}id to be mapped
according to an idmapping.
Anyway, it's impossible at now and IMHO, until we don't have any
practical use case where
UID/GID-based path restriction is used in combination with idmapped
mounts it's not worth to
make such big changes in the VFS layer.

May be I'm not right, but it seems like UID/GID-based path restriction
is not a widespread
feature and I can hardly imagine it to be used with the container
workloads (for instance),
because it will require to always keep in sync MDS permissions
configuration with the
possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
It is useful when cephfs is used as an external storage on the host, but if you
share cephfs with a few containers with different user namespaces idmapping...

Kind regards,
Alex

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-09 10:12       ` Aleksandr Mikhalitsyn
@ 2023-06-13  1:43         ` Xiubo Li
  2023-06-13 12:46           ` Aleksandr Mikhalitsyn
  2023-06-13 14:53           ` Gregory Farnum
  0 siblings, 2 replies; 47+ messages in thread
From: Xiubo Li @ 2023-06-13  1:43 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn, Christian Brauner, Gregory Farnum
  Cc: stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton, ceph-devel,
	linux-kernel


On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
>> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
>>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
>>>>
>>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
>>>>> Dear friends,
>>>>>
>>>>> This patchset was originally developed by Christian Brauner but I'll continue
>>>>> to push it forward. Christian allowed me to do that :)
>>>>>
>>>>> This feature is already actively used/tested with LXD/LXC project.
>>>>>
>>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
>>> Hi Xiubo!
>>>
>>>> Could you rebase these patches to 'testing' branch ?
>>> Will do in -v6.
>>>
>>>> And you still have missed several places, for example the following cases:
>>>>
>>>>
>>>>      1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
>>>>                req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
>>>> mode);
>>> +
>>>
>>>>      2    389  fs/ceph/dir.c <<ceph_readdir>>
>>>>                req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
>>> +
>>>
>>>>      3    789  fs/ceph/dir.c <<ceph_lookup>>
>>>>                req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
>>> We don't have an idmapping passed to lookup from the VFS layer. As I
>>> mentioned before, it's just impossible now.
>> ->lookup() doesn't deal with idmappings and really can't otherwise you
>> risk ending up with inode aliasing which is really not something you
>> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
>> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
>> value. So better not even risk exposing the idmapping in there at all.
> Thanks for adding, Christian!
>
> I agree, every time when we use an idmapping we need to be careful with
> what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
> idmapping (not mount),
> but in this case, Xiubo want's current_fs{u,g}id to be mapped
> according to an idmapping.
> Anyway, it's impossible at now and IMHO, until we don't have any
> practical use case where
> UID/GID-based path restriction is used in combination with idmapped
> mounts it's not worth to
> make such big changes in the VFS layer.
>
> May be I'm not right, but it seems like UID/GID-based path restriction
> is not a widespread
> feature and I can hardly imagine it to be used with the container
> workloads (for instance),
> because it will require to always keep in sync MDS permissions
> configuration with the
> possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
> It is useful when cephfs is used as an external storage on the host, but if you
> share cephfs with a few containers with different user namespaces idmapping...

Hmm, while this will break the MDS permission check in cephfs then in 
lookup case. If we really couldn't support it we should make it to 
escape the check anyway or some OPs may fail and won't work as expected.

@Greg

For the lookup requests the idmapping couldn't get the mapped UID/GID 
just like all the other requests, which is needed by the MDS permission 
check. Is that okay to make it disable the check for this case ? I am 
afraid this will break the MDS permssions logic.

Any idea ?

Thanks

- Xiubo


> Kind regards,
> Alex
>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-13  1:43         ` Xiubo Li
@ 2023-06-13 12:46           ` Aleksandr Mikhalitsyn
  2023-06-14  9:45             ` Christian Brauner
  2023-06-13 14:53           ` Gregory Farnum
  1 sibling, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-13 12:46 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Christian Brauner, Gregory Farnum, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Tue, Jun 13, 2023 at 3:43 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> > On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> >> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> >>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> >>>>
> >>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> >>>>> Dear friends,
> >>>>>
> >>>>> This patchset was originally developed by Christian Brauner but I'll continue
> >>>>> to push it forward. Christian allowed me to do that :)
> >>>>>
> >>>>> This feature is already actively used/tested with LXD/LXC project.
> >>>>>
> >>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
> >>> Hi Xiubo!
> >>>
> >>>> Could you rebase these patches to 'testing' branch ?
> >>> Will do in -v6.
> >>>
> >>>> And you still have missed several places, for example the following cases:
> >>>>
> >>>>
> >>>>      1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> >>>>                req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> >>>> mode);
> >>> +
> >>>
> >>>>      2    389  fs/ceph/dir.c <<ceph_readdir>>
> >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> >>> +
> >>>
> >>>>      3    789  fs/ceph/dir.c <<ceph_lookup>>
> >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> >>> We don't have an idmapping passed to lookup from the VFS layer. As I
> >>> mentioned before, it's just impossible now.
> >> ->lookup() doesn't deal with idmappings and really can't otherwise you
> >> risk ending up with inode aliasing which is really not something you
> >> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> >> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
> >> value. So better not even risk exposing the idmapping in there at all.
> > Thanks for adding, Christian!
> >
> > I agree, every time when we use an idmapping we need to be careful with
> > what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
> > idmapping (not mount),
> > but in this case, Xiubo want's current_fs{u,g}id to be mapped
> > according to an idmapping.
> > Anyway, it's impossible at now and IMHO, until we don't have any
> > practical use case where
> > UID/GID-based path restriction is used in combination with idmapped
> > mounts it's not worth to
> > make such big changes in the VFS layer.
> >
> > May be I'm not right, but it seems like UID/GID-based path restriction
> > is not a widespread
> > feature and I can hardly imagine it to be used with the container
> > workloads (for instance),
> > because it will require to always keep in sync MDS permissions
> > configuration with the
> > possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
> > It is useful when cephfs is used as an external storage on the host, but if you
> > share cephfs with a few containers with different user namespaces idmapping...
>
> Hmm, while this will break the MDS permission check in cephfs then in
> lookup case. If we really couldn't support it we should make it to
> escape the check anyway or some OPs may fail and won't work as expected.

Hi Xiubo!

Disabling UID/GID checks on the MDS side looks reasonable. IMHO the
most important checks are:
- open
- mknod/mkdir/symlink/rename
and for these checks we already have an idmapping.

Also, I want to add that it's a little bit unusual when permission
checks are done against the caller UID/GID.
Usually, if we have opened a file descriptor and, for instance, passed
this file descriptor through a unix socket then
file descriptor holder will be able to use it in accordance with the
flags (O_RDONLY, O_RDWR, ...).
We also have ->f_cred on the struct file that contains credentials of
the file opener and permission checks are usually done
based on this. But in cephfs we are always using syscall caller's
credentials. It makes cephfs file descriptor "not transferable"
in terms of permission checks.

Kind regards,
Alex

>
> @Greg
>
> For the lookup requests the idmapping couldn't get the mapped UID/GID
> just like all the other requests, which is needed by the MDS permission
> check. Is that okay to make it disable the check for this case ? I am
> afraid this will break the MDS permssions logic.
>
> Any idea ?
>
> Thanks
>
> - Xiubo
>
>
> > Kind regards,
> > Alex
> >
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-13  1:43         ` Xiubo Li
  2023-06-13 12:46           ` Aleksandr Mikhalitsyn
@ 2023-06-13 14:53           ` Gregory Farnum
  2023-06-13 16:27             ` Aleksandr Mikhalitsyn
  2023-06-14  1:52             ` Xiubo Li
  1 sibling, 2 replies; 47+ messages in thread
From: Gregory Farnum @ 2023-06-13 14:53 UTC (permalink / raw)
  To: Xiubo Li, Aleksandr Mikhalitsyn, Christian Brauner
  Cc: stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton, ceph-devel,
	linux-kernel

On Mon, Jun 12, 2023 at 6:43 PM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> > On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> >> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> >>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> >>>>
> >>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> >>>>> Dear friends,
> >>>>>
> >>>>> This patchset was originally developed by Christian Brauner but I'll continue
> >>>>> to push it forward. Christian allowed me to do that :)
> >>>>>
> >>>>> This feature is already actively used/tested with LXD/LXC project.
> >>>>>
> >>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
> >>> Hi Xiubo!
> >>>
> >>>> Could you rebase these patches to 'testing' branch ?
> >>> Will do in -v6.
> >>>
> >>>> And you still have missed several places, for example the following cases:
> >>>>
> >>>>
> >>>>      1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> >>>>                req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> >>>> mode);
> >>> +
> >>>
> >>>>      2    389  fs/ceph/dir.c <<ceph_readdir>>
> >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> >>> +
> >>>
> >>>>      3    789  fs/ceph/dir.c <<ceph_lookup>>
> >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> >>> We don't have an idmapping passed to lookup from the VFS layer. As I
> >>> mentioned before, it's just impossible now.
> >> ->lookup() doesn't deal with idmappings and really can't otherwise you
> >> risk ending up with inode aliasing which is really not something you
> >> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> >> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
> >> value. So better not even risk exposing the idmapping in there at all.
> > Thanks for adding, Christian!
> >
> > I agree, every time when we use an idmapping we need to be careful with
> > what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
> > idmapping (not mount),
> > but in this case, Xiubo want's current_fs{u,g}id to be mapped
> > according to an idmapping.
> > Anyway, it's impossible at now and IMHO, until we don't have any
> > practical use case where
> > UID/GID-based path restriction is used in combination with idmapped
> > mounts it's not worth to
> > make such big changes in the VFS layer.
> >
> > May be I'm not right, but it seems like UID/GID-based path restriction
> > is not a widespread
> > feature and I can hardly imagine it to be used with the container
> > workloads (for instance),
> > because it will require to always keep in sync MDS permissions
> > configuration with the
> > possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
> > It is useful when cephfs is used as an external storage on the host, but if you
> > share cephfs with a few containers with different user namespaces idmapping...
>
> Hmm, while this will break the MDS permission check in cephfs then in
> lookup case. If we really couldn't support it we should make it to
> escape the check anyway or some OPs may fail and won't work as expected.

I don't pretend to know the details of the VFS (or even our linux
client implementation), but I'm confused that this is apparently so
hard. It looks to me like we currently always fill in the "caller_uid"
with "from_kuid(&init_user_ns, req->r_cred->fsuid))". Is this actually
valid to begin with? If it is, why can't the uid mapping be applied on
that?

As both the client and the server share authority over the inode's
state (including things like mode bits and owners), and need to do
permission checking, being able to tell the server the relevant actor
is inherently necessary. We also let admins restrict keys to
particular UID/GID combinations as they wish, and it's not the most
popular feature but it does get deployed. I would really expect a user
of UID mapping to be one of the *most* likely to employ such a
facility...maybe not with containers, but certainly end-user homedirs
and shared spaces.

Disabling the MDS auth checks is really not an option. I guess we
could require any user employing idmapping to not be uid-restricted,
and set the anonymous UID (does that work, Xiubo, or was it the broken
one? In which case we'd have to default to root?). But that seems a
bit janky to me.
-Greg

> @Greg
>
> For the lookup requests the idmapping couldn't get the mapped UID/GID
> just like all the other requests, which is needed by the MDS permission
> check. Is that okay to make it disable the check for this case ? I am
> afraid this will break the MDS permssions logic.
>
> Any idea ?
>
> Thanks
>
> - Xiubo
>
>
> > Kind regards,
> > Alex
> >
>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-13 14:53           ` Gregory Farnum
@ 2023-06-13 16:27             ` Aleksandr Mikhalitsyn
  2023-06-14  1:52             ` Xiubo Li
  1 sibling, 0 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-13 16:27 UTC (permalink / raw)
  To: Gregory Farnum
  Cc: Xiubo Li, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Tue, Jun 13, 2023 at 4:54 PM Gregory Farnum <gfarnum@redhat.com> wrote:
>
> On Mon, Jun 12, 2023 at 6:43 PM Xiubo Li <xiubli@redhat.com> wrote:
> >
> >
> > On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> > > On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> > >> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> > >>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >>>>
> > >>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > >>>>> Dear friends,
> > >>>>>
> > >>>>> This patchset was originally developed by Christian Brauner but I'll continue
> > >>>>> to push it forward. Christian allowed me to do that :)
> > >>>>>
> > >>>>> This feature is already actively used/tested with LXD/LXC project.
> > >>>>>
> > >>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
> > >>> Hi Xiubo!
> > >>>
> > >>>> Could you rebase these patches to 'testing' branch ?
> > >>> Will do in -v6.
> > >>>
> > >>>> And you still have missed several places, for example the following cases:
> > >>>>
> > >>>>
> > >>>>      1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> > >>>>                req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> > >>>> mode);
> > >>> +
> > >>>
> > >>>>      2    389  fs/ceph/dir.c <<ceph_readdir>>
> > >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> > >>> +
> > >>>
> > >>>>      3    789  fs/ceph/dir.c <<ceph_lookup>>
> > >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> > >>> We don't have an idmapping passed to lookup from the VFS layer. As I
> > >>> mentioned before, it's just impossible now.
> > >> ->lookup() doesn't deal with idmappings and really can't otherwise you
> > >> risk ending up with inode aliasing which is really not something you
> > >> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> > >> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
> > >> value. So better not even risk exposing the idmapping in there at all.
> > > Thanks for adding, Christian!
> > >
> > > I agree, every time when we use an idmapping we need to be careful with
> > > what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
> > > idmapping (not mount),
> > > but in this case, Xiubo want's current_fs{u,g}id to be mapped
> > > according to an idmapping.
> > > Anyway, it's impossible at now and IMHO, until we don't have any
> > > practical use case where
> > > UID/GID-based path restriction is used in combination with idmapped
> > > mounts it's not worth to
> > > make such big changes in the VFS layer.
> > >
> > > May be I'm not right, but it seems like UID/GID-based path restriction
> > > is not a widespread
> > > feature and I can hardly imagine it to be used with the container
> > > workloads (for instance),
> > > because it will require to always keep in sync MDS permissions
> > > configuration with the
> > > possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
> > > It is useful when cephfs is used as an external storage on the host, but if you
> > > share cephfs with a few containers with different user namespaces idmapping...
> >
> > Hmm, while this will break the MDS permission check in cephfs then in
> > lookup case. If we really couldn't support it we should make it to
> > escape the check anyway or some OPs may fail and won't work as expected.

Dear Gregory,

Thanks for the fast reply!

>
> I don't pretend to know the details of the VFS (or even our linux
> client implementation), but I'm confused that this is apparently so
> hard. It looks to me like we currently always fill in the "caller_uid"
> with "from_kuid(&init_user_ns, req->r_cred->fsuid))". Is this actually
> valid to begin with? If it is, why can't the uid mapping be applied on
> that?

Applying an idmapping is not hard, it's as simple as replacing
from_kuid(&init_user_ns, req->r_cred->fsuid)
to
from_vfsuid(req->r_mnt_idmap, &init_user_ns, VFSUIDT_INIT(req->r_cred->fsuid))

but the problem is that we don't have req->r_mnt_idmap for all the requests.
For instance, we don't have idmap arguments (that come from the VFS
layer) for ->lookup
operation and many others. There are some reasons for that (Christian
has covered some of them).
So, it's not about my laziness to implement that. It's a real pain ;-)

>
> As both the client and the server share authority over the inode's
> state (including things like mode bits and owners), and need to do
> permission checking, being able to tell the server the relevant actor
> is inherently necessary. We also let admins restrict keys to
> particular UID/GID combinations as they wish, and it's not the most
> popular feature but it does get deployed. I would really expect a user
> of UID mapping to be one of the *most* likely to employ such a
> facility...maybe not with containers, but certainly end-user homedirs
> and shared spaces.
>
> Disabling the MDS auth checks is really not an option. I guess we
> could require any user employing idmapping to not be uid-restricted,
> and set the anonymous UID (does that work, Xiubo, or was it the broken
> one? In which case we'd have to default to root?). But that seems a
> bit janky to me.

That's an interesting point about anonymous UID, but at the same time,
We use these caller's fs UID/GID values as an owner's UID/GID for
newly created inodes.
It means that we can't use anonymous UID everywhere in this case
otherwise all new files/directories
will be owned by an anonymous user.

> -Greg

Kind regards,
Alex

>
> > @Greg
> >
> > For the lookup requests the idmapping couldn't get the mapped UID/GID
> > just like all the other requests, which is needed by the MDS permission
> > check. Is that okay to make it disable the check for this case ? I am
> > afraid this will break the MDS permssions logic.
> >
> > Any idea ?
> >
> > Thanks
> >
> > - Xiubo
> >
> >
> > > Kind regards,
> > > Alex
> > >
> >
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-13 14:53           ` Gregory Farnum
  2023-06-13 16:27             ` Aleksandr Mikhalitsyn
@ 2023-06-14  1:52             ` Xiubo Li
  2023-06-14 12:39               ` Aleksandr Mikhalitsyn
       [not found]               ` <CAEivzxcr99sERxZX17rZ5jW9YSzAWYvAjOOhBH+FqRoso2=yng@mail.gmail.com>
  1 sibling, 2 replies; 47+ messages in thread
From: Xiubo Li @ 2023-06-14  1:52 UTC (permalink / raw)
  To: Gregory Farnum, Aleksandr Mikhalitsyn, Christian Brauner
  Cc: stgraber, linux-fsdevel, Ilya Dryomov, Jeff Layton, ceph-devel,
	linux-kernel


On 6/13/23 22:53, Gregory Farnum wrote:
> On Mon, Jun 12, 2023 at 6:43 PM Xiubo Li <xiubli@redhat.com> wrote:
>>
>> On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
>>> On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
>>>> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
>>>>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
>>>>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
>>>>>>> Dear friends,
>>>>>>>
>>>>>>> This patchset was originally developed by Christian Brauner but I'll continue
>>>>>>> to push it forward. Christian allowed me to do that :)
>>>>>>>
>>>>>>> This feature is already actively used/tested with LXD/LXC project.
>>>>>>>
>>>>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
>>>>> Hi Xiubo!
>>>>>
>>>>>> Could you rebase these patches to 'testing' branch ?
>>>>> Will do in -v6.
>>>>>
>>>>>> And you still have missed several places, for example the following cases:
>>>>>>
>>>>>>
>>>>>>       1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
>>>>>>                 req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
>>>>>> mode);
>>>>> +
>>>>>
>>>>>>       2    389  fs/ceph/dir.c <<ceph_readdir>>
>>>>>>                 req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
>>>>> +
>>>>>
>>>>>>       3    789  fs/ceph/dir.c <<ceph_lookup>>
>>>>>>                 req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
>>>>> We don't have an idmapping passed to lookup from the VFS layer. As I
>>>>> mentioned before, it's just impossible now.
>>>> ->lookup() doesn't deal with idmappings and really can't otherwise you
>>>> risk ending up with inode aliasing which is really not something you
>>>> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
>>>> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
>>>> value. So better not even risk exposing the idmapping in there at all.
>>> Thanks for adding, Christian!
>>>
>>> I agree, every time when we use an idmapping we need to be careful with
>>> what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
>>> idmapping (not mount),
>>> but in this case, Xiubo want's current_fs{u,g}id to be mapped
>>> according to an idmapping.
>>> Anyway, it's impossible at now and IMHO, until we don't have any
>>> practical use case where
>>> UID/GID-based path restriction is used in combination with idmapped
>>> mounts it's not worth to
>>> make such big changes in the VFS layer.
>>>
>>> May be I'm not right, but it seems like UID/GID-based path restriction
>>> is not a widespread
>>> feature and I can hardly imagine it to be used with the container
>>> workloads (for instance),
>>> because it will require to always keep in sync MDS permissions
>>> configuration with the
>>> possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
>>> It is useful when cephfs is used as an external storage on the host, but if you
>>> share cephfs with a few containers with different user namespaces idmapping...
>> Hmm, while this will break the MDS permission check in cephfs then in
>> lookup case. If we really couldn't support it we should make it to
>> escape the check anyway or some OPs may fail and won't work as expected.
> I don't pretend to know the details of the VFS (or even our linux
> client implementation), but I'm confused that this is apparently so
> hard. It looks to me like we currently always fill in the "caller_uid"
> with "from_kuid(&init_user_ns, req->r_cred->fsuid))". Is this actually
> valid to begin with? If it is, why can't the uid mapping be applied on
> that?
>
> As both the client and the server share authority over the inode's
> state (including things like mode bits and owners), and need to do
> permission checking, being able to tell the server the relevant actor
> is inherently necessary. We also let admins restrict keys to
> particular UID/GID combinations as they wish, and it's not the most
> popular feature but it does get deployed. I would really expect a user
> of UID mapping to be one of the *most* likely to employ such a
> facility...maybe not with containers, but certainly end-user homedirs
> and shared spaces.
>
> Disabling the MDS auth checks is really not an option. I guess we
> could require any user employing idmapping to not be uid-restricted,
> and set the anonymous UID (does that work, Xiubo, or was it the broken
> one? In which case we'd have to default to root?). But that seems a
> bit janky to me.

Yeah, this also seems risky.

Instead disabling the MDS auth checks there is another option, which is 
we can prevent  the kclient to be mounted or the idmapping to be 
applied. But this still have issues, such as what if admins set the MDS 
auth caps after idmap applied to the kclients ?

IMO there have 2 options: the best way is to fix this in VFS if 
possible. Else to add one option to disable the corresponding MDS auth 
caps in ceph if users want to support the idmap feature.

Thanks

- Xiubo

> -Greg
>
>> @Greg
>>
>> For the lookup requests the idmapping couldn't get the mapped UID/GID
>> just like all the other requests, which is needed by the MDS permission
>> check. Is that okay to make it disable the check for this case ? I am
>> afraid this will break the MDS permssions logic.
>>
>> Any idea ?
>>
>> Thanks
>>
>> - Xiubo
>>
>>
>>> Kind regards,
>>> Alex
>>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-13 12:46           ` Aleksandr Mikhalitsyn
@ 2023-06-14  9:45             ` Christian Brauner
  0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2023-06-14  9:45 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Xiubo Li, Gregory Farnum, stgraber, linux-fsdevel, Ilya Dryomov,
	Jeff Layton, ceph-devel, linux-kernel

On Tue, Jun 13, 2023 at 02:46:02PM +0200, Aleksandr Mikhalitsyn wrote:
> On Tue, Jun 13, 2023 at 3:43 AM Xiubo Li <xiubli@redhat.com> wrote:
> >
> >
> > On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> > > On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> > >> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> > >>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >>>>
> > >>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > >>>>> Dear friends,
> > >>>>>
> > >>>>> This patchset was originally developed by Christian Brauner but I'll continue
> > >>>>> to push it forward. Christian allowed me to do that :)
> > >>>>>
> > >>>>> This feature is already actively used/tested with LXD/LXC project.
> > >>>>>
> > >>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
> > >>> Hi Xiubo!
> > >>>
> > >>>> Could you rebase these patches to 'testing' branch ?
> > >>> Will do in -v6.
> > >>>
> > >>>> And you still have missed several places, for example the following cases:
> > >>>>
> > >>>>
> > >>>>      1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> > >>>>                req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> > >>>> mode);
> > >>> +
> > >>>
> > >>>>      2    389  fs/ceph/dir.c <<ceph_readdir>>
> > >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> > >>> +
> > >>>
> > >>>>      3    789  fs/ceph/dir.c <<ceph_lookup>>
> > >>>>                req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> > >>> We don't have an idmapping passed to lookup from the VFS layer. As I
> > >>> mentioned before, it's just impossible now.
> > >> ->lookup() doesn't deal with idmappings and really can't otherwise you
> > >> risk ending up with inode aliasing which is really not something you
> > >> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> > >> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
> > >> value. So better not even risk exposing the idmapping in there at all.
> > > Thanks for adding, Christian!
> > >
> > > I agree, every time when we use an idmapping we need to be careful with
> > > what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
> > > idmapping (not mount),
> > > but in this case, Xiubo want's current_fs{u,g}id to be mapped
> > > according to an idmapping.
> > > Anyway, it's impossible at now and IMHO, until we don't have any
> > > practical use case where
> > > UID/GID-based path restriction is used in combination with idmapped
> > > mounts it's not worth to
> > > make such big changes in the VFS layer.
> > >
> > > May be I'm not right, but it seems like UID/GID-based path restriction
> > > is not a widespread
> > > feature and I can hardly imagine it to be used with the container
> > > workloads (for instance),
> > > because it will require to always keep in sync MDS permissions
> > > configuration with the
> > > possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
> > > It is useful when cephfs is used as an external storage on the host, but if you
> > > share cephfs with a few containers with different user namespaces idmapping...
> >
> > Hmm, while this will break the MDS permission check in cephfs then in
> > lookup case. If we really couldn't support it we should make it to
> > escape the check anyway or some OPs may fail and won't work as expected.
> 
> Hi Xiubo!
> 
> Disabling UID/GID checks on the MDS side looks reasonable. IMHO the
> most important checks are:
> - open
> - mknod/mkdir/symlink/rename
> and for these checks we already have an idmapping.
> 
> Also, I want to add that it's a little bit unusual when permission
> checks are done against the caller UID/GID.

The server side permission checking based on the sender's fs{g,u}id is
rather esoteric imho. So I would just disable it for idmapped mounts.

> Usually, if we have opened a file descriptor and, for instance, passed
> this file descriptor through a unix socket then
> file descriptor holder will be able to use it in accordance with the
> flags (O_RDONLY, O_RDWR, ...).
> We also have ->f_cred on the struct file that contains credentials of
> the file opener and permission checks are usually done
> based on this. But in cephfs we are always using syscall caller's
> credentials. It makes cephfs file descriptor "not transferable"
> in terms of permission checks.

Yeah, that's another good point.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-14  1:52             ` Xiubo Li
@ 2023-06-14 12:39               ` Aleksandr Mikhalitsyn
       [not found]               ` <CAEivzxcr99sERxZX17rZ5jW9YSzAWYvAjOOhBH+FqRoso2=yng@mail.gmail.com>
  1 sibling, 0 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-14 12:39 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Wed, Jun 14, 2023 at 3:53 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 6/13/23 22:53, Gregory Farnum wrote:
> > On Mon, Jun 12, 2023 at 6:43 PM Xiubo Li <xiubli@redhat.com> wrote:
> >>
> >> On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> >>> On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner <brauner@kernel.org> wrote:
> >>>> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote:
> >>>>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> >>>>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> >>>>>>> Dear friends,
> >>>>>>>
> >>>>>>> This patchset was originally developed by Christian Brauner but I'll continue
> >>>>>>> to push it forward. Christian allowed me to do that :)
> >>>>>>>
> >>>>>>> This feature is already actively used/tested with LXD/LXC project.
> >>>>>>>
> >>>>>>> Git tree (based on https://github.com/ceph/ceph-client.git master):
> >>>>> Hi Xiubo!
> >>>>>
> >>>>>> Could you rebase these patches to 'testing' branch ?
> >>>>> Will do in -v6.
> >>>>>
> >>>>>> And you still have missed several places, for example the following cases:
> >>>>>>
> >>>>>>
> >>>>>>       1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> >>>>>>                 req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR,
> >>>>>> mode);
> >>>>> +
> >>>>>
> >>>>>>       2    389  fs/ceph/dir.c <<ceph_readdir>>
> >>>>>>                 req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> >>>>> +
> >>>>>
> >>>>>>       3    789  fs/ceph/dir.c <<ceph_lookup>>
> >>>>>>                 req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
> >>>>> We don't have an idmapping passed to lookup from the VFS layer. As I
> >>>>> mentioned before, it's just impossible now.
> >>>> ->lookup() doesn't deal with idmappings and really can't otherwise you
> >>>> risk ending up with inode aliasing which is really not something you
> >>>> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> >>>> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide
> >>>> value. So better not even risk exposing the idmapping in there at all.
> >>> Thanks for adding, Christian!
> >>>
> >>> I agree, every time when we use an idmapping we need to be careful with
> >>> what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem
> >>> idmapping (not mount),
> >>> but in this case, Xiubo want's current_fs{u,g}id to be mapped
> >>> according to an idmapping.
> >>> Anyway, it's impossible at now and IMHO, until we don't have any
> >>> practical use case where
> >>> UID/GID-based path restriction is used in combination with idmapped
> >>> mounts it's not worth to
> >>> make such big changes in the VFS layer.
> >>>
> >>> May be I'm not right, but it seems like UID/GID-based path restriction
> >>> is not a widespread
> >>> feature and I can hardly imagine it to be used with the container
> >>> workloads (for instance),
> >>> because it will require to always keep in sync MDS permissions
> >>> configuration with the
> >>> possible UID/GID ranges on the client. It looks like a nightmare for sysadmin.
> >>> It is useful when cephfs is used as an external storage on the host, but if you
> >>> share cephfs with a few containers with different user namespaces idmapping...
> >> Hmm, while this will break the MDS permission check in cephfs then in
> >> lookup case. If we really couldn't support it we should make it to
> >> escape the check anyway or some OPs may fail and won't work as expected.
> > I don't pretend to know the details of the VFS (or even our linux
> > client implementation), but I'm confused that this is apparently so
> > hard. It looks to me like we currently always fill in the "caller_uid"
> > with "from_kuid(&init_user_ns, req->r_cred->fsuid))". Is this actually
> > valid to begin with? If it is, why can't the uid mapping be applied on
> > that?
> >
> > As both the client and the server share authority over the inode's
> > state (including things like mode bits and owners), and need to do
> > permission checking, being able to tell the server the relevant actor
> > is inherently necessary. We also let admins restrict keys to
> > particular UID/GID combinations as they wish, and it's not the most
> > popular feature but it does get deployed. I would really expect a user
> > of UID mapping to be one of the *most* likely to employ such a
> > facility...maybe not with containers, but certainly end-user homedirs
> > and shared spaces.
> >
> > Disabling the MDS auth checks is really not an option. I guess we
> > could require any user employing idmapping to not be uid-restricted,
> > and set the anonymous UID (does that work, Xiubo, or was it the broken
> > one? In which case we'd have to default to root?). But that seems a
> > bit janky to me.
>
> Yeah, this also seems risky.
>
> Instead disabling the MDS auth checks there is another option, which is
> we can prevent  the kclient to be mounted or the idmapping to be
> applied. But this still have issues, such as what if admins set the MDS
> auth caps after idmap applied to the kclients ?

Hi Xiubo,

I thought about this too and came to the same conclusion, that UID/GID based
restriction can be applied dynamically, so detecting it on mount-time
helps not so much.

>
> IMO there have 2 options: the best way is to fix this in VFS if
> possible. Else to add one option to disable the corresponding MDS auth
> caps in ceph if users want to support the idmap feature.

Dear colleagues,
Dear Xiubo,

Let me try to summarize the previous discussions about cephfs idmapped
mount support.

This discussion about the need of caller's UID/GID mapping is started
from the first
version of this patchset in this [1] thread. Let'me quote Christian here:
> Since the idmapping is a property of the mount and not a property of the
> caller the caller's fs{g,u}id aren't mapped. What is mapped are the
> inode's i{g,u}id when accessed from a particular mount.
>
> The fs{g,u}id are only ever mapped when a new filesystem object is
> created. So if I have an idmapped mount that makes it so that files
> owned by 1000 on-disk appear to be owned by uid 0 then a user with uid 0
> creating a new file will create files with uid 1000 on-disk when going
> through that mount. For cephfs that'd be the uid we would be sending
> with creation requests as I've currently written it.

This is a key part of this discussion. Idmapped mounts is not a way to proxify
caller's UID/GID, but idmapped mounts are designed to perform UID/GID mapping
of inode's owner's UID/GID. Yes, these concepts look really-really
close and from
the first glance it looks like it's just an equivalent thing. But they are not.

From my understanding, if someone wants to verify caller UID/GID then he should
take an unmapped UID/GID and verify it. It's not important if the
caller does something
through an idmapped mount or not, from_kuid(&init_user_ns, req->r_cred->fsuid))
literally "UID of the caller in a root user namespace". But cephfs
mount can be used
from any user namespace (yes, cephfs can't be mounted in user namespaces, but it
can be inherited during CLONE_NEWNS, or used as a detached mount with
open_tree/move_mount).
What I want to say by providing this example is that even now, without
idmapped mounts
we have kinda close problem, that UID/GID based restriction will be
based on the host's (!),
root user namespace, UID/GID-s even if the caller sits inside the user
namespace. And we don't care,
right? Why it's a problem with an idmapped mounts? If someone wants to
control caller's UID/GID
on the MDS side he just needs to take hosts UID/GIDs and use them in
permission rules. That's it.

Next point is that technically idmapped mounts don't break anything,
if someone starts using
idmapped mounts with UID/GID-based restrictions he will get -EACCESS.
Why is this a problem?
A user will check configuration, read the clarification in the
documentation about idmapped mounts
in cephfs and find a warning that these are not fully compatible
things right now.

IMHO, there is only one real problem (which makes UID/GID-based
restrictions is not fully compatible with
an idmapped mounts). Is that we have to map caller's UID/GID according
to a mount idmapping when we
creating a new inode (mknod, mkdir, symlink, open(O_CREAT)). But it's
only because the caller's UID/GIDs are
used as the owner's UID/GID for newly created inode. Ideally, we need
to have two fields in ceph request,
one for a caller's UID/GID and another one for inode owner UID/GID.
But this requires cephfs protocol modification
(yes, it's a bit painful. But global VFS changes are painful too!). As
Christian pointed this is a reason why
he went this way in the first patchset version.

Maybe I'm not right, but both options to properly fix that VFS API
changes or cephfs protocol modification
are too expensive until we don't have a real requestors with a good
use case for idmapped mounts + UID/GID
based permissions. We already have a real and good use case for
idmapped mounts in Cephfs for LXD/LXC.
IMHO, it's better to move this thing forward step by step, because VFS
API/cephfs protocol changes will
take a really big amount of time and it's not obvious that it's worth
it, moreover it's not even clear that VFS API
change is the right way to deal with this problem. It seems to me that
Cephfs protocol change seems like a
more proper way here. At the same time I fully understand that you are
not happy about this option.

Just to conclude, we don't have any kind of cephfs degradation here,
all users without idmapping will not be affected,
all users who start using mount idmappings with cephfs will be aware
of this limitation.

[1] https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/

Kind regards,
Alex

>
> Thanks
>
> - Xiubo
>
> > -Greg
> >
> >> @Greg
> >>
> >> For the lookup requests the idmapping couldn't get the mapped UID/GID
> >> just like all the other requests, which is needed by the MDS permission
> >> check. Is that okay to make it disable the check for this case ? I am
> >> afraid this will break the MDS permssions logic.
> >>
> >> Any idea ?
> >>
> >> Thanks
> >>
> >> - Xiubo
> >>
> >>
> >>> Kind regards,
> >>> Alex
> >>>
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
       [not found]               ` <CAEivzxcr99sERxZX17rZ5jW9YSzAWYvAjOOhBH+FqRoso2=yng@mail.gmail.com>
@ 2023-06-15  5:08                 ` Xiubo Li
  2023-06-15 11:05                   ` Aleksandr Mikhalitsyn
                                     ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Xiubo Li @ 2023-06-15  5:08 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 6/14/23 20:34, Aleksandr Mikhalitsyn wrote:
> On Wed, Jun 14, 2023 at 3:53 AM Xiubo Li <xiubli@redhat.com> wrote:
> >
> >
> > On 6/13/23 22:53, Gregory Farnum wrote:
> > > On Mon, Jun 12, 2023 at 6:43 PM Xiubo Li <xiubli@redhat.com> wrote:
> > >>
> > >> On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> > >>> On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner 
> <brauner@kernel.org> wrote:
> > >>>> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn 
> wrote:
> > >>>>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >>>>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > >>>>>>> Dear friends,
> > >>>>>>>
> > >>>>>>> This patchset was originally developed by Christian Brauner 
> but I'll continue
> > >>>>>>> to push it forward. Christian allowed me to do that :)
> > >>>>>>>
> > >>>>>>> This feature is already actively used/tested with LXD/LXC 
> project.
> > >>>>>>>
> > >>>>>>> Git tree (based on https://github.com/ceph/ceph-client.git 
> master):
> > >>>>> Hi Xiubo!
> > >>>>>
> > >>>>>> Could you rebase these patches to 'testing' branch ?
> > >>>>> Will do in -v6.
> > >>>>>
> > >>>>>> And you still have missed several places, for example the 
> following cases:
> > >>>>>>
> > >>>>>>
> > >>>>>>       1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> > >>>>>>                 req = ceph_mdsc_create_request(mdsc, 
> CEPH_MDS_OP_GETATTR,
> > >>>>>> mode);
> > >>>>> +
> > >>>>>
> > >>>>>>       2    389  fs/ceph/dir.c <<ceph_readdir>>
> > >>>>>>                 req = ceph_mdsc_create_request(mdsc, op, 
> USE_AUTH_MDS);
> > >>>>> +
> > >>>>>
> > >>>>>>       3    789  fs/ceph/dir.c <<ceph_lookup>>
> > >>>>>>                 req = ceph_mdsc_create_request(mdsc, op, 
> USE_ANY_MDS);
> > >>>>> We don't have an idmapping passed to lookup from the VFS 
> layer. As I
> > >>>>> mentioned before, it's just impossible now.
> > >>>> ->lookup() doesn't deal with idmappings and really can't 
> otherwise you
> > >>>> risk ending up with inode aliasing which is really not 
> something you
> > >>>> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> > >>>> idmapping as inode->i_{g,u}id absolutely needs to be a 
> filesystem wide
> > >>>> value. So better not even risk exposing the idmapping in there 
> at all.
> > >>> Thanks for adding, Christian!
> > >>>
> > >>> I agree, every time when we use an idmapping we need to be 
> careful with
> > >>> what we map. AFAIU, inode->i_{g,u}id should be based on the 
> filesystem
> > >>> idmapping (not mount),
> > >>> but in this case, Xiubo want's current_fs{u,g}id to be mapped
> > >>> according to an idmapping.
> > >>> Anyway, it's impossible at now and IMHO, until we don't have any
> > >>> practical use case where
> > >>> UID/GID-based path restriction is used in combination with idmapped
> > >>> mounts it's not worth to
> > >>> make such big changes in the VFS layer.
> > >>>
> > >>> May be I'm not right, but it seems like UID/GID-based path 
> restriction
> > >>> is not a widespread
> > >>> feature and I can hardly imagine it to be used with the container
> > >>> workloads (for instance),
> > >>> because it will require to always keep in sync MDS permissions
> > >>> configuration with the
> > >>> possible UID/GID ranges on the client. It looks like a nightmare 
> for sysadmin.
> > >>> It is useful when cephfs is used as an external storage on the 
> host, but if you
> > >>> share cephfs with a few containers with different user 
> namespaces idmapping...
> > >> Hmm, while this will break the MDS permission check in cephfs then in
> > >> lookup case. If we really couldn't support it we should make it to
> > >> escape the check anyway or some OPs may fail and won't work as 
> expected.
> > > I don't pretend to know the details of the VFS (or even our linux
> > > client implementation), but I'm confused that this is apparently so
> > > hard. It looks to me like we currently always fill in the "caller_uid"
> > > with "from_kuid(&init_user_ns, req->r_cred->fsuid))". Is this actually
> > > valid to begin with? If it is, why can't the uid mapping be applied on
> > > that?
> > >
> > > As both the client and the server share authority over the inode's
> > > state (including things like mode bits and owners), and need to do
> > > permission checking, being able to tell the server the relevant actor
> > > is inherently necessary. We also let admins restrict keys to
> > > particular UID/GID combinations as they wish, and it's not the most
> > > popular feature but it does get deployed. I would really expect a user
> > > of UID mapping to be one of the *most* likely to employ such a
> > > facility...maybe not with containers, but certainly end-user homedirs
> > > and shared spaces.
> > >
> > > Disabling the MDS auth checks is really not an option. I guess we
> > > could require any user employing idmapping to not be uid-restricted,
> > > and set the anonymous UID (does that work, Xiubo, or was it the broken
> > > one? In which case we'd have to default to root?). But that seems a
> > > bit janky to me.
> >
> > Yeah, this also seems risky.
> >
> > Instead disabling the MDS auth checks there is another option, which is
> > we can prevent  the kclient to be mounted or the idmapping to be
> > applied. But this still have issues, such as what if admins set the MDS
> > auth caps after idmap applied to the kclients ?
>
> Hi Xiubo,
>
> I thought about this too and came to the same conclusion, that UID/GID 
> based
> restriction can be applied dynamically, so detecting it on mount-time 
> helps not so much.
>
For this you please raise one PR to ceph first to support this, and in 
the PR we can discuss more for the MDS auth caps. And after the PR 
getting merged then in this patch series you need to check the 
corresponding option or flag to determine whether could the idmap 
mounting succeed.

Thanks

- Xiubo


> >
> > IMO there have 2 options: the best way is to fix this in VFS if
> > possible. Else to add one option to disable the corresponding MDS auth
> > caps in ceph if users want to support the idmap feature.
>
> Dear colleagues,
> Dear Xiubo,
>
> Let me try to summarize the previous discussions about cephfs idmapped 
> mount support.
>
> This discussion about the need of caller's UID/GID mapping is started 
> from the first
> version of this patchset in this [1] thread. Let'me quote Christian here:
> > Since the idmapping is a property of the mount and not a property of the
> > caller the caller's fs{g,u}id aren't mapped. What is mapped are the
> > inode's i{g,u}id when accessed from a particular mount.
> >
> > The fs{g,u}id are only ever mapped when a new filesystem object is
> > created. So if I have an idmapped mount that makes it so that files
> > owned by 1000 on-disk appear to be owned by uid 0 then a user with uid 0
> > creating a new file will create files with uid 1000 on-disk when going
> > through that mount. For cephfs that'd be the uid we would be sending
> > with creation requests as I've currently written it.
>
> This is a key part of this discussion. Idmapped mounts is not a way to 
> proxify
> caller's UID/GID, but idmapped mounts are designed to perform UID/GID 
> mapping
> of inode's owner's UID/GID. Yes, these concepts look really-really 
> close and from
> the first glance it looks like it's just an equivalent thing. But they 
> are not.
>
> From my understanding, if someone wants to verify caller UID/GID then 
> he should
> take an unmapped UID/GID and verify it. It's not important if the 
> caller does something
> through an idmapped mount or not, from_kuid(&init_user_ns, 
> req->r_cred->fsuid))
> literally "UID of the caller in a root user namespace". But cephfs 
> mount can be used
> from any user namespace (yes, cephfs can't be mounted in user 
> namespaces, but it
> can be inherited during CLONE_NEWNS, or used as a detached mount with 
> open_tree/move_mount).
> What I want to say by providing this example is that even now, without 
> idmapped mounts
> we have kinda close problem, that UID/GID based restriction will be 
> based on the host's (!),
> root user namespace, UID/GID-s even if the caller sits inside the user 
> namespace. And we don't care,
> right? Why it's a problem with an idmapped mounts? If someone wants to 
> control caller's UID/GID
> on the MDS side he just needs to take hosts UID/GIDs and use them in 
> permission rules. That's it.
>
> Next point is that technically idmapped mounts don't break anything, 
> if someone starts using
> idmapped mounts with UID/GID-based restrictions he will get -EACCESS. 
> Why is this a problem?
> A user will check configuration, read the clarification in the 
> documentation about idmapped mounts
> in cephfs and find a warning that these are not fully compatible 
> things right now.
>
> IMHO, there is only one real problem (which makes UID/GID-based 
> restrictions is not fully compatible with
> an idmapped mounts). Is that we have to map caller's UID/GID according 
> to a mount idmapping when we
> creating a new inode (mknod, mkdir, symlink, open(O_CREAT)). But it's 
> only because the caller's UID/GIDs are
> used as the owner's UID/GID for newly created inode. Ideally, we need 
> to have two fields in ceph request,
> one for a caller's UID/GID and another one for inode owner UID/GID. 
> But this requires cephfs protocol modification
> (yes, it's a bit painful. But global VFS changes are painful too!). As 
> Christian pointed this is a reason why
> he went this way in the first patchset version.
>
> Maybe I'm not right, but both options to properly fix that VFS API 
> changes or cephfs protocol modification
> are too expensive until we don't have a real requestors with a good 
> use case for idmapped mounts + UID/GID
> based permissions. We already have a real and good use case for 
> idmapped mounts in Cephfs for LXD/LXC.
> IMHO, it's better to move this thing forward step by step, because VFS 
> API/cephfs protocol changes will
> take a really big amount of time and it's not obvious that it's worth 
> it, moreover it's not even clear that VFS API
> change is the right way to deal with this problem. It seems to me that 
> Cephfs protocol change seems like a
> more proper way here. At the same time I fully understand that you are 
> not happy about this option.
>
> Just to conclude, we don't have any kind of cephfs degradation here, 
> all users without idmapping will not be affected,
> all users who start using mount idmappings with cephfs will be aware 
> of this limitation.
>
> [1] 
> https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/
>
> Kind regards,
> Alex
>
> >
> > Thanks
> >
> > - Xiubo
> >
> > > -Greg
> > >
> > >> @Greg
> > >>
> > >> For the lookup requests the idmapping couldn't get the mapped UID/GID
> > >> just like all the other requests, which is needed by the MDS 
> permission
> > >> check. Is that okay to make it disable the check for this case ? I am
> > >> afraid this will break the MDS permssions logic.
> > >>
> > >> Any idea ?
> > >>
> > >> Thanks
> > >>
> > >> - Xiubo
> > >>
> > >>
> > >>> Kind regards,
> > >>> Alex
> > >>>
> >


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-15  5:08                 ` Xiubo Li
@ 2023-06-15 11:05                   ` Aleksandr Mikhalitsyn
  2023-06-15 12:29                   ` Xiubo Li
  2023-06-24  1:36                   ` Xiubo Li
  2 siblings, 0 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-15 11:05 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Thu, Jun 15, 2023 at 7:08 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 6/14/23 20:34, Aleksandr Mikhalitsyn wrote:
> > On Wed, Jun 14, 2023 at 3:53 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >
> > >
> > > On 6/13/23 22:53, Gregory Farnum wrote:
> > > > On Mon, Jun 12, 2023 at 6:43 PM Xiubo Li <xiubli@redhat.com> wrote:
> > > >>
> > > >> On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote:
> > > >>> On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner
> > <brauner@kernel.org> wrote:
> > > >>>> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn
> > wrote:
> > > >>>>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li <xiubli@redhat.com> wrote:
> > > >>>>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote:
> > > >>>>>>> Dear friends,
> > > >>>>>>>
> > > >>>>>>> This patchset was originally developed by Christian Brauner
> > but I'll continue
> > > >>>>>>> to push it forward. Christian allowed me to do that :)
> > > >>>>>>>
> > > >>>>>>> This feature is already actively used/tested with LXD/LXC
> > project.
> > > >>>>>>>
> > > >>>>>>> Git tree (based on https://github.com/ceph/ceph-client.git
> > master):
> > > >>>>> Hi Xiubo!
> > > >>>>>
> > > >>>>>> Could you rebase these patches to 'testing' branch ?
> > > >>>>> Will do in -v6.
> > > >>>>>
> > > >>>>>> And you still have missed several places, for example the
> > following cases:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>       1    269  fs/ceph/addr.c <<ceph_netfs_issue_op_inline>>
> > > >>>>>>                 req = ceph_mdsc_create_request(mdsc,
> > CEPH_MDS_OP_GETATTR,
> > > >>>>>> mode);
> > > >>>>> +
> > > >>>>>
> > > >>>>>>       2    389  fs/ceph/dir.c <<ceph_readdir>>
> > > >>>>>>                 req = ceph_mdsc_create_request(mdsc, op,
> > USE_AUTH_MDS);
> > > >>>>> +
> > > >>>>>
> > > >>>>>>       3    789  fs/ceph/dir.c <<ceph_lookup>>
> > > >>>>>>                 req = ceph_mdsc_create_request(mdsc, op,
> > USE_ANY_MDS);
> > > >>>>> We don't have an idmapping passed to lookup from the VFS
> > layer. As I
> > > >>>>> mentioned before, it's just impossible now.
> > > >>>> ->lookup() doesn't deal with idmappings and really can't
> > otherwise you
> > > >>>> risk ending up with inode aliasing which is really not
> > something you
> > > >>>> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's
> > > >>>> idmapping as inode->i_{g,u}id absolutely needs to be a
> > filesystem wide
> > > >>>> value. So better not even risk exposing the idmapping in there
> > at all.
> > > >>> Thanks for adding, Christian!
> > > >>>
> > > >>> I agree, every time when we use an idmapping we need to be
> > careful with
> > > >>> what we map. AFAIU, inode->i_{g,u}id should be based on the
> > filesystem
> > > >>> idmapping (not mount),
> > > >>> but in this case, Xiubo want's current_fs{u,g}id to be mapped
> > > >>> according to an idmapping.
> > > >>> Anyway, it's impossible at now and IMHO, until we don't have any
> > > >>> practical use case where
> > > >>> UID/GID-based path restriction is used in combination with idmapped
> > > >>> mounts it's not worth to
> > > >>> make such big changes in the VFS layer.
> > > >>>
> > > >>> May be I'm not right, but it seems like UID/GID-based path
> > restriction
> > > >>> is not a widespread
> > > >>> feature and I can hardly imagine it to be used with the container
> > > >>> workloads (for instance),
> > > >>> because it will require to always keep in sync MDS permissions
> > > >>> configuration with the
> > > >>> possible UID/GID ranges on the client. It looks like a nightmare
> > for sysadmin.
> > > >>> It is useful when cephfs is used as an external storage on the
> > host, but if you
> > > >>> share cephfs with a few containers with different user
> > namespaces idmapping...
> > > >> Hmm, while this will break the MDS permission check in cephfs then in
> > > >> lookup case. If we really couldn't support it we should make it to
> > > >> escape the check anyway or some OPs may fail and won't work as
> > expected.
> > > > I don't pretend to know the details of the VFS (or even our linux
> > > > client implementation), but I'm confused that this is apparently so
> > > > hard. It looks to me like we currently always fill in the "caller_uid"
> > > > with "from_kuid(&init_user_ns, req->r_cred->fsuid))". Is this actually
> > > > valid to begin with? If it is, why can't the uid mapping be applied on
> > > > that?
> > > >
> > > > As both the client and the server share authority over the inode's
> > > > state (including things like mode bits and owners), and need to do
> > > > permission checking, being able to tell the server the relevant actor
> > > > is inherently necessary. We also let admins restrict keys to
> > > > particular UID/GID combinations as they wish, and it's not the most
> > > > popular feature but it does get deployed. I would really expect a user
> > > > of UID mapping to be one of the *most* likely to employ such a
> > > > facility...maybe not with containers, but certainly end-user homedirs
> > > > and shared spaces.
> > > >
> > > > Disabling the MDS auth checks is really not an option. I guess we
> > > > could require any user employing idmapping to not be uid-restricted,
> > > > and set the anonymous UID (does that work, Xiubo, or was it the broken
> > > > one? In which case we'd have to default to root?). But that seems a
> > > > bit janky to me.
> > >
> > > Yeah, this also seems risky.
> > >
> > > Instead disabling the MDS auth checks there is another option, which is
> > > we can prevent  the kclient to be mounted or the idmapping to be
> > > applied. But this still have issues, such as what if admins set the MDS
> > > auth caps after idmap applied to the kclients ?
> >
> > Hi Xiubo,
> >
> > I thought about this too and came to the same conclusion, that UID/GID
> > based
> > restriction can be applied dynamically, so detecting it on mount-time
> > helps not so much.
> >
> For this you please raise one PR to ceph first to support this, and in
> the PR we can discuss more for the MDS auth caps. And after the PR
> getting merged then in this patch series you need to check the
> corresponding option or flag to determine whether could the idmap
> mounting succeed.

I'm sorry but I don't understand what we want to support here. Do we want to
add some new ceph request that allows to check if UID/GID-based
permissions are applied for
a particular ceph client user?

Thanks,
Alex

>
> Thanks
>
> - Xiubo
>
>
> > >
> > > IMO there have 2 options: the best way is to fix this in VFS if
> > > possible. Else to add one option to disable the corresponding MDS auth
> > > caps in ceph if users want to support the idmap feature.
> >
> > Dear colleagues,
> > Dear Xiubo,
> >
> > Let me try to summarize the previous discussions about cephfs idmapped
> > mount support.
> >
> > This discussion about the need of caller's UID/GID mapping is started
> > from the first
> > version of this patchset in this [1] thread. Let'me quote Christian here:
> > > Since the idmapping is a property of the mount and not a property of the
> > > caller the caller's fs{g,u}id aren't mapped. What is mapped are the
> > > inode's i{g,u}id when accessed from a particular mount.
> > >
> > > The fs{g,u}id are only ever mapped when a new filesystem object is
> > > created. So if I have an idmapped mount that makes it so that files
> > > owned by 1000 on-disk appear to be owned by uid 0 then a user with uid 0
> > > creating a new file will create files with uid 1000 on-disk when going
> > > through that mount. For cephfs that'd be the uid we would be sending
> > > with creation requests as I've currently written it.
> >
> > This is a key part of this discussion. Idmapped mounts is not a way to
> > proxify
> > caller's UID/GID, but idmapped mounts are designed to perform UID/GID
> > mapping
> > of inode's owner's UID/GID. Yes, these concepts look really-really
> > close and from
> > the first glance it looks like it's just an equivalent thing. But they
> > are not.
> >
> > From my understanding, if someone wants to verify caller UID/GID then
> > he should
> > take an unmapped UID/GID and verify it. It's not important if the
> > caller does something
> > through an idmapped mount or not, from_kuid(&init_user_ns,
> > req->r_cred->fsuid))
> > literally "UID of the caller in a root user namespace". But cephfs
> > mount can be used
> > from any user namespace (yes, cephfs can't be mounted in user
> > namespaces, but it
> > can be inherited during CLONE_NEWNS, or used as a detached mount with
> > open_tree/move_mount).
> > What I want to say by providing this example is that even now, without
> > idmapped mounts
> > we have kinda close problem, that UID/GID based restriction will be
> > based on the host's (!),
> > root user namespace, UID/GID-s even if the caller sits inside the user
> > namespace. And we don't care,
> > right? Why it's a problem with an idmapped mounts? If someone wants to
> > control caller's UID/GID
> > on the MDS side he just needs to take hosts UID/GIDs and use them in
> > permission rules. That's it.
> >
> > Next point is that technically idmapped mounts don't break anything,
> > if someone starts using
> > idmapped mounts with UID/GID-based restrictions he will get -EACCESS.
> > Why is this a problem?
> > A user will check configuration, read the clarification in the
> > documentation about idmapped mounts
> > in cephfs and find a warning that these are not fully compatible
> > things right now.
> >
> > IMHO, there is only one real problem (which makes UID/GID-based
> > restrictions is not fully compatible with
> > an idmapped mounts). Is that we have to map caller's UID/GID according
> > to a mount idmapping when we
> > creating a new inode (mknod, mkdir, symlink, open(O_CREAT)). But it's
> > only because the caller's UID/GIDs are
> > used as the owner's UID/GID for newly created inode. Ideally, we need
> > to have two fields in ceph request,
> > one for a caller's UID/GID and another one for inode owner UID/GID.
> > But this requires cephfs protocol modification
> > (yes, it's a bit painful. But global VFS changes are painful too!). As
> > Christian pointed this is a reason why
> > he went this way in the first patchset version.
> >
> > Maybe I'm not right, but both options to properly fix that VFS API
> > changes or cephfs protocol modification
> > are too expensive until we don't have a real requestors with a good
> > use case for idmapped mounts + UID/GID
> > based permissions. We already have a real and good use case for
> > idmapped mounts in Cephfs for LXD/LXC.
> > IMHO, it's better to move this thing forward step by step, because VFS
> > API/cephfs protocol changes will
> > take a really big amount of time and it's not obvious that it's worth
> > it, moreover it's not even clear that VFS API
> > change is the right way to deal with this problem. It seems to me that
> > Cephfs protocol change seems like a
> > more proper way here. At the same time I fully understand that you are
> > not happy about this option.
> >
> > Just to conclude, we don't have any kind of cephfs degradation here,
> > all users without idmapping will not be affected,
> > all users who start using mount idmappings with cephfs will be aware
> > of this limitation.
> >
> > [1]
> > https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/
> >
> > Kind regards,
> > Alex
> >
> > >
> > > Thanks
> > >
> > > - Xiubo
> > >
> > > > -Greg
> > > >
> > > >> @Greg
> > > >>
> > > >> For the lookup requests the idmapping couldn't get the mapped UID/GID
> > > >> just like all the other requests, which is needed by the MDS
> > permission
> > > >> check. Is that okay to make it disable the check for this case ? I am
> > > >> afraid this will break the MDS permssions logic.
> > > >>
> > > >> Any idea ?
> > > >>
> > > >> Thanks
> > > >>
> > > >> - Xiubo
> > > >>
> > > >>
> > > >>> Kind regards,
> > > >>> Alex
> > > >>>
> > >
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-15  5:08                 ` Xiubo Li
  2023-06-15 11:05                   ` Aleksandr Mikhalitsyn
@ 2023-06-15 12:29                   ` Xiubo Li
  2023-06-15 12:54                     ` Aleksandr Mikhalitsyn
  2023-06-24  1:36                   ` Xiubo Li
  2 siblings, 1 reply; 47+ messages in thread
From: Xiubo Li @ 2023-06-15 12:29 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

[...]

 > > >
 > > > I thought about this too and came to the same conclusion, that 
UID/GID
 > > > based
 > > > restriction can be applied dynamically, so detecting it on mount-time
 > > > helps not so much.
 > > >
 > > For this you please raise one PR to ceph first to support this, and in
 > > the PR we can discuss more for the MDS auth caps. And after the PR
 > > getting merged then in this patch series you need to check the
 > > corresponding option or flag to determine whether could the idmap
 > > mounting succeed.
 >
 > I'm sorry but I don't understand what we want to support here. Do we 
want to
 > add some new ceph request that allows to check if UID/GID-based
 > permissions are applied for
 > a particular ceph client user?

IMO we should prevent users to set UID/GID-based MDS auth caps from ceph 
side. And users should know what has happened.

Once users want to support the idmap mounts they should know that the 
MDS auth caps won't work anymore.

Thanks

- Xiubo


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-15 12:29                   ` Xiubo Li
@ 2023-06-15 12:54                     ` Aleksandr Mikhalitsyn
  2023-06-21 16:55                       ` Aleksandr Mikhalitsyn
  2023-06-26  1:04                       ` Xiubo Li
  0 siblings, 2 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-15 12:54 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Thu, Jun 15, 2023 at 2:29 PM Xiubo Li <xiubli@redhat.com> wrote:
>
> [...]
>
>  > > >
>  > > > I thought about this too and came to the same conclusion, that
> UID/GID
>  > > > based
>  > > > restriction can be applied dynamically, so detecting it on mount-time
>  > > > helps not so much.
>  > > >
>  > > For this you please raise one PR to ceph first to support this, and in
>  > > the PR we can discuss more for the MDS auth caps. And after the PR
>  > > getting merged then in this patch series you need to check the
>  > > corresponding option or flag to determine whether could the idmap
>  > > mounting succeed.
>  >
>  > I'm sorry but I don't understand what we want to support here. Do we
> want to
>  > add some new ceph request that allows to check if UID/GID-based
>  > permissions are applied for
>  > a particular ceph client user?
>
> IMO we should prevent users to set UID/GID-based MDS auth caps from ceph
> side. And users should know what has happened.

ok, we want to restrict setting of UID/GID-based permissions if there is an
idmapped mount on the client. IMHO, idmapping mounts is truly a
client-side feature
and server modification looks a bit strange to me.

>
> Once users want to support the idmap mounts they should know that the
> MDS auth caps won't work anymore.

They will work, but permission rule configuration should include
non-mapped UID/GID-s.
As I mentioned here [1] it's already the case even without mount idmappings.

It would be great to discuss this thing as a concept and synchronize
our understanding of this
before going into modification of a server side.

[1] https://lore.kernel.org/lkml/CAEivzxcBBJV6DOGzy5S7=TUjrXZfVaGaJX5z7WFzYq1w4MdtiA@mail.gmail.com/

Kind regards,
Alex

>
> Thanks
>
> - Xiubo
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-15 12:54                     ` Aleksandr Mikhalitsyn
@ 2023-06-21 16:55                       ` Aleksandr Mikhalitsyn
  2023-06-26  1:04                       ` Xiubo Li
  1 sibling, 0 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-21 16:55 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Thu, Jun 15, 2023 at 2:54 PM Aleksandr Mikhalitsyn
<aleksandr.mikhalitsyn@canonical.com> wrote:
>
> On Thu, Jun 15, 2023 at 2:29 PM Xiubo Li <xiubli@redhat.com> wrote:
> >
> > [...]
> >
> >  > > >
> >  > > > I thought about this too and came to the same conclusion, that
> > UID/GID
> >  > > > based
> >  > > > restriction can be applied dynamically, so detecting it on mount-time
> >  > > > helps not so much.
> >  > > >
> >  > > For this you please raise one PR to ceph first to support this, and in
> >  > > the PR we can discuss more for the MDS auth caps. And after the PR
> >  > > getting merged then in this patch series you need to check the
> >  > > corresponding option or flag to determine whether could the idmap
> >  > > mounting succeed.
> >  >
> >  > I'm sorry but I don't understand what we want to support here. Do we
> > want to
> >  > add some new ceph request that allows to check if UID/GID-based
> >  > permissions are applied for
> >  > a particular ceph client user?
> >
> > IMO we should prevent users to set UID/GID-based MDS auth caps from ceph
> > side. And users should know what has happened.
>
> ok, we want to restrict setting of UID/GID-based permissions if there is an
> idmapped mount on the client. IMHO, idmapping mounts is truly a
> client-side feature
> and server modification looks a bit strange to me.
>
> >
> > Once users want to support the idmap mounts they should know that the
> > MDS auth caps won't work anymore.
>
> They will work, but permission rule configuration should include
> non-mapped UID/GID-s.
> As I mentioned here [1] it's already the case even without mount idmappings.
>
> It would be great to discuss this thing as a concept and synchronize
> our understanding of this
> before going into modification of a server side.

Hi everyone,

I've spent some extra time analyzing this issue with UID/GID-based
path restriction feature and idmapped mounts
one more time and am still fully sure that we have two ways here:
I. Extend Cephfs protocol (add new fields to request arguments in the
"union ceph_mds_request_args")
There should be 2 new fields for the file/directory owner's UID and
GID respectively. With the help of these
new fields, we will be able to split the permission check logic (that
is based on the caller's UID/GID and should not be affected by
the mounts idmapping at all!) and file owner concept, which involves
mounts' idmapping.
II. ignore this issue as non-critical, because:
- idmapped mounts can be created only by privileged users
(CAP_SYS_ADMIN in the superblock owner's user namespace (currently,
it's always the initial user namespace!))
- the surface of the problem is really small (combination of idmapped
mount + UID/GID path-based restriction)
- problem *can* be workarounded by appropriate permission
configuration (UID/GID permissions should be configured to
include both the mount's idmapping UIDs/GIDs and the host ones).

Before that I've highlighted some existing problems of this UID/GID
path-based restriction feature:
- [kernel client] UID/GIDs are sent to the server always from the
initial user namespace (while the caller can be from inside the
container with a non-initial user namespace)
- [fuse client] UID/GIDs are always mapped to the fuse mount's
superblock user namespace
(https://github.com/ceph/ceph-client/blob/409e873ea3c1fd3079909718bbeb06ac1ec7f38b/fs/fuse/dev.c#L138)
It means that we already have analogical inconsistency between clients
(userspace one and kernel).
- [kernel client] We always take current user credentials instead of
using (struct file)->f_cred as it has usually done for other
filesystems

Please understand me in the right way, I'm not trying to say that we
need to be lazy and ignore the issue at all, but I'm
just trying to say that this issue is not local and is not caused by
an idmapped mounts, but it there is something to do on the cephfs
side,
we need to extend protocol and it's not obvious that it is worth it.
My understanding is that we need to clarify this limitation in
cephfs kernel client documentation and explain how to configure
UID/GID path-based permissions with idmapped mounts to work around
this.
And if we get requests from our users that this is interesting to
someone to support it in the right way then we can do all of this
crazy stuff
by extending ceph protocol. Btw, I've checked when "union
ceph_mds_request_args" was extended last time. It was 6 (!) years ago
:)

Kind regards,
Alex

>
> [1] https://lore.kernel.org/lkml/CAEivzxcBBJV6DOGzy5S7=TUjrXZfVaGaJX5z7WFzYq1w4MdtiA@mail.gmail.com/
>
> Kind regards,
> Alex
>
> >
> > Thanks
> >
> > - Xiubo
> >

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-15  5:08                 ` Xiubo Li
  2023-06-15 11:05                   ` Aleksandr Mikhalitsyn
  2023-06-15 12:29                   ` Xiubo Li
@ 2023-06-24  1:36                   ` Xiubo Li
  2023-06-24  7:11                     ` Aleksandr Mikhalitsyn
  2 siblings, 1 reply; 47+ messages in thread
From: Xiubo Li @ 2023-06-24  1:36 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

[...]

 > > >
 > > > I thought about this too and came to the same conclusion, that 
UID/GID
 > > > based
 > > > restriction can be applied dynamically, so detecting it on mount-time
 > > > helps not so much.
 > > >
 > > For this you please raise one PR to ceph first to support this, and in
 > > the PR we can discuss more for the MDS auth caps. And after the PR
 > > getting merged then in this patch series you need to check the
 > > corresponding option or flag to determine whether could the idmap
 > > mounting succeed.
 >
 > I'm sorry but I don't understand what we want to support here. Do we 
want to
 > add some new ceph request that allows to check if UID/GID-based
 > permissions are applied for
 > a particular ceph client user?

IMO we should prevent user to set UID/GID-based permisions caps from 
ceph side.

As I know currently there is no way to prevent users to set MDS auth 
caps, IMO in ceph side at least we need one flag or option to disable 
this once users want this fs cluster sever for idmap mounts use case.

Thanks

- Xiubo


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-24  1:36                   ` Xiubo Li
@ 2023-06-24  7:11                     ` Aleksandr Mikhalitsyn
  2023-06-26  2:12                       ` Xiubo Li
  0 siblings, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-24  7:11 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li <xiubli@redhat.com> wrote:
>
> [...]
>
>  > > >
>  > > > I thought about this too and came to the same conclusion, that
> UID/GID
>  > > > based
>  > > > restriction can be applied dynamically, so detecting it on mount-time
>  > > > helps not so much.
>  > > >
>  > > For this you please raise one PR to ceph first to support this, and in
>  > > the PR we can discuss more for the MDS auth caps. And after the PR
>  > > getting merged then in this patch series you need to check the
>  > > corresponding option or flag to determine whether could the idmap
>  > > mounting succeed.
>  >
>  > I'm sorry but I don't understand what we want to support here. Do we
> want to
>  > add some new ceph request that allows to check if UID/GID-based
>  > permissions are applied for
>  > a particular ceph client user?
>
> IMO we should prevent user to set UID/GID-based permisions caps from
> ceph side.
>
> As I know currently there is no way to prevent users to set MDS auth
> caps, IMO in ceph side at least we need one flag or option to disable
> this once users want this fs cluster sever for idmap mounts use case.

How this should be visible from the user side? We introducing a new
kernel client mount option,
like "nomdscaps", then pass flag to the MDS and MDS should check that
MDS auth permissions
are not applied (on the mount time) and prevent them from being
applied later while session is active. Like that?

At the same time I'm thinking about protocol extension that adds 2
additional fields for UID/GID. This will allow to correctly
handle everything. I wanted to avoid any changes to the protocol or
server-side things. But if we want to change MDS side,
maybe it's better then to go this way?

Thanks,
Alex

>
> Thanks
>
> - Xiubo
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-15 12:54                     ` Aleksandr Mikhalitsyn
  2023-06-21 16:55                       ` Aleksandr Mikhalitsyn
@ 2023-06-26  1:04                       ` Xiubo Li
  1 sibling, 0 replies; 47+ messages in thread
From: Xiubo Li @ 2023-06-26  1:04 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 6/15/23 20:54, Aleksandr Mikhalitsyn wrote:
> On Thu, Jun 15, 2023 at 2:29 PM Xiubo Li <xiubli@redhat.com> wrote:
>> [...]
>>
>>   > > >
>>   > > > I thought about this too and came to the same conclusion, that
>> UID/GID
>>   > > > based
>>   > > > restriction can be applied dynamically, so detecting it on mount-time
>>   > > > helps not so much.
>>   > > >
>>   > > For this you please raise one PR to ceph first to support this, and in
>>   > > the PR we can discuss more for the MDS auth caps. And after the PR
>>   > > getting merged then in this patch series you need to check the
>>   > > corresponding option or flag to determine whether could the idmap
>>   > > mounting succeed.
>>   >
>>   > I'm sorry but I don't understand what we want to support here. Do we
>> want to
>>   > add some new ceph request that allows to check if UID/GID-based
>>   > permissions are applied for
>>   > a particular ceph client user?
>>
>> IMO we should prevent users to set UID/GID-based MDS auth caps from ceph
>> side. And users should know what has happened.
> ok, we want to restrict setting of UID/GID-based permissions if there is an
> idmapped mount on the client. IMHO, idmapping mounts is truly a
> client-side feature
> and server modification looks a bit strange to me.

Yeah, agree.

But without fixing the lookup issue in kclient side it will be buggy and 
may make some tests fail too.

We need to support this more smoothly.

Thanks

- Xiubo

>> Once users want to support the idmap mounts they should know that the
>> MDS auth caps won't work anymore.
> They will work, but permission rule configuration should include
> non-mapped UID/GID-s.
> As I mentioned here [1] it's already the case even without mount idmappings.
>
> It would be great to discuss this thing as a concept and synchronize
> our understanding of this
> before going into modification of a server side.
>
> [1] https://lore.kernel.org/lkml/CAEivzxcBBJV6DOGzy5S7=TUjrXZfVaGaJX5z7WFzYq1w4MdtiA@mail.gmail.com/
>
> Kind regards,
> Alex
>
>> Thanks
>>
>> - Xiubo
>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-24  7:11                     ` Aleksandr Mikhalitsyn
@ 2023-06-26  2:12                       ` Xiubo Li
  2023-06-26 11:23                         ` Aleksandr Mikhalitsyn
  0 siblings, 1 reply; 47+ messages in thread
From: Xiubo Li @ 2023-06-26  2:12 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li <xiubli@redhat.com> wrote:
>> [...]
>>
>>   > > >
>>   > > > I thought about this too and came to the same conclusion, that
>> UID/GID
>>   > > > based
>>   > > > restriction can be applied dynamically, so detecting it on mount-time
>>   > > > helps not so much.
>>   > > >
>>   > > For this you please raise one PR to ceph first to support this, and in
>>   > > the PR we can discuss more for the MDS auth caps. And after the PR
>>   > > getting merged then in this patch series you need to check the
>>   > > corresponding option or flag to determine whether could the idmap
>>   > > mounting succeed.
>>   >
>>   > I'm sorry but I don't understand what we want to support here. Do we
>> want to
>>   > add some new ceph request that allows to check if UID/GID-based
>>   > permissions are applied for
>>   > a particular ceph client user?
>>
>> IMO we should prevent user to set UID/GID-based permisions caps from
>> ceph side.
>>
>> As I know currently there is no way to prevent users to set MDS auth
>> caps, IMO in ceph side at least we need one flag or option to disable
>> this once users want this fs cluster sever for idmap mounts use case.
> How this should be visible from the user side? We introducing a new
> kernel client mount option,
> like "nomdscaps", then pass flag to the MDS and MDS should check that
> MDS auth permissions
> are not applied (on the mount time) and prevent them from being
> applied later while session is active. Like that?
>
> At the same time I'm thinking about protocol extension that adds 2
> additional fields for UID/GID. This will allow to correctly
> handle everything. I wanted to avoid any changes to the protocol or
> server-side things. But if we want to change MDS side,
> maybe it's better then to go this way?

There is another way:

For each client it will have a dedicated client auth caps, something like:

client.foo
   key: *key*
   caps: [mds] allow r, allow rw path=/bar
   caps: [mon] allow r
   caps: [osd] allow rw tag cephfs data=cephfs_a

When mounting this client with idmap enabled, then we can just check the 
above [mds] caps, if there has any UID/GID based permissions set, then 
fail the mounting.

That means this kind client couldn't be mounted with idmap enabled.

Also we need to make sure that once there is a mount with idmap enabled, 
the corresponding client caps couldn't be append the UID/GID based 
permissions. This need a patch in ceph anyway IMO.

Thanks

- Xiubo





>
> Thanks,
> Alex
>
>> Thanks
>>
>> - Xiubo
>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-26  2:12                       ` Xiubo Li
@ 2023-06-26 11:23                         ` Aleksandr Mikhalitsyn
  2023-06-26 11:49                           ` Aleksandr Mikhalitsyn
  2023-07-04  1:08                           ` Xiubo Li
  0 siblings, 2 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-26 11:23 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
> > On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li <xiubli@redhat.com> wrote:
> >> [...]
> >>
> >>   > > >
> >>   > > > I thought about this too and came to the same conclusion, that
> >> UID/GID
> >>   > > > based
> >>   > > > restriction can be applied dynamically, so detecting it on mount-time
> >>   > > > helps not so much.
> >>   > > >
> >>   > > For this you please raise one PR to ceph first to support this, and in
> >>   > > the PR we can discuss more for the MDS auth caps. And after the PR
> >>   > > getting merged then in this patch series you need to check the
> >>   > > corresponding option or flag to determine whether could the idmap
> >>   > > mounting succeed.
> >>   >
> >>   > I'm sorry but I don't understand what we want to support here. Do we
> >> want to
> >>   > add some new ceph request that allows to check if UID/GID-based
> >>   > permissions are applied for
> >>   > a particular ceph client user?
> >>
> >> IMO we should prevent user to set UID/GID-based permisions caps from
> >> ceph side.
> >>
> >> As I know currently there is no way to prevent users to set MDS auth
> >> caps, IMO in ceph side at least we need one flag or option to disable
> >> this once users want this fs cluster sever for idmap mounts use case.
> > How this should be visible from the user side? We introducing a new
> > kernel client mount option,
> > like "nomdscaps", then pass flag to the MDS and MDS should check that
> > MDS auth permissions
> > are not applied (on the mount time) and prevent them from being
> > applied later while session is active. Like that?
> >
> > At the same time I'm thinking about protocol extension that adds 2
> > additional fields for UID/GID. This will allow to correctly
> > handle everything. I wanted to avoid any changes to the protocol or
> > server-side things. But if we want to change MDS side,
> > maybe it's better then to go this way?

Hi Xiubo,

>
> There is another way:
>
> For each client it will have a dedicated client auth caps, something like:
>
> client.foo
>    key: *key*
>    caps: [mds] allow r, allow rw path=/bar
>    caps: [mon] allow r
>    caps: [osd] allow rw tag cephfs data=cephfs_a

Do we have any infrastructure to get this caps list on the client side
right now?
(I've taken a quick look through the code and can't find anything
related to this.)

>
> When mounting this client with idmap enabled, then we can just check the
> above [mds] caps, if there has any UID/GID based permissions set, then
> fail the mounting.

understood

>
> That means this kind client couldn't be mounted with idmap enabled.
>
> Also we need to make sure that once there is a mount with idmap enabled,
> the corresponding client caps couldn't be append the UID/GID based
> permissions. This need a patch in ceph anyway IMO.

So, yeah we will need to effectively block cephx permission changes if
there is a client mounted with
an active idmapped mount. Sounds as something that require massive
changes on the server side.

At the same time this will just block users from using idmapped mounts
along with UID/GID restrictions.

If you want me to change server-side anyways, isn't it better just to
extend cephfs protocol to properly
handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
What we need to do here is to add a separate UID/GID fields for ceph
requests those are creating a new inodes
(like mknod, symlink, etc).

Kind regards,
Alex

>
> Thanks
>
> - Xiubo
>
>
>
>
>
> >
> > Thanks,
> > Alex
> >
> >> Thanks
> >>
> >> - Xiubo
> >>
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-26 11:23                         ` Aleksandr Mikhalitsyn
@ 2023-06-26 11:49                           ` Aleksandr Mikhalitsyn
  2023-07-04  1:10                             ` Xiubo Li
  2023-07-04  1:08                           ` Xiubo Li
  1 sibling, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-06-26 11:49 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Mon, Jun 26, 2023 at 1:23 PM Aleksandr Mikhalitsyn
<aleksandr.mikhalitsyn@canonical.com> wrote:
>
> On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li <xiubli@redhat.com> wrote:
> >
> >
> > On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
> > > On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >> [...]
> > >>
> > >>   > > >
> > >>   > > > I thought about this too and came to the same conclusion, that
> > >> UID/GID
> > >>   > > > based
> > >>   > > > restriction can be applied dynamically, so detecting it on mount-time
> > >>   > > > helps not so much.
> > >>   > > >
> > >>   > > For this you please raise one PR to ceph first to support this, and in
> > >>   > > the PR we can discuss more for the MDS auth caps. And after the PR
> > >>   > > getting merged then in this patch series you need to check the
> > >>   > > corresponding option or flag to determine whether could the idmap
> > >>   > > mounting succeed.
> > >>   >
> > >>   > I'm sorry but I don't understand what we want to support here. Do we
> > >> want to
> > >>   > add some new ceph request that allows to check if UID/GID-based
> > >>   > permissions are applied for
> > >>   > a particular ceph client user?
> > >>
> > >> IMO we should prevent user to set UID/GID-based permisions caps from
> > >> ceph side.
> > >>
> > >> As I know currently there is no way to prevent users to set MDS auth
> > >> caps, IMO in ceph side at least we need one flag or option to disable
> > >> this once users want this fs cluster sever for idmap mounts use case.
> > > How this should be visible from the user side? We introducing a new
> > > kernel client mount option,
> > > like "nomdscaps", then pass flag to the MDS and MDS should check that
> > > MDS auth permissions
> > > are not applied (on the mount time) and prevent them from being
> > > applied later while session is active. Like that?
> > >
> > > At the same time I'm thinking about protocol extension that adds 2
> > > additional fields for UID/GID. This will allow to correctly
> > > handle everything. I wanted to avoid any changes to the protocol or
> > > server-side things. But if we want to change MDS side,
> > > maybe it's better then to go this way?
>
> Hi Xiubo,
>
> >
> > There is another way:
> >
> > For each client it will have a dedicated client auth caps, something like:
> >
> > client.foo
> >    key: *key*
> >    caps: [mds] allow r, allow rw path=/bar
> >    caps: [mon] allow r
> >    caps: [osd] allow rw tag cephfs data=cephfs_a
>
> Do we have any infrastructure to get this caps list on the client side
> right now?
> (I've taken a quick look through the code and can't find anything
> related to this.)

I've found your PR that looks related https://github.com/ceph/ceph/pull/48027

>
> >
> > When mounting this client with idmap enabled, then we can just check the
> > above [mds] caps, if there has any UID/GID based permissions set, then
> > fail the mounting.
>
> understood
>
> >
> > That means this kind client couldn't be mounted with idmap enabled.
> >
> > Also we need to make sure that once there is a mount with idmap enabled,
> > the corresponding client caps couldn't be append the UID/GID based
> > permissions. This need a patch in ceph anyway IMO.
>
> So, yeah we will need to effectively block cephx permission changes if
> there is a client mounted with
> an active idmapped mount. Sounds as something that require massive
> changes on the server side.
>
> At the same time this will just block users from using idmapped mounts
> along with UID/GID restrictions.
>
> If you want me to change server-side anyways, isn't it better just to
> extend cephfs protocol to properly
> handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
> What we need to do here is to add a separate UID/GID fields for ceph
> requests those are creating a new inodes
> (like mknod, symlink, etc).
>
> Kind regards,
> Alex
>
> >
> > Thanks
> >
> > - Xiubo
> >
> >
> >
> >
> >
> > >
> > > Thanks,
> > > Alex
> > >
> > >> Thanks
> > >>
> > >> - Xiubo
> > >>
> >

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-26 11:23                         ` Aleksandr Mikhalitsyn
  2023-06-26 11:49                           ` Aleksandr Mikhalitsyn
@ 2023-07-04  1:08                           ` Xiubo Li
  2023-07-14 12:57                             ` Aleksandr Mikhalitsyn
  1 sibling, 1 reply; 47+ messages in thread
From: Xiubo Li @ 2023-07-04  1:08 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

Sorry, not sure, why my last reply wasn't sent out.

Do it again.


On 6/26/23 19:23, Aleksandr Mikhalitsyn wrote:
> On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li<xiubli@redhat.com>  wrote:
>> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
>>> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li<xiubli@redhat.com>  wrote:
>>>> [...]
>>>>
>>>>    > > >
>>>>    > > > I thought about this too and came to the same conclusion, that
>>>> UID/GID
>>>>    > > > based
>>>>    > > > restriction can be applied dynamically, so detecting it on mount-time
>>>>    > > > helps not so much.
>>>>    > > >
>>>>    > > For this you please raise one PR to ceph first to support this, and in
>>>>    > > the PR we can discuss more for the MDS auth caps. And after the PR
>>>>    > > getting merged then in this patch series you need to check the
>>>>    > > corresponding option or flag to determine whether could the idmap
>>>>    > > mounting succeed.
>>>>    >
>>>>    > I'm sorry but I don't understand what we want to support here. Do we
>>>> want to
>>>>    > add some new ceph request that allows to check if UID/GID-based
>>>>    > permissions are applied for
>>>>    > a particular ceph client user?
>>>>
>>>> IMO we should prevent user to set UID/GID-based permisions caps from
>>>> ceph side.
>>>>
>>>> As I know currently there is no way to prevent users to set MDS auth
>>>> caps, IMO in ceph side at least we need one flag or option to disable
>>>> this once users want this fs cluster sever for idmap mounts use case.
>>> How this should be visible from the user side? We introducing a new
>>> kernel client mount option,
>>> like "nomdscaps", then pass flag to the MDS and MDS should check that
>>> MDS auth permissions
>>> are not applied (on the mount time) and prevent them from being
>>> applied later while session is active. Like that?
>>>
>>> At the same time I'm thinking about protocol extension that adds 2
>>> additional fields for UID/GID. This will allow to correctly
>>> handle everything. I wanted to avoid any changes to the protocol or
>>> server-side things. But if we want to change MDS side,
>>> maybe it's better then to go this way?
> Hi Xiubo,
>
>> There is another way:
>>
>> For each client it will have a dedicated client auth caps, something like:
>>
>> client.foo
>>     key: *key*
>>     caps: [mds] allow r, allow rw path=/bar
>>     caps: [mon] allow r
>>     caps: [osd] allow rw tag cephfs data=cephfs_a
> Do we have any infrastructure to get this caps list on the client side
> right now?
> (I've taken a quick look through the code and can't find anything
> related to this.)

I am afraid there is no.

But just after the following ceph PR gets merged it will be easy to do this:

https://github.com/ceph/ceph/pull/48027

This is still under testing.

>> When mounting this client with idmap enabled, then we can just check the
>> above [mds] caps, if there has any UID/GID based permissions set, then
>> fail the mounting.
> understood
>
>> That means this kind client couldn't be mounted with idmap enabled.
>>
>> Also we need to make sure that once there is a mount with idmap enabled,
>> the corresponding client caps couldn't be append the UID/GID based
>> permissions. This need a patch in ceph anyway IMO.
> So, yeah we will need to effectively block cephx permission changes if
> there is a client mounted with
> an active idmapped mount. Sounds as something that require massive
> changes on the server side.

Maybe no need much, it should be simple IMO. But I am not 100% sure.

> At the same time this will just block users from using idmapped mounts
> along with UID/GID restrictions.
>
> If you want me to change server-side anyways, isn't it better just to
> extend cephfs protocol to properly
> handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
> What we need to do here is to add a separate UID/GID fields for ceph
> requests those are creating a new inodes
> (like mknod, symlink, etc).

BTW, could you explain it more ? How could this resolve the issue we are 
discussing here ?

Thanks

- Xiubo


>
> Kind regards,
> Alex
>
>> Thanks
>>
>> - Xiubo
>>
>>
>>
>>
>>
>>> Thanks,
>>> Alex
>>>
>>>> Thanks
>>>>
>>>> - Xiubo
>>>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-06-26 11:49                           ` Aleksandr Mikhalitsyn
@ 2023-07-04  1:10                             ` Xiubo Li
  0 siblings, 0 replies; 47+ messages in thread
From: Xiubo Li @ 2023-07-04  1:10 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 6/26/23 19:49, Aleksandr Mikhalitsyn wrote:
> On Mon, Jun 26, 2023 at 1:23 PM Aleksandr Mikhalitsyn
> <aleksandr.mikhalitsyn@canonical.com> wrote:
>> On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li <xiubli@redhat.com> wrote:
>>>
>>> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
>>>> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li <xiubli@redhat.com> wrote:
>>>>> [...]
>>>>>
>>>>>    > > >
>>>>>    > > > I thought about this too and came to the same conclusion, that
>>>>> UID/GID
>>>>>    > > > based
>>>>>    > > > restriction can be applied dynamically, so detecting it on mount-time
>>>>>    > > > helps not so much.
>>>>>    > > >
>>>>>    > > For this you please raise one PR to ceph first to support this, and in
>>>>>    > > the PR we can discuss more for the MDS auth caps. And after the PR
>>>>>    > > getting merged then in this patch series you need to check the
>>>>>    > > corresponding option or flag to determine whether could the idmap
>>>>>    > > mounting succeed.
>>>>>    >
>>>>>    > I'm sorry but I don't understand what we want to support here. Do we
>>>>> want to
>>>>>    > add some new ceph request that allows to check if UID/GID-based
>>>>>    > permissions are applied for
>>>>>    > a particular ceph client user?
>>>>>
>>>>> IMO we should prevent user to set UID/GID-based permisions caps from
>>>>> ceph side.
>>>>>
>>>>> As I know currently there is no way to prevent users to set MDS auth
>>>>> caps, IMO in ceph side at least we need one flag or option to disable
>>>>> this once users want this fs cluster sever for idmap mounts use case.
>>>> How this should be visible from the user side? We introducing a new
>>>> kernel client mount option,
>>>> like "nomdscaps", then pass flag to the MDS and MDS should check that
>>>> MDS auth permissions
>>>> are not applied (on the mount time) and prevent them from being
>>>> applied later while session is active. Like that?
>>>>
>>>> At the same time I'm thinking about protocol extension that adds 2
>>>> additional fields for UID/GID. This will allow to correctly
>>>> handle everything. I wanted to avoid any changes to the protocol or
>>>> server-side things. But if we want to change MDS side,
>>>> maybe it's better then to go this way?
>> Hi Xiubo,
>>
>>> There is another way:
>>>
>>> For each client it will have a dedicated client auth caps, something like:
>>>
>>> client.foo
>>>     key: *key*
>>>     caps: [mds] allow r, allow rw path=/bar
>>>     caps: [mon] allow r
>>>     caps: [osd] allow rw tag cephfs data=cephfs_a
>> Do we have any infrastructure to get this caps list on the client side
>> right now?
>> (I've taken a quick look through the code and can't find anything
>> related to this.)
> I've found your PR that looks related https://github.com/ceph/ceph/pull/48027

Yeah, after this we need to do some extra work in kclient and then it 
will be easy to check the caps I think.

Thanks

- Xiubo

>>> When mounting this client with idmap enabled, then we can just check the
>>> above [mds] caps, if there has any UID/GID based permissions set, then
>>> fail the mounting.
>> understood
>>
>>> That means this kind client couldn't be mounted with idmap enabled.
>>>
>>> Also we need to make sure that once there is a mount with idmap enabled,
>>> the corresponding client caps couldn't be append the UID/GID based
>>> permissions. This need a patch in ceph anyway IMO.
>> So, yeah we will need to effectively block cephx permission changes if
>> there is a client mounted with
>> an active idmapped mount. Sounds as something that require massive
>> changes on the server side.
>>
>> At the same time this will just block users from using idmapped mounts
>> along with UID/GID restrictions.
>>
>> If you want me to change server-side anyways, isn't it better just to
>> extend cephfs protocol to properly
>> handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
>> What we need to do here is to add a separate UID/GID fields for ceph
>> requests those are creating a new inodes
>> (like mknod, symlink, etc).
>>
>> Kind regards,
>> Alex
>>
>>> Thanks
>>>
>>> - Xiubo
>>>
>>>
>>>
>>>
>>>
>>>> Thanks,
>>>> Alex
>>>>
>>>>> Thanks
>>>>>
>>>>> - Xiubo
>>>>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-04  1:08                           ` Xiubo Li
@ 2023-07-14 12:57                             ` Aleksandr Mikhalitsyn
  2023-07-18  1:44                               ` Xiubo Li
  0 siblings, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-07-14 12:57 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Tue, Jul 4, 2023 at 3:09 AM Xiubo Li <xiubli@redhat.com> wrote:
>
> Sorry, not sure, why my last reply wasn't sent out.
>
> Do it again.
>
>
> On 6/26/23 19:23, Aleksandr Mikhalitsyn wrote:
> > On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li<xiubli@redhat.com>  wrote:
> >> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
> >>> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li<xiubli@redhat.com>  wrote:
> >>>> [...]
> >>>>
> >>>>    > > >
> >>>>    > > > I thought about this too and came to the same conclusion, that
> >>>> UID/GID
> >>>>    > > > based
> >>>>    > > > restriction can be applied dynamically, so detecting it on mount-time
> >>>>    > > > helps not so much.
> >>>>    > > >
> >>>>    > > For this you please raise one PR to ceph first to support this, and in
> >>>>    > > the PR we can discuss more for the MDS auth caps. And after the PR
> >>>>    > > getting merged then in this patch series you need to check the
> >>>>    > > corresponding option or flag to determine whether could the idmap
> >>>>    > > mounting succeed.
> >>>>    >
> >>>>    > I'm sorry but I don't understand what we want to support here. Do we
> >>>> want to
> >>>>    > add some new ceph request that allows to check if UID/GID-based
> >>>>    > permissions are applied for
> >>>>    > a particular ceph client user?
> >>>>
> >>>> IMO we should prevent user to set UID/GID-based permisions caps from
> >>>> ceph side.
> >>>>
> >>>> As I know currently there is no way to prevent users to set MDS auth
> >>>> caps, IMO in ceph side at least we need one flag or option to disable
> >>>> this once users want this fs cluster sever for idmap mounts use case.
> >>> How this should be visible from the user side? We introducing a new
> >>> kernel client mount option,
> >>> like "nomdscaps", then pass flag to the MDS and MDS should check that
> >>> MDS auth permissions
> >>> are not applied (on the mount time) and prevent them from being
> >>> applied later while session is active. Like that?
> >>>
> >>> At the same time I'm thinking about protocol extension that adds 2
> >>> additional fields for UID/GID. This will allow to correctly
> >>> handle everything. I wanted to avoid any changes to the protocol or
> >>> server-side things. But if we want to change MDS side,
> >>> maybe it's better then to go this way?
> > Hi Xiubo,
> >
> >> There is another way:
> >>
> >> For each client it will have a dedicated client auth caps, something like:
> >>
> >> client.foo
> >>     key: *key*
> >>     caps: [mds] allow r, allow rw path=/bar
> >>     caps: [mon] allow r
> >>     caps: [osd] allow rw tag cephfs data=cephfs_a
> > Do we have any infrastructure to get this caps list on the client side
> > right now?
> > (I've taken a quick look through the code and can't find anything
> > related to this.)
>
> I am afraid there is no.
>
> But just after the following ceph PR gets merged it will be easy to do this:
>
> https://github.com/ceph/ceph/pull/48027
>
> This is still under testing.
>
> >> When mounting this client with idmap enabled, then we can just check the
> >> above [mds] caps, if there has any UID/GID based permissions set, then
> >> fail the mounting.
> > understood
> >
> >> That means this kind client couldn't be mounted with idmap enabled.
> >>
> >> Also we need to make sure that once there is a mount with idmap enabled,
> >> the corresponding client caps couldn't be append the UID/GID based
> >> permissions. This need a patch in ceph anyway IMO.
> > So, yeah we will need to effectively block cephx permission changes if
> > there is a client mounted with
> > an active idmapped mount. Sounds as something that require massive
> > changes on the server side.
>
> Maybe no need much, it should be simple IMO. But I am not 100% sure.
>
> > At the same time this will just block users from using idmapped mounts
> > along with UID/GID restrictions.
> >
> > If you want me to change server-side anyways, isn't it better just to
> > extend cephfs protocol to properly
> > handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
> > What we need to do here is to add a separate UID/GID fields for ceph
> > requests those are creating a new inodes
> > (like mknod, symlink, etc).

Dear Xiubo,

I'm sorry for delay with reply, I've missed this message accidentally.

>
> BTW, could you explain it more ? How could this resolve the issue we are
> discussing here ?

This was briefly mentioned here
https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/#t
by Christian. Let me describe it in detail.

In the current approach we apply mount idmapping to
head->caller_{uid,gid} fields
to make mkdir/mknod/symlink operations set a proper inode owner
uid/gid in according with an idmapping.

This makes a problem with path-based UID/GID restriction mechanism,
because it uses head->caller_{uid,gid} fields
to check if UID/GID is permitted or not.

So, the problem is that we have one field in ceph request for two
different needs - to control permissions and to set inode owner.
Christian pointed that the most saner way is to modify ceph protocol
and add a separate field to store inode owner UID/GID,
and only this fields should be idmapped, but head->caller_{uid,gid}
will be untouched.

With this approach, we will not affect UID/GID-based permission rules
with an idmapped mounts at all.

Kind regards,
Alex

>
> Thanks
>
> - Xiubo
>
>
> >
> > Kind regards,
> > Alex
> >
> >> Thanks
> >>
> >> - Xiubo
> >>
> >>
> >>
> >>
> >>
> >>> Thanks,
> >>> Alex
> >>>
> >>>> Thanks
> >>>>
> >>>> - Xiubo
> >>>>
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-14 12:57                             ` Aleksandr Mikhalitsyn
@ 2023-07-18  1:44                               ` Xiubo Li
  2023-07-18 14:49                                 ` Aleksandr Mikhalitsyn
  0 siblings, 1 reply; 47+ messages in thread
From: Xiubo Li @ 2023-07-18  1:44 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 7/14/23 20:57, Aleksandr Mikhalitsyn wrote:
> On Tue, Jul 4, 2023 at 3:09 AM Xiubo Li <xiubli@redhat.com> wrote:
>> Sorry, not sure, why my last reply wasn't sent out.
>>
>> Do it again.
>>
>>
>> On 6/26/23 19:23, Aleksandr Mikhalitsyn wrote:
>>> On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li<xiubli@redhat.com>  wrote:
>>>> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
>>>>> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li<xiubli@redhat.com>  wrote:
>>>>>> [...]
>>>>>>
>>>>>>     > > >
>>>>>>     > > > I thought about this too and came to the same conclusion, that
>>>>>> UID/GID
>>>>>>     > > > based
>>>>>>     > > > restriction can be applied dynamically, so detecting it on mount-time
>>>>>>     > > > helps not so much.
>>>>>>     > > >
>>>>>>     > > For this you please raise one PR to ceph first to support this, and in
>>>>>>     > > the PR we can discuss more for the MDS auth caps. And after the PR
>>>>>>     > > getting merged then in this patch series you need to check the
>>>>>>     > > corresponding option or flag to determine whether could the idmap
>>>>>>     > > mounting succeed.
>>>>>>     >
>>>>>>     > I'm sorry but I don't understand what we want to support here. Do we
>>>>>> want to
>>>>>>     > add some new ceph request that allows to check if UID/GID-based
>>>>>>     > permissions are applied for
>>>>>>     > a particular ceph client user?
>>>>>>
>>>>>> IMO we should prevent user to set UID/GID-based permisions caps from
>>>>>> ceph side.
>>>>>>
>>>>>> As I know currently there is no way to prevent users to set MDS auth
>>>>>> caps, IMO in ceph side at least we need one flag or option to disable
>>>>>> this once users want this fs cluster sever for idmap mounts use case.
>>>>> How this should be visible from the user side? We introducing a new
>>>>> kernel client mount option,
>>>>> like "nomdscaps", then pass flag to the MDS and MDS should check that
>>>>> MDS auth permissions
>>>>> are not applied (on the mount time) and prevent them from being
>>>>> applied later while session is active. Like that?
>>>>>
>>>>> At the same time I'm thinking about protocol extension that adds 2
>>>>> additional fields for UID/GID. This will allow to correctly
>>>>> handle everything. I wanted to avoid any changes to the protocol or
>>>>> server-side things. But if we want to change MDS side,
>>>>> maybe it's better then to go this way?
>>> Hi Xiubo,
>>>
>>>> There is another way:
>>>>
>>>> For each client it will have a dedicated client auth caps, something like:
>>>>
>>>> client.foo
>>>>      key: *key*
>>>>      caps: [mds] allow r, allow rw path=/bar
>>>>      caps: [mon] allow r
>>>>      caps: [osd] allow rw tag cephfs data=cephfs_a
>>> Do we have any infrastructure to get this caps list on the client side
>>> right now?
>>> (I've taken a quick look through the code and can't find anything
>>> related to this.)
>> I am afraid there is no.
>>
>> But just after the following ceph PR gets merged it will be easy to do this:
>>
>> https://github.com/ceph/ceph/pull/48027
>>
>> This is still under testing.
>>
>>>> When mounting this client with idmap enabled, then we can just check the
>>>> above [mds] caps, if there has any UID/GID based permissions set, then
>>>> fail the mounting.
>>> understood
>>>
>>>> That means this kind client couldn't be mounted with idmap enabled.
>>>>
>>>> Also we need to make sure that once there is a mount with idmap enabled,
>>>> the corresponding client caps couldn't be append the UID/GID based
>>>> permissions. This need a patch in ceph anyway IMO.
>>> So, yeah we will need to effectively block cephx permission changes if
>>> there is a client mounted with
>>> an active idmapped mount. Sounds as something that require massive
>>> changes on the server side.
>> Maybe no need much, it should be simple IMO. But I am not 100% sure.
>>
>>> At the same time this will just block users from using idmapped mounts
>>> along with UID/GID restrictions.
>>>
>>> If you want me to change server-side anyways, isn't it better just to
>>> extend cephfs protocol to properly
>>> handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
>>> What we need to do here is to add a separate UID/GID fields for ceph
>>> requests those are creating a new inodes
>>> (like mknod, symlink, etc).
> Dear Xiubo,
>
> I'm sorry for delay with reply, I've missed this message accidentally.
>
>> BTW, could you explain it more ? How could this resolve the issue we are
>> discussing here ?
> This was briefly mentioned here
> https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/#t
> by Christian. Let me describe it in detail.
>
> In the current approach we apply mount idmapping to
> head->caller_{uid,gid} fields
> to make mkdir/mknod/symlink operations set a proper inode owner
> uid/gid in according with an idmapping.

Sorry for late.

I still couldn't get how this could resolve the lookup case.

For a lookup request the caller_{uid, gid} still will be the mapped 
{uid, gid}, right ? And also the same for other non-create requests. If 
so this will be incorrect for the cephx perm checks IMO.

Thanks

- Xiubo


> This makes a problem with path-based UID/GID restriction mechanism,
> because it uses head->caller_{uid,gid} fields
> to check if UID/GID is permitted or not.
>
> So, the problem is that we have one field in ceph request for two
> different needs - to control permissions and to set inode owner.
> Christian pointed that the most saner way is to modify ceph protocol
> and add a separate field to store inode owner UID/GID,
> and only this fields should be idmapped, but head->caller_{uid,gid}
> will be untouched.
>
> With this approach, we will not affect UID/GID-based permission rules
> with an idmapped mounts at all.
>
> Kind regards,
> Alex
>
>> Thanks
>>
>> - Xiubo
>>
>>
>>> Kind regards,
>>> Alex
>>>
>>>> Thanks
>>>>
>>>> - Xiubo
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>> Alex
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> - Xiubo
>>>>>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-18  1:44                               ` Xiubo Li
@ 2023-07-18 14:49                                 ` Aleksandr Mikhalitsyn
  2023-07-19 11:57                                   ` Aleksandr Mikhalitsyn
  0 siblings, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-07-18 14:49 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Tue, Jul 18, 2023 at 3:45 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 7/14/23 20:57, Aleksandr Mikhalitsyn wrote:
> > On Tue, Jul 4, 2023 at 3:09 AM Xiubo Li <xiubli@redhat.com> wrote:
> >> Sorry, not sure, why my last reply wasn't sent out.
> >>
> >> Do it again.
> >>
> >>
> >> On 6/26/23 19:23, Aleksandr Mikhalitsyn wrote:
> >>> On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li<xiubli@redhat.com>  wrote:
> >>>> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
> >>>>> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li<xiubli@redhat.com>  wrote:
> >>>>>> [...]
> >>>>>>
> >>>>>>     > > >
> >>>>>>     > > > I thought about this too and came to the same conclusion, that
> >>>>>> UID/GID
> >>>>>>     > > > based
> >>>>>>     > > > restriction can be applied dynamically, so detecting it on mount-time
> >>>>>>     > > > helps not so much.
> >>>>>>     > > >
> >>>>>>     > > For this you please raise one PR to ceph first to support this, and in
> >>>>>>     > > the PR we can discuss more for the MDS auth caps. And after the PR
> >>>>>>     > > getting merged then in this patch series you need to check the
> >>>>>>     > > corresponding option or flag to determine whether could the idmap
> >>>>>>     > > mounting succeed.
> >>>>>>     >
> >>>>>>     > I'm sorry but I don't understand what we want to support here. Do we
> >>>>>> want to
> >>>>>>     > add some new ceph request that allows to check if UID/GID-based
> >>>>>>     > permissions are applied for
> >>>>>>     > a particular ceph client user?
> >>>>>>
> >>>>>> IMO we should prevent user to set UID/GID-based permisions caps from
> >>>>>> ceph side.
> >>>>>>
> >>>>>> As I know currently there is no way to prevent users to set MDS auth
> >>>>>> caps, IMO in ceph side at least we need one flag or option to disable
> >>>>>> this once users want this fs cluster sever for idmap mounts use case.
> >>>>> How this should be visible from the user side? We introducing a new
> >>>>> kernel client mount option,
> >>>>> like "nomdscaps", then pass flag to the MDS and MDS should check that
> >>>>> MDS auth permissions
> >>>>> are not applied (on the mount time) and prevent them from being
> >>>>> applied later while session is active. Like that?
> >>>>>
> >>>>> At the same time I'm thinking about protocol extension that adds 2
> >>>>> additional fields for UID/GID. This will allow to correctly
> >>>>> handle everything. I wanted to avoid any changes to the protocol or
> >>>>> server-side things. But if we want to change MDS side,
> >>>>> maybe it's better then to go this way?
> >>> Hi Xiubo,
> >>>
> >>>> There is another way:
> >>>>
> >>>> For each client it will have a dedicated client auth caps, something like:
> >>>>
> >>>> client.foo
> >>>>      key: *key*
> >>>>      caps: [mds] allow r, allow rw path=/bar
> >>>>      caps: [mon] allow r
> >>>>      caps: [osd] allow rw tag cephfs data=cephfs_a
> >>> Do we have any infrastructure to get this caps list on the client side
> >>> right now?
> >>> (I've taken a quick look through the code and can't find anything
> >>> related to this.)
> >> I am afraid there is no.
> >>
> >> But just after the following ceph PR gets merged it will be easy to do this:
> >>
> >> https://github.com/ceph/ceph/pull/48027
> >>
> >> This is still under testing.
> >>
> >>>> When mounting this client with idmap enabled, then we can just check the
> >>>> above [mds] caps, if there has any UID/GID based permissions set, then
> >>>> fail the mounting.
> >>> understood
> >>>
> >>>> That means this kind client couldn't be mounted with idmap enabled.
> >>>>
> >>>> Also we need to make sure that once there is a mount with idmap enabled,
> >>>> the corresponding client caps couldn't be append the UID/GID based
> >>>> permissions. This need a patch in ceph anyway IMO.
> >>> So, yeah we will need to effectively block cephx permission changes if
> >>> there is a client mounted with
> >>> an active idmapped mount. Sounds as something that require massive
> >>> changes on the server side.
> >> Maybe no need much, it should be simple IMO. But I am not 100% sure.
> >>
> >>> At the same time this will just block users from using idmapped mounts
> >>> along with UID/GID restrictions.
> >>>
> >>> If you want me to change server-side anyways, isn't it better just to
> >>> extend cephfs protocol to properly
> >>> handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
> >>> What we need to do here is to add a separate UID/GID fields for ceph
> >>> requests those are creating a new inodes
> >>> (like mknod, symlink, etc).
> > Dear Xiubo,
> >
> > I'm sorry for delay with reply, I've missed this message accidentally.
> >
> >> BTW, could you explain it more ? How could this resolve the issue we are
> >> discussing here ?
> > This was briefly mentioned here
> > https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/#t
> > by Christian. Let me describe it in detail.
> >
> > In the current approach we apply mount idmapping to
> > head->caller_{uid,gid} fields
> > to make mkdir/mknod/symlink operations set a proper inode owner
> > uid/gid in according with an idmapping.
>
> Sorry for late.
>
> I still couldn't get how this could resolve the lookup case.
>
> For a lookup request the caller_{uid, gid} still will be the mapped
> {uid, gid}, right ?

No, the idea is to stop mapping a caller_{uid, gid}. And to add a new
fields like
inode_owner_{uid, gid} which will be idmapped (this field will be specific only
for those operations that create a new inode).

>
And also the same for other non-create requests. If
> so this will be incorrect for the cephx perm checks IMO.

Thanks,
Alex

>
> Thanks
>
> - Xiubo
>
>
> > This makes a problem with path-based UID/GID restriction mechanism,
> > because it uses head->caller_{uid,gid} fields
> > to check if UID/GID is permitted or not.
> >
> > So, the problem is that we have one field in ceph request for two
> > different needs - to control permissions and to set inode owner.
> > Christian pointed that the most saner way is to modify ceph protocol
> > and add a separate field to store inode owner UID/GID,
> > and only this fields should be idmapped, but head->caller_{uid,gid}
> > will be untouched.
> >
> > With this approach, we will not affect UID/GID-based permission rules
> > with an idmapped mounts at all.
> >
> > Kind regards,
> > Alex
> >
> >> Thanks
> >>
> >> - Xiubo
> >>
> >>
> >>> Kind regards,
> >>> Alex
> >>>
> >>>> Thanks
> >>>>
> >>>> - Xiubo
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Thanks,
> >>>>> Alex
> >>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> - Xiubo
> >>>>>>
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-18 14:49                                 ` Aleksandr Mikhalitsyn
@ 2023-07-19 11:57                                   ` Aleksandr Mikhalitsyn
  2023-07-20  6:36                                     ` Xiubo Li
  0 siblings, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-07-19 11:57 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Tue, Jul 18, 2023 at 4:49 PM Aleksandr Mikhalitsyn
<aleksandr.mikhalitsyn@canonical.com> wrote:
>
> On Tue, Jul 18, 2023 at 3:45 AM Xiubo Li <xiubli@redhat.com> wrote:
> >
> >
> > On 7/14/23 20:57, Aleksandr Mikhalitsyn wrote:
> > > On Tue, Jul 4, 2023 at 3:09 AM Xiubo Li <xiubli@redhat.com> wrote:
> > >> Sorry, not sure, why my last reply wasn't sent out.
> > >>
> > >> Do it again.
> > >>
> > >>
> > >> On 6/26/23 19:23, Aleksandr Mikhalitsyn wrote:
> > >>> On Mon, Jun 26, 2023 at 4:12 AM Xiubo Li<xiubli@redhat.com>  wrote:
> > >>>> On 6/24/23 15:11, Aleksandr Mikhalitsyn wrote:
> > >>>>> On Sat, Jun 24, 2023 at 3:37 AM Xiubo Li<xiubli@redhat.com>  wrote:
> > >>>>>> [...]
> > >>>>>>
> > >>>>>>     > > >
> > >>>>>>     > > > I thought about this too and came to the same conclusion, that
> > >>>>>> UID/GID
> > >>>>>>     > > > based
> > >>>>>>     > > > restriction can be applied dynamically, so detecting it on mount-time
> > >>>>>>     > > > helps not so much.
> > >>>>>>     > > >
> > >>>>>>     > > For this you please raise one PR to ceph first to support this, and in
> > >>>>>>     > > the PR we can discuss more for the MDS auth caps. And after the PR
> > >>>>>>     > > getting merged then in this patch series you need to check the
> > >>>>>>     > > corresponding option or flag to determine whether could the idmap
> > >>>>>>     > > mounting succeed.
> > >>>>>>     >
> > >>>>>>     > I'm sorry but I don't understand what we want to support here. Do we
> > >>>>>> want to
> > >>>>>>     > add some new ceph request that allows to check if UID/GID-based
> > >>>>>>     > permissions are applied for
> > >>>>>>     > a particular ceph client user?
> > >>>>>>
> > >>>>>> IMO we should prevent user to set UID/GID-based permisions caps from
> > >>>>>> ceph side.
> > >>>>>>
> > >>>>>> As I know currently there is no way to prevent users to set MDS auth
> > >>>>>> caps, IMO in ceph side at least we need one flag or option to disable
> > >>>>>> this once users want this fs cluster sever for idmap mounts use case.
> > >>>>> How this should be visible from the user side? We introducing a new
> > >>>>> kernel client mount option,
> > >>>>> like "nomdscaps", then pass flag to the MDS and MDS should check that
> > >>>>> MDS auth permissions
> > >>>>> are not applied (on the mount time) and prevent them from being
> > >>>>> applied later while session is active. Like that?
> > >>>>>
> > >>>>> At the same time I'm thinking about protocol extension that adds 2
> > >>>>> additional fields for UID/GID. This will allow to correctly
> > >>>>> handle everything. I wanted to avoid any changes to the protocol or
> > >>>>> server-side things. But if we want to change MDS side,
> > >>>>> maybe it's better then to go this way?
> > >>> Hi Xiubo,
> > >>>
> > >>>> There is another way:
> > >>>>
> > >>>> For each client it will have a dedicated client auth caps, something like:
> > >>>>
> > >>>> client.foo
> > >>>>      key: *key*
> > >>>>      caps: [mds] allow r, allow rw path=/bar
> > >>>>      caps: [mon] allow r
> > >>>>      caps: [osd] allow rw tag cephfs data=cephfs_a
> > >>> Do we have any infrastructure to get this caps list on the client side
> > >>> right now?
> > >>> (I've taken a quick look through the code and can't find anything
> > >>> related to this.)
> > >> I am afraid there is no.
> > >>
> > >> But just after the following ceph PR gets merged it will be easy to do this:
> > >>
> > >> https://github.com/ceph/ceph/pull/48027
> > >>
> > >> This is still under testing.
> > >>
> > >>>> When mounting this client with idmap enabled, then we can just check the
> > >>>> above [mds] caps, if there has any UID/GID based permissions set, then
> > >>>> fail the mounting.
> > >>> understood
> > >>>
> > >>>> That means this kind client couldn't be mounted with idmap enabled.
> > >>>>
> > >>>> Also we need to make sure that once there is a mount with idmap enabled,
> > >>>> the corresponding client caps couldn't be append the UID/GID based
> > >>>> permissions. This need a patch in ceph anyway IMO.
> > >>> So, yeah we will need to effectively block cephx permission changes if
> > >>> there is a client mounted with
> > >>> an active idmapped mount. Sounds as something that require massive
> > >>> changes on the server side.
> > >> Maybe no need much, it should be simple IMO. But I am not 100% sure.
> > >>
> > >>> At the same time this will just block users from using idmapped mounts
> > >>> along with UID/GID restrictions.
> > >>>
> > >>> If you want me to change server-side anyways, isn't it better just to
> > >>> extend cephfs protocol to properly
> > >>> handle UID/GIDs with idmapped mounts? (It was originally proposed by Christian.)
> > >>> What we need to do here is to add a separate UID/GID fields for ceph
> > >>> requests those are creating a new inodes
> > >>> (like mknod, symlink, etc).
> > > Dear Xiubo,
> > >
> > > I'm sorry for delay with reply, I've missed this message accidentally.
> > >
> > >> BTW, could you explain it more ? How could this resolve the issue we are
> > >> discussing here ?
> > > This was briefly mentioned here
> > > https://lore.kernel.org/all/20220105141023.vrrbfhti5apdvkz7@wittgenstein/#t
> > > by Christian. Let me describe it in detail.
> > >
> > > In the current approach we apply mount idmapping to
> > > head->caller_{uid,gid} fields
> > > to make mkdir/mknod/symlink operations set a proper inode owner
> > > uid/gid in according with an idmapping.
> >
> > Sorry for late.
> >
> > I still couldn't get how this could resolve the lookup case.
> >
> > For a lookup request the caller_{uid, gid} still will be the mapped
> > {uid, gid}, right ?
>
> No, the idea is to stop mapping a caller_{uid, gid}. And to add a new
> fields like
> inode_owner_{uid, gid} which will be idmapped (this field will be specific only
> for those operations that create a new inode).

I've decided to write some summary of different approaches and
elaborate tricky places.

Current implementation.

We have head->caller_{uid,gid} fields mapped in according
to the mount's idmapping. But as we don't have information about
mount's idmapping in all call stacks (like ->lookup case), we
are not able to map it always and they are left untouched in these cases.

This tends to an inconsistency between different inode_operations,
for example ->lookup (don't have an access to an idmapping) and
->mkdir (have an idmapping as an argument).

This inconsistency is absolutely harmless if the user does not
use UID/GID-based restrictions. Because in this use case head->caller_{uid,gid}
fields used *only* to set inode owner UID/GID during the inode_operations
which create inodes.

Conclusion 1. head->caller_{uid,gid} fields have two meanings
- UID/GID-based permission checks
- inode owner information

Solution 0. Ignore the issue with UID/GID-based restrictions and idmapped mounts
until we are not blamed by users ;-)

As far as I can see you are not happy about this way. :-)

Solution 1. Let's add mount's idmapping argument to all inode_operations
and always map head->caller_{uid,gid} fields.

Not a best idea, because:
- big modification of VFS layer
- ideologically incorrect, for instance ->lookup should not care
and know *anything* about mount's idmapping, because ->lookup works
not on the mount level (it's not important who and through which mount
triggered the ->lookup). Imagine that you've dentry cache filled and call
open(...) in this case ->lookup can be uncalled. But if the user was not lucky
enough to have cache filled then open(..) will trigger the lookup and
then ->lookup results will be dependent on the mount's idmapping. It
seems incorrect
and unobvious consequence of introducing such a parameter to ->lookup operation.
To summarize, ->lookup is about filling dentry cache and dentry cache
is superblock-level
thing, not mount-level.

Solution 2. Add some kind of extra checks to ceph-client and ceph
server to detect that
mount idmappings used with UID/GID-based restrictions and restrict such mounts.

Seems not ideal to me too. Because it's not a fix, it's a limitation
and this limitation is
not cheap from the implementation perspective (we need heavy changes
in ceph server side and
client side too). Btw, currently VFS API is also not ready for that,
because we can't
decide if idmapped mounts are allowed or not in runtime. It's a static
thing that should be declared
with FS_ALLOW_IDMAP flag in (struct file_system_type)->fs_flags. Not a
big deal, but...

Solution 3. Add a new UID/GID fields to ceph request structure in
addition to head->caller_{uid,gid}
to store information about inode owners (only for inode_operations
which create inodes).

How does it solves the problem?
With these new fields we can leave head->caller_{uid,gid} untouched
with an idmapped mounts code.
It means that UID/GID-based restrictions will continue to work as intended.

At the same time, new fields (let say "inode_owner_{uid,gid}") will be
mapped in accordance with
a mount's idmapping.

This solution seems ideal, because it is philosophically correct, it
makes cephfs idmapped mounts to work
in the same manner and way as idmapped mounts work for any other
filesystem like ext4.

But yes, this requires cephfs protocol changes...

I personally still believe that the "Solution 0" approach is optimal
and we can go with "Solution 3" way
as the next iteration.

Kind regards,
Alex

>
> >
> And also the same for other non-create requests. If
> > so this will be incorrect for the cephx perm checks IMO.
>
> Thanks,
> Alex
>
> >
> > Thanks
> >
> > - Xiubo
> >
> >
> > > This makes a problem with path-based UID/GID restriction mechanism,
> > > because it uses head->caller_{uid,gid} fields
> > > to check if UID/GID is permitted or not.
> > >
> > > So, the problem is that we have one field in ceph request for two
> > > different needs - to control permissions and to set inode owner.
> > > Christian pointed that the most saner way is to modify ceph protocol
> > > and add a separate field to store inode owner UID/GID,
> > > and only this fields should be idmapped, but head->caller_{uid,gid}
> > > will be untouched.
> > >
> > > With this approach, we will not affect UID/GID-based permission rules
> > > with an idmapped mounts at all.
> > >
> > > Kind regards,
> > > Alex
> > >
> > >> Thanks
> > >>
> > >> - Xiubo
> > >>
> > >>
> > >>> Kind regards,
> > >>> Alex
> > >>>
> > >>>> Thanks
> > >>>>
> > >>>> - Xiubo
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>> Thanks,
> > >>>>> Alex
> > >>>>>
> > >>>>>> Thanks
> > >>>>>>
> > >>>>>> - Xiubo
> > >>>>>>
> >

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-19 11:57                                   ` Aleksandr Mikhalitsyn
@ 2023-07-20  6:36                                     ` Xiubo Li
  2023-07-20  6:41                                       ` Aleksandr Mikhalitsyn
  2023-07-21 15:43                                       ` Aleksandr Mikhalitsyn
  0 siblings, 2 replies; 47+ messages in thread
From: Xiubo Li @ 2023-07-20  6:36 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 7/19/23 19:57, Aleksandr Mikhalitsyn wrote:
> On Tue, Jul 18, 2023 at 4:49 PM Aleksandr Mikhalitsyn
> <aleksandr.mikhalitsyn@canonical.com> wrote:
>> On Tue, Jul 18, 2023 at 3:45 AM Xiubo Li <xiubli@redhat.com> wrote:
[...]
>> No, the idea is to stop mapping a caller_{uid, gid}. And to add a new
>> fields like
>> inode_owner_{uid, gid} which will be idmapped (this field will be specific only
>> for those operations that create a new inode).
> I've decided to write some summary of different approaches and
> elaborate tricky places.
>
> Current implementation.
>
> We have head->caller_{uid,gid} fields mapped in according
> to the mount's idmapping. But as we don't have information about
> mount's idmapping in all call stacks (like ->lookup case), we
> are not able to map it always and they are left untouched in these cases.
>
> This tends to an inconsistency between different inode_operations,
> for example ->lookup (don't have an access to an idmapping) and
> ->mkdir (have an idmapping as an argument).
>
> This inconsistency is absolutely harmless if the user does not
> use UID/GID-based restrictions. Because in this use case head->caller_{uid,gid}
> fields used *only* to set inode owner UID/GID during the inode_operations
> which create inodes.
>
> Conclusion 1. head->caller_{uid,gid} fields have two meanings
> - UID/GID-based permission checks
> - inode owner information
>
> Solution 0. Ignore the issue with UID/GID-based restrictions and idmapped mounts
> until we are not blamed by users ;-)
>
> As far as I can see you are not happy about this way. :-)
>
> Solution 1. Let's add mount's idmapping argument to all inode_operations
> and always map head->caller_{uid,gid} fields.
>
> Not a best idea, because:
> - big modification of VFS layer
> - ideologically incorrect, for instance ->lookup should not care
> and know *anything* about mount's idmapping, because ->lookup works
> not on the mount level (it's not important who and through which mount
> triggered the ->lookup). Imagine that you've dentry cache filled and call
> open(...) in this case ->lookup can be uncalled. But if the user was not lucky
> enough to have cache filled then open(..) will trigger the lookup and
> then ->lookup results will be dependent on the mount's idmapping. It
> seems incorrect
> and unobvious consequence of introducing such a parameter to ->lookup operation.
> To summarize, ->lookup is about filling dentry cache and dentry cache
> is superblock-level
> thing, not mount-level.
>
> Solution 2. Add some kind of extra checks to ceph-client and ceph
> server to detect that
> mount idmappings used with UID/GID-based restrictions and restrict such mounts.
>
> Seems not ideal to me too. Because it's not a fix, it's a limitation
> and this limitation is
> not cheap from the implementation perspective (we need heavy changes
> in ceph server side and
> client side too). Btw, currently VFS API is also not ready for that,
> because we can't
> decide if idmapped mounts are allowed or not in runtime. It's a static
> thing that should be declared
> with FS_ALLOW_IDMAP flag in (struct file_system_type)->fs_flags. Not a
> big deal, but...
>
> Solution 3. Add a new UID/GID fields to ceph request structure in
> addition to head->caller_{uid,gid}
> to store information about inode owners (only for inode_operations
> which create inodes).
>
> How does it solves the problem?
> With these new fields we can leave head->caller_{uid,gid} untouched
> with an idmapped mounts code.
> It means that UID/GID-based restrictions will continue to work as intended.
>
> At the same time, new fields (let say "inode_owner_{uid,gid}") will be
> mapped in accordance with
> a mount's idmapping.
>
> This solution seems ideal, because it is philosophically correct, it
> makes cephfs idmapped mounts to work
> in the same manner and way as idmapped mounts work for any other
> filesystem like ext4.

Okay, this approach sounds more reasonable to me. But you need to do 
some extra work to make it to be compatible between {old,new} kernels 
and  {old,new} cephs.

So then the caller uid/gid will always be the user uid/gid issuing the 
requests as now.

Thanks

- Xiubo


> But yes, this requires cephfs protocol changes...
>
> I personally still believe that the "Solution 0" approach is optimal
> and we can go with "Solution 3" way
> as the next iteration.
>
> Kind regards,
> Alex
>
>> And also the same for other non-create requests. If
>>> so this will be incorrect for the cephx perm checks IMO.
>> Thanks,
>> Alex
>>
>>> Thanks
>>>
>>> - Xiubo
>>>
>>>
>>>> This makes a problem with path-based UID/GID restriction mechanism,
>>>> because it uses head->caller_{uid,gid} fields
>>>> to check if UID/GID is permitted or not.
>>>>
>>>> So, the problem is that we have one field in ceph request for two
>>>> different needs - to control permissions and to set inode owner.
>>>> Christian pointed that the most saner way is to modify ceph protocol
>>>> and add a separate field to store inode owner UID/GID,
>>>> and only this fields should be idmapped, but head->caller_{uid,gid}
>>>> will be untouched.
>>>>
>>>> With this approach, we will not affect UID/GID-based permission rules
>>>> with an idmapped mounts at all.
>>>>
>>>> Kind regards,
>>>> Alex
>>>>
>>>>> Thanks
>>>>>
>>>>> - Xiubo
>>>>>
>>>>>
>>>>>> Kind regards,
>>>>>> Alex
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> - Xiubo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Alex
>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> - Xiubo
>>>>>>>>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-20  6:36                                     ` Xiubo Li
@ 2023-07-20  6:41                                       ` Aleksandr Mikhalitsyn
  2023-07-21 15:43                                       ` Aleksandr Mikhalitsyn
  1 sibling, 0 replies; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-07-20  6:41 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Thu, Jul 20, 2023 at 8:36 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 7/19/23 19:57, Aleksandr Mikhalitsyn wrote:
> > On Tue, Jul 18, 2023 at 4:49 PM Aleksandr Mikhalitsyn
> > <aleksandr.mikhalitsyn@canonical.com> wrote:
> >> On Tue, Jul 18, 2023 at 3:45 AM Xiubo Li <xiubli@redhat.com> wrote:
> [...]
> >> No, the idea is to stop mapping a caller_{uid, gid}. And to add a new
> >> fields like
> >> inode_owner_{uid, gid} which will be idmapped (this field will be specific only
> >> for those operations that create a new inode).
> > I've decided to write some summary of different approaches and
> > elaborate tricky places.
> >
> > Current implementation.
> >
> > We have head->caller_{uid,gid} fields mapped in according
> > to the mount's idmapping. But as we don't have information about
> > mount's idmapping in all call stacks (like ->lookup case), we
> > are not able to map it always and they are left untouched in these cases.
> >
> > This tends to an inconsistency between different inode_operations,
> > for example ->lookup (don't have an access to an idmapping) and
> > ->mkdir (have an idmapping as an argument).
> >
> > This inconsistency is absolutely harmless if the user does not
> > use UID/GID-based restrictions. Because in this use case head->caller_{uid,gid}
> > fields used *only* to set inode owner UID/GID during the inode_operations
> > which create inodes.
> >
> > Conclusion 1. head->caller_{uid,gid} fields have two meanings
> > - UID/GID-based permission checks
> > - inode owner information
> >
> > Solution 0. Ignore the issue with UID/GID-based restrictions and idmapped mounts
> > until we are not blamed by users ;-)
> >
> > As far as I can see you are not happy about this way. :-)
> >
> > Solution 1. Let's add mount's idmapping argument to all inode_operations
> > and always map head->caller_{uid,gid} fields.
> >
> > Not a best idea, because:
> > - big modification of VFS layer
> > - ideologically incorrect, for instance ->lookup should not care
> > and know *anything* about mount's idmapping, because ->lookup works
> > not on the mount level (it's not important who and through which mount
> > triggered the ->lookup). Imagine that you've dentry cache filled and call
> > open(...) in this case ->lookup can be uncalled. But if the user was not lucky
> > enough to have cache filled then open(..) will trigger the lookup and
> > then ->lookup results will be dependent on the mount's idmapping. It
> > seems incorrect
> > and unobvious consequence of introducing such a parameter to ->lookup operation.
> > To summarize, ->lookup is about filling dentry cache and dentry cache
> > is superblock-level
> > thing, not mount-level.
> >
> > Solution 2. Add some kind of extra checks to ceph-client and ceph
> > server to detect that
> > mount idmappings used with UID/GID-based restrictions and restrict such mounts.
> >
> > Seems not ideal to me too. Because it's not a fix, it's a limitation
> > and this limitation is
> > not cheap from the implementation perspective (we need heavy changes
> > in ceph server side and
> > client side too). Btw, currently VFS API is also not ready for that,
> > because we can't
> > decide if idmapped mounts are allowed or not in runtime. It's a static
> > thing that should be declared
> > with FS_ALLOW_IDMAP flag in (struct file_system_type)->fs_flags. Not a
> > big deal, but...
> >
> > Solution 3. Add a new UID/GID fields to ceph request structure in
> > addition to head->caller_{uid,gid}
> > to store information about inode owners (only for inode_operations
> > which create inodes).
> >
> > How does it solves the problem?
> > With these new fields we can leave head->caller_{uid,gid} untouched
> > with an idmapped mounts code.
> > It means that UID/GID-based restrictions will continue to work as intended.
> >
> > At the same time, new fields (let say "inode_owner_{uid,gid}") will be
> > mapped in accordance with
> > a mount's idmapping.
> >
> > This solution seems ideal, because it is philosophically correct, it
> > makes cephfs idmapped mounts to work
> > in the same manner and way as idmapped mounts work for any other
> > filesystem like ext4.
>
> Okay, this approach sounds more reasonable to me. But you need to do
> some extra work to make it to be compatible between {old,new} kernels
> and  {old,new} cephs.

Sure. Then I'll start implementing this.

Kind regards,
Alex

>
> So then the caller uid/gid will always be the user uid/gid issuing the
> requests as now.
>
> Thanks
>
> - Xiubo
>
>
> > But yes, this requires cephfs protocol changes...
> >
> > I personally still believe that the "Solution 0" approach is optimal
> > and we can go with "Solution 3" way
> > as the next iteration.
> >
> > Kind regards,
> > Alex
> >
> >> And also the same for other non-create requests. If
> >>> so this will be incorrect for the cephx perm checks IMO.
> >> Thanks,
> >> Alex
> >>
> >>> Thanks
> >>>
> >>> - Xiubo
> >>>
> >>>
> >>>> This makes a problem with path-based UID/GID restriction mechanism,
> >>>> because it uses head->caller_{uid,gid} fields
> >>>> to check if UID/GID is permitted or not.
> >>>>
> >>>> So, the problem is that we have one field in ceph request for two
> >>>> different needs - to control permissions and to set inode owner.
> >>>> Christian pointed that the most saner way is to modify ceph protocol
> >>>> and add a separate field to store inode owner UID/GID,
> >>>> and only this fields should be idmapped, but head->caller_{uid,gid}
> >>>> will be untouched.
> >>>>
> >>>> With this approach, we will not affect UID/GID-based permission rules
> >>>> with an idmapped mounts at all.
> >>>>
> >>>> Kind regards,
> >>>> Alex
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>> - Xiubo
> >>>>>
> >>>>>
> >>>>>> Kind regards,
> >>>>>> Alex
> >>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> - Xiubo
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Alex
> >>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>>
> >>>>>>>>> - Xiubo
> >>>>>>>>>
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-20  6:36                                     ` Xiubo Li
  2023-07-20  6:41                                       ` Aleksandr Mikhalitsyn
@ 2023-07-21 15:43                                       ` Aleksandr Mikhalitsyn
  2023-07-24  1:02                                         ` Xiubo Li
  1 sibling, 1 reply; 47+ messages in thread
From: Aleksandr Mikhalitsyn @ 2023-07-21 15:43 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel

On Thu, Jul 20, 2023 at 8:36 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 7/19/23 19:57, Aleksandr Mikhalitsyn wrote:
> > On Tue, Jul 18, 2023 at 4:49 PM Aleksandr Mikhalitsyn
> > <aleksandr.mikhalitsyn@canonical.com> wrote:
> >> On Tue, Jul 18, 2023 at 3:45 AM Xiubo Li <xiubli@redhat.com> wrote:
> [...]
> >> No, the idea is to stop mapping a caller_{uid, gid}. And to add a new
> >> fields like
> >> inode_owner_{uid, gid} which will be idmapped (this field will be specific only
> >> for those operations that create a new inode).
> > I've decided to write some summary of different approaches and
> > elaborate tricky places.
> >
> > Current implementation.
> >
> > We have head->caller_{uid,gid} fields mapped in according
> > to the mount's idmapping. But as we don't have information about
> > mount's idmapping in all call stacks (like ->lookup case), we
> > are not able to map it always and they are left untouched in these cases.
> >
> > This tends to an inconsistency between different inode_operations,
> > for example ->lookup (don't have an access to an idmapping) and
> > ->mkdir (have an idmapping as an argument).
> >
> > This inconsistency is absolutely harmless if the user does not
> > use UID/GID-based restrictions. Because in this use case head->caller_{uid,gid}
> > fields used *only* to set inode owner UID/GID during the inode_operations
> > which create inodes.
> >
> > Conclusion 1. head->caller_{uid,gid} fields have two meanings
> > - UID/GID-based permission checks
> > - inode owner information
> >
> > Solution 0. Ignore the issue with UID/GID-based restrictions and idmapped mounts
> > until we are not blamed by users ;-)
> >
> > As far as I can see you are not happy about this way. :-)
> >
> > Solution 1. Let's add mount's idmapping argument to all inode_operations
> > and always map head->caller_{uid,gid} fields.
> >
> > Not a best idea, because:
> > - big modification of VFS layer
> > - ideologically incorrect, for instance ->lookup should not care
> > and know *anything* about mount's idmapping, because ->lookup works
> > not on the mount level (it's not important who and through which mount
> > triggered the ->lookup). Imagine that you've dentry cache filled and call
> > open(...) in this case ->lookup can be uncalled. But if the user was not lucky
> > enough to have cache filled then open(..) will trigger the lookup and
> > then ->lookup results will be dependent on the mount's idmapping. It
> > seems incorrect
> > and unobvious consequence of introducing such a parameter to ->lookup operation.
> > To summarize, ->lookup is about filling dentry cache and dentry cache
> > is superblock-level
> > thing, not mount-level.
> >
> > Solution 2. Add some kind of extra checks to ceph-client and ceph
> > server to detect that
> > mount idmappings used with UID/GID-based restrictions and restrict such mounts.
> >
> > Seems not ideal to me too. Because it's not a fix, it's a limitation
> > and this limitation is
> > not cheap from the implementation perspective (we need heavy changes
> > in ceph server side and
> > client side too). Btw, currently VFS API is also not ready for that,
> > because we can't
> > decide if idmapped mounts are allowed or not in runtime. It's a static
> > thing that should be declared
> > with FS_ALLOW_IDMAP flag in (struct file_system_type)->fs_flags. Not a
> > big deal, but...
> >
> > Solution 3. Add a new UID/GID fields to ceph request structure in
> > addition to head->caller_{uid,gid}
> > to store information about inode owners (only for inode_operations
> > which create inodes).
> >
> > How does it solves the problem?
> > With these new fields we can leave head->caller_{uid,gid} untouched
> > with an idmapped mounts code.
> > It means that UID/GID-based restrictions will continue to work as intended.
> >
> > At the same time, new fields (let say "inode_owner_{uid,gid}") will be
> > mapped in accordance with
> > a mount's idmapping.
> >
> > This solution seems ideal, because it is philosophically correct, it
> > makes cephfs idmapped mounts to work
> > in the same manner and way as idmapped mounts work for any other
> > filesystem like ext4.
>
> Okay, this approach sounds more reasonable to me. But you need to do
> some extra work to make it to be compatible between {old,new} kernels
> and  {old,new} cephs.
>
> So then the caller uid/gid will always be the user uid/gid issuing the
> requests as now.

Dear Xiubo,

I've posted a PR https://github.com/ceph/ceph/pull/52575

Kind regards,
Alex

>
> Thanks
>
> - Xiubo
>
>
> > But yes, this requires cephfs protocol changes...
> >
> > I personally still believe that the "Solution 0" approach is optimal
> > and we can go with "Solution 3" way
> > as the next iteration.
> >
> > Kind regards,
> > Alex
> >
> >> And also the same for other non-create requests. If
> >>> so this will be incorrect for the cephx perm checks IMO.
> >> Thanks,
> >> Alex
> >>
> >>> Thanks
> >>>
> >>> - Xiubo
> >>>
> >>>
> >>>> This makes a problem with path-based UID/GID restriction mechanism,
> >>>> because it uses head->caller_{uid,gid} fields
> >>>> to check if UID/GID is permitted or not.
> >>>>
> >>>> So, the problem is that we have one field in ceph request for two
> >>>> different needs - to control permissions and to set inode owner.
> >>>> Christian pointed that the most saner way is to modify ceph protocol
> >>>> and add a separate field to store inode owner UID/GID,
> >>>> and only this fields should be idmapped, but head->caller_{uid,gid}
> >>>> will be untouched.
> >>>>
> >>>> With this approach, we will not affect UID/GID-based permission rules
> >>>> with an idmapped mounts at all.
> >>>>
> >>>> Kind regards,
> >>>> Alex
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>> - Xiubo
> >>>>>
> >>>>>
> >>>>>> Kind regards,
> >>>>>> Alex
> >>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> - Xiubo
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Alex
> >>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>>
> >>>>>>>>> - Xiubo
> >>>>>>>>>
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5 00/14] ceph: support idmapped mounts
  2023-07-21 15:43                                       ` Aleksandr Mikhalitsyn
@ 2023-07-24  1:02                                         ` Xiubo Li
  0 siblings, 0 replies; 47+ messages in thread
From: Xiubo Li @ 2023-07-24  1:02 UTC (permalink / raw)
  To: Aleksandr Mikhalitsyn
  Cc: Gregory Farnum, Christian Brauner, stgraber, linux-fsdevel,
	Ilya Dryomov, Jeff Layton, ceph-devel, linux-kernel


On 7/21/23 23:43, Aleksandr Mikhalitsyn wrote:
> On Thu, Jul 20, 2023 at 8:36 AM Xiubo Li <xiubli@redhat.com> wrote:
>>
>> On 7/19/23 19:57, Aleksandr Mikhalitsyn wrote:
>>> On Tue, Jul 18, 2023 at 4:49 PM Aleksandr Mikhalitsyn
>>> <aleksandr.mikhalitsyn@canonical.com> wrote:
>>>> On Tue, Jul 18, 2023 at 3:45 AM Xiubo Li <xiubli@redhat.com> wrote:
>> [...]
>>>> No, the idea is to stop mapping a caller_{uid, gid}. And to add a new
>>>> fields like
>>>> inode_owner_{uid, gid} which will be idmapped (this field will be specific only
>>>> for those operations that create a new inode).
>>> I've decided to write some summary of different approaches and
>>> elaborate tricky places.
>>>
>>> Current implementation.
>>>
>>> We have head->caller_{uid,gid} fields mapped in according
>>> to the mount's idmapping. But as we don't have information about
>>> mount's idmapping in all call stacks (like ->lookup case), we
>>> are not able to map it always and they are left untouched in these cases.
>>>
>>> This tends to an inconsistency between different inode_operations,
>>> for example ->lookup (don't have an access to an idmapping) and
>>> ->mkdir (have an idmapping as an argument).
>>>
>>> This inconsistency is absolutely harmless if the user does not
>>> use UID/GID-based restrictions. Because in this use case head->caller_{uid,gid}
>>> fields used *only* to set inode owner UID/GID during the inode_operations
>>> which create inodes.
>>>
>>> Conclusion 1. head->caller_{uid,gid} fields have two meanings
>>> - UID/GID-based permission checks
>>> - inode owner information
>>>
>>> Solution 0. Ignore the issue with UID/GID-based restrictions and idmapped mounts
>>> until we are not blamed by users ;-)
>>>
>>> As far as I can see you are not happy about this way. :-)
>>>
>>> Solution 1. Let's add mount's idmapping argument to all inode_operations
>>> and always map head->caller_{uid,gid} fields.
>>>
>>> Not a best idea, because:
>>> - big modification of VFS layer
>>> - ideologically incorrect, for instance ->lookup should not care
>>> and know *anything* about mount's idmapping, because ->lookup works
>>> not on the mount level (it's not important who and through which mount
>>> triggered the ->lookup). Imagine that you've dentry cache filled and call
>>> open(...) in this case ->lookup can be uncalled. But if the user was not lucky
>>> enough to have cache filled then open(..) will trigger the lookup and
>>> then ->lookup results will be dependent on the mount's idmapping. It
>>> seems incorrect
>>> and unobvious consequence of introducing such a parameter to ->lookup operation.
>>> To summarize, ->lookup is about filling dentry cache and dentry cache
>>> is superblock-level
>>> thing, not mount-level.
>>>
>>> Solution 2. Add some kind of extra checks to ceph-client and ceph
>>> server to detect that
>>> mount idmappings used with UID/GID-based restrictions and restrict such mounts.
>>>
>>> Seems not ideal to me too. Because it's not a fix, it's a limitation
>>> and this limitation is
>>> not cheap from the implementation perspective (we need heavy changes
>>> in ceph server side and
>>> client side too). Btw, currently VFS API is also not ready for that,
>>> because we can't
>>> decide if idmapped mounts are allowed or not in runtime. It's a static
>>> thing that should be declared
>>> with FS_ALLOW_IDMAP flag in (struct file_system_type)->fs_flags. Not a
>>> big deal, but...
>>>
>>> Solution 3. Add a new UID/GID fields to ceph request structure in
>>> addition to head->caller_{uid,gid}
>>> to store information about inode owners (only for inode_operations
>>> which create inodes).
>>>
>>> How does it solves the problem?
>>> With these new fields we can leave head->caller_{uid,gid} untouched
>>> with an idmapped mounts code.
>>> It means that UID/GID-based restrictions will continue to work as intended.
>>>
>>> At the same time, new fields (let say "inode_owner_{uid,gid}") will be
>>> mapped in accordance with
>>> a mount's idmapping.
>>>
>>> This solution seems ideal, because it is philosophically correct, it
>>> makes cephfs idmapped mounts to work
>>> in the same manner and way as idmapped mounts work for any other
>>> filesystem like ext4.
>> Okay, this approach sounds more reasonable to me. But you need to do
>> some extra work to make it to be compatible between {old,new} kernels
>> and  {old,new} cephs.
>>
>> So then the caller uid/gid will always be the user uid/gid issuing the
>> requests as now.
> Dear Xiubo,
>
> I've posted a PR https://github.com/ceph/ceph/pull/52575

Sure. Will check.

Thanks

- Xiubo

> Kind regards,
> Alex
>
>> Thanks
>>
>> - Xiubo
>>
>>
>>> But yes, this requires cephfs protocol changes...
>>>
>>> I personally still believe that the "Solution 0" approach is optimal
>>> and we can go with "Solution 3" way
>>> as the next iteration.
>>>
>>> Kind regards,
>>> Alex
>>>
>>>> And also the same for other non-create requests. If
>>>>> so this will be incorrect for the cephx perm checks IMO.
>>>> Thanks,
>>>> Alex
>>>>
>>>>> Thanks
>>>>>
>>>>> - Xiubo
>>>>>
>>>>>
>>>>>> This makes a problem with path-based UID/GID restriction mechanism,
>>>>>> because it uses head->caller_{uid,gid} fields
>>>>>> to check if UID/GID is permitted or not.
>>>>>>
>>>>>> So, the problem is that we have one field in ceph request for two
>>>>>> different needs - to control permissions and to set inode owner.
>>>>>> Christian pointed that the most saner way is to modify ceph protocol
>>>>>> and add a separate field to store inode owner UID/GID,
>>>>>> and only this fields should be idmapped, but head->caller_{uid,gid}
>>>>>> will be untouched.
>>>>>>
>>>>>> With this approach, we will not affect UID/GID-based permission rules
>>>>>> with an idmapped mounts at all.
>>>>>>
>>>>>> Kind regards,
>>>>>> Alex
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> - Xiubo
>>>>>>>
>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Alex
>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> - Xiubo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Alex
>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> - Xiubo
>>>>>>>>>>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2023-07-24  1:03 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-08 15:42 [PATCH v5 00/14] ceph: support idmapped mounts Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 01/14] fs: export mnt_idmap_get/mnt_idmap_put Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 02/14] ceph: stash idmapping in mdsc request Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 03/14] ceph: handle idmapped mounts in create_request_message() Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 04/14] ceph: pass an idmapping to mknod/symlink/mkdir/rename Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 05/14] ceph: allow idmapped getattr inode op Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 06/14] ceph: allow idmapped permission " Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 07/14] ceph: pass idmap to __ceph_setattr Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 08/14] ceph: allow idmapped setattr inode op Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 09/14] ceph/acl: allow idmapped set_acl " Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 10/14] ceph/file: allow idmapped atomic_open " Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 11/14] ceph: pass idmap to ceph_do_getattr Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 12/14] ceph: pass idmap to __ceph_setxattr Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 13/14] ceph: pass idmap to ceph_open/ioctl_set_layout Alexander Mikhalitsyn
2023-06-08 15:42 ` [PATCH v5 14/14] ceph: allow idmapped mounts Alexander Mikhalitsyn
2023-06-09  1:57 ` [PATCH v5 00/14] ceph: support " Xiubo Li
2023-06-09  8:59   ` Aleksandr Mikhalitsyn
2023-06-09  9:59     ` Christian Brauner
2023-06-09 10:12       ` Aleksandr Mikhalitsyn
2023-06-13  1:43         ` Xiubo Li
2023-06-13 12:46           ` Aleksandr Mikhalitsyn
2023-06-14  9:45             ` Christian Brauner
2023-06-13 14:53           ` Gregory Farnum
2023-06-13 16:27             ` Aleksandr Mikhalitsyn
2023-06-14  1:52             ` Xiubo Li
2023-06-14 12:39               ` Aleksandr Mikhalitsyn
     [not found]               ` <CAEivzxcr99sERxZX17rZ5jW9YSzAWYvAjOOhBH+FqRoso2=yng@mail.gmail.com>
2023-06-15  5:08                 ` Xiubo Li
2023-06-15 11:05                   ` Aleksandr Mikhalitsyn
2023-06-15 12:29                   ` Xiubo Li
2023-06-15 12:54                     ` Aleksandr Mikhalitsyn
2023-06-21 16:55                       ` Aleksandr Mikhalitsyn
2023-06-26  1:04                       ` Xiubo Li
2023-06-24  1:36                   ` Xiubo Li
2023-06-24  7:11                     ` Aleksandr Mikhalitsyn
2023-06-26  2:12                       ` Xiubo Li
2023-06-26 11:23                         ` Aleksandr Mikhalitsyn
2023-06-26 11:49                           ` Aleksandr Mikhalitsyn
2023-07-04  1:10                             ` Xiubo Li
2023-07-04  1:08                           ` Xiubo Li
2023-07-14 12:57                             ` Aleksandr Mikhalitsyn
2023-07-18  1:44                               ` Xiubo Li
2023-07-18 14:49                                 ` Aleksandr Mikhalitsyn
2023-07-19 11:57                                   ` Aleksandr Mikhalitsyn
2023-07-20  6:36                                     ` Xiubo Li
2023-07-20  6:41                                       ` Aleksandr Mikhalitsyn
2023-07-21 15:43                                       ` Aleksandr Mikhalitsyn
2023-07-24  1:02                                         ` Xiubo Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).