From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16E53EB64D9 for ; Wed, 14 Jun 2023 09:46:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231660AbjFNJqn (ORCPT ); Wed, 14 Jun 2023 05:46:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244234AbjFNJqQ (ORCPT ); Wed, 14 Jun 2023 05:46:16 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFB6B269E; Wed, 14 Jun 2023 02:45:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 183C263A37; Wed, 14 Jun 2023 09:45:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6BF80C433C0; Wed, 14 Jun 2023 09:45:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686735950; bh=bOUTEmQRNYc2XRJruJB03nYq61euHP/sjkiGBDp0jWY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=YagmxNqX4bkrWOB9BBQcotdoXxdLP0fEnyz9wXn8Sl7+EjU/LxuJDdp9uBB38g0YT mRxpCUyx9P6EIwxoACtoqJiSwoWqfAbU7UdZVVcdTSLc7F1aPshmkHgjI4E+Eus30y KepqggXk0js7bVCTE8quhjgSx70PCFTnpgQoTJU0SzXjUfKtWdByl13UOcfXTdTsKN 3N1RD4YgTpBGTPj1vFue4yhScbXUY8z4yta5NFW094UZGmvAO96SvWQp1e+iQx9E7J aq9N5Z8k/eB5IE529DjlRUXFOVzfT3c+Ch0i+yrrhxr+i8q8T4XLKpTPuNWhenChf0 moyzO6KNMk/zA== Date: Wed, 14 Jun 2023 11:45:45 +0200 From: Christian Brauner To: Aleksandr Mikhalitsyn Cc: Xiubo Li , Gregory Farnum , stgraber@ubuntu.com, linux-fsdevel@vger.kernel.org, Ilya Dryomov , Jeff Layton , ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 00/14] ceph: support idmapped mounts Message-ID: <20230614-westseite-urlaub-7a5afcf0577a@brauner> References: <20230608154256.562906-1-aleksandr.mikhalitsyn@canonical.com> <20230609-alufolie-gezaubert-f18ef17cda12@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 13, 2023 at 02:46:02PM +0200, Aleksandr Mikhalitsyn wrote: > On Tue, Jun 13, 2023 at 3:43 AM Xiubo Li wrote: > > > > > > On 6/9/23 18:12, Aleksandr Mikhalitsyn wrote: > > > On Fri, Jun 9, 2023 at 12:00 PM Christian Brauner wrote: > > >> On Fri, Jun 09, 2023 at 10:59:19AM +0200, Aleksandr Mikhalitsyn wrote: > > >>> On Fri, Jun 9, 2023 at 3:57 AM Xiubo Li wrote: > > >>>> > > >>>> On 6/8/23 23:42, Alexander Mikhalitsyn wrote: > > >>>>> Dear friends, > > >>>>> > > >>>>> This patchset was originally developed by Christian Brauner but I'll continue > > >>>>> to push it forward. Christian allowed me to do that :) > > >>>>> > > >>>>> This feature is already actively used/tested with LXD/LXC project. > > >>>>> > > >>>>> Git tree (based on https://github.com/ceph/ceph-client.git master): > > >>> Hi Xiubo! > > >>> > > >>>> Could you rebase these patches to 'testing' branch ? > > >>> Will do in -v6. > > >>> > > >>>> And you still have missed several places, for example the following cases: > > >>>> > > >>>> > > >>>> 1 269 fs/ceph/addr.c <> > > >>>> req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_GETATTR, > > >>>> mode); > > >>> + > > >>> > > >>>> 2 389 fs/ceph/dir.c <> > > >>>> req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); > > >>> + > > >>> > > >>>> 3 789 fs/ceph/dir.c <> > > >>>> req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS); > > >>> We don't have an idmapping passed to lookup from the VFS layer. As I > > >>> mentioned before, it's just impossible now. > > >> ->lookup() doesn't deal with idmappings and really can't otherwise you > > >> risk ending up with inode aliasing which is really not something you > > >> want. IOW, you can't fill in inode->i_{g,u}id based on a mount's > > >> idmapping as inode->i_{g,u}id absolutely needs to be a filesystem wide > > >> value. So better not even risk exposing the idmapping in there at all. > > > Thanks for adding, Christian! > > > > > > I agree, every time when we use an idmapping we need to be careful with > > > what we map. AFAIU, inode->i_{g,u}id should be based on the filesystem > > > idmapping (not mount), > > > but in this case, Xiubo want's current_fs{u,g}id to be mapped > > > according to an idmapping. > > > Anyway, it's impossible at now and IMHO, until we don't have any > > > practical use case where > > > UID/GID-based path restriction is used in combination with idmapped > > > mounts it's not worth to > > > make such big changes in the VFS layer. > > > > > > May be I'm not right, but it seems like UID/GID-based path restriction > > > is not a widespread > > > feature and I can hardly imagine it to be used with the container > > > workloads (for instance), > > > because it will require to always keep in sync MDS permissions > > > configuration with the > > > possible UID/GID ranges on the client. It looks like a nightmare for sysadmin. > > > It is useful when cephfs is used as an external storage on the host, but if you > > > share cephfs with a few containers with different user namespaces idmapping... > > > > Hmm, while this will break the MDS permission check in cephfs then in > > lookup case. If we really couldn't support it we should make it to > > escape the check anyway or some OPs may fail and won't work as expected. > > Hi Xiubo! > > Disabling UID/GID checks on the MDS side looks reasonable. IMHO the > most important checks are: > - open > - mknod/mkdir/symlink/rename > and for these checks we already have an idmapping. > > Also, I want to add that it's a little bit unusual when permission > checks are done against the caller UID/GID. The server side permission checking based on the sender's fs{g,u}id is rather esoteric imho. So I would just disable it for idmapped mounts. > Usually, if we have opened a file descriptor and, for instance, passed > this file descriptor through a unix socket then > file descriptor holder will be able to use it in accordance with the > flags (O_RDONLY, O_RDWR, ...). > We also have ->f_cred on the struct file that contains credentials of > the file opener and permission checks are usually done > based on this. But in cephfs we are always using syscall caller's > credentials. It makes cephfs file descriptor "not transferable" > in terms of permission checks. Yeah, that's another good point.