From: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
To: Christian Brauner <brauner@kernel.org>, Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org, djwong@kernel.org
Subject: Re: [PATCH] xfs: do not clear S_ISUID|S_ISGID for idmapped mounts
Date: Tue, 22 Feb 2022 14:19:16 +0300 [thread overview]
Message-ID: <bdfa9081-1994-95f9-6feb-6710d34b33a1@virtuozzo.com> (raw)
In-Reply-To: <20220222102405.mmqlzimwabz7v67d@wittgenstein>
On 2/22/22 13:24, Christian Brauner wrote:
> On Tue, Feb 22, 2022 at 09:33:40AM +0100, Christoph Hellwig wrote:
>> On Mon, Feb 21, 2022 at 09:22:18PM +0300, Andrey Zhadchenko wrote:
>>> xfs_fileattr_set() handles idmapped mounts correctly and do not drop this
>>> bits.
>>> Unfortunately chown syscall results in different callstask:
>>> i_op->xfs_vn_setattr()->...->xfs_setattr_nonsize() which checks if process
>>> has CAP_FSETID capable in init_user_ns rather than mntns userns.
>>
>> Can you add an xfstests the exercises this path?
>>
>> The fix itself looks good:
>>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> So for anything other than directories the s{g,u}id bits are cleared on
> every chown in notify_change() by the vfs; even for the root user (Also
> documented on chown(2) manpage).
Only exception - chown preserves setgid bit set on a
non-group-executable file (also documented there) but do not take root
privileges into account at vfs level.
>
> So the only scenario were this change would be relevant is for
> directories afaict:
>
> 1. So ext4 has the behavior:
>
> ubuntu@f2-vm|~
> > mkdir suid.dir
> ubuntu@f2-vm|~
> > perms ./suid.dir
> drwxrwxr-x 775 (1000:1000) ./suid.dir
> ubuntu@f2-vm|~
> > chmod u+s ./suid.dir/
> ubuntu@f2-vm|~
> > perms ./suid.dir
> drwsrwxr-x 4775 (1000:1000) ./suid.dir
> ubuntu@f2-vm|~
> > chmod g+s ./suid.dir/
> ubuntu@f2-vm|~
> > perms ./suid.dir
> drwsrwsr-x 6775 (1000:1000) ./suid.dir
> ubuntu@f2-vm|~
> > chown 1000:1000 ./suid.dir/
> ubuntu@f2-vm|~
> > perms ./suid.dir/
> drwsrwsr-x 6775 (1000:1000) ./suid.dir/
>
> meaning that both s{g,u}id bits are retained for directories. (Just to
> make this explicit: changing {g,u}id to the same {g,u}id still ends up
> calling into the filesystem.)
>
> 2. Whereas xfs currently has:
>
> brauner@wittgenstein|~
> > mkdir suid.dir
> brauner@wittgenstein|~
> > perms ./suid.dir
> drwxrwxr-x 775 ./suid.dir
> brauner@wittgenstein|~
> > chmod u+s ./suid.dir/
> brauner@wittgenstein|~
> > perms ./suid.dir
> drwsrwxr-x 4775 ./suid.dir
> brauner@wittgenstein|~
> > chmod g+s ./suid.dir/
> brauner@wittgenstein|~
> > perms ./suid.dir
> drwsrwsr-x 6775 ./suid.dir
> brauner@wittgenstein|~
> > chown 1000:1000 ./suid.dir/
> brauner@wittgenstein|~
> > perms ./suid.dir/
> drwxrwxr-x 775 ./suid.dir/
>
> meaning that both s{g,u}id bits are cleared for directories.
>
> Since the vfs will always ensure that s{g,u}id bits are stripped for
> anything that isn't a directory in the vfs:
> - ATTR_KILL_S{G,U}ID is raised in chown_common():
>
> if (!S_ISDIR(inode->i_mode))
> newattrs.ia_valid |=
> ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
>
> - and then in notify_change() we'll get the bits stripped and ATTR_MODE
> raised:
>
> if (ia_valid & ATTR_KILL_SUID) {
> if (mode & S_ISUID) {
> ia_valid = attr->ia_valid |= ATTR_MODE;
> attr->ia_mode = (inode->i_mode & ~S_ISUID);
> }
> }
> if (ia_valid & ATTR_KILL_SGID) {
> if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
So SGID is not killed if there is no S_IXGRP (yet no capability check)
Actually I do not really understand why do kernel expects filesystems to
further apply restrictions with CAP_FSETID. Why not kill it here since
we have all info?
> if (!(ia_valid & ATTR_MODE)) {
> ia_valid = attr->ia_valid |= ATTR_MODE;
> attr->ia_mode = inode->i_mode;
> }
> attr->ia_mode &= ~S_ISGID;
> }
> }
>
> we can change this codepath to just mirror setattr_copy() or switch
> fully to setattr_copy() (if feasible).
>
> Because as of right now the code seems to imply that the xfs code itself
> is responsible for stripping s{g,u}id bits for all files whereas it is
> the vfs that does it for any non-directory. So I'd propose to either try
> and switch that code to setattr_copy() or to do open-code the
> setattr_copy() check:
>
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index b79b3846e71b..ff55b31521a2 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -748,9 +748,13 @@ xfs_setattr_nonsize(
> * The set-user-ID and set-group-ID bits of a file will be
> * cleared upon successful return from chown()
> */
> - if ((inode->i_mode & (S_ISUID|S_ISGID)) &&
> - !capable(CAP_FSETID))
> - inode->i_mode &= ~(S_ISUID|S_ISGID);
> + if (iattr->ia_valid & ATTR_MODE) {
> + umode_t mode = iattr->ia_mode;
> + if (!in_group_p(i_gid_into_mnt(mnt_userns, inode)) &&
> + !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
> + mode &= ~S_ISGID;
> + inode->i_mode = mode;
> + }
>
> /*
> * Change the ownerships and register quota modifications
>
> which aligns xfs with ext4 and any other filesystem. Any thoughts on
> this?
>
> For @Andrey specifically: the tests these should go into:
>
> https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/src/idmapped-mounts/idmapped-mounts.c
>
> Note that there are already setgid inheritance tests and set*id
> execution tests in there.
> You should be able to copy a lot of code from them. Could you please add
> the test I sketched above and ideally also a test that the set{g,u}id
> bits are stripped during chown for regular files?
Thanks for the link. To clarify what tests and result you expect:
- for directory chown we expect to preserve s{g,u}id
- for regfile chown we expect to preserve S_ISGID only if S_IXGRP is
absent and CAP_FSETID is present
JFYI: I found out this problem while running LTP (specifically
syscalls/chown02 test) on idmapped XFS. Maybe I will be able to find
more, who knows.
next prev parent reply other threads:[~2022-02-22 11:19 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-21 18:22 [PATCH] xfs: do not clear S_ISUID|S_ISGID for idmapped mounts Andrey Zhadchenko
2022-02-22 8:33 ` Christoph Hellwig
2022-02-22 9:25 ` Andrey Zhadchenko
2022-02-22 10:24 ` Christian Brauner
2022-02-22 11:19 ` Andrey Zhadchenko [this message]
2022-02-22 12:23 ` Christian Brauner
2022-02-22 12:36 ` Christian Brauner
2022-02-22 12:44 ` Christian Brauner
2022-02-22 14:54 ` Andrey Zhadchenko
2022-02-22 15:03 ` Christian Brauner
2022-02-22 21:40 ` Dave Chinner
2022-02-23 8:11 ` Christian Brauner
2022-02-25 1:57 ` Darrick J. Wong
2022-02-25 9:45 ` Christian Brauner
2022-02-25 10:42 ` Andrey Zhadchenko
2022-02-25 17:11 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bdfa9081-1994-95f9-6feb-6710d34b33a1@virtuozzo.com \
--to=andrey.zhadchenko@virtuozzo.com \
--cc=brauner@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).