linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Marchevsky <davemarchevsky@fb.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: <linux-fsdevel@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Seth Forshee <sforshee@digitalocean.com>,
	Rik van Riel <riel@surriel.com>, <kernel-team@fb.com>
Subject: Re: [PATCH] fuse: allow CAP_SYS_ADMIN in root userns to access allow_other mount
Date: Fri, 12 Nov 2021 18:29:20 -0500	[thread overview]
Message-ID: <0515c3c8-c9e3-25dd-4b49-bb8e19c76f0d@fb.com> (raw)
In-Reply-To: <20211112101307.iqf3nhxgchf2u2i3@wittgenstein>

On 11/12/21 5:13 AM, Christian Brauner wrote:   
> On Thu, Nov 11, 2021 at 02:11:42PM -0800, Dave Marchevsky wrote:
>> Since commit 73f03c2b4b52 ("fuse: Restrict allow_other to the
>> superblock's namespace or a descendant"), access to allow_other FUSE
>> filesystems has been limited to users in the mounting user namespace or
>> descendants. This prevents a process that is privileged in its userns -
>> but not its parent namespaces - from mounting a FUSE fs w/ allow_other
>> that is accessible to processes in parent namespaces.
>>
>> While this restriction makes sense overall it breaks a legitimate
>> usecase for me. I have a tracing daemon which needs to peek into
>> process' open files in order to symbolicate - similar to 'perf'. The
>> daemon is a privileged process in the root userns, but is unable to peek
>> into FUSE filesystems mounted with allow_other by processes in child
>> namespaces.
>>
>> This patch adds an escape hatch to the descendant userns logic
>> specifically for processes with CAP_SYS_ADMIN in the root userns. Such
>> processes can already do many dangerous things regardless of namespace,
>> and moreover could fork and setns into any child userns with a FUSE
>> mount, so it's reasonable to allow them to interact with all allow_other
>> FUSE filesystems.
>>
>> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
>> Cc: Miklos Szeredi <miklos@szeredi.hu>
>> Cc: Seth Forshee <sforshee@digitalocean.com>
>> Cc: Rik van Riel <riel@surriel.com>
>> Cc: kernel-team@fb.com
>> ---
> 
> If your tracing daemon runs in init_user_ns with CAP_SYS_ADMIN why can't
> it simply use a helper process/thread to
> setns(userns_fd/pidfd, CLONE_NEWUSER)
> to the target userns? This way we don't need to special-case
> init_user_ns at all.

helper process + setns could work for my usecase. But the fact that there's no
way to say "I know what I am about to do is potentially stupid and dangerous,
but I am root so let me do it", without spawning a helper process in this case,
feels like it'll result in special-case userspace workarounds for anyone doing
symbolication of backtraces.

e.g. perf will have to add some logic: "did I fail
to grab this exe that some process had mapped? Is it in a FUSE mounted by some
descendant userns? let's fork a helper process..." Not the end of the world,
but unnecessary complexity nonetheless.

That being said, I agree that this patch's special-casing of init_user_ns is
hacky. What do you think about a more explicit and general "let me do this
stupid and dangerous thing" mechanism - perhaps a new struct fuse_conn field
containing a set of exception userns', populated with ioctl or similar.

> 
>>
>> Note: I was unsure whether CAP_SYS_ADMIN or CAP_SYS_PTRACE was the best
>> choice of capability here. Went with the former as it's checked
>> elsewhere in fs/fuse while CAP_SYS_PTRACE isn't.
>>
>>  fs/fuse/dir.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
>> index 0654bfedcbb0..2524eeb0f35d 100644
>> --- a/fs/fuse/dir.c
>> +++ b/fs/fuse/dir.c
>> @@ -1134,7 +1134,7 @@ int fuse_allow_current_process(struct fuse_conn *fc)
>>  	const struct cred *cred;
>>  
>>  	if (fc->allow_other)
>> -		return current_in_userns(fc->user_ns);
>> +		return current_in_userns(fc->user_ns) || capable(CAP_SYS_ADMIN);
>>  
>>  	cred = current_cred();
>>  	if (uid_eq(cred->euid, fc->user_id) &&
>> -- 
>> 2.30.2
>>
>>


  reply	other threads:[~2021-11-12 23:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 22:11 [PATCH] fuse: allow CAP_SYS_ADMIN in root userns to access allow_other mount Dave Marchevsky
2021-11-12  2:10 ` Rik van Riel
2021-11-12 10:13 ` Christian Brauner
2021-11-12 23:29   ` Dave Marchevsky [this message]
2021-11-15 15:28     ` Miklos Szeredi
2022-05-17 16:50       ` Dave Marchevsky
2022-05-18 11:22         ` Christian Brauner
2022-05-18 11:26           ` Miklos Szeredi
2022-05-19  4:56             ` Andrii Nakryiko
2022-05-19  8:59               ` Christian Brauner
2022-05-24  4:35                 ` Andrii Nakryiko
2022-05-24  7:07                   ` Miklos Szeredi
2022-05-24 14:59                     ` Rik van Riel
2022-05-24 15:44                     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0515c3c8-c9e3-25dd-4b49-bb8e19c76f0d@fb.com \
    --to=davemarchevsky@fb.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=riel@surriel.com \
    --cc=sforshee@digitalocean.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).