From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751613AbdASJbE (ORCPT <rfc822;w@1wt.eu>);
        Thu, 19 Jan 2017 04:31:04 -0500
Received: from mx2.suse.de ([195.135.220.15]:58939 "EHLO mx2.suse.de"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751487AbdASJ3p (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 19 Jan 2017 04:29:45 -0500
Date: Thu, 19 Jan 2017 10:29:31 +0100
From: Michal Hocko <mhocko@kernel.org>
To: Aleksa Sarai <asarai@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
        Oleg Nesterov <oleg@redhat.com>, Kees Cook <keescook@chromium.org>,
        Al Viro <viro@zeniv.linux.org.uk>,
        John Stultz <john.stultz@linaro.org>,
        Mateusz Guzik <mguzik@redhat.com>,
        Janis Danisevskis <jdanis@google.com>, linux-kernel@vger.kernel.org,
        dev@opencontainers.org, containers@lists.linux-foundation.org,
        "Eric W. Biederman" <ebiederm@xmission.com>
Subject: Re: [PATCH] procfs: change the owner of non-dumpable and writeable
 files
Message-ID: <20170119092930.GJ30786@dhcp22.suse.cz>
References: <20170118040159.4751-1-asarai@suse.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170118040159.4751-1-asarai@suse.de>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Cc Eric

On Wed 18-01-17 15:01:59, Aleksa Sarai wrote:
> In order to protect against ptrace(2) and similar attacks on container
> runtimes when they join namespaces, many runtimes set mm->dumpable to
> SUID_DUMP_DISABLE. However, doing this means that attempting to set up
> an unprivileged user namespace will fail because an unprivileged process
> can no longer access /proc/self/{setgroups,{uid,gid}_map} for the
> container process (which is the same uid as the runtime process).
> 
> Fix this by changing pid_getattr to *also* change the owner of regular
> files that have a mode of 0644 (when the process is not dumpable). This
> ensures that the important /proc/[pid]/... files mentioned above are
> properly accessible by a container runtime in a rootless container
> context.
> 
> The most blantant issue is that a non-dumpable process in a rootless
> container context is unable to open /proc/self/setgroups, because it
> doesn't own the file.
> 
> int main(void)
> {
> 	prctl(PR_SET_DUMPABLE, 0, 0, 0, 0);
> 	unshare(CLONE_NEWUSER);
> 
> 	/* This will fail. */
> 	int fd = open("/proc/self/setgroups", O_WRONLY);
> 	if (fd < 0)
> 		abort();
> 
> 	return 0;
> }

I do agree that failing to open anything in /proc/self/ is more than
unexepcted! I cannot judge the patch but my gut feeling tells me that
the fix should be somewhere in the open handler.

One nit below

> 
> Cc: dev@opencontainers.org
> Cc: containers@lists.linux-foundation.org
> Signed-off-by: Aleksa Sarai <asarai@suse.de>
> ---
>  fs/proc/base.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index ca651ac00660..ebabb12f4536 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1729,6 +1729,7 @@ int pid_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
>  			return -ENOENT;
>  		}
>  		if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) ||
> +		    (inode->i_mode == (S_IFREG|S_IRUGO|S_IWUSR)) ||
>  		    task_dumpable(task)) {
>  			cred = __task_cred(task);
>  			stat->uid = cred->euid;
> @@ -1770,6 +1771,7 @@ int pid_revalidate(struct dentry *dentry, unsigned int flags)
>  
>  	if (task) {
>  		if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) ||
> +		    (inode->i_mode == (S_IFREG|S_IRUGO|S_IWUSR)) ||
>  		    task_dumpable(task)) {
>  			rcu_read_lock();
>  			cred = __task_cred(task);
> @@ -2394,7 +2396,7 @@ static int proc_pident_instantiate(struct inode *dir,
>  	return -ENOENT;
>  }
>  
> -static struct dentry *proc_pident_lookup(struct inode *dir, 
> +static struct dentry *proc_pident_lookup(struct inode *dir,
>  					 struct dentry *dentry,
>  					 const struct pid_entry *ents,
>  					 unsigned int nents)
> @@ -2536,7 +2538,7 @@ static const struct pid_entry attr_dir_stuff[] = {
>  
>  static int proc_attr_dir_readdir(struct file *file, struct dir_context *ctx)
>  {
> -	return proc_pident_readdir(file, ctx, 
> +	return proc_pident_readdir(file, ctx,
>  				   attr_dir_stuff, ARRAY_SIZE(attr_dir_stuff));
>  }

this two are just whitespace noise

-- 
Michal Hocko
SUSE Labs