From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751804AbdARXay (ORCPT ); Wed, 18 Jan 2017 18:30:54 -0500 Received: from mail-it0-f52.google.com ([209.85.214.52]:36625 "EHLO mail-it0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751354AbdARXaE (ORCPT ); Wed, 18 Jan 2017 18:30:04 -0500 MIME-Version: 1.0 In-Reply-To: <20170118040159.4751-1-asarai@suse.de> References: <20170118040159.4751-1-asarai@suse.de> From: Kees Cook Date: Wed, 18 Jan 2017 15:22:48 -0800 X-Google-Sender-Auth: SKcBYEGghSBow9pnk6COxWQAEGk Message-ID: Subject: Re: [PATCH] procfs: change the owner of non-dumpable and writeable files To: Aleksa Sarai Cc: Andrew Morton , Michal Hocko , Oleg Nesterov , Al Viro , John Stultz , Mateusz Guzik , Janis Danisevskis , LKML , dev@opencontainers.org, Linux Containers Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 17, 2017 at 8:01 PM, Aleksa Sarai wrote: > In order to protect against ptrace(2) and similar attacks on container > runtimes when they join namespaces, many runtimes set mm->dumpable to > SUID_DUMP_DISABLE. However, doing this means that attempting to set up > an unprivileged user namespace will fail because an unprivileged process > can no longer access /proc/self/{setgroups,{uid,gid}_map} for the > container process (which is the same uid as the runtime process). > > Fix this by changing pid_getattr to *also* change the owner of regular > files that have a mode of 0644 (when the process is not dumpable). This > ensures that the important /proc/[pid]/... files mentioned above are > properly accessible by a container runtime in a rootless container > context. > > The most blantant issue is that a non-dumpable process in a rootless > container context is unable to open /proc/self/setgroups, because it > doesn't own the file. This changes a lot more than just setgroups, doesn't it? This bypasses the task_dumpable check for all kinds of things. Though, I expect the has_pid_permissions() check to be the harder one to pass. Why does has_pid_permissions() succeed in the case you've given? -Kees > int main(void) > { > prctl(PR_SET_DUMPABLE, 0, 0, 0, 0); > unshare(CLONE_NEWUSER); > > /* This will fail. */ > int fd = open("/proc/self/setgroups", O_WRONLY); > if (fd < 0) > abort(); > > return 0; > } > > Cc: dev@opencontainers.org > Cc: containers@lists.linux-foundation.org > Signed-off-by: Aleksa Sarai > --- > fs/proc/base.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index ca651ac00660..ebabb12f4536 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1729,6 +1729,7 @@ int pid_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat) > return -ENOENT; > } > if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) || > + (inode->i_mode == (S_IFREG|S_IRUGO|S_IWUSR)) || > task_dumpable(task)) { > cred = __task_cred(task); > stat->uid = cred->euid; > @@ -1770,6 +1771,7 @@ int pid_revalidate(struct dentry *dentry, unsigned int flags) > > if (task) { > if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) || > + (inode->i_mode == (S_IFREG|S_IRUGO|S_IWUSR)) || > task_dumpable(task)) { > rcu_read_lock(); > cred = __task_cred(task); > @@ -2394,7 +2396,7 @@ static int proc_pident_instantiate(struct inode *dir, > return -ENOENT; > } > > -static struct dentry *proc_pident_lookup(struct inode *dir, > +static struct dentry *proc_pident_lookup(struct inode *dir, > struct dentry *dentry, > const struct pid_entry *ents, > unsigned int nents) > @@ -2536,7 +2538,7 @@ static const struct pid_entry attr_dir_stuff[] = { > > static int proc_attr_dir_readdir(struct file *file, struct dir_context *ctx) > { > - return proc_pident_readdir(file, ctx, > + return proc_pident_readdir(file, ctx, > attr_dir_stuff, ARRAY_SIZE(attr_dir_stuff)); > } > > -- > 2.11.0 > -- Kees Cook Nexus Security