From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753625AbbFIDkY (ORCPT ); Mon, 8 Jun 2015 23:40:24 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:64586 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752983AbbFIDkQ (ORCPT ); Mon, 8 Jun 2015 23:40:16 -0400 From: Calvin Owens To: Andrew Morton , Alexey Dobriyan , "Eric W. Biederman" , Al Viro , Miklos Szeredi , Zefan Li , Oleg Nesterov , Joe Perches , David Howells CC: Calvin Owens , , , Andy Lutomirski , Cyrill Gorcunov , Kees Cook , "Kirill A. Shutemov" Subject: [PATCH v6] procfs: Always expose /proc//map_files/ and make it readable Date: Mon, 8 Jun 2015 20:39:33 -0700 Message-ID: <1433821173-2804704-1-git-send-email-calvinowens@fb.com> X-Mailer: git-send-email 1.8.1 In-Reply-To: <1432005006-3428-1-git-send-email-calvinowens@fb.com> References: <1432005006-3428-1-git-send-email-calvinowens@fb.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151,1.0.33,0.0.0000 definitions=2015-06-09_03:2015-06-08,2015-06-09,1970-01-01 signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, /proc//map_files/ is restricted to CAP_SYS_ADMIN, and is only exposed if CONFIG_CHECKPOINT_RESTORE is set. This interface very useful because it allows userspace to stat() deleted files that are still mapped by some process, which enables a much quicker and more accurate answer to the question "How much disk space is being consumed by files that are deleted but still mapped?" than is currently possible. This patch moves map_files/ out from behind CONFIG_CHECKPOINT_RESTORE, and adjusts the permissions enforced on it as follows: * proc_map_files_lookup() * proc_map_files_readdir() * map_files_d_revalidate() Remove the CAP_SYS_ADMIN restriction, leaving only the current restriction requiring PTRACE_MODE_READ. In earlier versions of this patch, I changed the ptrace checks in the functions above to enforce MODE_ATTACH instead of MODE_READ. That was an oversight: all the information exposed by the above three functions is already available with MODE_READ from /proc/PID/maps. I was only being asked to strengthen the protection around functionality provided by follow_link(), not the above. So, I've left the checks for MODE_READ as-is, since AFAICS all objections raised so far are addressed by the new CAP_SYS_ADMIN check in follow_link(), explained below. * proc_map_files_follow_link() This stub has been added, and requires that the user have CAP_SYS_ADMIN in order to follow the links in map_files/, since there was concern on LKML both about the potential for bypassing permissions on ancestor directories in the path to files pointed to, and about what happens with more exotic memory mappings created by some drivers (ie dma-buf). Cc: Andrew Morton Cc: Andy Lutomirski Cc: Cyrill Gorcunov Cc: Joe Perches Cc: Kees Cook Cc: Kirill A. Shutemov Signed-off-by: Calvin Owens --- Changes in v6: Require CAP_SYS_ADMIN for follow_link(). Leave other PTRACE_MODE_READ checks as-is, since CAP_SYS_ADMIN alone addresses all concerns raised AFAICS. Changes in v5: s/dentry->d_inode/d_inode(dentry)/g Changes in v4: Return -ESRCH from follow_link() when get_proc_task() returns NULL. Changes in v3: Changed permission checks to use PTRACE_MODE_ATTACH instead of PTRACE_MODE_READ, and added a stub to enforce MODE_ATTACH on follow_link() as well. Changes in v2: Removed the follow_link() stub that returned -EPERM if the caller didn't have CAP_SYS_ADMIN, since the caller in my chroot() scenario gets -EACCES anyway. fs/proc/base.c | 42 +++++++++++++++++++++++------------------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 093ca14..0270191 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -1641,8 +1641,6 @@ end_instantiate: return dir_emit(ctx, name, len, 1, DT_UNKNOWN); } -#ifdef CONFIG_CHECKPOINT_RESTORE - /* * dname_to_vma_addr - maps a dentry name into two unsigned longs * which represent vma start and end addresses. @@ -1669,11 +1667,6 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags) if (flags & LOOKUP_RCU) return -ECHILD; - if (!capable(CAP_SYS_ADMIN)) { - status = -EPERM; - goto out_notask; - } - inode = d_inode(dentry); task = get_proc_task(inode); if (!task) @@ -1762,6 +1755,28 @@ struct map_files_info { unsigned char name[4*sizeof(long)+2]; /* max: %lx-%lx\0 */ }; +/* + * Only allow CAP_SYS_ADMIN to follow the links, due to concerns about how the + * symlinks may be used to bypass permissions on ancestor directories in the + * path to the file in question. + */ +static void *proc_map_files_follow_link(struct dentry *dentry, struct nameidata *nd) +{ + if (!capable(CAP_SYS_ADMIN)) + return ERR_PTR(-EPERM); + + return proc_pid_follow_link(dentry, nd); +} + +/* + * Identical to proc_pid_link_inode_operations except for follow_link() + */ +static const struct inode_operations proc_map_files_link_inode_operations = { + .readlink = proc_pid_readlink, + .follow_link = proc_map_files_follow_link, + .setattr = proc_setattr, +}; + static int proc_map_files_instantiate(struct inode *dir, struct dentry *dentry, struct task_struct *task, const void *ptr) @@ -1777,7 +1792,7 @@ proc_map_files_instantiate(struct inode *dir, struct dentry *dentry, ei = PROC_I(inode); ei->op.proc_get_link = proc_map_files_get_link; - inode->i_op = &proc_pid_link_inode_operations; + inode->i_op = &proc_map_files_link_inode_operations; inode->i_size = 64; inode->i_mode = S_IFLNK; @@ -1801,10 +1816,6 @@ static struct dentry *proc_map_files_lookup(struct inode *dir, int result; struct mm_struct *mm; - result = -EPERM; - if (!capable(CAP_SYS_ADMIN)) - goto out; - result = -ENOENT; task = get_proc_task(dir); if (!task) @@ -1858,10 +1869,6 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx) struct map_files_info *p; int ret; - ret = -EPERM; - if (!capable(CAP_SYS_ADMIN)) - goto out; - ret = -ENOENT; task = get_proc_task(file_inode(file)); if (!task) @@ -2050,7 +2057,6 @@ static const struct file_operations proc_timers_operations = { .llseek = seq_lseek, .release = seq_release_private, }; -#endif /* CONFIG_CHECKPOINT_RESTORE */ static int proc_pident_instantiate(struct inode *dir, struct dentry *dentry, struct task_struct *task, const void *ptr) @@ -2549,9 +2555,7 @@ static const struct inode_operations proc_task_inode_operations; static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), -#ifdef CONFIG_CHECKPOINT_RESTORE DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations), -#endif DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), #ifdef CONFIG_NET -- 1.8.1