From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [RFC PATCH] fs: Add user_file_or_path_at and use it for truncate Date: Wed, 28 Aug 2013 12:04:43 -0700 Message-ID: References: <7d1419dda1da70a8ad915f85b093a58b86bcaf3b.1377630856.git.luto@amacapital.net> <20130828061623.GJ27005@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: Linus Torvalds , Willy Tarreau , Ingo Molnar , "security@kernel.org" , Linux Kernel Mailing List , Oleg Nesterov , Linux FS Devel , Brad Spengler To: Al Viro Return-path: In-Reply-To: <20130828061623.GJ27005@ZenIV.linux.org.uk> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Aug 27, 2013 at 11:16 PM, Al Viro wrote: > On Tue, Aug 27, 2013 at 01:28:27PM -0700, Andy Lutomirski wrote: >> There are also O_PATH fds, and I'm not sure what the semantics of >> O_PATH fds are or should be when they refer to something other than a >> directory. > > O_PATH file just points to specific location in the tree, no more and > no less. I don't know whether ftruncate(some O_PATH fd) should work. But this probably barely matters. > > g) name_to_handle_at(). Not sure if we need any capability checks, but > we definitely don't give a damn about the way file had been opened. open_by_handle(name_to_handle(read-only fd)) is dangerous for the same reason that open("/proc/self/fd/N", O_RDWR) is, but the capability checks have that covered. > > h) one case in quotactl(). With CAP_SYS_ADMIN required. If attacker got > that, it's too fucking late anyway. > > i) fstatat() and friends, statfs(), utimes() et.al, {set,get,list,remove}xattr() > - no difference in checks between pathname- and descriptor-based calls > anyway. Please, note that for fsetxattr() we do *NOT* require the file > to be opened for write; adding such requirement would be a user-visible > API change, and one fairly likely to break stuff, at that. Sigh. I wonder if that was deliberate. Anyway, it is what it is. > > j) umount() and pivot_root(). We really don't care how the file had been > opened, and attacker capable of playing with that can do a lot more. > > k) inotify_add_watch()/fanotify_mark(). No descriptor-based versions, > no permission checks other than "may read whatever we ended up with". > I really doubt that we care about the way fd had been opened in case > of /proc//fd/. > > l) truncate() mess. > > m) open() mess. > > AFAICS, the *only* cases when we might possibly care are linkat() target, > truncate() and open(). Note, BTW, that right now we *do* allow an attempt > to reopen a file via procfs symlink r/w, even when file had been r/o. > It's subject to permissions on the object being opened, but that's it. > > I'm not sure we can change that - again, it's a user-visible API, and > the change is very likely to break some scripts. In fact, it's about > as dangerous as a full-blown switch to dup-style semantics for procfs > opens, and it's a lot less attractive. > > For truncate() we would only need to have FMODE_WRITE reported, more or > less the same way as FMODE_FLINK. And without open() changes it doesn't > buy us anything at all... > > I've no problem with unrolling the user_path_at() in do_sys_truncate() > into an explicit loop by trailing symlinks and checking for indication > left by proc_pid_follow_link(), more or less the same way as with > LOOKUP_LINK in lookup_last(). It's _far_ less invasive than playing > with "oh, here we fill a struct path or maybe a struct file" horrors, > pinning struct file for no reason, etc. > > AFAICS, the real question is whether we dare to change open() behaviour on > /proc/*/fd/*. I've played with that a bit and I believe that I can do > the switch to dup-style with very localized changes in fs/namei.c and > fs/proc/{base,fd}.c. Will be even binary compatible kernel-side - > ->atomic_open() returns NULL/ERR_PTR where it used to return 0/-error, > not that we had many instances to convert. *IF* that variant is not > out of consideration for userland API stability reasons, I would certainly > prefer to go that way; turns out that these days we can pull it off without > black magic in descriptor handling, etc. Linus? I personally find the check-mode-but-get-a-new-struct-file version to be less weird that the dup approach. Either approach will break scripts that try to write to /dev/stdin (which is the whole point). --Andy