From: Rich Felker <dalias@aerifal.cx> To: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Drysdale <drysdale@google.com>, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Andy Lutomirski <luto@amacapital.net>, Meredydd Luff <meredydd@senatehouse.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Andrew Morton <akpm@linux-foundation.org>, David Miller <davem@davemloft.net>, Thomas Gleixner <tglx@linutronix.de>, Stephen Rothwell <sfr@canb.auug.org.au>, Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>, Kees Cook <keescook@chromium.org>, Arnd Bergmann <arnd@arndb.de>, Christoph Hellwig <hch@infradead.org>, X86 ML <x86@kernel.org>, linux-arch <linux-arch@vger.kernel.org>, Linux API <linux-api@vger.kernel.org>, sparclinux@vger.kernel.org Subject: Re: [PATCHv10 man-pages 5/5] execveat.2: initial man page for execveat(2) Date: Fri, 9 Jan 2015 16:28:52 -0500 [thread overview] Message-ID: <20150109212852.GU4574@brightrain.aerifal.cx> (raw) In-Reply-To: <20150109210941.GL22149@ZenIV.linux.org.uk> On Fri, Jan 09, 2015 at 09:09:41PM +0000, Al Viro wrote: > On Fri, Jan 09, 2015 at 03:59:26PM -0500, Rich Felker wrote: > > > > For fsck sake, folks, if you have bloody /proc, you don't need that shite > > > at all! Just do execve on /proc/self/fd/n, and be done with that. > > > > > > The sole excuse for merging that thing in the first place had been > > > "would anybody think of children^Wsclerotic^Whardened environments > > > where they have no /proc at all". > > > > That doesn't work. With O_CLOEXEC, /proc/self/fd/n is already gone at > > the time the interpreter runs, whether you're using fexecveat or > > execve with "/proc/self/fd/n" to implement POSIX fexecve(). That's the > > problem. This breaks the intended idiom for fexecve. > > Just what will your magical symlink do in case when the file is opened, > unlinked and marked O_CLOEXEC? When should actual freeing of disk blocks, > etc. happen? And no, you can't assume that interpreter will open the > damn thing even once - there's nothing to oblige it to do so. Unlinking is not relevant. Magical symlinks refer to open file descriptions (either real ones or O_PATH inode-reference-only ones), not files. There is no new complexity proposed for freeing disk blocks here. Semantics are identical to existing O_PATH inode references. > Al, more and more tempted to ask reverting the whole thing - this hardcoded > /dev/fd/... (in fs/exec.c, no less) is disgraceful enough, but threats of > even more revolting kludges in the name of "intended idiom for fexecve"... If you have a multithreaded process that's executing an external program via fexecve, then unless it has specialized knowledge about what other parts of the program/libraries are doing, it needs to be using O_CLOEXEC for the file descriptor. Otherwise, the file descriptor could be leaked to child processes started by other threads. This is what I mean by the "intended idiom". Note that it's easier to use pathnames instead of fexecve, but doing so may not be an option if the program needs to verify the file before exec'ing it. This issue can be avoided if you're going to fork-and-fexecve rather than replacing the calling process, since after forking it's safe to remove the close-on-exec flag. But then you still have the issue that the child process, after exec, keeps a spurious file descriptor to its own process image (executable file) open which it can never close (because it doesn't know the number). This could eventually lead to fd exhaustion after many generations. The "magic open-once magic symlink" approach is really the cleanest solution I can find. In the case where the interpreter does not open the script, nothing terribly bad happens; the magic symlink just sticks around until _exit or exec. In the case where the interpreter opens it more than once, you get a failure, but as far as I know existing interpreters don't do this, and it's arguably bad design. In any case it's a caught error. Rich
WARNING: multiple messages have this Message-ID (diff)
From: Rich Felker <dalias@aerifal.cx> To: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Drysdale <drysdale@google.com>, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Andy Lutomirski <luto@amacapital.net>, Meredydd Luff <meredydd@senatehouse.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Andrew Morton <akpm@linux-foundation.org>, David Miller <davem@davemloft.net>, Thomas Gleixner <tglx@linutronix.de>, Stephen Rothwell <sfr@canb.auug.org.au>, Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>, Kees Cook <keescook@chromium.org>, Arnd Bergmann <arnd@arndb.de>, Christoph Hellwig <hch@infradead.org>, X86 ML <x86@kernel.org>, linux-arch <linux-arch@vger.kernel.org>, Linux API <linux-api@vger.kernel.org>, sparclinux@vger.kernel.org Subject: Re: [PATCHv10 man-pages 5/5] execveat.2: initial man page for execveat(2) Date: Fri, 09 Jan 2015 21:28:52 +0000 [thread overview] Message-ID: <20150109212852.GU4574@brightrain.aerifal.cx> (raw) In-Reply-To: <20150109210941.GL22149@ZenIV.linux.org.uk> On Fri, Jan 09, 2015 at 09:09:41PM +0000, Al Viro wrote: > On Fri, Jan 09, 2015 at 03:59:26PM -0500, Rich Felker wrote: > > > > For fsck sake, folks, if you have bloody /proc, you don't need that shite > > > at all! Just do execve on /proc/self/fd/n, and be done with that. > > > > > > The sole excuse for merging that thing in the first place had been > > > "would anybody think of children^Wsclerotic^Whardened environments > > > where they have no /proc at all". > > > > That doesn't work. With O_CLOEXEC, /proc/self/fd/n is already gone at > > the time the interpreter runs, whether you're using fexecveat or > > execve with "/proc/self/fd/n" to implement POSIX fexecve(). That's the > > problem. This breaks the intended idiom for fexecve. > > Just what will your magical symlink do in case when the file is opened, > unlinked and marked O_CLOEXEC? When should actual freeing of disk blocks, > etc. happen? And no, you can't assume that interpreter will open the > damn thing even once - there's nothing to oblige it to do so. Unlinking is not relevant. Magical symlinks refer to open file descriptions (either real ones or O_PATH inode-reference-only ones), not files. There is no new complexity proposed for freeing disk blocks here. Semantics are identical to existing O_PATH inode references. > Al, more and more tempted to ask reverting the whole thing - this hardcoded > /dev/fd/... (in fs/exec.c, no less) is disgraceful enough, but threats of > even more revolting kludges in the name of "intended idiom for fexecve"... If you have a multithreaded process that's executing an external program via fexecve, then unless it has specialized knowledge about what other parts of the program/libraries are doing, it needs to be using O_CLOEXEC for the file descriptor. Otherwise, the file descriptor could be leaked to child processes started by other threads. This is what I mean by the "intended idiom". Note that it's easier to use pathnames instead of fexecve, but doing so may not be an option if the program needs to verify the file before exec'ing it. This issue can be avoided if you're going to fork-and-fexecve rather than replacing the calling process, since after forking it's safe to remove the close-on-exec flag. But then you still have the issue that the child process, after exec, keeps a spurious file descriptor to its own process image (executable file) open which it can never close (because it doesn't know the number). This could eventually lead to fd exhaustion after many generations. The "magic open-once magic symlink" approach is really the cleanest solution I can find. In the case where the interpreter does not open the script, nothing terribly bad happens; the magic symlink just sticks around until _exit or exec. In the case where the interpreter opens it more than once, you get a failure, but as far as I know existing interpreters don't do this, and it's arguably bad design. In any case it's a caught error. Rich
next prev parent reply other threads:[~2015-01-09 21:29 UTC|newest] Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-11-24 11:53 [PATCHv10 0/5] syscalls,x86,sparc: Add execveat() system call David Drysdale 2014-11-24 11:53 ` David Drysdale 2014-11-24 11:53 ` [PATCHv10 1/5] syscalls: implement " David Drysdale 2014-11-24 11:53 ` David Drysdale 2014-11-24 11:53 ` [PATCHv10 2/5] x86: Hook up execveat " David Drysdale 2014-11-24 12:45 ` Thomas Gleixner 2014-11-24 12:45 ` Thomas Gleixner 2014-11-24 12:45 ` Thomas Gleixner 2014-11-24 17:06 ` Dan Carpenter 2014-11-24 17:06 ` Dan Carpenter 2014-11-24 17:06 ` Dan Carpenter 2014-11-24 18:26 ` David Drysdale 2014-11-24 18:26 ` David Drysdale 2014-11-25 12:16 ` Dan Carpenter 2014-11-25 12:16 ` Dan Carpenter 2014-11-25 12:16 ` Dan Carpenter 2014-11-24 18:53 ` Thomas Gleixner 2014-11-24 18:53 ` Thomas Gleixner 2014-11-24 11:53 ` [PATCHv10 3/5] syscalls: add selftest for execveat(2) David Drysdale 2014-11-24 11:53 ` David Drysdale 2014-11-24 11:53 ` [PATCHv10 4/5] sparc: Hook up execveat system call David Drysdale 2014-11-24 18:36 ` David Miller 2014-11-24 18:36 ` David Miller 2014-11-24 11:53 ` [PATCHv10 man-pages 5/5] execveat.2: initial man page for execveat(2) David Drysdale 2015-01-09 15:47 ` Michael Kerrisk (man-pages) 2015-01-09 15:47 ` Michael Kerrisk (man-pages) 2015-01-09 16:13 ` Rich Felker 2015-01-09 16:13 ` Rich Felker 2015-01-09 17:46 ` David Drysdale 2015-01-09 17:46 ` David Drysdale 2015-01-09 17:46 ` David Drysdale 2015-01-09 20:48 ` Rich Felker 2015-01-09 20:48 ` Rich Felker 2015-01-09 20:48 ` Rich Felker 2015-01-09 20:56 ` Al Viro 2015-01-09 20:56 ` Al Viro 2015-01-09 20:59 ` Rich Felker 2015-01-09 20:59 ` Rich Felker 2015-01-09 20:59 ` Rich Felker 2015-01-09 21:09 ` Al Viro 2015-01-09 21:09 ` Al Viro 2015-01-09 21:09 ` Al Viro 2015-01-09 21:28 ` Rich Felker [this message] 2015-01-09 21:28 ` Rich Felker 2015-01-09 21:50 ` Al Viro 2015-01-09 21:50 ` Al Viro 2015-01-09 22:17 ` Rich Felker 2015-01-09 22:17 ` Rich Felker 2015-01-09 22:33 ` Al Viro 2015-01-09 22:33 ` Al Viro 2015-01-09 22:42 ` Rich Felker 2015-01-09 22:42 ` Rich Felker 2015-01-09 22:57 ` Al Viro 2015-01-09 22:57 ` Al Viro 2015-01-09 22:57 ` Al Viro 2015-01-09 23:12 ` Rich Felker 2015-01-09 23:12 ` Rich Felker 2015-01-09 23:24 ` Andy Lutomirski 2015-01-09 23:24 ` Andy Lutomirski 2015-01-09 23:37 ` Rich Felker 2015-01-09 23:37 ` Rich Felker 2015-01-10 0:01 ` Al Viro 2015-01-09 23:36 ` Al Viro 2015-01-09 23:36 ` Al Viro 2015-01-10 3:03 ` Al Viro 2015-01-10 3:03 ` Al Viro 2015-01-10 3:03 ` Al Viro 2015-01-10 3:41 ` Rich Felker 2015-01-10 3:41 ` Rich Felker 2015-01-10 4:14 ` Al Viro 2015-01-10 5:57 ` Rich Felker 2015-01-10 5:57 ` Rich Felker 2015-01-10 22:27 ` Eric W. Biederman 2015-01-10 22:27 ` Eric W. Biederman 2015-01-10 22:27 ` Eric W. Biederman 2015-01-11 1:15 ` Rich Felker 2015-01-11 1:15 ` Rich Felker 2015-01-11 2:09 ` Eric W. Biederman 2015-01-11 2:09 ` Eric W. Biederman 2015-01-11 2:09 ` Eric W. Biederman 2015-01-11 11:02 ` Christoph Hellwig 2015-01-11 11:02 ` Christoph Hellwig 2015-01-11 11:02 ` Christoph Hellwig 2015-01-12 14:18 ` David Drysdale 2015-01-09 22:13 ` Eric W. Biederman 2015-01-09 22:13 ` Eric W. Biederman 2015-01-09 22:13 ` Eric W. Biederman 2015-01-09 22:13 ` Eric W. Biederman 2015-01-09 22:38 ` Rich Felker 2015-01-09 22:38 ` Rich Felker 2015-01-10 1:17 ` Eric W. Biederman 2015-01-10 1:17 ` Eric W. Biederman 2015-01-10 1:17 ` Eric W. Biederman 2015-01-10 1:17 ` Eric W. Biederman 2015-01-10 1:33 ` Rich Felker 2015-01-10 1:33 ` Rich Felker 2015-01-10 1:33 ` Rich Felker 2015-01-12 11:33 ` David Drysdale 2015-01-12 16:07 ` Rich Felker 2015-01-12 16:07 ` Rich Felker 2015-01-10 7:13 ` Michael Kerrisk (man-pages) 2015-01-10 7:13 ` Michael Kerrisk (man-pages) 2015-01-09 21:20 ` Eric W. Biederman 2015-01-09 21:20 ` Eric W. Biederman 2015-01-09 21:20 ` Eric W. Biederman 2015-01-09 21:31 ` Rich Felker 2015-01-09 21:31 ` Rich Felker 2015-01-09 21:31 ` Rich Felker 2015-01-10 7:43 ` Michael Kerrisk (man-pages) 2015-01-10 7:43 ` Michael Kerrisk (man-pages) 2015-01-10 7:43 ` Michael Kerrisk (man-pages) 2015-01-10 8:27 ` Michael Kerrisk (man-pages) 2015-01-10 8:27 ` Michael Kerrisk (man-pages) 2015-01-10 13:31 ` Rich Felker 2015-01-10 13:31 ` Rich Felker 2015-01-10 7:38 ` Michael Kerrisk (man-pages) 2015-01-10 7:38 ` Michael Kerrisk (man-pages) 2015-01-10 7:38 ` Michael Kerrisk (man-pages) 2015-01-09 18:02 ` David Drysdale 2015-01-09 18:02 ` David Drysdale 2015-01-10 7:56 ` Michael Kerrisk (man-pages) 2015-01-10 7:56 ` Michael Kerrisk (man-pages) 2015-01-10 7:56 ` Michael Kerrisk (man-pages)
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20150109212852.GU4574@brightrain.aerifal.cx \ --to=dalias@aerifal.cx \ --cc=akpm@linux-foundation.org \ --cc=arnd@arndb.de \ --cc=davem@davemloft.net \ --cc=drysdale@google.com \ --cc=ebiederm@xmission.com \ --cc=hch@infradead.org \ --cc=hpa@zytor.com \ --cc=keescook@chromium.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@amacapital.net \ --cc=meredydd@senatehouse.org \ --cc=mingo@redhat.com \ --cc=mtk.manpages@gmail.com \ --cc=oleg@redhat.com \ --cc=sfr@canb.auug.org.au \ --cc=sparclinux@vger.kernel.org \ --cc=tglx@linutronix.de \ --cc=viro@ZenIV.linux.org.uk \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.