On 2020-05-15, Kees Cook <keescook@chromium.org> wrote:
> On Fri, May 15, 2020 at 04:43:37PM +0200, Florian Weimer wrote:
> > * Kees Cook:
> > 
> > > On Fri, May 15, 2020 at 10:43:34AM +0200, Florian Weimer wrote:
> > >> * Kees Cook:
> > >> 
> > >> > Maybe I've missed some earlier discussion that ruled this out, but I
> > >> > couldn't find it: let's just add O_EXEC and be done with it. It actually
> > >> > makes the execve() path more like openat2() and is much cleaner after
> > >> > a little refactoring. Here are the results, though I haven't emailed it
> > >> > yet since I still want to do some more testing:
> > >> > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/o_exec/v1
> > >> 
> > >> I think POSIX specifies O_EXEC in such a way that it does not confer
> > >> read permissions.  This seems incompatible with what we are trying to
> > >> achieve here.
> > >
> > > I was trying to retain this behavior, since we already make this
> > > distinction between execve() and uselib() with the MAY_* flags:
> > >
> > > execve():
> > >         struct open_flags open_exec_flags = {
> > >                 .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
> > >                 .acc_mode = MAY_EXEC,
> > >
> > > uselib():
> > >         static const struct open_flags uselib_flags = {
> > >                 .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
> > >                 .acc_mode = MAY_READ | MAY_EXEC,
> > >
> > > I tried to retain this in my proposal, in the O_EXEC does not imply
> > > MAY_READ:
> > 
> > That doesn't quite parse for me, sorry.
> > 
> > The point is that the script interpreter actually needs to *read* those
> > files in order to execute them.
> 
> I think I misunderstood what you meant (Mickaël got me sorted out
> now). If O_EXEC is already meant to be "EXEC and _not_ READ nor WRITE",
> then yes, this new flag can't be O_EXEC. I was reading the glibc
> documentation (which treats it as a permission bit flag, not POSIX,
> which treats it as a complete mode description).

On the other hand, if we had O_EXEC (or O_EXONLY a-la O_RDONLY) then the
interpreter could re-open the file descriptor as O_RDONLY after O_EXEC
succeeds. Not ideal, but I don't think it's a deal-breaker.

Regarding O_MAYEXEC, I do feel a little conflicted.

I do understand that its goal is not to be what O_EXEC was supposed to
be (which is loosely what O_PATH has effectively become), so I think
that this is not really a huge problem -- especially since you could
just do O_MAYEXEC|O_PATH if you wanted to disallow reading explicitly.
It would be nice to have an O_EXONLY concept, but it's several decades
too late to make it mandatory (and making it optional has questionable
utility IMHO).

However, the thing I still feel mildly conflicted about is the sysctl. I
do understand the argument for it (ultimately, whether O_MAYEXEC is
usable on a system depends on the distribution) but it means that any
program which uses O_MAYEXEC cannot rely on it to provide the security
guarantees they expect. Even if the program goes and reads the sysctl
value, it could change underneath them. If this is just meant to be a
best-effort protection then this doesn't matter too much, but I just
feel uneasy about these kinds of best-effort protections.

I do wonder if we could require that fexecve(3) can only be done with
file descriptors that have been opened with O_MAYEXEC (obviously this
would also need to be a sysctl -- *sigh*). This would tie in to some of
the magic-link changes I wanted to push (namely, upgrade_mask).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>