linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	David Howells <dhowells@redhat.com>,
	linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	torvalds@linux-foundation.org,
	linux-security-module@vger.kernel.org
Subject: Re: [PATCH 02/10] vfs: syscall: Add move_mount(2) to move mounts around
Date: Mon, 8 Jul 2019 19:01:32 +0100	[thread overview]
Message-ID: <20190708180132.GU17978@ZenIV.linux.org.uk> (raw)
In-Reply-To: <874l3wo3gq.fsf@xmission.com>

On Mon, Jul 08, 2019 at 12:12:21PM -0500, Eric W. Biederman wrote:

> Al you do realize that the TOCTOU you are talking about comes the system
> call API.  TOMOYO can only be faulted for not playing in their own
> sandbox and not reaching out and fixing the vfs implementation details.
>
> Userspace has always had to very careful to only mount filesystems
> on paths that root completely controls and won't change.

That has nothing whatsoever to do with the path where you are mounting
something.  _That_ is actually looked up before ->sb_mount() gets called;
no TOCTOU there.

The thing where ->sb_mount() is fucked by design is its handling of
	* device name
	* old tree in mount --bind
	* old tree in mount --move
	* things like journal name (not that any of the instances had tried
to do anything with that)

All of those *do* have TOCTOU, and that's an inevitable result of the
idiotic hook fetishism of LSM design.  Instead of "we want something
to happen when such-and-such predicate is about to change", it's
"lemme run my code, the earlier the better, I don't care about any
damn predicates, it's all too complicated anyway, whaddya mean
racy?"

Any time you have pathname resolution done twice, it's a built-in race.
If you want *ALL* checks on mount(2) to be done before the mean, nasty
kernel code gets to decide anything (bind/move/mount/etc. all squashed
together, just let us have at the syscall arguments, mmkay?) - that's
precisely what you get.

And no, that TOCTOU is not in syscall API.  "open() of an untrusted
pathname may end up trying to open hell knows what" is one thing;
"open() of an untrusted pathname may apply MAC checks to one object
and open something entirely different" is another.  The former is
inherent to syscall API.  The latter would be a badly fucked up
implementation (we don't have that issue on open(2), thankfully).

To make it clear, TOMOYO is not at fault here; LSM "architecture" is.
Note, BTW, that TOMOYO checks there do *NOT* limit the input pathname
at all - only the destination of the first pathwalk.  Repeating it
may easily lead to an entirely different place.

Canonicalized pathname is derived from pathwalk result; having concluded
that it's perfectly fine for the operation requested is pure security
theatre - it
	* says nothing about the trustedness of the original pathname
	* may have nothing whatsoever to the object yielded by the
second pathwalk, which is what'll end up actually used.
It's not even "this thing walks through /proc, and thus not to be trusted
to be stable" - the checks won't notice where the damn thing had been.

When somebody proposes _useful_ MAC for mount --move (and that really
can't be done at the level of syscall entry - we need to have already
figured out that with given combination of flags the 1st argument of
mount(2) will be a pathname *and* already looked it up), sure - it
will be added to do_move_mount(), which is where we have all lookups
done, and apply both for mount() and move_mount().

Right now anyone relying upon DAC enforced for MS_MOVE has worse problems
than "attacker will use move_mount(2) and bypass my policy" - the same
attacker can bloody well bypass those with nothing more exotic than
clone(2) and dup2(2) (and mount(2), of course).

And it's not just MS_MOVE (or MS_BIND).  Anyone trying to prevent
mounting e.g. ext2 from untrusted device and do that on the level of
->sb_mount() *is* *bloody* *well* *fucked*.  ->sb_mount() is simply
the wrong place for that.

  reply	other threads:[~2019-07-08 18:01 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-19 17:08 [PATCH 00/10] VFS: Provide new mount UAPI David Howells
2019-02-19 17:08 ` [PATCH 01/10] vfs: syscall: Add open_tree(2) to reference or clone a mount David Howells
2019-02-19 17:08 ` [PATCH 02/10] vfs: syscall: Add move_mount(2) to move mounts around David Howells
2019-02-20 12:32   ` Alan Jenkins
2019-02-20 12:41     ` Alan Jenkins
2019-02-20 16:23   ` Jann Horn
2019-07-08 12:02   ` Tetsuo Handa
2019-07-08 13:18     ` Al Viro
2019-07-08 17:12       ` Eric W. Biederman
2019-07-08 18:01         ` Al Viro [this message]
2019-07-08 18:13           ` Al Viro
2019-07-08 20:21           ` Al Viro
2019-07-09  0:13             ` Eric W. Biederman
2019-07-09 10:51               ` Tetsuo Handa
2019-07-22 10:12                 ` Tetsuo Handa
2019-07-23  4:16                   ` John Johansen
2019-07-23 13:45                     ` Tetsuo Handa
2019-08-06 10:43                       ` Tetsuo Handa
2019-08-22  3:51                         ` [RFC][PATCH] fix d_absolute_path() interplay with fsmount() Al Viro
2019-08-30 10:11                           ` Tetsuo Handa
2019-07-23 21:45             ` [PATCH 02/10] vfs: syscall: Add move_mount(2) to move mounts around James Morris
2019-07-23 23:30               ` Al Viro
2019-02-19 17:08 ` [PATCH 03/10] teach move_mount(2) to work with OPEN_TREE_CLONE David Howells
2019-02-20 18:59   ` Alan Jenkins
2019-02-26 17:45   ` Alan Jenkins
2019-02-19 17:08 ` [PATCH 04/10] Make anon_inodes unconditional David Howells
2019-02-19 17:09 ` [PATCH 05/10] vfs: syscall: Add fsopen() to prepare for superblock creation David Howells
2019-02-19 17:09 ` [PATCH 06/10] vfs: Implement logging through fs_context David Howells
2019-02-19 17:09 ` [PATCH 07/10] vfs: syscall: Add fsconfig() for configuring and managing a context David Howells
2019-02-19 17:09 ` [PATCH 08/10] vfs: syscall: Add fsmount() to create a mount for a superblock David Howells
2019-02-19 17:09 ` [PATCH 09/10] vfs: syscall: Add fspick() to select a superblock for reconfiguration David Howells
2019-02-19 17:09 ` [PATCH 10/10] vfs: Add a sample program for the new mount API David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190708180132.GU17978@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).