linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bfields@fieldses.org (J. Bruce Fields)
To: Dave Chinner <david@fromorbit.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Casey Schaufler <casey@schaufler-ca.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Seth Forshee <seth.forshee@canonical.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>,
	SELinux-NSA <selinux@tycho.nsa.gov>,
	Serge Hallyn <serge.hallyn@canonical.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/7] Initial support for user namespace owned mounts
Date: Tue, 21 Jul 2015 13:37:21 -0400	[thread overview]
Message-ID: <20150721173721.GE11050@fieldses.org> (raw)
In-Reply-To: <20150717024735.GW3902@dastard>

On Fri, Jul 17, 2015 at 12:47:35PM +1000, Dave Chinner wrote:
> On Thu, Jul 16, 2015 at 07:42:03PM -0500, Eric W. Biederman wrote:
> > Dave Chinner <david@fromorbit.com> writes:
> > 
> > > On Wed, Jul 15, 2015 at 11:47:08PM -0500, Eric W. Biederman wrote:
> > >> Casey Schaufler <casey@schaufler-ca.com> writes:
> > >> > On 7/15/2015 6:08 PM, Andy Lutomirski wrote:
> > >> >> If I mount an unprivileged filesystem, then either the contents were
> > >> >> put there *by me*, in which case letting me access them are fine, or
> > >> >> (with Seth's patches and then some) I control the backing store, in
> > >> >> which case I can do whatever I want regardless of what LSM thinks.
> > >> >>
> > >> >> So I don't see the problem.  Why would Smack or any other LSM care at
> > >> >> all, unless it wants to prevent me from mounting the fs in the first
> > >> >> place?
> > >> >
> > >> > First off, I don't cotton to the notion that you should be able
> > >> > to mount filesystems without privilege. But it seems I'm being
> > >> > outvoted on that. I suspect that there are cases where it might
> > >> > be safe, but I can't think of one off the top of my head.
> > >> 
> > >> There are two fundamental issues mounting filesystems without privielge,
> > >> by which I actually mean mounting filesystems as the root user in a user
> > >> namespace.
> > >> 
> > >> - Are the semantics safe.
> > >> - Is the extra attack surface a problem.
> > >
> > > I think the attack surface this exposes is the biggest problem
> > > facing this proposal.
> > 
> > I completely agree.
> > 
> > >> Figuring out how to make semantics safe is what we are talking about.
> > >> 
> > >> Once we sort out the semantics we can look at the handful of filesystems
> > >> like fuse where the extra attack surface is not a concern.
> > >> 
> > >> With that said desktop environments have for a long time been
> > >> automatically mounting whichever filesystem you place in your computer,
> > >> so in practice what this is really about is trying to align the kernel
> > >> with how people use filesystems.
> > >
> > > The key difference is that desktops only do this when you physically
> > > plug in a device. With unprivileged mounts, a hostile attacker
> > > doesn't need physical access to the machine to exploit lurking
> > > kernel filesystem bugs. i.e. they can just use loopback mounts, and
> > > they can keep mounting corrupted images until they find something
> > > that works.
> > 
> > Yep.  That magnifies the problem quite a bit.
> > 
> > > User namespaces are supposed to provide trust separation.  The
> > > kernel filesystems simply aren't hardened against unprivileged
> > > attacks from below - there is a trust relationship between root and
> > > the filesystem in that they are the only things that can write to
> > > the disk. Mounts from within a userns destroys this relationship as
> > > the userns root, by definition, is not a trusted actor.
> > 
> > I talked to Ted Tso a while back and ext4 is at least in principle
> > already hardened against that kind of attack.  I am not certain I
> > believe it, but if it is true I think it is fantastic.
> 
> No, it's not. No filesystem is, because to harden against such
> attacks requires complete verification of all metadata when it is
> read from disk, before it is used, or some method or ensuring the
> block was not tampered with. CRCs are not sufficient, because they
> can be tampered with, too.
> 
> The only way a filesystem would be able to trust what it reads from
> disk has not been tampered with in a system with untrusted mounts is
> if it has some kind of cryptographically secure signature in the
> metadata and the attacker is unable to access the key for that
> signature.

Preventing tampering is a little different from protecting the kernel
from attack, isn't it?  I thought the latter was what people were asking
about.

So, for example, a screwed up on-disk directory structure shouldn't
result in creating a cycle in the dcache and then deadlocking.

--b.

> No filesystem we have has that capability and AFAIA there
> are no plans for any filesystem to implement such tamper detection.
> And no, ext4 encryption does not provide this because it only stores
> the values and data in encrypted format and does not protect
> metadata from tampering when it is not mounted.
> 
> If we don't have crypto signatures in metadata, then XFS is probably
> the most robust against tampering as it does a lot more checking of
> the on-disk metadata before it is used than any other filesystem
> (i.e. see the verifier infrastructure that does corruption checks
> after read (in io completion) and before write (in io submission)
> to catch bad metadata before it is used by the kernel, or before it
> is written to disk by the kernel.
> 
> However, these checks are far from comprehensive. we can only check
> internal consistency of the metadata objects in the block, and even
> then we really only can check for values within range rather than
> absolute correctness. e.g. we can check a dirent has a valid name,
> length, ftype and inode number, but we can't validate that the inode
> is actually allocated or not because that requires a lookup in the
> allocated inode btree. We *trust* that inode number to be
> allocated and valid because it is in metadata the filesystem wrote.
> 
> For inode numbers that come from untrusted sources (NFS,
> open-by-handle, etc) we have a flag that does inode number
> validation on lookup (XFS_IGET_UNTRUSTED) to check against trusted
> metadata (i.e. the allocated inode btrees), but that is expensive
> and so not done on inodes that we pull directly from metadata that
> has come from disk. Indeed, we still trust on-disk metadata to be
> correct to validate that other metadata canbe trusted, so if one
> structure can be tampered with, so can others.
> 
> IOWs, if we cannot trust one part of the filesystem metadata to be
> correct, then we cannot trust that filesystem *at all*, *for
> anything*. And even running fsck doesn't restore trust - all it does
> is tell us that any modification that was made is not a detectable
> inconsistency that needs fixing.
> 
> > At this point any setting of the FS_USER_MOUNT flag I figure needs to go
> > through the filesystem maintainers tree and they need to be aware of and
> > agree to deal with the attack from below issue.
> > 
> > The one filesystem I truly expect we can make work is fuse.  fuse has
> > been designed to deal with some variation of the attack from below issue
> > since day one.  We looked at what the patches to fuse would look like
> > with the current state of the vfs and it was not pretty.
> > 
> > We very much need to sort through as much as possible at the vfs layer,
> > and in generic code.  Allow everyone to see what is going on and how
> > it works before preceeding forward with enabling any filesystems.
> 
> The VFS protects us from attacks from above the filesystem, not
> below. The VFS plays no part in validating the on-disk structure of
> a filesystem which is what attacks from below will be attempting to
> exploit.
> 
> > I truly hope we can find a small set of block device filesystems that we
> > can harden from attack below.   That would allow linux to have serious
> > defenses against evil usb stick attacks.  I think that is going to take
> > a lot of careful coding, testing and validation and advancing the state
> > of the art to get there.
> 
> Somehow, I just can't see that happening.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-07-21 17:37 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-15 19:46 [PATCH 0/7] Initial support for user namespace owned mounts Seth Forshee
2015-07-15 19:46 ` [PATCH 1/7] fs: Add user namesapace member to struct super_block Seth Forshee
2015-07-16  2:47   ` Eric W. Biederman
2015-08-05 21:03     ` Seth Forshee
2015-08-05 21:19       ` Eric W. Biederman
2015-08-06 14:20         ` Seth Forshee
2015-08-06 14:51           ` Stephen Smalley
2015-08-06 15:44             ` Seth Forshee
2015-08-06 16:11               ` Stephen Smalley
2015-08-07 14:16                 ` Seth Forshee
2015-08-07 14:32           ` Seth Forshee
2015-08-07 18:35             ` Casey Schaufler
2015-08-07 18:57               ` Seth Forshee
2015-07-15 19:46 ` [PATCH 2/7] userns: Simpilify MNT_NODEV handling Seth Forshee
2015-07-15 19:46 ` [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces Seth Forshee
2015-07-15 21:48   ` Serge E. Hallyn
2015-07-15 21:50     ` Andy Lutomirski
2015-07-15 22:35       ` Eric W. Biederman
2015-07-16  1:14         ` Seth Forshee
2015-07-16  1:23           ` Andy Lutomirski
2015-07-16 13:06             ` Seth Forshee
2015-07-16  1:19         ` Andy Lutomirski
2015-07-16  4:23           ` Eric W. Biederman
2015-07-16  4:49             ` Andy Lutomirski
2015-07-16  5:04               ` Eric W. Biederman
2015-07-16  5:15                 ` Andy Lutomirski
2015-07-16  5:44                   ` Eric W. Biederman
2015-07-16 13:13                     ` Seth Forshee
2015-07-17  0:43                       ` Eric W. Biederman
2015-07-29 16:04                 ` Serge E. Hallyn
2015-07-29 16:18                   ` Serge E. Hallyn
2015-07-15 19:46 ` [PATCH 4/7] fs: Treat foreign mounts as nosuid Seth Forshee
2015-07-17  6:46   ` Nikolay Borisov
2015-07-15 19:46 ` [PATCH 5/7] security: Restrict security attribute updates for userns mounts Seth Forshee
2015-07-15 19:46 ` [PATCH 6/7] selinux: Ignore security labels on user namespace mounts Seth Forshee
2015-07-16 13:23   ` Stephen Smalley
2015-07-22 16:02     ` Stephen Smalley
2015-07-22 16:14       ` Seth Forshee
2015-07-22 20:25         ` Stephen Smalley
2015-07-22 20:40           ` Stephen Smalley
2015-07-23 13:57             ` Stephen Smalley
2015-07-23 14:39               ` Seth Forshee
2015-07-23 15:36                 ` Stephen Smalley
2015-07-23 16:23                   ` Seth Forshee
2015-07-24 15:11                     ` Seth Forshee
2015-07-30 15:57                       ` Stephen Smalley
2015-07-30 16:24                         ` Seth Forshee
2015-07-15 19:46 ` [PATCH 7/7] smack: Don't use security labels for " Seth Forshee
2015-07-15 20:43   ` Casey Schaufler
2015-07-15 20:36 ` [PATCH 0/7] Initial support for user namespace owned mounts Casey Schaufler
2015-07-15 21:06   ` Eric W. Biederman
2015-07-15 21:48     ` Seth Forshee
2015-07-15 22:28       ` Eric W. Biederman
2015-07-16  1:05         ` Andy Lutomirski
2015-07-16  2:20           ` Eric W. Biederman
2015-07-16 13:12           ` Stephen Smalley
2015-07-15 23:04       ` Casey Schaufler
2015-07-15 22:39     ` Casey Schaufler
2015-07-16  1:08       ` Andy Lutomirski
2015-07-16  2:54         ` Casey Schaufler
2015-07-16  4:47           ` Eric W. Biederman
2015-07-17  0:09             ` Dave Chinner
2015-07-17  0:42               ` Eric W. Biederman
2015-07-17  2:47                 ` Dave Chinner
2015-07-21 17:37                   ` J. Bruce Fields [this message]
2015-07-22  7:56                     ` Dave Chinner
2015-07-22 14:09                       ` J. Bruce Fields
2015-07-22 16:52                         ` Austin S Hemmelgarn
2015-07-22 17:41                           ` J. Bruce Fields
2015-07-23  1:51                             ` Dave Chinner
2015-07-23 13:19                               ` J. Bruce Fields
2015-07-23 23:48                                 ` Dave Chinner
2015-07-18  0:07                 ` Serge E. Hallyn
2015-07-20 17:54             ` Colin Walters
2015-07-16 11:16     ` Lukasz Pawelczyk
2015-07-17  0:10       ` Eric W. Biederman
2015-07-17 10:13         ` Lukasz Pawelczyk
2015-07-16  3:15 ` Eric W. Biederman
2015-07-16 13:59   ` Seth Forshee
2015-07-16 15:09     ` Casey Schaufler
2015-07-16 18:57       ` Seth Forshee
2015-07-16 21:42         ` Casey Schaufler
2015-07-16 22:27           ` Andy Lutomirski
2015-07-16 23:08             ` Casey Schaufler
2015-07-16 23:29               ` Andy Lutomirski
2015-07-17  0:45                 ` Casey Schaufler
2015-07-17  0:59                   ` Andy Lutomirski
2015-07-17 14:28                     ` Serge E. Hallyn
2015-07-17 14:56                       ` Seth Forshee
2015-07-21 20:35                     ` Seth Forshee
2015-07-22  1:52                       ` Casey Schaufler
2015-07-22 15:56                         ` Seth Forshee
2015-07-22 18:10                           ` Casey Schaufler
2015-07-22 19:32                             ` Seth Forshee
2015-07-23  0:05                               ` Casey Schaufler
2015-07-23  0:15                                 ` Eric W. Biederman
2015-07-23  5:15                                   ` Seth Forshee
2015-07-23 21:48                                   ` Casey Schaufler
2015-07-28 20:40                                 ` Seth Forshee
2015-07-30 16:18                                   ` Casey Schaufler
2015-07-30 17:05                                     ` Eric W. Biederman
2015-07-30 17:25                                       ` Seth Forshee
2015-07-30 17:33                                         ` Eric W. Biederman
2015-07-17 13:21           ` Seth Forshee
2015-07-17 17:14             ` Casey Schaufler
2015-07-16 15:59     ` Seth Forshee
2015-07-30  4:24 Amir Goldstein
2015-07-30 13:55 ` Seth Forshee
2015-07-30 14:47   ` Amir Goldstein
2015-07-30 15:33     ` Casey Schaufler
2015-07-30 15:52       ` Colin Walters
2015-07-30 16:15         ` Eric W. Biederman
2015-07-30 13:57 ` Serge Hallyn
2015-07-30 15:09   ` Amir Goldstein
2015-07-31  8:11 Amir Goldstein
2015-07-31 19:56 ` Casey Schaufler
2015-08-01 17:01   ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150721173721.GE11050@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=casey@schaufler-ca.com \
    --cc=david@fromorbit.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=selinux@tycho.nsa.gov \
    --cc=serge.hallyn@canonical.com \
    --cc=seth.forshee@canonical.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).